pyzor.digest¶
Handle digesting the messages.
- class pyzor.digest.DataDigester(msg, spec=None)¶
Bases: object
The major workhouse class.
- atomic_num_lines = 4¶
- digest¶
- classmethod digest_payloads(msg)¶
- email_ptrn = <_sre.SRE_Pattern object at 0x7f2063ecdcb0>¶
- handle_atomic(lines)¶
We digest everything.
- handle_line(line)¶
- handle_pieced(lines, spec)¶
Digest stuff according to the spec.
- longstr_ptrn = <_sre.SRE_Pattern object at 0x7f2063938918>¶
- min_line_length = 8¶
- classmethod normalize(s)¶
- static normalize_html_part(s)¶
- classmethod should_handle_line(s)¶
- unwanted_txt_repl = ''¶
- url_ptrn = <_sre.SRE_Pattern object at 0x7f2063bde6b0>¶
- value¶
- ws_ptrn = <_sre.SRE_Pattern object at 0x7f20642bd8a0>¶
- class pyzor.digest.HTMLStripper(collector)¶
Bases: HTMLParser.HTMLParser
Strip all tags from the HTML.
- handle_data(data)¶
Keep track of the data.
- handle_endtag(tag)¶
- handle_starttag(tag, attrs)¶
- class pyzor.digest.PrintingDataDigester(msg, spec=None)¶
Bases: pyzor.digest.DataDigester
Extends DataDigester: prints out what we’re digesting.
- handle_line(line)¶