headers with no charset or 'us-ascii' charsets. Actually this is only
partially true: we know about semicolons (but not true parameters) and
we know about whitespace (but not technically folding whitespace).
Still it should be good enough for all practical purposes.
Other changes include:
__init__(): Add a continuation_ws argument, which defaults to a single
space. Set this to change the whitespace used for continuation lines
when a header must be split. Also, changed the way header line
lengths are calculated, so that they take into account continuation_ws
(when tabs-expanded) and any provided header_name parameter. This
should do much better on returning split headers for which the first
and subsequent lines must fit into a specified width.
guess_maxlinelen(): Removed. I don't think we need this method as
part of the public API.
encode_chunks() -> _encode_chunks(): I don't think we need this one as
part of the public API either.
know anything about RFC 2047 encoded headers. Fortunately we have a
perfectly good header splitter in Header.encode(). So we just call
that to give us a properly formatted and split header.
Header.encode() didn't know about "highest-level syntactic breaks" but
that's been fixed now too.
__call__() can be 2-3x slower than the equivalent normal method.
_handle_message(): The structure of message/rfc822 message has
changed. Now parent's payload is a list of length 1, and the zeroth
element is the Message sub-object. Adjust the printing of such
message trees to reflect this change.
subclasses.
MIMENonMultipart: Base class for non-multipart/* content type subclass
specializations, e.g. image/gif. This class overrides attach() which
raises an exception, since it makes no sense to attach a subpart to
e.g. an image/gif message.
MIMEMultipart: Base class for multipart/* content type subclass
specializations, e.g. multipart/mixed. Does little more than provide
a useful constructor.
email package's Parser to handle the three common line endings.
Certain protocols such as IMAP define CRLF line endings and it doesn't
make sense for the client app to have to normalize the line endings
before handing it message off to the Parser.
_parsebody(): Be more flexible in the matching of line endings for
finding the MIME separators. Accept any of \r, \n and \r\n. Note
that we do /not/ change the line endings in the payloads, we just
accept any of those three around MIME boundaries.
single byte character sets. Also fixed a semantic problem with the
constructor's default arguments. Specifically,
__init__(): Change the maxlinelen argument default to None instead of
MAXLINELEN. The semantics should have been (and now are) that if
maxlinelen is given it is always honored. If it isn't given, but
header_name is given, then the maximum line length is calculated. If
neither are given then the default 76 characters is used.
_split(): If the character set is a single byte character set then we
can split the line at the maxlinelen because we know that encoding the
header won't increase its length. If the charset isn't a single byte
charset then we use the quicker divide-and-conquer line splitting
algorithm as before.
for the email package. The former is now just a shell project that
has some extra files for packaging for independent use (e.g. setup.py
and README).
Added a compatibility layer so that the same API can be used in Python
2.1 and 2.2/2.3 with the major differences shuffled off into helper
modules (_compat21.py and _compat22.py).
Also bumped the package version number to 2.0.3 for some fixes to be
checked in momentarily.
double call to AddressList.getaddrlist(), and /that/ always returns an
empty list for the second and subsequent calls.
Instead, instantiate an AddressList directly, and get the parsed
addresses out of the addresslist attribute.
non-us-ascii character sets in headers and bodies. Some API changes
(with DeprecationWarnings for the old APIs). Better RFC-compliant
implementations of base64 and quoted-printable.
Updated test cases. Documentation updates to follow (after I finish
writing them ;).
incorrect for "uneven" timezones. This algorithm should work for even
timezones (e.g. America/New_York) and uneven timezones (e.g.
Australia/Adelaide and America/St_Johns).
Closes SF bug #483231.
rfc822.py. The old rfc822.formatdate() produced date strings using
obsolete syntax. The new version produces the preferred RFC 2822
dates.
Also, an optional argument `localtime' is added, which if true,
produces a date relative to the local timezone, with daylight savings
time properly taken into account.
the separating semi-colon shows up on a continuation line (legal, but
weird).
Bug reported and fixed by Matthew Cowles. Test case and sample email
included.
_handle_multipart(): If there is an epilogue and the epilogue does
not itself start with a newline, add a newline before writing the
epilogue. Closes SF bug #472481.
_split_header(): Split on folding whitespace if the attempt to split
on semi-colons failed.
_split_header(): Patch by Matthew Cowles for fixing SF bug # 471918,
Generator splitting long headers.
failobj, and when getting the subtype use 'plain' as the failobj.
text/plain is supposed to be the default if the message contains no
Content-Type: header.
headers. It does not parse the body of the message, instead simply
assigning it as a string to the container's payload. This can be much
faster when you're only interested in a message's header.
Also, add a clause to the big-if to handle message/delivery-status
content types. These create a message with subparts that are
Message instances, which best represent the header blocks of this
content type.
get_type(): Use a compiled regular expression, which can be shared.
_get_params_preserve(): A helper method which extracts the header's
parameter list preserving value quoting. I'm not sure that this
needs to be a public method. It's necessary because we want
get_param() and friends to return the unquoted parameter value,
however we want the quote-preserved form for set_boundary().
get_params(), get_param(), set_boundary(): Implement in terms of
_get_params_preserve().
walk(): Yield ourself first, then recurse over our subparts (if any).