When the new policies are used (and only when the new policies are explicitly
used) headers turn into objects that have attributes based on their parsed
values, and can be set using objects that encapsulate the values, as well as
set directly from unicode strings. The folding algorithm then takes care of
encoding unicode where needed, and folding according to the highest level
syntactic objects.
With this patch only date and time headers are parsed as anything other than
unstructured, but that is all the helper methods in the existing API handle.
I do plan to add more parsers, and complete the set specified in the RFC
before the package becomes stable.
This patch primarily does two things: (1) it adds some internal-interface
methods to Policy that allow for Policy to control the parsing and folding of
headers in such a way that we can construct a backward compatibility policy
that is 100% compatible with the 3.2 API, while allowing a new policy to
implement the email6 API. (2) it adds that backward compatibility policy and
refactors the test suite so that the only differences between the 3.2
test_email.py file and the 3.3 test_email.py file is some small changes in
test framework and the addition of tests for bugs fixed that apply to the 3.2
API.
There are some additional teaks, such as moving just the code needed for the
compatibility policy into _policybase, so that the library code can import
only _policybase. That way the new code that will be added for email6
will only get imported when a non-compatibility policy is imported.
This new interface will also allow for future planned enhancements
in control over the parser/generator without requiring any additional
complexity in the parser/generator API.
Patch reviewed by Éric Araujo and Barry Warsaw.
The rearranged code should do exactly what the old code did, but
the new code avoids a potentially costly re computation in the case
where a boundary already exists.
The tests that were failing on (some) windows machines, where the
msg_XX.txt files used native \r\n lineseps are now also run on machines
that use \n natively, and conversely the \n tests are run on Windows.
The failing tests revealed one place where linesep needed to be added
to a flatten call in generator. There was also another that the tests
didn't catch, so I added a test for that case as well.
The work on this is not 100% complete, but everything is present to
allow real-world testing of the code. The only remaining major todo
item is to (hopefully!) enhance the handling of non-ASCII bytes in headers
converted to unicode by RFC2047 encoding them rather than replacing them with
'?'s.
svn+ssh://pythondev@svn.python.org/python/trunk
........
r78274 | r.david.murray | 2010-02-20 23:23:00 -0500 (Sat, 20 Feb 2010) | 9 lines
Issue 7970: When email.Parser.Parser parses a MIME message of type
message/rfc822 it turns it into an object whose body consists of
a list containing a single Message object. HeaderParser, on the
other hand, just copies the body as a string. Generator.flatten
has a special handler for the message mime type that expected the
body to be the one item list. This fails if the message was parsed
by HeaderParser. So we now check to see if the body is a string
first, and if so just we just emit it.
........
svn+ssh://pythondev@svn.python.org/python/trunk
Merge adds an additional test for as_string with a maxheaderlen specified.
........
r77517 | r.david.murray | 2010-01-16 00:15:17 -0500 (Sat, 16 Jan 2010) | 6 lines
Issue #1670765: Prevent email.generator.Generator from re-wrapping
headers in multipart/signed MIME parts, which fixes one of the sources of
invalid modifications to such parts by Generator. Patch and tests by
Martin von Gagern.
........
r77525 | r.david.murray | 2010-01-16 11:08:32 -0500 (Sat, 16 Jan 2010) | 2 lines
Fix issue number in comment.
........
I replaced sys.maxint with sys.maxsize in Lib/*.py. Does anybody see a problem with the change on Win 64bit platforms? Win 64's long is just 32bit but the sys.maxsize is now 2**63-1 on every 64bit platform.
Also added docs for sys.maxsize.
This should restore the email package in the py3k branch to exactly what's in
the sandbox.
This wipes out 1-2 fixes made post-copy, which I'll re-apply shortly.
Completely get rid of StringIO.py and cStringIO.c.
I had to fix a few tests and modules beyond what Christian did, and
invent a few conventions. E.g. in elementtree, I chose to
write/return Unicode strings whe no encoding is given, but bytes when
an explicit encoding is given. Also mimetools was made to always
assume binary files.
There's one major and one minor category still unfixed:
doctests are the major category (and I hope to be able to augment the
refactoring tool to refactor bona fide doctests soon);
other code generating print statements in strings is the minor category.
(Oh, and I don't know if the compiler package works.)
number of tests, all because of the codecs/_multibytecodecs issue described
here (it's not a Py3K issue, just something Py3K discovers):
http://mail.python.org/pipermail/python-dev/2006-April/064051.html
Hye-Shik Chang promised to look for a fix, so no need to fix it here. The
tests that are expected to break are:
test_codecencodings_cn
test_codecencodings_hk
test_codecencodings_jp
test_codecencodings_kr
test_codecencodings_tw
test_codecs
test_multibytecodec
This merge fixes an actual test failure (test_weakref) in this branch,
though, so I believe merging is the right thing to do anyway.