In Python2, if a unicode string was assigned as the value of a header,
email would automatically CTE encode it using the UTF8 charset.
This capability was lost in the Python3 translation, and this patch
restores it.
Patch by Ali Ikinci, assisted by R. David Murray.
I also added a fix for the mailbox test that was depending (with a comment
that it was a bad idea to so depend) on non-ASCII causing message_from_string
to raise an error. It now uses support.patch to induce an error during
message serialization.
Analogous to the decode_header fix, this fix makes Header.append and
make_header correctly handle the unknown-8bit charset introduced by email5.1,
when the input to them is binary strings. Previous to this fix the
make_header(decode_header(x)) == x invariant was broken in the face of the
unknown-8bit charset.
Why I consider this a bug rather than an API change: the API change was
to Message, which didn't used to return Headers unless you added them
yourself. Now it does (for 8bit binary header input), so decode_header
needs to be able to handle them.
The fix is to charset.py, which was not doing the encoding to the
correct output character set when doing a body_encode for either
the shift-jis or euc-jp charsets. There's also a fix for handling
a bytes input in encoders.py.
Patch by Michael Henry, comment changes by me.
When a header was long enough to need to be split across lines, the
input charset name was used instead of the output charset name in
the encoded words. This make a difference only for the two charsets
above.
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r87136 | r.david.murray | 2010-12-08 17:53:00 -0500 (Wed, 08 Dec 2010) | 6 lines
Have script_helper._assert_python strip refcount strings from stderr.
This makes the output of the function and those that depend on it
independent of whether or not they are being run under a debug
build.
........
r87221 | r.david.murray | 2010-12-13 19:55:46 -0500 (Mon, 13 Dec 2010) | 4 lines
#10699: fix docstring for tzset: it does not take a parameter
Thanks to Garrett Cooper for the fix.
........
r87256 | r.david.murray | 2010-12-14 21:19:14 -0500 (Tue, 14 Dec 2010) | 2 lines
#10705: document what the values of debuglevel are and mean.
........
r87337 | r.david.murray | 2010-12-17 11:11:40 -0500 (Fri, 17 Dec 2010) | 2 lines
#10559: provide instructions for accessing sys.argv when first mentioned.
........
r87338 | r.david.murray | 2010-12-17 11:29:07 -0500 (Fri, 17 Dec 2010) | 2 lines
#10454: clarify the compileall docs and help messages.
[compileall.py changes not backported.]
........
r87571 | r.david.murray | 2010-12-29 14:06:48 -0500 (Wed, 29 Dec 2010) | 2 lines
Fix same typo in docs.
........
r87839 | r.david.murray | 2011-01-07 16:57:25 -0500 (Fri, 07 Jan 2011) | 9 lines
Fix formatting of values with embedded newlines when rfc2047 encoding
Before this patch if a value being encoded had an embedded newline,
the line following the newline would have no leading whitespace,
and the whitespace it did have was encoded into the word. Now
the existing whitespace gets turned into a blank, the way it does
in other header reformatting, and the _continuation_ws gets added
at the beginning of the encoded line.
........
r88164 | r.david.murray | 2011-01-24 14:34:58 -0500 (Mon, 24 Jan 2011) | 12 lines
#10960: fix 'stat' links, link to lstat from stat, general tidy of stat doc.
Original patch by Michal Nowikowski, with some additions and wording
fixes by me.
I changed the wording from 'Performs a stat system call' to 'Performs
the equivalent of a stat system call', since on Windows there are no
stat/lstat system calls involved. I also extended Michal's breakout
of the attributes into a list to the other paragraphs, and rearranged
the order of the paragraphs in the 'stat' docs to make it flow
better and put it in what I think is a more logical/useful order.
........