This also backs out the previous fixes for for #14360, #1717, and #16564.
Those bugs were actually caused by the fact that set_payload didn't decode to
str, thus rendering the model inconsistent. This fix does mean the data
processed by the encoder functions goes through an extra encode/decode cycle,
but it means the model is always consistent. Future API updates will provide
a better way to encode payloads, which will bypass this minor de-optimization.
Tests by Vajrasky Kok.
There were no tests for the encoders module. encode_base64 worked
because it is the default and so got tested implicitly elsewhere, and
we use encode_7or8bit internally, so that worked, too. I previously
fixed encode_noop, so this fix means that everythign in the encoders
module now works, hopefully correctly. Also added an explicit test
for encode_base64.
The fix is to charset.py, which was not doing the encoding to the
correct output character set when doing a body_encode for either
the shift-jis or euc-jp charsets. There's also a fix for handling
a bytes input in encoders.py.
Patch by Michael Henry, comment changes by me.
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r81685 | r.david.murray | 2010-06-04 12:11:08 -0400 (Fri, 04 Jun 2010) | 4 lines
#4768: store base64 encoded email body parts as text, not binary.
Patch and tests by Forest Bond.
........
svn+ssh://pythondev@svn.python.org/python/branches/py3k
................
r79996 | r.david.murray | 2010-04-12 10:48:58 -0400 (Mon, 12 Apr 2010) | 15 lines
Merged revisions 79994 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r79994 | r.david.murray | 2010-04-12 10:26:06 -0400 (Mon, 12 Apr 2010) | 9 lines
Issue #7472: ISO-2022 charsets now consistently use 7bit CTE.
Fixed a typo in the email.encoders module so that messages output using
an ISO-2022 character set will use a content-transfer-encoding of
7bit consistently. Previously if the input data had any eight bit
characters the output data would get marked as 8bit even though it
was actually 7bit.
........
................
r80855 | r.david.murray | 2010-05-05 21:41:14 -0400 (Wed, 05 May 2010) | 24 lines
Merged revisions 80800 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
It turns out that email5 (py3k), because it is using unicode for the
payload, doesn't do the encoding to the output character set until later
in the process. Specifically, charset.body_encode no longer does the
input-to-output charset conversion. So the if test in the exception
clause in encoders.encode_7or8bit really is needed in email5.
So, this merge only merges the test, not the removal of the 'if'.
........
r80800 | r.david.murray | 2010-05-05 13:31:03 -0400 (Wed, 05 May 2010) | 9 lines
Issue #7472: remove unused code from email.encoders.encode_7or8bit.
Yukihiro Nakadaira noticed a typo in encode_7or8bit that was trying
to special case iso-2022 codecs. It turns out that the code in
question is never used, because whereas it was designed to trigger
if the payload encoding was eight bit but its output encoding was
7 bit, in practice the payload is always converted to the 7bit
encoding before encode_7or8bit is called. Patch by Shawat Anand.
........
................
svn+ssh://pythondev@svn.python.org/python/trunk
........
r79994 | r.david.murray | 2010-04-12 10:26:06 -0400 (Mon, 12 Apr 2010) | 9 lines
Issue #7472: ISO-2022 charsets now consistently use 7bit CTE.
Fixed a typo in the email.encoders module so that messages output using
an ISO-2022 character set will use a content-transfer-encoding of
7bit consistently. Previously if the input data had any eight bit
characters the output data would get marked as 8bit even though it
was actually 7bit.
........
MIMEApplication() requires a bytes object for its _data, so fix the tests.
We no longer need utils._identity() or utils._bdecode(). The former isn't
used anywhere AFAICT (where's "make test's" lint? <wink>) and the latter is a
kludge that is eliminated by base64.b64encode().
Current status: 5F/5E
number of tests, all because of the codecs/_multibytecodecs issue described
here (it's not a Py3K issue, just something Py3K discovers):
http://mail.python.org/pipermail/python-dev/2006-April/064051.html
Hye-Shik Chang promised to look for a fix, so no need to fix it here. The
tests that are expected to break are:
test_codecencodings_cn
test_codecencodings_hk
test_codecencodings_jp
test_codecencodings_kr
test_codecencodings_tw
test_codecs
test_multibytecodec
This merge fixes an actual test failure (test_weakref) in this branch,
though, so I believe merging is the right thing to do anyway.