Commit Graph

591 Commits

Author SHA1 Message Date
R David Murray 1424e7d688 Merge: #18437: fix comment typo. 2013-07-12 22:56:15 -04:00
R David Murray 037f65841c #18437: fix comment typo. 2013-07-12 22:55:43 -04:00
R David Murray 1f9d24a18d Merge: #18431: Decode encoded words in atoms in new email parser. 2013-07-12 16:01:10 -04:00
R David Murray 923512f327 #18431: Decode encoded words in atoms in new email parser.
There is more to be done here in terms of accepting RFC invalid
input that some mailers accept, but this covers the valid
RFC places where encoded words can occur in structured headers.
2013-07-12 16:00:28 -04:00
R David Murray 63194a774e Merge: #18044: Fix parsing of encoded words of the form =?utf8?q?=XX...?= 2013-07-11 15:58:07 -04:00
R David Murray 65171b28e7 #18044: Fix parsing of encoded words of the form =?utf8?q?=XX...?=
The problem was I was only checking for decimal digits after the third '?',
not for *hex* digits :(.

This changeset also fixes a couple of comment typos, deletes an unused
function relating to encoded word parsing, and removed an invalid
'if' test from the folding function that was revealed by the tests
written to validate this issue.
2013-07-11 15:52:57 -04:00
Ezio Melotti e0a39de647 #18380: merge with 3.3. 2013-07-06 17:17:45 +02:00
Ezio Melotti 2a99d5df63 #18380: pass regex flags to the right argument. Patch by Valentina Mukhamedzhanova. 2013-07-06 17:16:04 +02:00
R David Murray c723da361a Merge #14360: make encoders.encode_quopri work. 2013-06-27 18:38:36 -04:00
R David Murray f6069f9f22 #14360: make encoders.encode_quopri work.
There were no tests for the encoders module.  encode_base64 worked
because it is the default and so got tested implicitly elsewhere, and
we use encode_7or8bit internally, so that worked, too.  I previously
fixed encode_noop, so this fix means that everythign in the encoders
module now works, hopefully correctly.  Also added an explicit test
for encode_base64.
2013-06-27 18:37:00 -04:00
R David Murray b83ee30fc1 #11454: Reduce email module load time, improve surrogate check efficiency.
The new _has_surrogates code was suggested by Serhiy Storchaka.  See
the issue for timings, but it is far faster than any other alternative,
and also removes the load time that we previously incurred from compiling
the complex regex this replaces.
2013-06-26 12:06:21 -04:00
Victor Stinner 765531d2d0 Issue #17516: use comment syntax for comments, instead of multiline string 2013-03-26 01:11:54 +01:00
R David Murray 2fab35877d Add missing FeedParser and BytesFeedParser to email.parser.__all__. 2013-03-15 21:00:48 -04:00
R David Murray 5efee58014 Merge: #17431: Fix missing import of BytesFeedParser in email.parser. 2013-03-15 20:45:11 -04:00
R David Murray 8093d6f822 Merge: #17431: Fix missing import of BytesFeedParser in email.parser. 2013-03-15 20:42:29 -04:00
R David Murray 612528d95d #17431: Fix missing import of BytesFeedParser in email.parser.
Initial patch contributed by Edmond Burnett.
2013-03-15 20:38:15 -04:00
Terry Jan Reedy 8b53559a89 Merge with 3.3, issue #17047: remove doubled words added in 3.3,
as reported by Serhiy Storchaka and Matthew Barnett.
2013-03-11 18:36:38 -04:00
Terry Jan Reedy 0f84764a09 Issue #17047: remove doubled words added in 3.3
as reported by Serhiy Storchaka and Matthew Barnett.
2013-03-11 18:34:00 -04:00
R David Murray 857b24b0d5 Merge: PEP8 fixup on previous patch, remove unused imports in test_email. 2013-03-07 18:17:19 -05:00
R David Murray 965794ed58 Merge: PEP8 fixup on previous patch, remove unused imports in test_email. 2013-03-07 18:16:47 -05:00
R David Murray b9534f4ed5 PEP8 fixup on previous patch, remove unused import in test_email. 2013-03-07 18:15:13 -05:00
R David Murray 2e78cd9b5e Merge: #14645: Generator now emits correct linesep for all parts.
Previously the parts of the message retained whatever linesep they had on
read, which means if the messages weren't read in univeral newline mode, the
line endings could well be inconsistent.  In general sending it via smtplib
would result in them getting fixed, but it is better to generate them
correctly to begin with.  Also, the new send_message method of smtplib does
not do the fixup, so that method is producing rfc-invalid output without this
fix.
2013-03-07 17:31:21 -05:00
R David Murray addb0be63e Merge: #14645: Generator now emits correct linesep for all parts.
Previously the parts of the message retained whatever linesep they had on
read, which means if the messages weren't read in univeral newline mode, the
line endings could well be inconsistent.  In general sending it via smtplib
would result in them getting fixed, but it is better to generate them
correctly to begin with.  Also, the new send_message method of smtplib does
not do the fixup, so that method is producing rfc-invalid output without this
fix.
2013-03-07 16:43:58 -05:00
R David Murray e67c6c545b #14645: Generator now emits correct linesep for all parts.
Previously the parts of the message retained whatever linesep they had on
read, which means if the messages weren't read in univeral newline mode, the
line endings could well be inconsistent.  In general sending it via smtplib
would result in them getting fixed, but it is better to generate them
correctly to begin with.  Also, the new send_message method of smtplib does
not do the fixup, so that method is producing rfc-invalid output without this
fix.
2013-03-07 16:38:03 -05:00
R David Murray 2940e71add #15220: simplify and speed up feedparser's line splitting.
Original patch submitted by QNX, modified for clarity by me (mostly comments).
QNX reports a 30% speed up in average email parsing time.
2013-02-13 21:17:13 -05:00
R David Murray 64634eb321 Merge: #17171: fix email.encoders.encode_7or8bit when applied to binary data. 2013-02-11 10:54:22 -05:00
R David Murray 66383b2e0a Merge: #17171: fix email.encoders.encode_7or8bit when applied to binary data. 2013-02-11 10:53:35 -05:00
R David Murray ec317a8985 #17171: fix email.encoders.encode_7or8bit when applied to binary data. 2013-02-11 10:51:28 -05:00
R David Murray c9b4e60683 Merge: #16564: Fix regression in use of encoders.encode_noop with binary data. 2013-02-09 13:13:14 -05:00
R David Murray 6cb1d67eb3 Merge: #16564: Fix regression in use of encoders.encode_noop with binary data. 2013-02-09 13:10:54 -05:00
R David Murray ceaa8b1d75 #16564: Fix regression in use of encoders.encode_noop with binary data. 2013-02-09 13:02:58 -05:00
R David Murray c44911f49a Merge: #16948: Fix quopri encoding of non-latin1 character sets. 2013-02-05 11:34:39 -05:00
R David Murray e201e9d584 Merge: #16948: Fix quopri encoding of non-latin1 character sets. 2013-02-05 10:55:27 -05:00
R David Murray f581b37200 #16948: Fix quopri encoding of non-latin1 character sets. 2013-02-05 10:49:49 -05:00
R David Murray 2c4f6e8693 Merge #16811: Fix folding of headers with no value in provisional policies. 2013-02-04 15:25:06 -05:00
R David Murray 844b0e6971 #16811: Fix folding of headers with no value in provisional policies. 2013-02-04 15:22:53 -05:00
Andrew Svetlov a191959849 Issue #16714: use 'raise' exceptions, don't 'throw'.
Patch by Serhiy Storchaka.
2012-12-18 21:27:16 +02:00
Andrew Svetlov 5b89840d9c Issue #16714: use 'raise' exceptions, don't 'throw'.
Patch by Serhiy Storchaka.
2012-12-18 21:26:36 +02:00
Andrew Svetlov 737fb89dd1 Issue #16714: use 'raise' exceptions, don't 'throw'.
Patch by Serhiy Storchaka.
2012-12-18 21:14:22 +02:00
Philip Jenvey 4993cc0a5b utilize yield from 2012-10-01 12:53:43 -07:00
Georg Brandl 1aca31e8f3 Closes #15925: fix regression in parsedate() and parsedate_tz() that should return None if unable to parse the argument. 2012-09-22 09:03:56 +02:00
R David Murray ad2a7d528a Merge #15249: Mangle From lines correctly when body contains invalid bytes.
Fix by Colin Su.  Test by me, based on a test written by Petri Lehtinen.
2012-08-24 11:23:50 -04:00
R David Murray 638d40b433 #15249: Mangle From lines correctly when body contains invalid bytes.
Fix by Colin Su.  Test by me, based on a test written by Petri Lehtinen.
2012-08-24 11:14:13 -04:00
Alexander Belopolsky f9bd9141c5 Issue #665194: Added a small optimization 2012-08-22 23:02:36 -04:00
R David Murray 097a1208bc #665194: fix variable name in exception code path.
It was correct in the original patch and I foobared it
when I restructured part of the code.
2012-08-22 21:52:31 -04:00
R David Murray b8687df653 #665194: Update email.utils.localtime to use astimezone, and fix bug.
The new code correctly handles historic changes in UTC offsets.
A test for this should follow.

Original patch by Alexander Belopolsky.
2012-08-22 21:34:00 -04:00
R David Murray 970bef295d Merge #15232: correctly mangle From lines in MIME preamble and epilogue 2012-07-22 21:53:54 -04:00
R David Murray 6a31bc6d81 #15232: correctly mangle From lines in MIME preamble and epilogue 2012-07-22 21:47:53 -04:00
R David Murray 97f43c019f #15160: Extend the new email parser to handle MIME headers.
This code passes all the same tests that the existing RFC mime header
parser passes, plus a bunch of additional ones.

There are a couple of commented out tests where there are issues with the
folding.  The folding doesn't normally get invoked for headers parsed from
source, and the cases are marginal anyway (headers with invalid binary data)
so I'm not worried about them, but will fix them after the beta.

There are things that can be done to make this API even more convenient, but I
think this is a solid foundation worth having.  And the parser is a full RFC
parser, so it handles cases that the current parser doesn't.  (There are also
probably cases where it fails when the current parser doesn't, but I haven't
found them yet ;)

Oh, yeah, and there are some really ugly bits in the parser for handling some
'postel' cases that are unfortunately common.

I hope/plan to to eventually refactor a lot of the code in the parser which
should reduce the line count...but there is no escaping the fact that the
error recovery is welter of special cases.
2012-06-24 05:03:27 -04:00
Alexander Belopolsky 76935b9c8c Issue #14653: email.utils.mktime_tz() no longer relies on system
mktime() when timezone offest is supplied.
2012-06-21 20:48:23 -04:00