cpython

Commit Graph

Author	SHA1	Message	Date
Barry Warsaw	15aefa94d0	Fixing some RFC 2231 related issues as reported in the Spambayes project, and with assistance from Oleg Broytmann. Specifically, get_param(), get_params(): Document that these methods may return parameter values that are either strings, or 3-tuples in the case of RFC 2231 encoded parameters. The application should be prepared to deal with such return values. get_boundary(): Be prepared to deal with RFC 2231 encoded boundary parameters. It makes little sense to have boundaries that are anything but ascii, so if we get back a 3-tuple from get_param() we will decode it into ascii and let any failures percolate up. get_content_charset(): New method which treats the charset parameter just like the boundary parameter in get_boundary(). Note that "get_charset()" was already taken to return the default Charset object. get_charsets(): Rewrite to use get_content_charset().	2002-09-26 17:19:34 +00:00
Barry Warsaw	6f30a8ab62	__version__: Bump to 2.4 Move the imports of Parser and Message inside the message_from_string() and message_from_file() functions. This way just "import email" won't suck in most of the submodules of the package. Note: this will break code that relied on "import email" giving you a bunch of the submodules, but that was never documented and should not have been relied on.	2002-09-25 22:07:50 +00:00
Barry Warsaw	40363b63f0	Open the test files in binary mode so the \r\n files won't cause failures on Windows. Closes SF bug # 609988.	2002-09-18 22:17:57 +00:00
Barry Warsaw	78170048f9	Bump to 2.3.1 to pick up the missing file.	2002-09-12 03:44:50 +00:00
Barry Warsaw	fbcde75c70	get_payload(): Document that calling it with no arguments returns a reference to the payload.	2002-09-11 14:11:35 +00:00
Barry Warsaw	bc6edac8df	test_utils_quote_unquote(): Test for unquote() properly de-backslash-ifying.	2002-09-11 02:31:24 +00:00
Barry Warsaw	184d55a897	rfc822.unquote() doesn't properly de-backslash-ify in Python prior to 2.3. This patch (adapted from Quinn Dunkan's SF patch #573204) fixes the problem and should get ported to rfc822.py.	2002-09-11 02:22:48 +00:00
Barry Warsaw	034b47acfe	_parsebody(): Instead of raising a BoundaryError when no start boundary could be found -- in a lax parser -- the entire body is assigned to the message payload.	2002-09-10 16:14:56 +00:00
Barry Warsaw	b1c1de3805	Import _isstring() from the compatibility layer. _handle_text(): Use _isstring() for stringiness test. _handle_multipart(): Add a test before the ListType test, checking for stringiness of the payload. String payloads for multitypes means a message with broken MIME chrome was parsed by a lax parser. Instead of raising a BoundaryError in those cases, the entire body is assigned to the message payload (but since the content type is still multipart/*, the Generator needs to be updated too).	2002-09-10 16:13:45 +00:00
Barry Warsaw	356afac41f	_isstring(): Factor out "stringiness" test, e.g. for StringType or UnicodeType, which is different between Python 2.1 and 2.2.	2002-09-10 16:09:06 +00:00
Barry Warsaw	45d9bde6c1	_ascii_split(): Don't lstrip continuation lines. Closes SF bug #601392 .	2002-09-10 15:57:29 +00:00
Barry Warsaw	24d45df3f2	test_splitting_first_line_only_is_long(): New test for SF bug #601392 , broken wrapping of long ASCII headers.	2002-09-10 15:46:44 +00:00
Barry Warsaw	dad90c202a	A sample message with broken MIME boundaries.	2002-09-10 15:43:30 +00:00
Barry Warsaw	e99e2f53e7	test_set_param(), test_del_param(): Test RFC 2231 encoding support by Oleg Broytmann in SF patch #600096. Whitespace normalized by Barry.	2002-09-06 03:56:26 +00:00
Barry Warsaw	3c25535dc8	_formatparam(), set_param(): RFC 2231 encoding support by Oleg Broytmann in SF patch #600096. Specifically, the former function now encodes the triplets, while the latter adds optional charset and language arguments.	2002-09-06 03:55:04 +00:00
Barry Warsaw	470288c54e	test_mondo_message(): "binary" is not a legal content type, so with the previous RFC 2045, $5.2 repair to get_content_type() this subpart's type will now be text/plain.	2002-09-06 03:41:27 +00:00
Barry Warsaw	58fb61cce5	test_replace_header(): New test for Message.replace_header().	2002-09-06 03:39:59 +00:00
Barry Warsaw	229727fa07	replace_header(): New method given by Skip Montanaro in SF patch #601959. Modified slightly by Barry (who added the KeyError in case the header is missing.	2002-09-06 03:38:12 +00:00
Barry Warsaw	a4ce1cf34c	_structure(): Use .get_content_type()	2002-09-01 21:04:43 +00:00
Barry Warsaw	1a1607546c	Whitespace normalization.	2002-08-27 22:38:50 +00:00
Barry Warsaw	48b0d36b4d	Typo	2002-08-27 22:34:44 +00:00
Tim Peters	280488b9a3	Whitespace normalization.	2002-08-23 18:19:30 +00:00
Barry Warsaw	4d5ef6aed6	Bump version number to 2.3	2002-08-20 14:51:34 +00:00
Barry Warsaw	3328136e3c	Added tests for SF patch #597593 , syntactically invalid Content-Type: headers.	2002-08-20 14:51:10 +00:00
Barry Warsaw	f36d804b3b	get_content_type(), get_content_maintype(), get_content_subtype(): RFC 2045, section 5.2 states that if the Content-Type: header is syntactically invalid, the default type should be text/plain. Implement minimal sanity checking of the header -- it must have exactly one slash in it. This closes SF patch #597593 by Skip, but in a different way. Note that these methods used to raise ValueError for invalid ctypes, but now they won't.	2002-08-20 14:50:09 +00:00
Barry Warsaw	dfea3b3963	_dispatch(): Use get_content_maintype() and get_content_subtype() to get the MIME main and sub types, instead of getting the whole ctype and splitting it here. The two more specific methods now correctly implement RFC 2045, section 5.2.	2002-08-20 14:47:30 +00:00
Barry Warsaw	b404bb7813	test_three_lines(): Test case reported by Andrew McNamara. Works in email 2.2 but fails in email 1.0.	2002-08-20 12:54:07 +00:00
Barry Warsaw	9e4e050c59	Use full package paths in imports.	2002-07-23 20:35:58 +00:00
Barry Warsaw	10d0d595e0	Added a couple of more tests for Header charset handling.	2002-07-23 19:46:35 +00:00
Barry Warsaw	04f357cffe	Get rid of relative imports in all unittests. Now anything that imports e.g. test_support must do so using an absolute package name such as "import test.test_support" or "from test import test_support". This also updates the README in Lib/test, and gets rid of the duplicate data dirctory in Lib/test/data (replaced by Lib/email/test/data). Now Tim and Jack can have at it. :)	2002-07-23 19:04:11 +00:00
Barry Warsaw	92825a9a52	append(): Bite the bullet and let charset be the string name of a character set, which we'll convert to a Charset instance. Sigh.	2002-07-23 06:08:10 +00:00
Barry Warsaw	15d3739446	make_header(): Watch out for charset is None, which decode_header() will return as the charset if implicit us-ascii is used.	2002-07-23 04:29:54 +00:00
Tim Peters	53d019cf5a	Changed import from from test.test_support import TestSkipped, run_unittest to from test_support import TestSkipped, run_unittest Otherwise, if the Japanese codecs aren't installed, regrtest doesn't believe the TestSkipped exception raised by this test matches the except (ImportError, test_support.TestSkipped), msg: it's looking for, and reports the skip as a crash failure instead of as a skipped test. I suppose this will make it harder to run this test outside of regrtest, but under the assumption only Barry does that, better to make it skip cleanly for everyone else.	2002-07-21 06:06:30 +00:00
Barry Warsaw	190390b026	The email package's tests live much better in a subpackage (i.e. email.test), so move the guts of them here from Lib/test. The latter directory will retain stubs to run the email.test tests using Python's standard regression test. test_email_torture.py is a torture tester which will not run under Python's test suite because I don't want to commit megs of data to that project (it will fail cleanly there). When run under the mimelib project it'll stress test the package with megs of message samples collected from various locations in the wild.	2002-07-19 22:31:10 +00:00
Barry Warsaw	629038093c	The email package's tests live much better in a subpackage (i.e. email.test), so move the guts of them here from Lib/test. The latter directory will retain stubs to run the email.test tests using Python's standard regression test. test_email_torture.py is a torture tester which will not run under Python's test suite because I don't want to commit megs of data to that project (it will fail cleanly there). When run under the mimelib project it'll stress test the package with megs of message samples collected from various locations in the wild. email/test/data is a copy of Lib/test/data. The fate of the latter is still undecided.	2002-07-19 22:29:49 +00:00
Barry Warsaw	d8e8e54c2b	message_from_string(), message_from_file(): The consensus on the mimelib-devel list is that non-strict parsing should be the default. Make it so.	2002-07-19 22:26:01 +00:00
Barry Warsaw	bb26b4530b	Parser.__init__(): The consensus on the mimelib-devel list is that non-strict parsing should be the default. Make it so.	2002-07-19 22:25:34 +00:00
Barry Warsaw	c10686426e	To better support default content types, fix an API wart, and preserve backwards compatibility, we're silently deprecating get_type(), get_subtype() and get_main_type(). We may eventually noisily deprecate these. For now, we'll just fix a bug in the splitting of the main and subtypes. get_content_type(), get_content_maintype(), get_content_subtype(): New methods which replace the above. These /always/ return a content type string and do not take a failobj, because an email message always at least has a default content type. set_default_type(): Someday there may be additional default content types, so don't hard code an assertion about the value of the ctype argument.	2002-07-19 22:24:55 +00:00
Barry Warsaw	d43857455e	_structure(): Take an optional `fp' argument which would be the object to print>> the structure to. Defaults to sys.stdout.	2002-07-19 22:21:47 +00:00
Barry Warsaw	1cecdc6bcb	_dispatch(): Use the new Message.get_content_type() method as hashed out on the mimelib-devel list.	2002-07-19 22:21:02 +00:00
Barry Warsaw	7aeac9180e	Anthony Baxter's cleanup patch. Python project SF patch # 583190, quoting: in non-strict mode, messages don't require a blank line at the end with a missing end-terminator. A single newline is sufficient now. Handle trailing whitespace at the end of a boundary. Had to switch from using string.split() to re.split() Handle whitespace on the end of a parameter list for Content-type. Handle whitespace on the end of a plain content-type header. Specifically, get_type(): Strip the content type string. _get_params_preserve(): Strip the parameter names and values on both sides. _parsebody(): Lots of changes as described above, with some stylistic changes by Barry (who hopefully didn't screw things up ;).	2002-07-18 23:09:09 +00:00
Barry Warsaw	2d2fc229a0	Anthony Baxter's patch to expose the parser's `strict' flag in these convenience functions. Closes SF # 583188 (python project).	2002-07-18 21:29:17 +00:00
Barry Warsaw	4ef1c7d85b	_structure(): Don't get the whole Content-Type: header, just get the type with get_type().	2002-07-11 20:24:36 +00:00
Barry Warsaw	f488b2c6d5	_dispatch(): Comment improvements.	2002-07-11 18:48:40 +00:00
Barry Warsaw	8da39aa56a	make_header(): New function to take the output of decode_header() and create a Header instance. Closes feature request #539481. Header.__init__(): Allow the initial string to be omitted. __eq__(), __ne__(): Support rich comparisons for equality of Header instances withy Header instances or strings. Also, update a bunch of docstrings.	2002-07-09 16:33:47 +00:00
Barry Warsaw	f6caeba03a	Anthony Baxter's patch for non-strict parsing. This adds a `strict' argument to the constructor -- defaulting to true -- which is different than Anthony's approach of using global state. parse(), parsestr(): Grow a `headersonly' argument which stops parsing once the header block has been seen, i.e. it does /not/ parse or even read the body of the message. This is used for parsing message/rfc822 type messages. We need test cases for the non-strict parsing. Anthony will supply these. _parsebody(): We can get rid of the isdigest end-of-line kludges, although we still need to know if we're parsing a multipart/digest so we can set the default type accordingly.	2002-07-09 02:50:02 +00:00
Barry Warsaw	a0c8b9d4d5	Add the concept of a "default type". Normally the default type is text/plain but the RFCs state that inside a multipart/digest, the default type is message/rfc822. To preserve idempotency, we need a separate place to define the default type than the Content-Type: header. get_default_type(), set_default_type(): Accessor and mutator methods for the default type.	2002-07-09 02:46:12 +00:00
Barry Warsaw	bb493a7039	__init__(): Don't attach the subparts if its an empty tuple. If the boundary was given in the arguments, call set_boundary().	2002-07-09 02:44:26 +00:00
Barry Warsaw	93c40f0c3a	clone(): A new method for creating a clone of this generator (for recursive generation). _dispatch(): If the message object doesn't have a Content-Type: header, check its default type instead of assuming it's text/plain. This makes for correct generation of message/rfc822 containers. _handle_multipart(): We can get rid of the isdigest kludge. Just print the message as normal and everything will work out correctly. _handle_mulitpart_digest(): We don't need this anymore either.	2002-07-09 02:43:47 +00:00
Barry Warsaw	ed53bdb02d	__init__(): Be sure to set the default type to message/rfc822.	2002-07-09 02:40:35 +00:00

1 2 3

112 Commits