cpython

Commit Graph

Author	SHA1	Message	Date
Barry Warsaw	f36d804b3b	get_content_type(), get_content_maintype(), get_content_subtype(): RFC 2045, section 5.2 states that if the Content-Type: header is syntactically invalid, the default type should be text/plain. Implement minimal sanity checking of the header -- it must have exactly one slash in it. This closes SF patch #597593 by Skip, but in a different way. Note that these methods used to raise ValueError for invalid ctypes, but now they won't.	2002-08-20 14:50:09 +00:00
Barry Warsaw	dfea3b3963	_dispatch(): Use get_content_maintype() and get_content_subtype() to get the MIME main and sub types, instead of getting the whole ctype and splitting it here. The two more specific methods now correctly implement RFC 2045, section 5.2.	2002-08-20 14:47:30 +00:00
Barry Warsaw	b404bb7813	test_three_lines(): Test case reported by Andrew McNamara. Works in email 2.2 but fails in email 1.0.	2002-08-20 12:54:07 +00:00
Barry Warsaw	9e4e050c59	Use full package paths in imports.	2002-07-23 20:35:58 +00:00
Barry Warsaw	10d0d595e0	Added a couple of more tests for Header charset handling.	2002-07-23 19:46:35 +00:00
Barry Warsaw	04f357cffe	Get rid of relative imports in all unittests. Now anything that imports e.g. test_support must do so using an absolute package name such as "import test.test_support" or "from test import test_support". This also updates the README in Lib/test, and gets rid of the duplicate data dirctory in Lib/test/data (replaced by Lib/email/test/data). Now Tim and Jack can have at it. :)	2002-07-23 19:04:11 +00:00
Barry Warsaw	92825a9a52	append(): Bite the bullet and let charset be the string name of a character set, which we'll convert to a Charset instance. Sigh.	2002-07-23 06:08:10 +00:00
Barry Warsaw	15d3739446	make_header(): Watch out for charset is None, which decode_header() will return as the charset if implicit us-ascii is used.	2002-07-23 04:29:54 +00:00
Tim Peters	53d019cf5a	Changed import from from test.test_support import TestSkipped, run_unittest to from test_support import TestSkipped, run_unittest Otherwise, if the Japanese codecs aren't installed, regrtest doesn't believe the TestSkipped exception raised by this test matches the except (ImportError, test_support.TestSkipped), msg: it's looking for, and reports the skip as a crash failure instead of as a skipped test. I suppose this will make it harder to run this test outside of regrtest, but under the assumption only Barry does that, better to make it skip cleanly for everyone else.	2002-07-21 06:06:30 +00:00
Barry Warsaw	190390b026	The email package's tests live much better in a subpackage (i.e. email.test), so move the guts of them here from Lib/test. The latter directory will retain stubs to run the email.test tests using Python's standard regression test. test_email_torture.py is a torture tester which will not run under Python's test suite because I don't want to commit megs of data to that project (it will fail cleanly there). When run under the mimelib project it'll stress test the package with megs of message samples collected from various locations in the wild.	2002-07-19 22:31:10 +00:00
Barry Warsaw	629038093c	The email package's tests live much better in a subpackage (i.e. email.test), so move the guts of them here from Lib/test. The latter directory will retain stubs to run the email.test tests using Python's standard regression test. test_email_torture.py is a torture tester which will not run under Python's test suite because I don't want to commit megs of data to that project (it will fail cleanly there). When run under the mimelib project it'll stress test the package with megs of message samples collected from various locations in the wild. email/test/data is a copy of Lib/test/data. The fate of the latter is still undecided.	2002-07-19 22:29:49 +00:00
Barry Warsaw	d8e8e54c2b	message_from_string(), message_from_file(): The consensus on the mimelib-devel list is that non-strict parsing should be the default. Make it so.	2002-07-19 22:26:01 +00:00
Barry Warsaw	bb26b4530b	Parser.__init__(): The consensus on the mimelib-devel list is that non-strict parsing should be the default. Make it so.	2002-07-19 22:25:34 +00:00
Barry Warsaw	c10686426e	To better support default content types, fix an API wart, and preserve backwards compatibility, we're silently deprecating get_type(), get_subtype() and get_main_type(). We may eventually noisily deprecate these. For now, we'll just fix a bug in the splitting of the main and subtypes. get_content_type(), get_content_maintype(), get_content_subtype(): New methods which replace the above. These /always/ return a content type string and do not take a failobj, because an email message always at least has a default content type. set_default_type(): Someday there may be additional default content types, so don't hard code an assertion about the value of the ctype argument.	2002-07-19 22:24:55 +00:00
Barry Warsaw	d43857455e	_structure(): Take an optional `fp' argument which would be the object to print>> the structure to. Defaults to sys.stdout.	2002-07-19 22:21:47 +00:00
Barry Warsaw	1cecdc6bcb	_dispatch(): Use the new Message.get_content_type() method as hashed out on the mimelib-devel list.	2002-07-19 22:21:02 +00:00
Barry Warsaw	7aeac9180e	Anthony Baxter's cleanup patch. Python project SF patch # 583190, quoting: in non-strict mode, messages don't require a blank line at the end with a missing end-terminator. A single newline is sufficient now. Handle trailing whitespace at the end of a boundary. Had to switch from using string.split() to re.split() Handle whitespace on the end of a parameter list for Content-type. Handle whitespace on the end of a plain content-type header. Specifically, get_type(): Strip the content type string. _get_params_preserve(): Strip the parameter names and values on both sides. _parsebody(): Lots of changes as described above, with some stylistic changes by Barry (who hopefully didn't screw things up ;).	2002-07-18 23:09:09 +00:00
Barry Warsaw	2d2fc229a0	Anthony Baxter's patch to expose the parser's `strict' flag in these convenience functions. Closes SF # 583188 (python project).	2002-07-18 21:29:17 +00:00
Barry Warsaw	4ef1c7d85b	_structure(): Don't get the whole Content-Type: header, just get the type with get_type().	2002-07-11 20:24:36 +00:00
Barry Warsaw	f488b2c6d5	_dispatch(): Comment improvements.	2002-07-11 18:48:40 +00:00
Barry Warsaw	8da39aa56a	make_header(): New function to take the output of decode_header() and create a Header instance. Closes feature request #539481. Header.__init__(): Allow the initial string to be omitted. __eq__(), __ne__(): Support rich comparisons for equality of Header instances withy Header instances or strings. Also, update a bunch of docstrings.	2002-07-09 16:33:47 +00:00
Barry Warsaw	f6caeba03a	Anthony Baxter's patch for non-strict parsing. This adds a `strict' argument to the constructor -- defaulting to true -- which is different than Anthony's approach of using global state. parse(), parsestr(): Grow a `headersonly' argument which stops parsing once the header block has been seen, i.e. it does /not/ parse or even read the body of the message. This is used for parsing message/rfc822 type messages. We need test cases for the non-strict parsing. Anthony will supply these. _parsebody(): We can get rid of the isdigest end-of-line kludges, although we still need to know if we're parsing a multipart/digest so we can set the default type accordingly.	2002-07-09 02:50:02 +00:00
Barry Warsaw	a0c8b9d4d5	Add the concept of a "default type". Normally the default type is text/plain but the RFCs state that inside a multipart/digest, the default type is message/rfc822. To preserve idempotency, we need a separate place to define the default type than the Content-Type: header. get_default_type(), set_default_type(): Accessor and mutator methods for the default type.	2002-07-09 02:46:12 +00:00
Barry Warsaw	bb493a7039	__init__(): Don't attach the subparts if its an empty tuple. If the boundary was given in the arguments, call set_boundary().	2002-07-09 02:44:26 +00:00
Barry Warsaw	93c40f0c3a	clone(): A new method for creating a clone of this generator (for recursive generation). _dispatch(): If the message object doesn't have a Content-Type: header, check its default type instead of assuming it's text/plain. This makes for correct generation of message/rfc822 containers. _handle_multipart(): We can get rid of the isdigest kludge. Just print the message as normal and everything will work out correctly. _handle_mulitpart_digest(): We don't need this anymore either.	2002-07-09 02:43:47 +00:00
Barry Warsaw	ed53bdb02d	__init__(): Be sure to set the default type to message/rfc822.	2002-07-09 02:40:35 +00:00
Barry Warsaw	8fa06b55f6	_structure(): A handy little debugging aid that I don't (yet) intend to make public, but that others might still find useful.	2002-07-09 02:39:07 +00:00
Barry Warsaw	27b168ca7c	With the addition of Oleg's support for RFC 2231, it's time to bump the version number to 2.1.	2002-07-09 02:13:10 +00:00
Barry Warsaw	6ee7156996	append(): Clarify the expected type of charset.	2002-07-03 05:04:04 +00:00
Barry Warsaw	12566a8826	Oleg Broytmann's support for RFC 2231 encoded parameters, SF patch #549133 Specifically, decode_rfc2231(), encode_rfc2231(): Functions to encode and decode RFC 2231 style parameters. decode_params(): Function to decode a list of parameters.	2002-06-29 05:58:04 +00:00
Barry Warsaw	908dc4bea8	Oleg Broytmann's support for RFC 2231 encoded parameters, SF patch #549133 Specifically, _formatparam(): Teach this about encoded `param' arguments, which are a 3-tuple of items (charset, language, value). language is ignored. _unquotevalue(): Handle both 3-tuple RFC 2231 values and unencoded values. _get_params_preserve(): Decode the parameters before returning them. get_params(), get_param(): Use _unquotevalue(). get_filename(), get_boundary(): Teach these about encoded (3-tuple) parameters.	2002-06-29 05:56:15 +00:00
Barry Warsaw	8e69bdac33	__unicode__(): Patch # 541263 by Mikhail Zabaluev, implementation modified by Barry.	2002-06-29 03:26:58 +00:00
Barry Warsaw	ba2577b7f1	_max_append(): When adding the string `s' to its own line, it should be lstrip'd so that old continuation whitespace is replaced by that specified in Header's continuation_ws parameter.	2002-06-28 23:48:23 +00:00
Barry Warsaw	766125080f	Teach this class about "highest-level syntactic breaks" but only for headers with no charset or 'us-ascii' charsets. Actually this is only partially true: we know about semicolons (but not true parameters) and we know about whitespace (but not technically folding whitespace). Still it should be good enough for all practical purposes. Other changes include: __init__(): Add a continuation_ws argument, which defaults to a single space. Set this to change the whitespace used for continuation lines when a header must be split. Also, changed the way header line lengths are calculated, so that they take into account continuation_ws (when tabs-expanded) and any provided header_name parameter. This should do much better on returning split headers for which the first and subsequent lines must fit into a specified width. guess_maxlinelen(): Removed. I don't think we need this method as part of the public API. encode_chunks() -> _encode_chunks(): I don't think we need this one as part of the public API either.	2002-06-28 23:46:53 +00:00
Barry Warsaw	062749ac57	_split_header(): The code here was terminally broken because it didn't know anything about RFC 2047 encoded headers. Fortunately we have a perfectly good header splitter in Header.encode(). So we just call that to give us a properly formatted and split header. Header.encode() didn't know about "highest-level syntactic breaks" but that's been fixed now too.	2002-06-28 23:41:42 +00:00
Barry Warsaw	69e18af968	_parsebody(): Fix for the new message/rfc822 tree structure (the parent is now a multipart with one element, the sub-message object).	2002-06-02 19:12:03 +00:00
Barry Warsaw	d2b2e533c0	header_encode(), encode(): Use _floordiv() from the appropriate compatibility module.	2002-06-02 19:08:31 +00:00
Barry Warsaw	21f77ac0bc	Use absolute import paths for intrapackage imports.	2002-06-02 19:07:16 +00:00
Barry Warsaw	8ba76e8929	Use absolute import paths for intrapackage imports. as_string(): Use Generator.flatten() for better performance.	2002-06-02 19:05:51 +00:00
Barry Warsaw	524af6f382	Use absolute import paths for intrapackage imports. Use MIMENonMultipart as the base class so that you can't attach() to these non-multipart message types.	2002-06-02 19:05:08 +00:00
Barry Warsaw	7dc865ad72	flatten(): Renamed from __call__() which is (silently) deprecated. __call__() can be 2-3x slower than the equivalent normal method. _handle_message(): The structure of message/rfc822 message has changed. Now parent's payload is a list of length 1, and the zeroth element is the Message sub-object. Adjust the printing of such message trees to reflect this change.	2002-06-02 19:02:37 +00:00
Barry Warsaw	ff49279f7c	_intdiv2() -> _floordiv(), merge of uncommitted changes.	2002-06-02 18:59:06 +00:00
Neal Norwitz	1fab9ee085	Get email test to pass. Barry, hope this is what you had in mind	2002-06-02 16:38:14 +00:00
Barry Warsaw	9d5e4aa414	Bump to version 2.0.5, and also use absolute import paths.	2002-06-01 06:03:09 +00:00
Barry Warsaw	2f514a806d	These two classes provide bases for more specific content type subclasses. MIMENonMultipart: Base class for non-multipart/* content type subclass specializations, e.g. image/gif. This class overrides attach() which raises an exception, since it makes no sense to attach a subpart to e.g. an image/gif message. MIMEMultipart: Base class for multipart/* content type subclass specializations, e.g. multipart/mixed. Does little more than provide a useful constructor.	2002-06-01 05:59:12 +00:00
Barry Warsaw	1c30aa2292	The _compat modules now export _floordiv() instead of _intdiv2() for better code reuse. _split() Use _floordiv().	2002-06-01 05:49:17 +00:00
Barry Warsaw	c5d1c045ab	Slightly better docstring	2002-06-01 05:45:37 +00:00
Barry Warsaw	bb98c8cff0	_is_unicode(): Use UnicodeType instead of the unicode builtin for Python 2.1 compatibility.	2002-06-01 03:56:07 +00:00
Guido van Rossum	ca948b40b4	Use floor division where appropriate.	2002-05-29 20:38:21 +00:00
Guido van Rossum	1a7ac359a0	Importing Charset should not fail when Unicode is disabled. (XXX Using Unicode-aware methods may still die with a NameError on unicode. Maybe there's a more elegant solution but I doubt anybody cares.)	2002-05-28 18:49:03 +00:00
Tim Peters	8ac1495a6a	Whitespace normalization.	2002-05-23 15:15:30 +00:00
Barry Warsaw	43193150ee	Bump to version 2.0.4	2002-05-22 01:52:33 +00:00
Barry Warsaw	4be9eccbc4	getaddresses(): Like the change in rfc822.py, this one needs to access the AddressList.addresslist attribute directly. Also, add a test case for the email.Utils.getaddresses() interface.	2002-05-22 01:52:10 +00:00
Barry Warsaw	7e21b6792b	I've thought about it some more, and I believe it is proper for the email package's Parser to handle the three common line endings. Certain protocols such as IMAP define CRLF line endings and it doesn't make sense for the client app to have to normalize the line endings before handing it message off to the Parser. _parsebody(): Be more flexible in the matching of line endings for finding the MIME separators. Accept any of \r, \n and \r\n. Note that we do /not/ change the line endings in the payloads, we just accept any of those three around MIME boundaries.	2002-05-19 23:51:50 +00:00
Barry Warsaw	812031b955	Fixed a bug in the splitting of lines, and improved the splitting for single byte character sets. Also fixed a semantic problem with the constructor's default arguments. Specifically, __init__(): Change the maxlinelen argument default to None instead of MAXLINELEN. The semantics should have been (and now are) that if maxlinelen is given it is always honored. If it isn't given, but header_name is given, then the maximum line length is calculated. If neither are given then the default 76 characters is used. _split(): If the character set is a single byte character set then we can split the line at the maxlinelen because we know that encoding the header won't increase its length. If the charset isn't a single byte charset then we use the quicker divide-and-conquer line splitting algorithm as before.	2002-05-19 23:47:53 +00:00
Barry Warsaw	8c1aac2476	Complete a merge of the mimelib project and the Python cvs codebases for the email package. The former is now just a shell project that has some extra files for packaging for independent use (e.g. setup.py and README). Added a compatibility layer so that the same API can be used in Python 2.1 and 2.2/2.3 with the major differences shuffled off into helper modules (_compat21.py and _compat22.py). Also bumped the package version number to 2.0.3 for some fixes to be checked in momentarily.	2002-05-19 23:44:19 +00:00
Barry Warsaw	24fd0252c4	parseaddr(): Don't use rfc822.parseaddr() because this now implies a double call to AddressList.getaddrlist(), and /that/ always returns an empty list for the second and subsequent calls. Instead, instantiate an AddressList directly, and get the parsed addresses out of the addresslist attribute.	2002-04-15 22:00:25 +00:00
Barry Warsaw	e1df15c401	AddrlistClass -> AddressList	2002-04-12 20:50:05 +00:00
Barry Warsaw	409a4c08b5	Sync'ing with standalone email package 2.0.1. This adds support for non-us-ascii character sets in headers and bodies. Some API changes (with DeprecationWarnings for the old APIs). Better RFC-compliant implementations of base64 and quoted-printable. Updated test cases. Documentation updates to follow (after I finish writing them ;).	2002-04-10 21:01:31 +00:00
Barry Warsaw	5833baa309	Removed two unused imports. Closes patch #525225 . 2.2.1 candidate (but not terribly important).	2002-03-03 22:46:46 +00:00
Barry Warsaw	15e9dc9eac	_parsebody(): When adding subparts to a multipart container, make sure that the first subpart added makes the payload a list object. Otherwise, a multipart/* with only one subpart will not have the proper structure.	2002-01-27 06:48:02 +00:00
Barry Warsaw	c44d2c52c9	decode(), encode(): Accepting the minor optimizations from SF patch #486375, but not the rest of it, since that changes the documented semantics of encode().	2001-12-03 19:26:40 +00:00
Barry Warsaw	bf7c52c233	More typo fixes.	2001-11-24 16:56:56 +00:00
Greg Ward	6253c2dd40	Docstring typo fix.	2001-11-24 15:49:53 +00:00
Barry Warsaw	e5739a69a7	formatdate(): Jason Mastaler correctly points out that divmod with a negative modulus won't return the right values. So always do positive modulus on an absolute value and twiddle the sign as appropriate after the fact.	2001-11-19 18:36:43 +00:00
Barry Warsaw	cd45a36959	formatdate(): The calculation of the minutes part of the zone was incorrect for "uneven" timezones. This algorithm should work for even timezones (e.g. America/New_York) and uneven timezones (e.g. Australia/Adelaide and America/St_Johns). Closes SF bug #483231.	2001-11-19 16:28:07 +00:00
Barry Warsaw	9aa6435398	Forgot to import time.	2001-11-09 17:45:48 +00:00
Barry Warsaw	9cff0e604a	formatdate(): A better docstring.	2001-11-09 17:07:28 +00:00
Barry Warsaw	aa79f4d492	formatdate(): An implementation to replace the one borrowed from rfc822.py. The old rfc822.formatdate() produced date strings using obsolete syntax. The new version produces the preferred RFC 2822 dates. Also, an optional argument `localtime' is added, which if true, produces a date relative to the local timezone, with daylight savings time properly taken into account.	2001-11-09 16:59:56 +00:00
Barry Warsaw	2a9e3852ee	walk(): Fix docstring; traversal is depth-first. Closes mimelib bug #477864.	2001-11-05 19:19:55 +00:00
Barry Warsaw	2539cf5aad	A fix for SF bug #472560 , extra newlines returned by get_param() when the separating semi-colon shows up on a continuation line (legal, but weird). Bug reported and fixed by Matthew Cowles. Test case and sample email included.	2001-10-25 22:43:46 +00:00
Barry Warsaw	856c32b5f4	Another merge from mimelib: _handle_multipart(): If there is an epilogue and the epilogue does not itself start with a newline, add a newline before writing the epilogue. Closes SF bug #472481.	2001-10-19 04:06:39 +00:00
Barry Warsaw	d1eeecbd43	Two merges from the mimelib project: _split_header(): Split on folding whitespace if the attempt to split on semi-colons failed. _split_header(): Patch by Matthew Cowles for fixing SF bug # 471918, Generator splitting long headers.	2001-10-17 20:51:42 +00:00
Barry Warsaw	0164b6bf22	typed_subpart_iterator(): When getting the main type use 'text' as the failobj, and when getting the subtype use 'plain' as the failobj. text/plain is supposed to be the default if the message contains no Content-Type: header.	2001-10-15 04:38:22 +00:00
Barry Warsaw	e552882960	HeaderParser: A new subclass of Parser which only parses the message headers. It does not parse the body of the message, instead simply assigning it as a string to the container's payload. This can be much faster when you're only interested in a message's header.	2001-10-11 15:43:00 +00:00
Barry Warsaw	53f8fe4232	An audio/* class, like MIMEImage, contributed by Anthony Baxter. Rewritten for style and the email package naming conventions by Barry.	2001-10-09 19:41:18 +00:00
Barry Warsaw	2ae0b0163a	Fix __all__ to the current list of exported modules (must pass the tests in test_email.py).	2001-10-09 19:14:59 +00:00
Barry Warsaw	9300a75c88	get_all(): We never returned failobj if we found no matching headers. Fix that, and also make the docstring describe failobj.	2001-10-09 15:48:29 +00:00
Barry Warsaw	e968ead1dd	Give me back my page breaks.	2001-10-04 17:05:11 +00:00
Tim Peters	527e64fd68	Whitespace normalization.	2001-10-04 05:36:56 +00:00
Barry Warsaw	66971fbca5	_parsebody(): Use get_boundary() and get_type(). Also, add a clause to the big-if to handle message/delivery-status content types. These create a message with subparts that are Message instances, which best represent the header blocks of this content type.	2001-09-26 05:44:09 +00:00
Barry Warsaw	beb5945c65	has_key(): Implement in terms of get(). get_type(): Use a compiled regular expression, which can be shared. _get_params_preserve(): A helper method which extracts the header's parameter list preserving value quoting. I'm not sure that this needs to be a public method. It's necessary because we want get_param() and friends to return the unquoted parameter value, however we want the quote-preserved form for set_boundary(). get_params(), get_param(), set_boundary(): Implement in terms of _get_params_preserve(). walk(): Yield ourself first, then recurse over our subparts (if any).	2001-09-26 05:41:51 +00:00
Barry Warsaw	76fac8ed0e	__init__(): Arguments major renamed to maintype and minor renamed to subtype for consistency with the rest of the package.	2001-09-26 05:36:36 +00:00
Barry Warsaw	57758e3a02	Updated docstrings. Also, typed_subpart_iterator(): Arguments major renamed to maintype and minor renamed to subtype for consistency with the rest of the package.	2001-09-26 05:35:47 +00:00
Barry Warsaw	3dd978dfff	Image.py and class Image => MIMEImage.py and MIMEImage Text.py and class Text => MIMEText.py and MIMEText MessageRFC822.py and class MessageRFC822 => MIMEMessage.py and MIMEMessage These are renamed so as to be more consistent; these are MIME specific derived classes for when creating the object model out of whole cloth.	2001-09-26 05:34:30 +00:00
Barry Warsaw	b384e01796	In class Generator: _handle_text(): If the payload is None, then just return (i.e. don't write anything). Subparts of message/delivery-status types will have this property since they are just blocks of headers. Also, when raising the TypeError, include the type of the payload in the error message. _handle_multipart(), _handle_message(): When creating a clone of self, pass in our _mangle_from_ and maxheaderlen flags so the clone has the same behavior. _handle_message_delivery_status(): New method to do the proper printing of message/delivery-status type messages. These have to be handled differently than other message/* types because their payloads are subparts containing just blocks of headers. In class DecodedGenerator: _dispatch(): Skip over multipart/* messages since we don't care about them, and don't want the non-text format to appear in the printed results.	2001-09-26 05:32:41 +00:00
Barry Warsaw	6f70c41923	cosmetic	2001-09-26 05:26:22 +00:00
Barry Warsaw	ba92580f01	The email package version 1.0, prototyped as mimelib <http://sf.net/projects/mimelib>. There /are/ API differences between mimelib and email, but most of the implementations are shared (except where cool Py2.2 stuff like generators are used).	2001-09-23 03:17:28 +00:00

... 10 11 12 13 14

688 Commits