Commit Graph

24 Commits

Author SHA1 Message Date
Barry Warsaw 15aefa94d0 Fixing some RFC 2231 related issues as reported in the Spambayes
project, and with assistance from Oleg Broytmann.  Specifically,

get_param(), get_params(): Document that these methods may return
parameter values that are either strings, or 3-tuples in the case of
RFC 2231 encoded parameters.  The application should be prepared to
deal with such return values.

get_boundary(): Be prepared to deal with RFC 2231 encoded boundary
parameters.  It makes little sense to have boundaries that are
anything but ascii, so if we get back a 3-tuple from get_param() we
will decode it into ascii and let any failures percolate up.

get_content_charset(): New method which treats the charset parameter
just like the boundary parameter in get_boundary().  Note that
"get_charset()" was already taken to return the default Charset
object.

get_charsets(): Rewrite to use get_content_charset().
2002-09-26 17:19:34 +00:00
Barry Warsaw fbcde75c70 get_payload(): Document that calling it with no arguments returns a
reference to the payload.
2002-09-11 14:11:35 +00:00
Barry Warsaw 3c25535dc8 _formatparam(), set_param(): RFC 2231 encoding support by Oleg
Broytmann in SF patch #600096.  Specifically, the former function now
encodes the triplets, while the latter adds optional charset and
language arguments.
2002-09-06 03:55:04 +00:00
Barry Warsaw 229727fa07 replace_header(): New method given by Skip Montanaro in SF patch
#601959.  Modified slightly by Barry (who added the KeyError in case
the header is missing.
2002-09-06 03:38:12 +00:00
Barry Warsaw 48b0d36b4d Typo 2002-08-27 22:34:44 +00:00
Tim Peters 280488b9a3 Whitespace normalization. 2002-08-23 18:19:30 +00:00
Barry Warsaw f36d804b3b get_content_type(), get_content_maintype(), get_content_subtype(): RFC
2045, section 5.2 states that if the Content-Type: header is
syntactically invalid, the default type should be text/plain.
Implement minimal sanity checking of the header -- it must have
exactly one slash in it.  This closes SF patch #597593 by Skip, but in
a different way.

Note that these methods used to raise ValueError for invalid ctypes,
but now they won't.
2002-08-20 14:50:09 +00:00
Barry Warsaw c10686426e To better support default content types, fix an API wart, and preserve
backwards compatibility, we're silently deprecating get_type(),
get_subtype() and get_main_type().  We may eventually noisily
deprecate these.  For now, we'll just fix a bug in the splitting of
the main and subtypes.

get_content_type(), get_content_maintype(), get_content_subtype(): New
methods which replace the above.  These /always/ return a content type
string and do not take a failobj, because an email message always at
least has a default content type.

set_default_type(): Someday there may be additional default content
types, so don't hard code an assertion about the value of the ctype
argument.
2002-07-19 22:24:55 +00:00
Barry Warsaw 7aeac9180e Anthony Baxter's cleanup patch. Python project SF patch # 583190,
quoting:

  in non-strict mode, messages don't require a blank line at the end
  with a missing end-terminator. A single newline is sufficient now.

  Handle trailing whitespace at the end of a boundary. Had to switch
  from using string.split() to re.split()

  Handle whitespace on the end of a parameter list for Content-type.

  Handle whitespace on the end of a plain content-type header.

Specifically,

get_type(): Strip the content type string.

_get_params_preserve(): Strip the parameter names and values on both
sides.

_parsebody(): Lots of changes as described above, with some stylistic
changes by Barry (who hopefully didn't screw things up ;).
2002-07-18 23:09:09 +00:00
Barry Warsaw a0c8b9d4d5 Add the concept of a "default type". Normally the default type is
text/plain but the RFCs state that inside a multipart/digest, the
default type is message/rfc822.  To preserve idempotency, we need a
separate place to define the default type than the Content-Type:
header.

get_default_type(), set_default_type(): Accessor and mutator methods
for the default type.
2002-07-09 02:46:12 +00:00
Barry Warsaw 908dc4bea8 Oleg Broytmann's support for RFC 2231 encoded parameters, SF patch #549133
Specifically,

_formatparam(): Teach this about encoded `param' arguments, which are
a 3-tuple of items (charset, language, value).  language is ignored.

_unquotevalue(): Handle both 3-tuple RFC 2231 values and unencoded
values.

_get_params_preserve(): Decode the parameters before returning them.

get_params(), get_param(): Use _unquotevalue().

get_filename(), get_boundary(): Teach these about encoded (3-tuple)
parameters.
2002-06-29 05:56:15 +00:00
Barry Warsaw 8ba76e8929 Use absolute import paths for intrapackage imports.
as_string(): Use Generator.flatten() for better performance.
2002-06-02 19:05:51 +00:00
Tim Peters 8ac1495a6a Whitespace normalization. 2002-05-23 15:15:30 +00:00
Barry Warsaw 8c1aac2476 Complete a merge of the mimelib project and the Python cvs codebases
for the email package.  The former is now just a shell project that
has some extra files for packaging for independent use (e.g. setup.py
and README).

Added a compatibility layer so that the same API can be used in Python
2.1 and 2.2/2.3 with the major differences shuffled off into helper
modules (_compat21.py and _compat22.py).

Also bumped the package version number to 2.0.3 for some fixes to be
checked in momentarily.
2002-05-19 23:44:19 +00:00
Barry Warsaw 409a4c08b5 Sync'ing with standalone email package 2.0.1. This adds support for
non-us-ascii character sets in headers and bodies.  Some API changes
(with DeprecationWarnings for the old APIs).  Better RFC-compliant
implementations of base64 and quoted-printable.

Updated test cases.  Documentation updates to follow (after I finish
writing them ;).
2002-04-10 21:01:31 +00:00
Barry Warsaw bf7c52c233 More typo fixes. 2001-11-24 16:56:56 +00:00
Greg Ward 6253c2dd40 Docstring typo fix. 2001-11-24 15:49:53 +00:00
Barry Warsaw 2a9e3852ee walk(): Fix docstring; traversal is depth-first. Closes mimelib bug
#477864.
2001-11-05 19:19:55 +00:00
Barry Warsaw 2539cf5aad A fix for SF bug #472560, extra newlines returned by get_param() when
the separating semi-colon shows up on a continuation line (legal, but
weird).

Bug reported and fixed by Matthew Cowles.  Test case and sample email
included.
2001-10-25 22:43:46 +00:00
Barry Warsaw 9300a75c88 get_all(): We never returned failobj if we found no matching headers.
Fix that, and also make the docstring describe failobj.
2001-10-09 15:48:29 +00:00
Barry Warsaw e968ead1dd Give me back my page breaks. 2001-10-04 17:05:11 +00:00
Tim Peters 527e64fd68 Whitespace normalization. 2001-10-04 05:36:56 +00:00
Barry Warsaw beb5945c65 has_key(): Implement in terms of get().
get_type(): Use a compiled regular expression, which can be shared.

_get_params_preserve(): A helper method which extracts the header's
    parameter list preserving value quoting.  I'm not sure that this
    needs to be a public method.  It's necessary because we want
    get_param() and friends to return the unquoted parameter value,
    however we want the quote-preserved form for set_boundary().

get_params(), get_param(), set_boundary(): Implement in terms of
    _get_params_preserve().

walk(): Yield ourself first, then recurse over our subparts (if any).
2001-09-26 05:41:51 +00:00
Barry Warsaw ba92580f01 The email package version 1.0, prototyped as mimelib
<http://sf.net/projects/mimelib>.  There /are/ API differences between
mimelib and email, but most of the implementations are shared (except
where cool Py2.2 stuff like generators are used).
2001-09-23 03:17:28 +00:00