359 lines
16 KiB
TeX
359 lines
16 KiB
TeX
% Copyright (C) 2001 Python Software Foundation
|
|
% Author: barry@zope.com (Barry Warsaw)
|
|
|
|
\section{\module{email} --
|
|
An email and MIME handling package}
|
|
|
|
\declaremodule{standard}{email}
|
|
\modulesynopsis{Package supporting the parsing, manipulating, and
|
|
generating email messages, including MIME documents.}
|
|
\moduleauthor{Barry A. Warsaw}{barry@zope.com}
|
|
|
|
\versionadded{2.2}
|
|
|
|
The \module{email} package is a library for managing email messages,
|
|
including MIME and other \rfc{2822}-based message documents. It
|
|
subsumes most of the functionality in several older standard modules
|
|
such as \module{rfc822}, \module{mimetools}, \module{multifile}, and
|
|
other non-standard packages such as \module{mimecntl}.
|
|
|
|
The primary distinguishing feature of the \module{email} package is
|
|
that it splits the parsing and generating of email messages from the
|
|
internal \emph{object model} representation of email. Applications
|
|
using the \module{email} package deal primarily with objects; you can
|
|
add sub-objects to messages, remove sub-objects from messages,
|
|
completely re-arrange the contents, etc. There is a separate parser
|
|
and a separate generator which handles the transformation from flat
|
|
text to the object module, and then back to flat text again. There
|
|
are also handy subclasses for some common MIME object types, and a few
|
|
miscellaneous utilities that help with such common tasks as extracting
|
|
and parsing message field values, creating RFC-compliant dates, etc.
|
|
|
|
The following sections describe the functionality of the
|
|
\module{email} package. The ordering follows a progression that
|
|
should be common in applications: an email message is read as flat
|
|
text from a file or other source, the text is parsed to produce an
|
|
object model representation of the email message, this model is
|
|
manipulated, and finally the model is rendered back into
|
|
flat text.
|
|
|
|
It is perfectly feasible to create the object model out of whole cloth
|
|
-- i.e. completely from scratch. From there, a similar progression can
|
|
be taken as above.
|
|
|
|
Also included are detailed specifications of all the classes and
|
|
modules that the \module{email} package provides, the exception
|
|
classes you might encounter while using the \module{email} package,
|
|
some auxiliary utilities, and a few examples. For users of the older
|
|
\module{mimelib} package, from which the \module{email} package is
|
|
descendent, a section on differences and porting is provided.
|
|
|
|
\subsection{Representing an email message}
|
|
|
|
The primary object in the \module{email} package is the
|
|
\class{Message} class, provided in the \refmodule{email.Message}
|
|
module. \class{Message} is the base class for the \module{email}
|
|
object model. It provides the core functionality for setting and
|
|
querying header fields, and for accessing message bodies.
|
|
|
|
Conceptually, a \class{Message} object consists of \emph{headers} and
|
|
\emph{payloads}. Headers are \rfc{2822} style field name and
|
|
values where the field name and value are separated by a colon. The
|
|
colon is not part of either the field name or the field value.
|
|
|
|
Headers are stored and returned in case-preserving form but are
|
|
matched case-insensitively. There may also be a single
|
|
\emph{Unix-From} header, also known as the envelope header or the
|
|
\code{From_} header. The payload is either a string in the case of
|
|
simple message objects, a list of \class{Message} objects for
|
|
multipart MIME documents, or a single \class{Message} instance for
|
|
\code{message/rfc822} type objects.
|
|
|
|
\class{Message} objects provide a mapping style interface for
|
|
accessing the message headers, and an explicit interface for accessing
|
|
both the headers and the payload. It provides convenience methods for
|
|
generating a flat text representation of the message object tree, for
|
|
accessing commonly used header parameters, and for recursively walking
|
|
over the object tree.
|
|
|
|
\subsection{Parsing email messages}
|
|
Message object trees can be created in one of two ways: they can be
|
|
created from whole cloth by instantiating \class{Message} objects and
|
|
stringing them together via \method{add_payload()} and
|
|
\method{set_payload()} calls, or they can be created by parsing a flat text
|
|
representation of the email message.
|
|
|
|
The \module{email} package provides a standard parser that understands
|
|
most email document structures, including MIME documents. You can
|
|
pass the parser a string or a file object, and the parser will return
|
|
to you the root \class{Message} instance of the object tree. For
|
|
simple, non-MIME messages the payload of this root object will likely
|
|
be a string (e.g. containing the text of the message). For MIME
|
|
messages, the root object will return 1 from its
|
|
\method{is_multipart()} method, and the subparts can be accessed via
|
|
the \method{get_payload()} and \method{walk()} methods.
|
|
|
|
Note that the parser can be extended in limited ways, and of course
|
|
you can implement your own parser completely from scratch. There is
|
|
no magical connection between the \module{email} package's bundled
|
|
parser and the
|
|
\class{Message} class, so your custom parser can create message object
|
|
trees in any way it find necessary. The \module{email} package's
|
|
parser is described in detail in the \refmodule{email.Parser} module
|
|
documentation.
|
|
|
|
\subsection{Generating MIME documents}
|
|
One of the most common tasks is to generate the flat text of the email
|
|
message represented by a message object tree. You will need to do
|
|
this if you want to send your message via the \refmodule{smtplib}
|
|
module or the \refmodule{nntplib} module, or print the message on the
|
|
console. Taking a message object tree and producing a flat text
|
|
document is the job of the \refmodule{email.Generator} module.
|
|
|
|
Again, as with the \refmodule{email.Parser} module, you aren't limited
|
|
to the functionality of the bundled generator; you could write one
|
|
from scratch yourself. However the bundled generator knows how to
|
|
generate most email in a standards-compliant way, should handle MIME
|
|
and non-MIME email messages just fine, and is designed so that the
|
|
transformation from flat text, to an object tree via the
|
|
\class{Parser} class,
|
|
and back to flat text, be idempotent (the input is identical to the
|
|
output).
|
|
|
|
\subsection{Creating email and MIME objects from scratch}
|
|
|
|
Ordinarily, you get a message object tree by passing some text to a
|
|
parser, which parses the text and returns the root of the message
|
|
object tree. However you can also build a complete object tree from
|
|
scratch, or even individual \class{Message} objects by hand. In fact,
|
|
you can also take an existing tree and add new \class{Message}
|
|
objects, move them around, etc. This makes a very convenient
|
|
interface for slicing-and-dicing MIME messages.
|
|
|
|
You can create a new object tree by creating \class{Message}
|
|
instances, adding payloads and all the appropriate headers manually.
|
|
For MIME messages though, the \module{email} package provides some
|
|
convenient classes to make things easier. Each of these classes
|
|
should be imported from a module with the same name as the class, from
|
|
within the \module{email} package. E.g.:
|
|
|
|
\begin{verbatim}
|
|
import email.MIMEImage.MIMEImage
|
|
\end{verbatim}
|
|
|
|
or
|
|
|
|
\begin{verbatim}
|
|
from email.MIMEText import MIMEText
|
|
\end{verbatim}
|
|
|
|
Here are the classes:
|
|
|
|
\begin{classdesc}{MIMEBase}{_maintype, _subtype, **_params}
|
|
This is the base class for all the MIME-specific subclasses of
|
|
\class{Message}. Ordinarily you won't create instances specifically
|
|
of \class{MIMEBase}, although you could. \class{MIMEBase} is provided
|
|
primarily as a convenient base class for more specific MIME-aware
|
|
subclasses.
|
|
|
|
\var{_maintype} is the \code{Content-Type:} major type (e.g. \code{text} or
|
|
\code{image}), and \var{_subtype} is the \code{Content-Type:} minor type
|
|
(e.g. \code{plain} or \code{gif}). \var{_params} is a parameter
|
|
key/value dictionary and is passed directly to
|
|
\method{Message.add_header()}.
|
|
|
|
The \class{MIMEBase} class always adds a \code{Content-Type:} header
|
|
(based on \var{_maintype}, \var{_subtype}, and \var{_params}), and a
|
|
\code{MIME-Version:} header (always set to \code{1.0}).
|
|
\end{classdesc}
|
|
|
|
\begin{classdesc}{MIMEImage}{_imagedata\optional{, _subtype\optional{,
|
|
_encoder\optional{, **_params}}}}
|
|
|
|
A subclass of \class{MIMEBase}, the \class{MIMEImage} class is used to
|
|
create MIME message objects of major type \code{image}.
|
|
\var{_imagedata} is a string containing the raw image data. If this
|
|
data can be decoded by the standard Python module \refmodule{imghdr},
|
|
then the subtype will be automatically included in the
|
|
\code{Content-Type:} header. Otherwise you can explicitly specify the
|
|
image subtype via the \var{_subtype} parameter. If the minor type could
|
|
not be guessed and \var{_subtype} was not given, then \code{TypeError}
|
|
is raised.
|
|
|
|
Optional \var{_encoder} is a callable (i.e. function) which will
|
|
perform the actual encoding of the image data for transport. This
|
|
callable takes one argument, which is the \class{MIMEImage} instance.
|
|
It should use \method{get_payload()} and \method{set_payload()} to
|
|
change the payload to encoded form. It should also add any
|
|
\code{Content-Transfer-Encoding:} or other headers to the message
|
|
object as necessary. The default encoding is \emph{Base64}. See the
|
|
\refmodule{email.Encoders} module for a list of the built-in encoders.
|
|
|
|
\var{_params} are passed straight through to the \class{MIMEBase}
|
|
constructor.
|
|
\end{classdesc}
|
|
|
|
\begin{classdesc}{MIMEText}{_text\optional{, _subtype\optional{,
|
|
_charset\optional{, _encoder}}}}
|
|
A subclass of \class{MIMEBase}, the \class{MIMEText} class is used to
|
|
create MIME objects of major type \code{text}. \var{_text} is the string
|
|
for the payload. \var{_subtype} is the minor type and defaults to
|
|
\code{plain}. \var{_charset} is the character set of the text and is
|
|
passed as a parameter to the \class{MIMEBase} constructor; it defaults
|
|
to \code{us-ascii}. No guessing or encoding is performed on the text
|
|
data, but a newline is appended to \var{_text} if it doesn't already
|
|
end with a newline.
|
|
|
|
The \var{_encoding} argument is as with the \class{MIMEImage} class
|
|
constructor, except that the default encoding for \class{MIMEText}
|
|
objects is one that doesn't actually modify the payload, but does set
|
|
the \code{Content-Transfer-Encoding:} header to \code{7bit} or
|
|
\code{8bit} as appropriate.
|
|
\end{classdesc}
|
|
|
|
\begin{classdesc}{MIMEMessage}{_msg\optional{, _subtype}}
|
|
A subclass of \class{MIMEBase}, the \class{MIMEMessage} class is used to
|
|
create MIME objects of main type \code{message}. \var{_msg} is used as
|
|
the payload, and must be an instance of class \class{Message} (or a
|
|
subclass thereof), otherwise a \exception{TypeError} is raised.
|
|
|
|
Optional \var{_subtype} sets the subtype of the message; it defaults
|
|
to \code{rfc822}.
|
|
\end{classdesc}
|
|
|
|
\subsection{Encoders, Exceptions, Utilities, and Iterators}
|
|
|
|
The \module{email} package provides various encoders for safe
|
|
transport of binary payloads in \class{MIMEImage} and \class{MIMEText}
|
|
instances. See the \refmodule{email.Encoders} module for more
|
|
details.
|
|
|
|
All of the class exceptions that the \module{email} package can raise
|
|
are available in the \refmodule{email.Errors} module.
|
|
|
|
Some miscellaneous utility functions are available in the
|
|
\refmodule{email.Utils} module.
|
|
|
|
Iterating over a message object tree is easy with the
|
|
\method{Message.walk()} method; some additional helper iterators are
|
|
available in the \refmodule{email.Iterators} module.
|
|
|
|
\subsection{Differences from \module{mimelib}}
|
|
|
|
The \module{email} package was originally prototyped as a separate
|
|
library called \module{mimelib}. Changes have been made so that
|
|
method names are more consistent, and some methods or modules have
|
|
either been added or removed. The semantics of some of the methods
|
|
have also changed. For the most part, any functionality available in
|
|
\module{mimelib} is still available in the \module{email} package,
|
|
albeit often in a different way.
|
|
|
|
Here is a brief description of the differences between the
|
|
\module{mimelib} and the \module{email} packages, along with hints on
|
|
how to port your applications.
|
|
|
|
Of course, the most visible difference between the two packages is
|
|
that the package name has been changed to \module{email}. In
|
|
addition, the top-level package has the following differences:
|
|
|
|
\begin{itemize}
|
|
\item \function{messageFromString()} has been renamed to
|
|
\function{message_from_string()}.
|
|
\item \function{messageFromFile()} has been renamed to
|
|
\function{message_from_file()}.
|
|
\end{itemize}
|
|
|
|
The \class{Message} class has the following differences:
|
|
|
|
\begin{itemize}
|
|
\item The method \method{asString()} was renamed to \method{as_string()}.
|
|
\item The method \method{ismultipart()} was renamed to
|
|
\method{is_multipart()}.
|
|
\item The \method{get_payload()} method has grown a \var{decode}
|
|
optional argument.
|
|
\item The method \method{getall()} was renamed to \method{get_all()}.
|
|
\item The method \method{addheader()} was renamed to \method{add_header()}.
|
|
\item The method \method{gettype()} was renamed to \method{get_type()}.
|
|
\item The method\method{getmaintype()} was renamed to
|
|
\method{get_main_type()}.
|
|
\item The method \method{getsubtype()} was renamed to
|
|
\method{get_subtype()}.
|
|
\item The method \method{getparams()} was renamed to
|
|
\method{get_params()}.
|
|
Also, whereas \method{getparams()} returned a list of strings,
|
|
\method{get_params()} returns a list of 2-tuples, effectively
|
|
the key/value pairs of the parameters, split on the \samp{=}
|
|
sign.
|
|
\item The method \method{getparam()} was renamed to \method{get_param()}.
|
|
\item The method \method{getcharsets()} was renamed to
|
|
\method{get_charsets()}.
|
|
\item The method \method{getfilename()} was renamed to
|
|
\method{get_filename()}.
|
|
\item The method \method{getboundary()} was renamed to
|
|
\method{get_boundary()}.
|
|
\item The method \method{setboundary()} was renamed to
|
|
\method{set_boundary()}.
|
|
\item The method \method{getdecodedpayload()} was removed. To get
|
|
similar functionality, pass the value 1 to the \var{decode} flag
|
|
of the {get_payload()} method.
|
|
\item The method \method{getpayloadastext()} was removed. Similar
|
|
functionality
|
|
is supported by the \class{DecodedGenerator} class in the
|
|
\refmodule{email.Generator} module.
|
|
\item The method \method{getbodyastext()} was removed. You can get
|
|
similar functionality by creating an iterator with
|
|
\function{typed_subpart_iterator()} in the
|
|
\refmodule{email.Iterators} module.
|
|
\end{itemize}
|
|
|
|
The \class{Parser} class has no differences in its public interface.
|
|
It does have some additional smarts to recognize
|
|
\code{message/delivery-status} type messages, which it represents as
|
|
a \class{Message} instance containing separate \class{Message}
|
|
subparts for each header block in the delivery status
|
|
notification\footnote{Delivery Status Notifications (DSN) are defined
|
|
in \rfc{1894}}.
|
|
|
|
The \class{Generator} class has no differences in its public
|
|
interface. There is a new class in the \refmodule{email.Generator}
|
|
module though, called \class{DecodedGenerator} which provides most of
|
|
the functionality previously available in the
|
|
\method{Message.getpayloadastext()} method.
|
|
|
|
The following modules and classes have been changed:
|
|
|
|
\begin{itemize}
|
|
\item The \class{MIMEBase} class constructor arguments \var{_major}
|
|
and \var{_minor} have changed to \var{_maintype} and
|
|
\var{_subtype} respectively.
|
|
\item The \code{Image} class/module has been renamed to
|
|
\code{MIMEImage}. The \var{_minor} argument has been renamed to
|
|
\var{_subtype}.
|
|
\item The \code{Text} class/module has been renamed to
|
|
\code{MIMEText}. The \var{_minor} argument has been renamed to
|
|
\var{_subtype}.
|
|
\item The \code{MessageRFC822} class/module has been renamed to
|
|
\code{MIMEMessage}. Note that an earlier version of
|
|
\module{mimelib} called this class/module \code{RFC822}, but
|
|
that clashed with the Python standard library module
|
|
\refmodule{rfc822} on some case-insensitive file systems.
|
|
|
|
Also, the \class{MIMEMessage} class now represents any kind of
|
|
MIME message with main type \code{message}. It takes an
|
|
optional argument \var{_subtype} which is used to set the MIME
|
|
subtype. \var{_subtype} defaults to \code{rfc822}.
|
|
\end{itemize}
|
|
|
|
\module{mimelib} provided some utility functions in its
|
|
\module{address} and \module{date} modules. All of these functions
|
|
have been moved to the \refmodule{email.Utils} module.
|
|
|
|
The \code{MsgReader} class/module has been removed. Its functionality
|
|
is most closely supported in the \function{body_line_iterator()}
|
|
function in the \refmodule{email.Iterators} module.
|
|
|
|
\subsection{Examples}
|
|
|
|
Coming soon...
|
|
|