1998-08-10 16:42:37 -03:00
|
|
|
\section{\module{mimetypes} ---
|
1999-04-23 13:02:30 -03:00
|
|
|
Map filenames to MIME types}
|
1998-07-23 14:59:49 -03:00
|
|
|
|
1999-02-22 09:45:09 -04:00
|
|
|
\declaremodule{standard}{mimetypes}
|
1998-07-23 14:59:49 -03:00
|
|
|
\modulesynopsis{Mapping of filename extensions to MIME types.}
|
1999-04-23 13:02:30 -03:00
|
|
|
\sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
|
1998-07-23 14:59:49 -03:00
|
|
|
|
1999-02-22 09:45:09 -04:00
|
|
|
|
1998-05-19 12:03:45 -03:00
|
|
|
\indexii{MIME}{content type}
|
|
|
|
|
Applying proposed patch for bug #474583, optional support for
non-standard but common types. Including Martin's suggestion to add
rejected non-standard types from patch #438790. Specifically,
guess_type(), guess_extension(): Both the functions and the methods
grow an optional "strict" flag, defaulting to true, which determines
whether to recognize non-standard, but commonly found types or not.
Also, I sorted, reformatted, and culled duplicates from the big
types_map dictionary. Note that there are a few non-equivalent
duplicates (e.g. .cdf and .xls) for which the first will just get
thrown away. I didn't remove those though.
Finally, use of the module as a script as grown the -l and -e options
to toggle strictness and to do guess_extension(), respectively.
Doc and unittest updates too.
2001-10-25 18:49:18 -03:00
|
|
|
The \module{mimetypes} module converts between a filename or URL and
|
|
|
|
the MIME type associated with the filename extension. Conversions are
|
|
|
|
provided from filename to MIME type and from MIME type to filename
|
|
|
|
extension; encodings are not supported for the latter conversion.
|
1998-05-19 12:03:45 -03:00
|
|
|
|
2001-08-03 18:03:14 -03:00
|
|
|
The module provides one class and a number of convenience functions.
|
|
|
|
The functions are the normal interface to this module, but some
|
|
|
|
applications may be interested in the class as well.
|
|
|
|
|
1998-05-19 12:03:45 -03:00
|
|
|
The functions described below provide the primary interface for this
|
1999-04-23 13:02:30 -03:00
|
|
|
module. If the module has not been initialized, they will call
|
2001-08-03 18:03:14 -03:00
|
|
|
\function{init()} if they rely on the information \function{init()}
|
|
|
|
sets up.
|
1998-05-19 12:03:45 -03:00
|
|
|
|
|
|
|
|
Applying proposed patch for bug #474583, optional support for
non-standard but common types. Including Martin's suggestion to add
rejected non-standard types from patch #438790. Specifically,
guess_type(), guess_extension(): Both the functions and the methods
grow an optional "strict" flag, defaulting to true, which determines
whether to recognize non-standard, but commonly found types or not.
Also, I sorted, reformatted, and culled duplicates from the big
types_map dictionary. Note that there are a few non-equivalent
duplicates (e.g. .cdf and .xls) for which the first will just get
thrown away. I didn't remove those though.
Finally, use of the module as a script as grown the -l and -e options
to toggle strictness and to do guess_extension(), respectively.
Doc and unittest updates too.
2001-10-25 18:49:18 -03:00
|
|
|
\begin{funcdesc}{guess_type}{filename\optional{, strict}}
|
1998-05-19 12:03:45 -03:00
|
|
|
Guess the type of a file based on its filename or URL, given by
|
2001-08-03 15:39:36 -03:00
|
|
|
\var{filename}. The return value is a tuple \code{(\var{type},
|
|
|
|
\var{encoding})} where \var{type} is \code{None} if the type can't be
|
Applying proposed patch for bug #474583, optional support for
non-standard but common types. Including Martin's suggestion to add
rejected non-standard types from patch #438790. Specifically,
guess_type(), guess_extension(): Both the functions and the methods
grow an optional "strict" flag, defaulting to true, which determines
whether to recognize non-standard, but commonly found types or not.
Also, I sorted, reformatted, and culled duplicates from the big
types_map dictionary. Note that there are a few non-equivalent
duplicates (e.g. .cdf and .xls) for which the first will just get
thrown away. I didn't remove those though.
Finally, use of the module as a script as grown the -l and -e options
to toggle strictness and to do guess_extension(), respectively.
Doc and unittest updates too.
2001-10-25 18:49:18 -03:00
|
|
|
guessed (missing or unknown suffix) or a string of the form
|
2001-08-03 15:39:36 -03:00
|
|
|
\code{'\var{type}/\var{subtype}'}, usable for a MIME
|
Applying proposed patch for bug #474583, optional support for
non-standard but common types. Including Martin's suggestion to add
rejected non-standard types from patch #438790. Specifically,
guess_type(), guess_extension(): Both the functions and the methods
grow an optional "strict" flag, defaulting to true, which determines
whether to recognize non-standard, but commonly found types or not.
Also, I sorted, reformatted, and culled duplicates from the big
types_map dictionary. Note that there are a few non-equivalent
duplicates (e.g. .cdf and .xls) for which the first will just get
thrown away. I didn't remove those though.
Finally, use of the module as a script as grown the -l and -e options
to toggle strictness and to do guess_extension(), respectively.
Doc and unittest updates too.
2001-10-25 18:49:18 -03:00
|
|
|
\mailheader{content-type} header\indexii{MIME}{headers}.
|
|
|
|
|
|
|
|
\var{encoding} is \code{None} for no encoding or the name of the
|
|
|
|
program used to encode (e.g. \program{compress} or \program{gzip}).
|
|
|
|
The encoding is suitable for use as a \mailheader{Content-Encoding}
|
|
|
|
header, \emph{not} as a \mailheader{Content-Transfer-Encoding} header.
|
|
|
|
The mappings are table driven. Encoding suffixes are case sensitive;
|
|
|
|
type suffixes are first tried case sensitively, then case
|
|
|
|
insensitively.
|
|
|
|
|
|
|
|
Optional \var{strict} is a flag specifying whether the list of known
|
|
|
|
MIME types is limited to only the official types \ulink{registered
|
|
|
|
with IANA}{http://www.isi.edu/in-notes/iana/assignments/media-types}
|
|
|
|
are recognized. When \var{strict} is true (the default), only the
|
|
|
|
IANA types are supported; when \var{strict} is false, some additional
|
|
|
|
non-standard but commonly used MIME types are also recognized.
|
1998-05-19 12:03:45 -03:00
|
|
|
\end{funcdesc}
|
|
|
|
|
2002-09-06 13:15:58 -03:00
|
|
|
\begin{funcdesc}{guess_all_extensions}{type\optional{, strict}}
|
|
|
|
Guess the extensions for a file based on its MIME type, given by
|
|
|
|
\var{type}.
|
|
|
|
The return value is a list of strings giving all possible filename extensions,
|
|
|
|
including the leading dot (\character{.}). The extensions are not guaranteed
|
|
|
|
to have been associated with any particular data stream, but would be mapped
|
|
|
|
to the MIME type \var{type} by \function{guess_type()}. If no extension can
|
|
|
|
be guessed for \var{type}, \code{None} is returned.
|
|
|
|
|
|
|
|
Optional \var{strict} has the same meaning as with the
|
|
|
|
\function{guess_type()} function.
|
|
|
|
\end{funcdesc}
|
|
|
|
|
|
|
|
|
Applying proposed patch for bug #474583, optional support for
non-standard but common types. Including Martin's suggestion to add
rejected non-standard types from patch #438790. Specifically,
guess_type(), guess_extension(): Both the functions and the methods
grow an optional "strict" flag, defaulting to true, which determines
whether to recognize non-standard, but commonly found types or not.
Also, I sorted, reformatted, and culled duplicates from the big
types_map dictionary. Note that there are a few non-equivalent
duplicates (e.g. .cdf and .xls) for which the first will just get
thrown away. I didn't remove those though.
Finally, use of the module as a script as grown the -l and -e options
to toggle strictness and to do guess_extension(), respectively.
Doc and unittest updates too.
2001-10-25 18:49:18 -03:00
|
|
|
\begin{funcdesc}{guess_extension}{type\optional{, strict}}
|
1998-05-19 12:03:45 -03:00
|
|
|
Guess the extension for a file based on its MIME type, given by
|
|
|
|
\var{type}.
|
|
|
|
The return value is a string giving a filename extension, including the
|
|
|
|
leading dot (\character{.}). The extension is not guaranteed to have been
|
|
|
|
associated with any particular data stream, but would be mapped to the
|
|
|
|
MIME type \var{type} by \function{guess_type()}. If no extension can
|
|
|
|
be guessed for \var{type}, \code{None} is returned.
|
Applying proposed patch for bug #474583, optional support for
non-standard but common types. Including Martin's suggestion to add
rejected non-standard types from patch #438790. Specifically,
guess_type(), guess_extension(): Both the functions and the methods
grow an optional "strict" flag, defaulting to true, which determines
whether to recognize non-standard, but commonly found types or not.
Also, I sorted, reformatted, and culled duplicates from the big
types_map dictionary. Note that there are a few non-equivalent
duplicates (e.g. .cdf and .xls) for which the first will just get
thrown away. I didn't remove those though.
Finally, use of the module as a script as grown the -l and -e options
to toggle strictness and to do guess_extension(), respectively.
Doc and unittest updates too.
2001-10-25 18:49:18 -03:00
|
|
|
|
|
|
|
Optional \var{strict} has the same meaning as with the
|
|
|
|
\function{guess_type()} function.
|
1998-05-19 12:03:45 -03:00
|
|
|
\end{funcdesc}
|
|
|
|
|
|
|
|
|
|
|
|
Some additional functions and data items are available for controlling
|
|
|
|
the behavior of the module.
|
|
|
|
|
|
|
|
|
|
|
|
\begin{funcdesc}{init}{\optional{files}}
|
|
|
|
Initialize the internal data structures. If given, \var{files} must
|
|
|
|
be a sequence of file names which should be used to augment the
|
|
|
|
default type map. If omitted, the file names to use are taken from
|
2001-08-03 15:39:36 -03:00
|
|
|
\constant{knownfiles}. Each file named in \var{files} or
|
|
|
|
\constant{knownfiles} takes precedence over those named before it.
|
1998-05-19 12:03:45 -03:00
|
|
|
Calling \function{init()} repeatedly is allowed.
|
|
|
|
\end{funcdesc}
|
|
|
|
|
|
|
|
\begin{funcdesc}{read_mime_types}{filename}
|
|
|
|
Load the type map given in the file \var{filename}, if it exists. The
|
|
|
|
type map is returned as a dictionary mapping filename extensions,
|
|
|
|
including the leading dot (\character{.}), to strings of the form
|
|
|
|
\code{'\var{type}/\var{subtype}'}. If the file \var{filename} does
|
|
|
|
not exist or cannot be read, \code{None} is returned.
|
|
|
|
\end{funcdesc}
|
|
|
|
|
|
|
|
|
2002-09-06 13:15:58 -03:00
|
|
|
\begin{funcdesc}{add_type}{type, ext\optional{, strict}}
|
|
|
|
Add a mapping from the mimetype \var{type} to the extension \var{ext}.
|
|
|
|
When the extension is already known, the new type will replace the old
|
|
|
|
one. When the type is already known the extension will be added
|
|
|
|
to the list of known extensions.
|
|
|
|
|
|
|
|
When \var{strict} is the mapping will added to the official
|
|
|
|
MIME types, otherwise to the non-standard ones.
|
|
|
|
\end{funcdesc}
|
|
|
|
|
|
|
|
|
1998-05-19 12:03:45 -03:00
|
|
|
\begin{datadesc}{inited}
|
|
|
|
Flag indicating whether or not the global data structures have been
|
|
|
|
initialized. This is set to true by \function{init()}.
|
|
|
|
\end{datadesc}
|
|
|
|
|
|
|
|
\begin{datadesc}{knownfiles}
|
|
|
|
List of type map file names commonly installed. These files are
|
1999-04-23 13:02:30 -03:00
|
|
|
typically named \file{mime.types} and are installed in different
|
|
|
|
locations by different packages.\index{file!mime.types}
|
1998-05-19 12:03:45 -03:00
|
|
|
\end{datadesc}
|
|
|
|
|
|
|
|
\begin{datadesc}{suffix_map}
|
|
|
|
Dictionary mapping suffixes to suffixes. This is used to allow
|
|
|
|
recognition of encoded files for which the encoding and the type are
|
|
|
|
indicated by the same extension. For example, the \file{.tgz}
|
|
|
|
extension is mapped to \file{.tar.gz} to allow the encoding and type
|
|
|
|
to be recognized separately.
|
|
|
|
\end{datadesc}
|
|
|
|
|
|
|
|
\begin{datadesc}{encodings_map}
|
|
|
|
Dictionary mapping filename extensions to encoding types.
|
|
|
|
\end{datadesc}
|
|
|
|
|
|
|
|
\begin{datadesc}{types_map}
|
|
|
|
Dictionary mapping filename extensions to MIME types.
|
|
|
|
\end{datadesc}
|
2001-08-03 18:03:14 -03:00
|
|
|
|
Applying proposed patch for bug #474583, optional support for
non-standard but common types. Including Martin's suggestion to add
rejected non-standard types from patch #438790. Specifically,
guess_type(), guess_extension(): Both the functions and the methods
grow an optional "strict" flag, defaulting to true, which determines
whether to recognize non-standard, but commonly found types or not.
Also, I sorted, reformatted, and culled duplicates from the big
types_map dictionary. Note that there are a few non-equivalent
duplicates (e.g. .cdf and .xls) for which the first will just get
thrown away. I didn't remove those though.
Finally, use of the module as a script as grown the -l and -e options
to toggle strictness and to do guess_extension(), respectively.
Doc and unittest updates too.
2001-10-25 18:49:18 -03:00
|
|
|
\begin{datadesc}{common_types}
|
|
|
|
Dictionary mapping filename extensions to non-standard, but commonly
|
|
|
|
found MIME types.
|
|
|
|
\end{datadesc}
|
|
|
|
|
2001-08-03 18:03:14 -03:00
|
|
|
|
|
|
|
The \class{MimeTypes} class may be useful for applications which may
|
|
|
|
want more than one MIME-type database:
|
|
|
|
|
|
|
|
\begin{classdesc}{MimeTypes}{\optional{filenames}}
|
|
|
|
This class represents a MIME-types database. By default, it
|
|
|
|
provides access to the same database as the rest of this module.
|
|
|
|
The initial database is a copy of that provided by the module, and
|
|
|
|
may be extended by loading additional \file{mime.types}-style files
|
|
|
|
into the database using the \method{read()} or \method{readfp()}
|
|
|
|
methods. The mapping dictionaries may also be cleared before
|
|
|
|
loading additional data if the default data is not desired.
|
|
|
|
|
|
|
|
The optional \var{filenames} parameter can be used to cause
|
|
|
|
additional files to be loaded ``on top'' of the default database.
|
2001-08-03 21:48:49 -03:00
|
|
|
|
|
|
|
\versionadded{2.2}
|
2001-08-03 18:03:14 -03:00
|
|
|
\end{classdesc}
|
|
|
|
|
|
|
|
|
|
|
|
\subsection{MimeTypes Objects \label{mimetypes-objects}}
|
|
|
|
|
|
|
|
\class{MimeTypes} instances provide an interface which is very like
|
|
|
|
that of the \refmodule{mimetypes} module.
|
|
|
|
|
|
|
|
\begin{datadesc}{suffix_map}
|
|
|
|
Dictionary mapping suffixes to suffixes. This is used to allow
|
|
|
|
recognition of encoded files for which the encoding and the type are
|
|
|
|
indicated by the same extension. For example, the \file{.tgz}
|
|
|
|
extension is mapped to \file{.tar.gz} to allow the encoding and type
|
|
|
|
to be recognized separately. This is initially a copy of the global
|
|
|
|
\code{suffix_map} defined in the module.
|
|
|
|
\end{datadesc}
|
|
|
|
|
|
|
|
\begin{datadesc}{encodings_map}
|
|
|
|
Dictionary mapping filename extensions to encoding types. This is
|
|
|
|
initially a copy of the global \code{encodings_map} defined in the
|
|
|
|
module.
|
|
|
|
\end{datadesc}
|
|
|
|
|
|
|
|
\begin{datadesc}{types_map}
|
|
|
|
Dictionary mapping filename extensions to MIME types. This is
|
|
|
|
initially a copy of the global \code{types_map} defined in the
|
|
|
|
module.
|
|
|
|
\end{datadesc}
|
|
|
|
|
Applying proposed patch for bug #474583, optional support for
non-standard but common types. Including Martin's suggestion to add
rejected non-standard types from patch #438790. Specifically,
guess_type(), guess_extension(): Both the functions and the methods
grow an optional "strict" flag, defaulting to true, which determines
whether to recognize non-standard, but commonly found types or not.
Also, I sorted, reformatted, and culled duplicates from the big
types_map dictionary. Note that there are a few non-equivalent
duplicates (e.g. .cdf and .xls) for which the first will just get
thrown away. I didn't remove those though.
Finally, use of the module as a script as grown the -l and -e options
to toggle strictness and to do guess_extension(), respectively.
Doc and unittest updates too.
2001-10-25 18:49:18 -03:00
|
|
|
\begin{datadesc}{common_types}
|
|
|
|
Dictionary mapping filename extensions to non-standard, but commonly
|
|
|
|
found MIME types. This is initially a copy of the global
|
|
|
|
\code{common_types} defined in the module.
|
|
|
|
\end{datadesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}{guess_extension}{type\optional{, strict}}
|
2001-08-03 18:03:14 -03:00
|
|
|
Similar to the \function{guess_extension()} function, using the
|
|
|
|
tables stored as part of the object.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
Applying proposed patch for bug #474583, optional support for
non-standard but common types. Including Martin's suggestion to add
rejected non-standard types from patch #438790. Specifically,
guess_type(), guess_extension(): Both the functions and the methods
grow an optional "strict" flag, defaulting to true, which determines
whether to recognize non-standard, but commonly found types or not.
Also, I sorted, reformatted, and culled duplicates from the big
types_map dictionary. Note that there are a few non-equivalent
duplicates (e.g. .cdf and .xls) for which the first will just get
thrown away. I didn't remove those though.
Finally, use of the module as a script as grown the -l and -e options
to toggle strictness and to do guess_extension(), respectively.
Doc and unittest updates too.
2001-10-25 18:49:18 -03:00
|
|
|
\begin{methoddesc}{guess_type}{url\optional{, strict}}
|
2001-08-03 18:03:14 -03:00
|
|
|
Similar to the \function{guess_type()} function, using the tables
|
|
|
|
stored as part of the object.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}{read}{path}
|
|
|
|
Load MIME information from a file named \var{path}. This uses
|
|
|
|
\method{readfp()} to parse the file.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}{readfp}{file}
|
|
|
|
Load MIME type information from an open file. The file must have
|
|
|
|
the format of the standard \file{mime.types} files.
|
|
|
|
\end{methoddesc}
|