144 lines
6.2 KiB
TeX
144 lines
6.2 KiB
TeX
\section{\module{xml.sax} ---
|
|
Support for SAX2 parsers}
|
|
|
|
\declaremodule{standard}{xml.sax}
|
|
\modulesynopsis{Package containing SAX2 base classes and convenience
|
|
functions.}
|
|
\moduleauthor{Lars Marius Garshol}{larsga@garshol.priv.no}
|
|
\sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
|
|
\sectionauthor{Martin v. L\"owis}{martin@v.loewis.de}
|
|
|
|
\versionadded{2.0}
|
|
|
|
|
|
The \module{xml.sax} package provides a number of modules which
|
|
implement the Simple API for XML (SAX) interface for Python. The
|
|
package itself provides the SAX exceptions and the convenience
|
|
functions which will be most used by users of the SAX API.
|
|
|
|
The convenience functions are:
|
|
|
|
\begin{funcdesc}{make_parser}{\optional{parser_list}}
|
|
Create and return a SAX \class{XMLReader} object. The first parser
|
|
found will be used. If \var{parser_list} is provided, it must be a
|
|
sequence of strings which name modules that have a function named
|
|
\function{create_parser()}. Modules listed in \var{parser_list}
|
|
will be used before modules in the default list of parsers.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{parse}{filename_or_stream, handler\optional{, error_handler}}
|
|
Create a SAX parser and use it to parse a document. The document,
|
|
passed in as \var{filename_or_stream}, can be a filename or a file
|
|
object. The \var{handler} parameter needs to be a SAX
|
|
\class{ContentHandler} instance. If \var{error_handler} is given,
|
|
it must be a SAX \class{ErrorHandler} instance; if omitted,
|
|
\exception{SAXParseException} will be raised on all errors. There
|
|
is no return value; all work must be done by the \var{handler}
|
|
passed in.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{parseString}{string, handler\optional{, error_handler}}
|
|
Similar to \function{parse()}, but parses from a buffer \var{string}
|
|
received as a parameter.
|
|
\end{funcdesc}
|
|
|
|
A typical SAX application uses three kinds of objects: readers,
|
|
handlers and input sources. ``Reader'' in this context is another
|
|
term for parser, i.e.\ some piece of code that reads the bytes or
|
|
characters from the input source, and produces a sequence of events.
|
|
The events then get distributed to the handler objects, i.e.\ the
|
|
reader invokes a method on the handler. A SAX application must
|
|
therefore obtain a reader object, create or open the input sources,
|
|
create the handlers, and connect these objects all together. As the
|
|
final step of preparation, the reader is called to parse the input.
|
|
During parsing, methods on the handler objects are called based on
|
|
structural and syntactic events from the input data.
|
|
|
|
For these objects, only the interfaces are relevant; they are normally
|
|
not instantiated by the application itself. Since Python does not have
|
|
an explicit notion of interface, they are formally introduced as
|
|
classes, but applications may use implementations which do not inherit
|
|
from the provided classes. The \class{InputSource}, \class{Locator},
|
|
\class{Attributes}, \class{AttributesNS}, and
|
|
\class{XMLReader} interfaces are defined in the module
|
|
\refmodule{xml.sax.xmlreader}. The handler interfaces are defined in
|
|
\refmodule{xml.sax.handler}. For convenience, \class{InputSource}
|
|
(which is often instantiated directly) and the handler classes are
|
|
also available from \module{xml.sax}. These interfaces are described
|
|
below.
|
|
|
|
In addition to these classes, \module{xml.sax} provides the following
|
|
exception classes.
|
|
|
|
\begin{excclassdesc}{SAXException}{msg\optional{, exception}}
|
|
Encapsulate an XML error or warning. This class can contain basic
|
|
error or warning information from either the XML parser or the
|
|
application: it can be subclassed to provide additional
|
|
functionality or to add localization. Note that although the
|
|
handlers defined in the \class{ErrorHandler} interface receive
|
|
instances of this exception, it is not required to actually raise
|
|
the exception --- it is also useful as a container for information.
|
|
|
|
When instantiated, \var{msg} should be a human-readable description
|
|
of the error. The optional \var{exception} parameter, if given,
|
|
should be \code{None} or an exception that was caught by the parsing
|
|
code and is being passed along as information.
|
|
|
|
This is the base class for the other SAX exception classes.
|
|
\end{excclassdesc}
|
|
|
|
\begin{excclassdesc}{SAXParseException}{msg, exception, locator}
|
|
Subclass of \exception{SAXException} raised on parse errors.
|
|
Instances of this class are passed to the methods of the SAX
|
|
\class{ErrorHandler} interface to provide information about the
|
|
parse error. This class supports the SAX \class{Locator} interface
|
|
as well as the \class{SAXException} interface.
|
|
\end{excclassdesc}
|
|
|
|
\begin{excclassdesc}{SAXNotRecognizedException}{msg\optional{, exception}}
|
|
Subclass of \exception{SAXException} raised when a SAX
|
|
\class{XMLReader} is confronted with an unrecognized feature or
|
|
property. SAX applications and extensions may use this class for
|
|
similar purposes.
|
|
\end{excclassdesc}
|
|
|
|
\begin{excclassdesc}{SAXNotSupportedException}{msg\optional{, exception}}
|
|
Subclass of \exception{SAXException} raised when a SAX
|
|
\class{XMLReader} is asked to enable a feature that is not
|
|
supported, or to set a property to a value that the implementation
|
|
does not support. SAX applications and extensions may use this
|
|
class for similar purposes.
|
|
\end{excclassdesc}
|
|
|
|
|
|
\begin{seealso}
|
|
\seetitle[http://www.saxproject.org/]{SAX: The Simple API for
|
|
XML}{This site is the focal point for the definition of
|
|
the SAX API. It provides a Java implementation and online
|
|
documentation. Links to implementations and historical
|
|
information are also available.}
|
|
|
|
\seemodule{xml.sax.handler}{Definitions of the interfaces for
|
|
application-provided objects.}
|
|
|
|
\seemodule{xml.sax.saxutils}{Convenience functions for use in SAX
|
|
applications.}
|
|
|
|
\seemodule{xml.sax.xmlreader}{Definitions of the interfaces for
|
|
parser-provided objects.}
|
|
\end{seealso}
|
|
|
|
|
|
\subsection{SAXException Objects \label{sax-exception-objects}}
|
|
|
|
The \class{SAXException} exception class supports the following
|
|
methods:
|
|
|
|
\begin{methoddesc}[SAXException]{getMessage}{}
|
|
Return a human-readable message describing the error condition.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[SAXException]{getException}{}
|
|
Return an encapsulated exception object, or \code{None}.
|
|
\end{methoddesc}
|