cpython/Doc/lib/libwsgiref.tex

782 lines
27 KiB
TeX
Raw Normal View History

\section{\module{wsgiref} --- WSGI Utilities and Reference
Implementation}
\declaremodule{}{wsgiref}
\moduleauthor{Phillip J. Eby}{pje@telecommunity.com}
\sectionauthor{Phillip J. Eby}{pje@telecommunity.com}
\modulesynopsis{WSGI Utilities and Reference Implementation}
2006-06-11 02:45:47 -03:00
\versionadded{2.5}
The Web Server Gateway Interface (WSGI) is a standard interface
between web server software and web applications written in Python.
Having a standard interface makes it easy to use an application
that supports WSGI with a number of different web servers.
Only authors of web servers and programming frameworks need to know
every detail and corner case of the WSGI design. You don't need to
understand every detail of WSGI just to install a WSGI application or
to write a web application using an existing framework.
\module{wsgiref} is a reference implementation of the WSGI specification
that can be used to add WSGI support to a web server or framework. It
provides utilities for manipulating WSGI environment variables and
response headers, base classes for implementing WSGI servers, a demo
HTTP server that serves WSGI applications, and a validation tool that
checks WSGI servers and applications for conformance to the
WSGI specification (\pep{333}).
% XXX If you're just trying to write a web application...
% XXX should create a URL on python.org to point people to.
\subsection{\module{wsgiref.util} -- WSGI environment utilities}
\declaremodule{}{wsgiref.util}
This module provides a variety of utility functions for working with
WSGI environments. A WSGI environment is a dictionary containing
HTTP request variables as described in \pep{333}. All of the functions
taking an \var{environ} parameter expect a WSGI-compliant dictionary to
be supplied; please see \pep{333} for a detailed specification.
\begin{funcdesc}{guess_scheme}{environ}
Return a guess for whether \code{wsgi.url_scheme} should be ``http'' or
``https'', by checking for a \code{HTTPS} environment variable in the
\var{environ} dictionary. The return value is a string.
This function is useful when creating a gateway that wraps CGI or a
CGI-like protocol such as FastCGI. Typically, servers providing such
protocols will include a \code{HTTPS} variable with a value of ``1''
``yes'', or ``on'' when a request is received via SSL. So, this
function returns ``https'' if such a value is found, and ``http''
otherwise.
\end{funcdesc}
\begin{funcdesc}{request_uri}{environ \optional{, include_query=1}}
Return the full request URI, optionally including the query string,
using the algorithm found in the ``URL Reconstruction'' section of
\pep{333}. If \var{include_query} is false, the query string is
not included in the resulting URI.
\end{funcdesc}
\begin{funcdesc}{application_uri}{environ}
Similar to \function{request_uri}, except that the \code{PATH_INFO} and
\code{QUERY_STRING} variables are ignored. The result is the base URI
of the application object addressed by the request.
\end{funcdesc}
\begin{funcdesc}{shift_path_info}{environ}
Shift a single name from \code{PATH_INFO} to \code{SCRIPT_NAME} and
return the name. The \var{environ} dictionary is \emph{modified}
in-place; use a copy if you need to keep the original \code{PATH_INFO}
or \code{SCRIPT_NAME} intact.
If there are no remaining path segments in \code{PATH_INFO}, \code{None}
is returned.
Typically, this routine is used to process each portion of a request
URI path, for example to treat the path as a series of dictionary keys.
This routine modifies the passed-in environment to make it suitable for
invoking another WSGI application that is located at the target URI.
For example, if there is a WSGI application at \code{/foo}, and the
request URI path is \code{/foo/bar/baz}, and the WSGI application at
\code{/foo} calls \function{shift_path_info}, it will receive the string
``bar'', and the environment will be updated to be suitable for passing
to a WSGI application at \code{/foo/bar}. That is, \code{SCRIPT_NAME}
will change from \code{/foo} to \code{/foo/bar}, and \code{PATH_INFO}
will change from \code{/bar/baz} to \code{/baz}.
When \code{PATH_INFO} is just a ``/'', this routine returns an empty
string and appends a trailing slash to \code{SCRIPT_NAME}, even though
empty path segments are normally ignored, and \code{SCRIPT_NAME} doesn't
normally end in a slash. This is intentional behavior, to ensure that
an application can tell the difference between URIs ending in \code{/x}
from ones ending in \code{/x/} when using this routine to do object
traversal.
\end{funcdesc}
\begin{funcdesc}{setup_testing_defaults}{environ}
Update \var{environ} with trivial defaults for testing purposes.
This routine adds various parameters required for WSGI, including
\code{HTTP_HOST}, \code{SERVER_NAME}, \code{SERVER_PORT},
\code{REQUEST_METHOD}, \code{SCRIPT_NAME}, \code{PATH_INFO}, and all of
the \pep{333}-defined \code{wsgi.*} variables. It only supplies default
values, and does not replace any existing settings for these variables.
This routine is intended to make it easier for unit tests of WSGI
servers and applications to set up dummy environments. It should NOT
be used by actual WSGI servers or applications, since the data is fake!
\end{funcdesc}
In addition to the environment functions above, the
\module{wsgiref.util} module also provides these miscellaneous
utilities:
\begin{funcdesc}{is_hop_by_hop}{header_name}
Return true if 'header_name' is an HTTP/1.1 ``Hop-by-Hop'' header, as
defined by \rfc{2616}.
\end{funcdesc}
\begin{classdesc}{FileWrapper}{filelike \optional{, blksize=8192}}
A wrapper to convert a file-like object to an iterator. The resulting
objects support both \method{__getitem__} and \method{__iter__}
iteration styles, for compatibility with Python 2.1 and Jython.
As the object is iterated over, the optional \var{blksize} parameter
will be repeatedly passed to the \var{filelike} object's \method{read()}
method to obtain strings to yield. When \method{read()} returns an
empty string, iteration is ended and is not resumable.
If \var{filelike} has a \method{close()} method, the returned object
will also have a \method{close()} method, and it will invoke the
\var{filelike} object's \method{close()} method when called.
\end{classdesc}
\subsection{\module{wsgiref.headers} -- WSGI response header tools}
\declaremodule{}{wsgiref.headers}
This module provides a single class, \class{Headers}, for convenient
manipulation of WSGI response headers using a mapping-like interface.
\begin{classdesc}{Headers}{headers}
Create a mapping-like object wrapping \var{headers}, which must be a
list of header name/value tuples as described in \pep{333}. Any changes
made to the new \class{Headers} object will directly update the
\var{headers} list it was created with.
\class{Headers} objects support typical mapping operations including
\method{__getitem__}, \method{get}, \method{__setitem__},
\method{setdefault}, \method{__delitem__}, \method{__contains__} and
\method{has_key}. For each of these methods, the key is the header name
(treated case-insensitively), and the value is the first value
associated with that header name. Setting a header deletes any existing
values for that header, then adds a new value at the end of the wrapped
header list. Headers' existing order is generally maintained, with new
headers added to the end of the wrapped list.
Unlike a dictionary, \class{Headers} objects do not raise an error when
you try to get or delete a key that isn't in the wrapped header list.
Getting a nonexistent header just returns \code{None}, and deleting
a nonexistent header does nothing.
\class{Headers} objects also support \method{keys()}, \method{values()},
and \method{items()} methods. The lists returned by \method{keys()}
and \method{items()} can include the same key more than once if there
is a multi-valued header. The \code{len()} of a \class{Headers} object
is the same as the length of its \method{items()}, which is the same
as the length of the wrapped header list. In fact, the \method{items()}
method just returns a copy of the wrapped header list.
Calling \code{str()} on a \class{Headers} object returns a formatted
string suitable for transmission as HTTP response headers. Each header
is placed on a line with its value, separated by a colon and a space.
Each line is terminated by a carriage return and line feed, and the
string is terminated with a blank line.
In addition to their mapping interface and formatting features,
\class{Headers} objects also have the following methods for querying
and adding multi-valued headers, and for adding headers with MIME
parameters:
\begin{methoddesc}{get_all}{name}
Return a list of all the values for the named header.
The returned list will be sorted in the order they appeared in the
original header list or were added to this instance, and may contain
duplicates. Any fields deleted and re-inserted are always appended to
the header list. If no fields exist with the given name, returns an
empty list.
\end{methoddesc}
\begin{methoddesc}{add_header}{name, value, **_params}
Add a (possibly multi-valued) header, with optional MIME parameters
specified via keyword arguments.
\var{name} is the header field to add. Keyword arguments can be used to
set MIME parameters for the header field. Each parameter must be a
string or \code{None}. Underscores in parameter names are converted to
dashes, since dashes are illegal in Python identifiers, but many MIME
parameter names include dashes. If the parameter value is a string, it
is added to the header value parameters in the form \code{name="value"}.
If it is \code{None}, only the parameter name is added. (This is used
for MIME parameters without a value.) Example usage:
\begin{verbatim}
h.add_header('content-disposition', 'attachment', filename='bud.gif')
\end{verbatim}
The above will add a header that looks like this:
\begin{verbatim}
Content-Disposition: attachment; filename="bud.gif"
\end{verbatim}
\end{methoddesc}
\end{classdesc}
\subsection{\module{wsgiref.simple_server} -- a simple WSGI HTTP server}
\declaremodule[wsgiref.simpleserver]{}{wsgiref.simple_server}
This module implements a simple HTTP server (based on
\module{BaseHTTPServer}) that serves WSGI applications. Each server
instance serves a single WSGI application on a given host and port. If
you want to serve multiple applications on a single host and port, you
should create a WSGI application that parses \code{PATH_INFO} to select
which application to invoke for each request. (E.g., using the
\function{shift_path_info()} function from \module{wsgiref.util}.)
\begin{funcdesc}{make_server}{host, port, app
\optional{, server_class=\class{WSGIServer} \optional{,
handler_class=\class{WSGIRequestHandler}}}}
Create a new WSGI server listening on \var{host} and \var{port},
accepting connections for \var{app}. The return value is an instance of
the supplied \var{server_class}, and will process requests using the
specified \var{handler_class}. \var{app} must be a WSGI application
object, as defined by \pep{333}.
Example usage:
\begin{verbatim}from wsgiref.simple_server import make_server, demo_app
httpd = make_server('', 8000, demo_app)
print "Serving HTTP on port 8000..."
# Respond to requests until process is killed
httpd.serve_forever()
# Alternative: serve one request, then exit
##httpd.handle_request()
\end{verbatim}
\end{funcdesc}
\begin{funcdesc}{demo_app}{environ, start_response}
This function is a small but complete WSGI application that
returns a text page containing the message ``Hello world!''
and a list of the key/value pairs provided in the
\var{environ} parameter. It's useful for verifying that a WSGI server
(such as \module{wsgiref.simple_server}) is able to run a simple WSGI
application correctly.
\end{funcdesc}
\begin{classdesc}{WSGIServer}{server_address, RequestHandlerClass}
Create a \class{WSGIServer} instance. \var{server_address} should be
a \code{(host,port)} tuple, and \var{RequestHandlerClass} should be
the subclass of \class{BaseHTTPServer.BaseHTTPRequestHandler} that will
be used to process requests.
You do not normally need to call this constructor, as the
\function{make_server()} function can handle all the details for you.
\class{WSGIServer} is a subclass
of \class{BaseHTTPServer.HTTPServer}, so all of its methods (such as
\method{serve_forever()} and \method{handle_request()}) are available.
\class{WSGIServer} also provides these WSGI-specific methods:
\begin{methoddesc}{set_app}{application}
Sets the callable \var{application} as the WSGI application that will
receive requests.
\end{methoddesc}
\begin{methoddesc}{get_app}{}
Returns the currently-set application callable.
\end{methoddesc}
Normally, however, you do not need to use these additional methods, as
\method{set_app()} is normally called by \function{make_server()}, and
the \method{get_app()} exists mainly for the benefit of request handler
instances.
\end{classdesc}
\begin{classdesc}{WSGIRequestHandler}{request, client_address, server}
Create an HTTP handler for the given \var{request} (i.e. a socket),
\var{client_address} (a \code{(\var{host},\var{port})} tuple), and
\var{server} (\class{WSGIServer} instance).
You do not need to create instances of this class directly; they are
automatically created as needed by \class{WSGIServer} objects. You
can, however, subclass this class and supply it as a \var{handler_class}
to the \function{make_server()} function. Some possibly relevant
methods for overriding in subclasses:
\begin{methoddesc}{get_environ}{}
Returns a dictionary containing the WSGI environment for a request. The
default implementation copies the contents of the \class{WSGIServer}
object's \member{base_environ} dictionary attribute and then adds
various headers derived from the HTTP request. Each call to this method
should return a new dictionary containing all of the relevant CGI
environment variables as specified in \pep{333}.
\end{methoddesc}
\begin{methoddesc}{get_stderr}{}
Return the object that should be used as the \code{wsgi.errors} stream.
The default implementation just returns \code{sys.stderr}.
\end{methoddesc}
\begin{methoddesc}{handle}{}
Process the HTTP request. The default implementation creates a handler
instance using a \module{wsgiref.handlers} class to implement the actual
WSGI application interface.
\end{methoddesc}
\end{classdesc}
\subsection{\module{wsgiref.validate} -- WSGI conformance checker}
\declaremodule{}{wsgiref.validate}
When creating new WSGI application objects, frameworks, servers, or
middleware, it can be useful to validate the new code's conformance
using \module{wsgiref.validate}. This module provides a function that
creates WSGI application objects that validate communications between
a WSGI server or gateway and a WSGI application object, to check both
sides for protocol conformance.
Note that this utility does not guarantee complete \pep{333} compliance;
an absence of errors from this module does not necessarily mean that
errors do not exist. However, if this module does produce an error,
then it is virtually certain that either the server or application is
not 100\% compliant.
This module is based on the \module{paste.lint} module from Ian
Bicking's ``Python Paste'' library.
\begin{funcdesc}{validator}{application}
Wrap \var{application} and return a new WSGI application object. The
returned application will forward all requests to the original
\var{application}, and will check that both the \var{application} and
the server invoking it are conforming to the WSGI specification and to
RFC 2616.
Any detected nonconformance results in an \exception{AssertionError}
being raised; note, however, that how these errors are handled is
server-dependent. For example, \module{wsgiref.simple_server} and other
servers based on \module{wsgiref.handlers} (that don't override the
error handling methods to do something else) will simply output a
message that an error has occurred, and dump the traceback to
\code{sys.stderr} or some other error stream.
This wrapper may also generate output using the \module{warnings} module
to indicate behaviors that are questionable but which may not actually
be prohibited by \pep{333}. Unless they are suppressed using Python
command-line options or the \module{warnings} API, any such warnings
will be written to \code{sys.stderr} (\emph{not} \code{wsgi.errors},
unless they happen to be the same object).
\end{funcdesc}
\subsection{\module{wsgiref.handlers} -- server/gateway base classes}
\declaremodule{}{wsgiref.handlers}
This module provides base handler classes for implementing WSGI servers
and gateways. These base classes handle most of the work of
communicating with a WSGI application, as long as they are given a
CGI-like environment, along with input, output, and error streams.
\begin{classdesc}{CGIHandler}{}
CGI-based invocation via \code{sys.stdin}, \code{sys.stdout},
\code{sys.stderr} and \code{os.environ}. This is useful when you have
a WSGI application and want to run it as a CGI script. Simply invoke
\code{CGIHandler().run(app)}, where \code{app} is the WSGI application
object you wish to invoke.
This class is a subclass of \class{BaseCGIHandler} that sets
\code{wsgi.run_once} to true, \code{wsgi.multithread} to false, and
\code{wsgi.multiprocess} to true, and always uses \module{sys} and
\module{os} to obtain the necessary CGI streams and environment.
\end{classdesc}
\begin{classdesc}{BaseCGIHandler}{stdin, stdout, stderr, environ
\optional{, multithread=True \optional{, multiprocess=False}}}
Similar to \class{CGIHandler}, but instead of using the \module{sys} and
\module{os} modules, the CGI environment and I/O streams are specified
explicitly. The \var{multithread} and \var{multiprocess} values are
used to set the \code{wsgi.multithread} and \code{wsgi.multiprocess}
flags for any applications run by the handler instance.
This class is a subclass of \class{SimpleHandler} intended for use with
software other than HTTP ``origin servers''. If you are writing a
gateway protocol implementation (such as CGI, FastCGI, SCGI, etc.) that
uses a \code{Status:} header to send an HTTP status, you probably want
to subclass this instead of \class{SimpleHandler}.
\end{classdesc}
\begin{classdesc}{SimpleHandler}{stdin, stdout, stderr, environ
\optional{,multithread=True \optional{, multiprocess=False}}}
Similar to \class{BaseCGIHandler}, but designed for use with HTTP origin
servers. If you are writing an HTTP server implementation, you will
probably want to subclass this instead of \class{BaseCGIHandler}
This class is a subclass of \class{BaseHandler}. It overrides the
\method{__init__()}, \method{get_stdin()}, \method{get_stderr()},
\method{add_cgi_vars()}, \method{_write()}, and \method{_flush()}
methods to support explicitly setting the environment and streams via
the constructor. The supplied environment and streams are stored in
the \member{stdin}, \member{stdout}, \member{stderr}, and
\member{environ} attributes.
\end{classdesc}
\begin{classdesc}{BaseHandler}{}
This is an abstract base class for running WSGI applications. Each
instance will handle a single HTTP request, although in principle you
could create a subclass that was reusable for multiple requests.
\class{BaseHandler} instances have only one method intended for external
use:
\begin{methoddesc}{run}{app}
Run the specified WSGI application, \var{app}.
\end{methoddesc}
All of the other \class{BaseHandler} methods are invoked by this method
in the process of running the application, and thus exist primarily to
allow customizing the process.
The following methods MUST be overridden in a subclass:
\begin{methoddesc}{_write}{data}
Buffer the string \var{data} for transmission to the client. It's okay
if this method actually transmits the data; \class{BaseHandler}
just separates write and flush operations for greater efficiency
when the underlying system actually has such a distinction.
\end{methoddesc}
\begin{methoddesc}{_flush}{}
Force buffered data to be transmitted to the client. It's okay if this
method is a no-op (i.e., if \method{_write()} actually sends the data).
\end{methoddesc}
\begin{methoddesc}{get_stdin}{}
Return an input stream object suitable for use as the \code{wsgi.input}
of the request currently being processed.
\end{methoddesc}
\begin{methoddesc}{get_stderr}{}
Return an output stream object suitable for use as the
\code{wsgi.errors} of the request currently being processed.
\end{methoddesc}
\begin{methoddesc}{add_cgi_vars}{}
Insert CGI variables for the current request into the \member{environ}
attribute.
\end{methoddesc}
Here are some other methods and attributes you may wish to override.
This list is only a summary, however, and does not include every method
that can be overridden. You should consult the docstrings and source
code for additional information before attempting to create a customized
\class{BaseHandler} subclass.
Attributes and methods for customizing the WSGI environment:
\begin{memberdesc}{wsgi_multithread}
The value to be used for the \code{wsgi.multithread} environment
variable. It defaults to true in \class{BaseHandler}, but may have
a different default (or be set by the constructor) in the other
subclasses.
\end{memberdesc}
\begin{memberdesc}{wsgi_multiprocess}
The value to be used for the \code{wsgi.multiprocess} environment
variable. It defaults to true in \class{BaseHandler}, but may have
a different default (or be set by the constructor) in the other
subclasses.
\end{memberdesc}
\begin{memberdesc}{wsgi_run_once}
The value to be used for the \code{wsgi.run_once} environment
variable. It defaults to false in \class{BaseHandler}, but
\class{CGIHandler} sets it to true by default.
\end{memberdesc}
\begin{memberdesc}{os_environ}
The default environment variables to be included in every request's
WSGI environment. By default, this is a copy of \code{os.environ} at
the time that \module{wsgiref.handlers} was imported, but subclasses can
either create their own at the class or instance level. Note that the
dictionary should be considered read-only, since the default value is
shared between multiple classes and instances.
\end{memberdesc}
\begin{memberdesc}{server_software}
If the \member{origin_server} attribute is set, this attribute's value
is used to set the default \code{SERVER_SOFTWARE} WSGI environment
variable, and also to set a default \code{Server:} header in HTTP
responses. It is ignored for handlers (such as \class{BaseCGIHandler}
and \class{CGIHandler}) that are not HTTP origin servers.
\end{memberdesc}
\begin{methoddesc}{get_scheme}{}
Return the URL scheme being used for the current request. The default
implementation uses the \function{guess_scheme()} function from
\module{wsgiref.util} to guess whether the scheme should be ``http'' or
``https'', based on the current request's \member{environ} variables.
\end{methoddesc}
\begin{methoddesc}{setup_environ}{}
Set the \member{environ} attribute to a fully-populated WSGI
environment. The default implementation uses all of the above methods
and attributes, plus the \method{get_stdin()}, \method{get_stderr()},
and \method{add_cgi_vars()} methods and the \member{wsgi_file_wrapper}
attribute. It also inserts a \code{SERVER_SOFTWARE} key if not present,
as long as the \member{origin_server} attribute is a true value and the
\member{server_software} attribute is set.
\end{methoddesc}
Methods and attributes for customizing exception handling:
\begin{methoddesc}{log_exception}{exc_info}
Log the \var{exc_info} tuple in the server log. \var{exc_info} is a
\code{(\var{type}, \var{value}, \var{traceback})} tuple. The default
implementation simply writes the traceback to the request's
\code{wsgi.errors} stream and flushes it. Subclasses can override this
method to change the format or retarget the output, mail the traceback
to an administrator, or whatever other action may be deemed suitable.
\end{methoddesc}
\begin{memberdesc}{traceback_limit}
The maximum number of frames to include in tracebacks output by the
default \method{log_exception()} method. If \code{None}, all frames
are included.
\end{memberdesc}
\begin{methoddesc}{error_output}{environ, start_response}
This method is a WSGI application to generate an error page for the
user. It is only invoked if an error occurs before headers are sent
to the client.
This method can access the current error information using
\code{sys.exc_info()}, and should pass that information to
\var{start_response} when calling it (as described in the ``Error
Handling'' section of \pep{333}).
The default implementation just uses the \member{error_status},
\member{error_headers}, and \member{error_body} attributes to generate
an output page. Subclasses can override this to produce more dynamic
error output.
Note, however, that it's not recommended from a security perspective to
spit out diagnostics to any old user; ideally, you should have to do
something special to enable diagnostic output, which is why the default
implementation doesn't include any.
\end{methoddesc}
\begin{memberdesc}{error_status}
The HTTP status used for error responses. This should be a status
string as defined in \pep{333}; it defaults to a 500 code and message.
\end{memberdesc}
\begin{memberdesc}{error_headers}
The HTTP headers used for error responses. This should be a list of
WSGI response headers (\code{(\var{name}, \var{value})} tuples), as
described in \pep{333}. The default list just sets the content type
to \code{text/plain}.
\end{memberdesc}
\begin{memberdesc}{error_body}
The error response body. This should be an HTTP response body string.
It defaults to the plain text, ``A server error occurred. Please
contact the administrator.''
\end{memberdesc}
Methods and attributes for \pep{333}'s ``Optional Platform-Specific File
Handling'' feature:
\begin{memberdesc}{wsgi_file_wrapper}
A \code{wsgi.file_wrapper} factory, or \code{None}. The default value
of this attribute is the \class{FileWrapper} class from
\module{wsgiref.util}.
\end{memberdesc}
\begin{methoddesc}{sendfile}{}
Override to implement platform-specific file transmission. This method
is called only if the application's return value is an instance of
the class specified by the \member{wsgi_file_wrapper} attribute. It
should return a true value if it was able to successfully transmit the
file, so that the default transmission code will not be executed.
The default implementation of this method just returns a false value.
\end{methoddesc}
Miscellaneous methods and attributes:
\begin{memberdesc}{origin_server}
This attribute should be set to a true value if the handler's
\method{_write()} and \method{_flush()} are being used to communicate
directly to the client, rather than via a CGI-like gateway protocol that
wants the HTTP status in a special \code{Status:} header.
This attribute's default value is true in \class{BaseHandler}, but
false in \class{BaseCGIHandler} and \class{CGIHandler}.
\end{memberdesc}
\begin{memberdesc}{http_version}
If \member{origin_server} is true, this string attribute is used to
set the HTTP version of the response set to the client. It defaults to
\code{"1.0"}.
\end{memberdesc}
\end{classdesc}