294 lines
12 KiB
TeX
294 lines
12 KiB
TeX
\section{\module{SocketServer} ---
|
|
A framework for network servers}
|
|
|
|
\declaremodule{standard}{SocketServer}
|
|
\modulesynopsis{A framework for network servers.}
|
|
|
|
|
|
The \module{SocketServer} module simplifies the task of writing network
|
|
servers.
|
|
|
|
There are four basic server classes: \class{TCPServer} uses the
|
|
Internet TCP protocol, which provides for continuous streams of data
|
|
between the client and server. \class{UDPServer} uses datagrams, which
|
|
are discrete packets of information that may arrive out of order or be
|
|
lost while in transit. The more infrequently used
|
|
\class{UnixStreamServer} and \class{UnixDatagramServer} classes are
|
|
similar, but use \UNIX{} domain sockets; they're not available on
|
|
non-\UNIX{} platforms. For more details on network programming, consult
|
|
a book such as W. Richard Steven's \citetitle{UNIX Network Programming}
|
|
or Ralph Davis's \citetitle{Win32 Network Programming}.
|
|
|
|
These four classes process requests \dfn{synchronously}; each request
|
|
must be completed before the next request can be started. This isn't
|
|
suitable if each request takes a long time to complete, because it
|
|
requires a lot of computation, or because it returns a lot of data
|
|
which the client is slow to process. The solution is to create a
|
|
separate process or thread to handle each request; the
|
|
\class{ForkingMixIn} and \class{ThreadingMixIn} mix-in classes can be
|
|
used to support asynchronous behaviour.
|
|
|
|
Creating a server requires several steps. First, you must create a
|
|
request handler class by subclassing the \class{BaseRequestHandler}
|
|
class and overriding its \method{handle()} method; this method will
|
|
process incoming requests. Second, you must instantiate one of the
|
|
server classes, passing it the server's address and the request
|
|
handler class. Finally, call the \method{handle_request()} or
|
|
\method{serve_forever()} method of the server object to process one or
|
|
many requests.
|
|
|
|
When inheriting from \class{ThreadingMixIn} for threaded connection
|
|
behavior, you should explicitly declare how you want your threads
|
|
to behave on an abrupt shutdown. The \class{ThreadingMixIn} class
|
|
defines an attribute \var{daemon_threads}, which indicates whether
|
|
or not the server should wait for thread termination. You should
|
|
set the flag explicitly if you would like threads to behave
|
|
autonomously; the default is \constant{False}, meaning that Python
|
|
will not exit until all threads created by \class{ThreadingMixIn} have
|
|
exited.
|
|
|
|
Server classes have the same external methods and attributes, no
|
|
matter what network protocol they use:
|
|
|
|
\setindexsubitem{(SocketServer protocol)}
|
|
|
|
\subsection{Server Creation Notes}
|
|
|
|
There are five classes in an inheritance diagram, four of which represent
|
|
synchronous servers of four types:
|
|
|
|
\begin{verbatim}
|
|
+------------+
|
|
| BaseServer |
|
|
+------------+
|
|
|
|
|
v
|
|
+-----------+ +------------------+
|
|
| TCPServer |------->| UnixStreamServer |
|
|
+-----------+ +------------------+
|
|
|
|
|
v
|
|
+-----------+ +--------------------+
|
|
| UDPServer |------->| UnixDatagramServer |
|
|
+-----------+ +--------------------+
|
|
\end{verbatim}
|
|
|
|
Note that \class{UnixDatagramServer} derives from \class{UDPServer}, not
|
|
from \class{UnixStreamServer} -- the only difference between an IP and a
|
|
Unix stream server is the address family, which is simply repeated in both
|
|
unix server classes.
|
|
|
|
Forking and threading versions of each type of server can be created using
|
|
the \class{ForkingMixIn} and \class{ThreadingMixIn} mix-in classes. For
|
|
instance, a threading UDP server class is created as follows:
|
|
|
|
\begin{verbatim}
|
|
class ThreadingUDPServer(ThreadingMixIn, UDPServer): pass
|
|
\end{verbatim}
|
|
|
|
The mix-in class must come first, since it overrides a method defined in
|
|
\class{UDPServer}. Setting the various member variables also changes the
|
|
behavior of the underlying server mechanism.
|
|
|
|
To implement a service, you must derive a class from
|
|
\class{BaseRequestHandler} and redefine its \method{handle()} method. You
|
|
can then run various versions of the service by combining one of the server
|
|
classes with your request handler class. The request handler class must be
|
|
different for datagram or stream services. This can be hidden by using the
|
|
handler subclasses \class{StreamRequestHandler} or \class{DatagramRequestHandler}.
|
|
|
|
Of course, you still have to use your head! For instance, it makes no sense
|
|
to use a forking server if the service contains state in memory that can be
|
|
modified by different requests, since the modifications in the child process
|
|
would never reach the initial state kept in the parent process and passed to
|
|
each child. In this case, you can use a threading server, but you will
|
|
probably have to use locks to protect the integrity of the shared data.
|
|
|
|
On the other hand, if you are building an HTTP server where all data is
|
|
stored externally (for instance, in the file system), a synchronous class
|
|
will essentially render the service "deaf" while one request is being
|
|
handled -- which may be for a very long time if a client is slow to receive
|
|
all the data it has requested. Here a threading or forking server is
|
|
appropriate.
|
|
|
|
In some cases, it may be appropriate to process part of a request
|
|
synchronously, but to finish processing in a forked child depending on the
|
|
request data. This can be implemented by using a synchronous server and
|
|
doing an explicit fork in the request handler class \method{handle()}
|
|
method.
|
|
|
|
Another approach to handling multiple simultaneous requests in an
|
|
environment that supports neither threads nor \function{fork()} (or where
|
|
these are too expensive or inappropriate for the service) is to maintain an
|
|
explicit table of partially finished requests and to use \function{select()}
|
|
to decide which request to work on next (or whether to handle a new incoming
|
|
request). This is particularly important for stream services where each
|
|
client can potentially be connected for a long time (if threads or
|
|
subprocesses cannot be used).
|
|
|
|
%XXX should data and methods be intermingled, or separate?
|
|
% how should the distinction between class and instance variables be
|
|
% drawn?
|
|
|
|
\subsection{Server Objects}
|
|
|
|
\begin{funcdesc}{fileno}{}
|
|
Return an integer file descriptor for the socket on which the server
|
|
is listening. This function is most commonly passed to
|
|
\function{select.select()}, to allow monitoring multiple servers in the
|
|
same process.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{handle_request}{}
|
|
Process a single request. This function calls the following methods
|
|
in order: \method{get_request()}, \method{verify_request()}, and
|
|
\method{process_request()}. If the user-provided \method{handle()}
|
|
method of the handler class raises an exception, the server's
|
|
\method{handle_error()} method will be called.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{serve_forever}{}
|
|
Handle an infinite number of requests. This simply calls
|
|
\method{handle_request()} inside an infinite loop.
|
|
\end{funcdesc}
|
|
|
|
\begin{datadesc}{address_family}
|
|
The family of protocols to which the server's socket belongs.
|
|
\constant{socket.AF_INET} and \constant{socket.AF_UNIX} are two
|
|
possible values.
|
|
\end{datadesc}
|
|
|
|
\begin{datadesc}{RequestHandlerClass}
|
|
The user-provided request handler class; an instance of this class is
|
|
created for each request.
|
|
\end{datadesc}
|
|
|
|
\begin{datadesc}{server_address}
|
|
The address on which the server is listening. The format of addresses
|
|
varies depending on the protocol family; see the documentation for the
|
|
socket module for details. For Internet protocols, this is a tuple
|
|
containing a string giving the address, and an integer port number:
|
|
\code{('127.0.0.1', 80)}, for example.
|
|
\end{datadesc}
|
|
|
|
\begin{datadesc}{socket}
|
|
The socket object on which the server will listen for incoming requests.
|
|
\end{datadesc}
|
|
|
|
% XXX should class variables be covered before instance variables, or
|
|
% vice versa?
|
|
|
|
The server classes support the following class variables:
|
|
|
|
\begin{datadesc}{allow_reuse_address}
|
|
Whether the server will allow the reuse of an address. This defaults
|
|
to \constant{False}, and can be set in subclasses to change the policy.
|
|
\end{datadesc}
|
|
|
|
\begin{datadesc}{request_queue_size}
|
|
The size of the request queue. If it takes a long time to process a
|
|
single request, any requests that arrive while the server is busy are
|
|
placed into a queue, up to \member{request_queue_size} requests. Once
|
|
the queue is full, further requests from clients will get a
|
|
``Connection denied'' error. The default value is usually 5, but this
|
|
can be overridden by subclasses.
|
|
\end{datadesc}
|
|
|
|
\begin{datadesc}{socket_type}
|
|
The type of socket used by the server; \constant{socket.SOCK_STREAM}
|
|
and \constant{socket.SOCK_DGRAM} are two possible values.
|
|
\end{datadesc}
|
|
|
|
There are various server methods that can be overridden by subclasses
|
|
of base server classes like \class{TCPServer}; these methods aren't
|
|
useful to external users of the server object.
|
|
|
|
% should the default implementations of these be documented, or should
|
|
% it be assumed that the user will look at SocketServer.py?
|
|
|
|
\begin{funcdesc}{finish_request}{}
|
|
Actually processes the request by instantiating
|
|
\member{RequestHandlerClass} and calling its \method{handle()} method.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{get_request}{}
|
|
Must accept a request from the socket, and return a 2-tuple containing
|
|
the \emph{new} socket object to be used to communicate with the
|
|
client, and the client's address.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{handle_error}{request, client_address}
|
|
This function is called if the \member{RequestHandlerClass}'s
|
|
\method{handle()} method raises an exception. The default action is
|
|
to print the traceback to standard output and continue handling
|
|
further requests.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{process_request}{request, client_address}
|
|
Calls \method{finish_request()} to create an instance of the
|
|
\member{RequestHandlerClass}. If desired, this function can create a
|
|
new process or thread to handle the request; the \class{ForkingMixIn}
|
|
and \class{ThreadingMixIn} classes do this.
|
|
\end{funcdesc}
|
|
|
|
% Is there any point in documenting the following two functions?
|
|
% What would the purpose of overriding them be: initializing server
|
|
% instance variables, adding new network families?
|
|
|
|
\begin{funcdesc}{server_activate}{}
|
|
Called by the server's constructor to activate the server. The default
|
|
behavior just \method{listen}s to the server's socket.
|
|
May be overridden.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{server_bind}{}
|
|
Called by the server's constructor to bind the socket to the desired
|
|
address. May be overridden.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{verify_request}{request, client_address}
|
|
Must return a Boolean value; if the value is \constant{True}, the request will be
|
|
processed, and if it's \constant{False}, the request will be denied.
|
|
This function can be overridden to implement access controls for a server.
|
|
The default implementation always returns \constant{True}.
|
|
\end{funcdesc}
|
|
|
|
\subsection{RequestHandler Objects}
|
|
|
|
The request handler class must define a new \method{handle()} method,
|
|
and can override any of the following methods. A new instance is
|
|
created for each request.
|
|
|
|
\begin{funcdesc}{finish}{}
|
|
Called after the \method{handle()} method to perform any clean-up
|
|
actions required. The default implementation does nothing. If
|
|
\method{setup()} or \method{handle()} raise an exception, this
|
|
function will not be called.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{handle}{}
|
|
This function must do all the work required to service a request.
|
|
The default implementation does nothing.
|
|
Several instance attributes are available to it; the request is
|
|
available as \member{self.request}; the client address as
|
|
\member{self.client_address}; and the server instance as
|
|
\member{self.server}, in case it needs access to per-server
|
|
information.
|
|
|
|
The type of \member{self.request} is different for datagram or stream
|
|
services. For stream services, \member{self.request} is a socket
|
|
object; for datagram services, \member{self.request} is a string.
|
|
However, this can be hidden by using the request handler subclasses
|
|
\class{StreamRequestHandler} or \class{DatagramRequestHandler}, which
|
|
override the \method{setup()} and \method{finish()} methods, and
|
|
provide \member{self.rfile} and \member{self.wfile} attributes.
|
|
\member{self.rfile} and \member{self.wfile} can be read or written,
|
|
respectively, to get the request data or return data to the client.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{setup}{}
|
|
Called before the \method{handle()} method to perform any
|
|
initialization actions required. The default implementation does
|
|
nothing.
|
|
\end{funcdesc}
|