\documentclass{howto} \usepackage{ltxmarkup} \title{Documenting Python} \input{boilerplate} % Now override the stuff that includes author information; % Guido did *not* write this one! \author{Fred L. Drake, Jr.} \authoraddress{ PythonLabs \\ E-mail: \email{fdrake@acm.org} } \begin{document} \maketitle \begin{abstract} \noindent The Python language has a substantial body of documentation, much of it contributed by various authors. The markup used for the Python documentation is based on \LaTeX{} and requires a significant set of macros written specifically for documenting Python. This document describes the macros introduced to support Python documentation and how they should be used to support a wide range of output formats. This document describes the document classes and special markup used in the Python documentation. Authors may use this guide, in conjunction with the template files provided with the distribution, to create or maintain whole documents or sections. \end{abstract} \tableofcontents \section{Introduction} Python's documentation has long been considered to be good for a free programming language. There are a number of reasons for this, the most important being the early commitment of Python's creator, Guido van Rossum, to providing documentation on the language and its libraries, and the continuing involvement of the user community in providing assistance for creating and maintaining documentation. The involvement of the community takes many forms, from authoring to bug reports to just plain complaining when the documentation could be more complete or easier to use. All of these forms of input from the community have proved useful during the time I've been involved in maintaining the documentation. This document is aimed at authors and potential authors of documentation for Python. More specifically, it is for people contributing to the standard documentation and developing additional documents using the same tools as the standard documents. This guide will be less useful for authors using the Python documentation tools for topics other than Python, and less useful still for authors not using the tools at all. The material in this guide is intended to assist authors using the Python documentation tools. It includes information on the source distribution of the standard documentation, a discussion of the document types, reference material on the markup defined in the document classes, a list of the external tools needed for processing documents, and reference material on the tools provided with the documentation resources. At the end, there is also a section discussing future directions for the Python documentation and where to turn for more information. \section{Directory Structure} The source distribution for the standard Python documentation contains a large number of directories. While third-party documents do not need to be placed into this structure or need to be placed within a similar structure, it can be helpful to know where to look for examples and tools when developing new documents using the Python documentation tools. This section describes this directory structure. The documentation sources are usually placed within the Python source distribution as the top-level directory \file{Doc/}, but are not dependent on the Python source distribution in any way. The \file{Doc/} directory contains a few files and several subdirectories. The files are mostly self-explanatory, including a \file{README} and a \file{Makefile}. The directories fall into three categories: \begin{definitions} \term{Document Sources} The \LaTeX{} sources for each document are placed in a separate directory. These directories are given short names which vaguely indicate the document in each: \begin{tableii}{p{.75in}|p{3in}}{filenq}{Directory}{Document Title} \lineii{api/} {\citetitle[../api/api.html]{The Python/C API}} \lineii{dist/} {\citetitle[../dist/dist.html]{Distributing Python Modules}} \lineii{doc/} {\citetitle[../doc/doc.html]{Documenting Python}} \lineii{ext/} {\citetitle[../ext/ext.html]{Extending and Embedding the Python Interpreter}} \lineii{inst/} {\citetitle[../inst/inst.html]{Installing Python Modules}} \lineii{lib/} {\citetitle[../lib/lib.html]{Python Library Reference}} \lineii{mac/} {\citetitle[../mac/mac.html]{Macintosh Module Reference}} \lineii{ref/} {\citetitle[../ref/ref.html]{Python Reference Manual}} \lineii{tut/} {\citetitle[../tut/tut.html]{Python Tutorial}} \end{tableii} \term{Format-Specific Output} Most output formats have a directory which contains a \file{Makefile} which controls the generation of that format and provides storage for the formatted documents. The only variations within this category are the Portable Document Format (PDF) and PostScript versions are placed in the directories \file{paper-a4/} and \file{paper-letter/} (this causes all the temporary files created by \LaTeX{} to be kept in the same place for each paper size, where they can be more easily ignored). \begin{tableii}{p{.75in}|p{3in}}{filenq}{Directory}{Output Formats} \lineii{html/}{HTML output} \lineii{info/}{GNU info output} \lineii{paper-a4/}{PDF and PostScript, A4 paper} \lineii{paper-letter/}{PDF and PostScript, US-Letter paper} \end{tableii} \term{Supplemental Files} Some additional directories are used to store supplemental files used for the various processes. Directories are included for the shared \LaTeX{} document classes, the \LaTeX2HTML support, template files for various document components, and the scripts used to perform various steps in the formatting processes. \begin{tableii}{p{.75in}|p{3in}}{filenq}{Directory}{Contents} \lineii{perl/}{Support for \LaTeX2HTML processing} \lineii{templates/}{Example files for source documents} \lineii{texinputs/}{Style implementation for \LaTeX} \lineii{tools/}{Custom processing scripts} \end{tableii} \end{definitions} \section{\LaTeX{} Primer \label{latex-primer}} This section is a brief introduction to \LaTeX{} concepts and syntax, to provide authors enough information to author documents productively without having to become ``\TeX{}nicians.'' Perhaps the most important concept to keep in mind while marking up Python documentation is that while \TeX{} is unstructured, \LaTeX{} was designed as a layer on top of \TeX{} which specifically supports structured markup. The Python-specific markup is intended to extend the structure provided by standard \LaTeX{} document classes to support additional information specific to Python. \LaTeX{} documents contain two parts: the preamble and the body. The preamble is used to specify certain metadata about the document itself, such as the title, the list of authors, the date, and the \emph{class} the document belongs to. Additional information used to control index generation and the use of bibliographic databases can also be placed in the preamble. For most authors, the preamble can be most easily created by copying it from an existing document and modifying a few key pieces of information. The \dfn{class} of a document is used to place a document within a broad category of documents and set some fundamental formatting properties. For Python documentation, two classes are used: the \code{manual} class and the \code{howto} class. These classes also define the additional markup used to document Python concepts and structures. Specific information about these classes is provided in section \ref{classes}, ``Document Classes,'' below. The first thing in the preamble is the declaration of the document's class. After the class declaration, a number of \emph{macros} are used to provide further information about the document and setup any additional markup that is needed. No output is generated from the preamble; it is an error to include free text in the preamble because it would cause output. The document body follows the preamble. This contains all the printed components of the document marked up structurally. Generic \LaTeX{} structures include hierarchical sections, numbered and bulleted lists, and special structures for the document abstract and indexes. \subsection{Syntax} There are some things that an author of Python documentation needs to know about \LaTeX{} syntax. A \dfn{comment} is started by the ``percent'' character (\character{\%}) and continues through the end of the line and all leading whitespace on the following line. This is a little different from any programming language I know of, so an example is in order: \begin{verbatim} This is text.% comment This is more text. % another comment Still more text. \end{verbatim} The first non-comment character following the first comment is the letter \character{T} on the second line; the leading whitespace on that line is consumed as part of the first comment. This means that there is no space between the first and second sentences, so the period and letter \character{T} will be directly adjacent in the typeset document. Note also that though the first non-comment character after the second comment is the letter \character{S}, there is whitespace preceding the comment, so the two sentences are separated as expected. A \dfn{group} is an enclosure for a collection of text and commands which encloses the formatting context and constrains the scope of any changes to that context made by commands within the group. Groups can be nested hierarchically. The formatting context includes the font and the definition of additional macros (or overrides of macros defined in outer groups). Syntactically, groups are enclosed in braces: \begin{verbatim} {text in a group} \end{verbatim} An alternate syntax for a group using brackets, \code{[...]}, is used by macros and environment constructors which take optional parameters; brackets do not normally hold syntactic significance. A degenerate group, containing only one atomic bit of content, does not need to have an explicit group, unless it is required to avoid ambiguity. Since Python tends toward the explicit, groups are also made explicit in the documentation markup. Groups are used only sparingly in the Python documentation, except for their use in marking parameters to macros and environments. A \dfn{macro} is usually a simple construct which is identified by name and can take some number of parameters. In normal \LaTeX{} usage, one of these can be optional. The markup is introduced using the backslash character (\character{\e}), and the name is given by alphabetic characters (no digits, hyphens, or underscores). Required parameters should be marked as a group, and optional parameters should be marked using the alternate syntax for a group. For example, a macro named ``foo'' which takes a single parameter would appear like this: \begin{verbatim} \name{parameter} \end{verbatim} A macro which takes an optional parameter would be typed like this when the optional paramter is given: \begin{verbatim} \name[optional] \end{verbatim} If both optional and required parameters are to be required, it looks like this: \begin{verbatim} \name[optional]{required} \end{verbatim} A macro name may be followed by a space or newline; a space between the macro name and any parameters will be consumed, but this usage is not practiced in the Python documentation. Such a space is still consumed if there are no parameters to the macro, in which case inserting an empty group (\code{\{\}}) or explicit word space (\samp{\e\ }) immediately after the macro name helps to avoid running the expansion of the macro into the following text. Macros which take no parameters but which should not be followed by a word space do not need special treatment if the following character in the document source if not a name character (such as punctuation). Each line of this example shows an appropriate way to write text which includes a macro which takes no parameters: \begin{verbatim} This \UNIX{} is followed by a space. This \UNIX\ is also followed by a space. \UNIX, followed by a comma, needs no additional markup. \end{verbatim} An \dfn{environment} is a larger construct than a macro, and can be used for things with more content than would conveniently fit in a macro parameter. They are primarily used when formatting parameters need to be changed before and after a large chunk of content, but the content itself needs to be highly flexible. Code samples are presented using an environment, and descriptions of functions, methods, and classes are also marked using environments. Since the content of an environment is free-form and can consist of several paragraphs, they are actually marked using a pair of macros: \macro{begin} and \macro{end}. These macros both take the name of the environment as a parameter. An example is the environment used to mark the abstract of a document: \begin{verbatim} \begin{abstract} This is the text of the abstract. It concisely explains what information is found in the document. It can consist of multiple paragraphs. \end{abstract} \end{verbatim} An environment can also have required and optional parameters of its own. These follow the parameter of the \macro{begin} macro. This example shows an environment which takes a single required parameter: \begin{verbatim} \begin{datadesc}{controlnames} A 33-element string array that contains the \ASCII{} mnemonics for the thirty-two \ASCII{} control characters from 0 (NUL) to 0x1f (US), in order, plus the mnemonic \samp{SP} for the space character. \end{datadesc} \end{verbatim} There are a number of less-used marks in \LaTeX{} which are used to enter non-\ASCII{} characters, especially those used in European names. Given that these are often used adjacent to other characters, the markup required to produce the proper character may need to be followed by a space or an empty group, or the markup can be enclosed in a group. Some which are found in Python documentation are: \begin{tableii}{c|l}{textrm}{Character}{Markup} \lineii{\c c}{\code{\e c c}} \lineii{\"o}{\code{\e"o}} \lineii{\o}{\code{\e o}} \end{tableii} \subsection{Hierarchical Structure} \LaTeX{} expects documents to be arranged in a conventional, hierarchical way, with chapters, sections, sub-sections, appendixes, and the like. These are marked using macros rather than environments, probably because the end of a section can be safely inferred when a section of equal or higher level starts. There are six ``levels'' of sectioning in the document classes used for Python documentation, and the deepest two levels\footnote{The deepest levels have the highest numbers in the table.} are not used. The levels are: \begin{tableiii}{c|l|c}{textrm}{Level}{Macro Name}{Notes} \lineiii{1}{\macro{chapter}}{(1)} \lineiii{2}{\macro{section}}{} \lineiii{3}{\macro{subsection}}{} \lineiii{4}{\macro{subsubsection}}{} \lineiii{5}{\macro{paragraph}}{(2)} \lineiii{6}{\macro{subparagraph}}{} \end{tableiii} \noindent Notes: \begin{description} \item[(1)] Only used for the \code{manual} documents, as described in section \ref{classes}, ``Document Classes.'' \item[(2)] Not the same as a paragraph of text; nobody seems to use this. \end{description} \section{Document Classes \label{classes}} Two \LaTeX{} document classes are defined specifically for use with the Python documentation. The \code{manual} class is for large documents which are sectioned into chapters, and the \code{howto} class is for smaller documents. The \code{manual} documents are larger and are used for most of the standard documents. This document class is based on the standard \LaTeX{} \code{report} class and is formatted very much like a long technical report. The \citetitle[../ref/ref.html]{Python Reference Manual} is a good example of a \code{manual} document, and the \citetitle[../lib/lib.html]{Python Library Reference} is a large example. The \code{howto} documents are shorter, and don't have the large structure of the \code{manual} documents. This class is based on the standard \LaTeX{} \code{article} class and is formatted somewhat like the Linux Documentation Project's ``HOWTO'' series as done originally using the LinuxDoc software. The original intent for the document class was that it serve a similar role as the LDP's HOWTO series, but the applicability of the class turns out to be somewhat broader. This class is used for ``how-to'' documents (this document is an example) and for shorter reference manuals for small, fairly cohesive module libraries. Examples of the later use include the standard \citetitle[../mac/mac.html]{Macintosh Library Modules} and \citetitle[http://starship.python.net/crew/fdrake/manuals/krb5py/krb5py.html]{Using Kerberos from Python}, which contains reference material for an extension package. These documents are roughly equivalent to a single chapter from a larger work. \section{Special Markup Constructs} The Python document classes define a lot of new environments and macros. This section contains the reference material for these facilities. \subsection{Markup for the Preamble \label{preamble-info}} \begin{macrodesc}{release}{\p{ver}} Set the version number for the software described in the document. \end{macrodesc} \begin{macrodesc}{setshortversion}{\p{sver}} Specify the ``short'' version number of the documented software to be \var{sver}. \end{macrodesc} \subsection{Meta-information Markup \label{meta-info}} \begin{macrodesc}{sectionauthor}{\p{author}\p{email}} Identifies the author of the current section. \var{author} should be the author's name such that it can be used for presentation (though it isn't), and \var{email} should be the author's email address. The domain name portion of the address should be lower case. No presentation is generated from this markup, but it is used to help keep track of contributions. \end{macrodesc} \subsection{Information Units \label{info-units}} XXX Explain terminology, or come up with something more ``lay.'' There are a number of environments used to describe specific features provided by modules. Each environment requires parameters needed to provide basic information about what is being described, and the environment content should be the description. Most of these environments make entries in the general index (if one is being produced for the document); if no index entry is desired, non-indexing variants are available for many of these environments. The environments have names of the form \code{\var{feature}desc}, and the non-indexing variants are named \code{\var{feature}descni}. The available variants are explicitly included in the list below. For each of these environments, the first parameter, \var{name}, provides the name by which the feature is accessed. Environments which describe features of objects within a module, such as object methods or data attributes, allow an optional \var{type name} parameter. When the feature is an attribute of class instances, \var{type name} only needs to be given if the class was not the most recently described class in the module; the \var{name} value from the most recent \env{classdesc} is implied. For features of built-in or extension types, the \var{type name} value should always be provided. Another special case includes methods and members of general ``protocols,'' such as the formatter and writer protocols described for the \module{formatter} module: these may be documented without any specific implementation classes, and will always require the \var{type name} parameter to be provided. \begin{envdesc}{cfuncdesc}{\p{type}\p{name}\p{args}} Environment used to described a C function. The \var{type} should be specified as a \keyword{typedef} name, \code{struct \var{tag}}, or the name of a primitive type. If it is a pointer type, the trailing asterisk should not be preceded by a space. \var{name} should be the name of the function (or function-like pre-processor macro), and \var{args} should give the types and names of the parameters. The names need to be given so they may be used in the description. \end{envdesc} \begin{envdesc}{ctypedesc}{\op{tag}\p{name}} Environment used to described a C type. The \var{name} parameter should be the \keyword{typedef} name. If the type is defined as a \keyword{struct} without a \keyword{typedef}, \var{name} should have the form \code{struct \var{tag}}. \var{name} will be added to the index unless \var{tag} is provided, in which case \var{tag} will be used instead. \var{tag} should not be used for a \keyword{typedef} name. \end{envdesc} \begin{envdesc}{cvardesc}{\p{type}\p{name}} Description of a global C variable. \var{type} should be the \keyword{typedef} name, \code{struct \var{tag}}, or the name of a primitive type. If variable has a pointer type, the trailing asterisk should \emph{not} be preceded by a space. \end{envdesc} \begin{envdesc}{datadesc}{\p{name}} This environment is used to document global data in a module, including both variables and values used as ``defined constants.'' Class and object attributes are not documented using this environment. \end{envdesc} \begin{envdesc}{datadescni}{\p{name}} Like \env{datadesc}, but without creating any index entries. \end{envdesc} \begin{envdesc}{excclassdesc}{\p{name}\p{constructor parameters}} Descibe an exception defined by a class. \var{constructor parameters} should not include the \var{self} parameter or the parentheses used in the call syntax. To describe an exception class without describing the parameters to its constructor, use the \env{excdesc} environment. \end{envdesc} \begin{envdesc}{excdesc}{\p{name}} Describe an exception. This may be either a string exception or a class exception. In the case of class exceptions, the constructor parameters are not described; use \env{excclassdesc} to describe an exception class and its constructor. \end{envdesc} \begin{envdesc}{funcdesc}{\p{name}\p{parameters}} Describe a module-level function. \var{parameters} should not include the parentheses used in the call syntax. Object methods are not documented using this environment. Bound object methods placed in the module namespace as part of the public interface of the module are documented using this, as they are equivalent to normal functions for most purposes. The description should include information about the parameters required and how they are used (especially whether mutable objects passed as parameters are modified), side effects, and possible exceptions. A small example may be provided. \end{envdesc} \begin{envdesc}{funcdescni}{\p{name}\p{parameters}} Like \env{funcdesc}, but without creating any index entries. \end{envdesc} \begin{envdesc}{classdesc}{\p{name}\p{constructor parameters}} Describe a class and its constructor. \var{constructor parameters} should not include the \var{self} parameter or the parentheses used in the call syntax. \end{envdesc} \begin{envdesc}{classdesc*}{\p{name}} Describe a class without describing the constructor. This can be used to describe classes that are merely containers for attributes or which should never be instantiated or subclassed by user code. \end{envdesc} \begin{envdesc}{memberdesc}{\op{type name}\p{name}} Describe an object data attribute. The description should include information about the type of the data to be expected and whether it may be changed directly. \end{envdesc} \begin{envdesc}{memberdescni}{\op{type name}\p{name}} Like \env{memberdesc}, but without creating any index entries. \end{envdesc} \begin{envdesc}{methoddesc}{\op{type name}\p{name}\p{parameters}} Describe an object method. \var{parameters} should not include the \var{self} parameter or the parentheses used in the call syntax. The description should include similar information to that described for \env{funcdesc}. \end{envdesc} \begin{envdesc}{methoddescni}{\op{type name}\p{name}\p{parameters}} Like \env{methoddesc}, but without creating any index entries. \end{envdesc} \subsection{Showing Code Examples} Examples of Python source code or interactive sessions are represented as \env{verbatim} environments. This environment is a standard part of \LaTeX{}. It is important to only use spaces for indentation in code examples since \TeX{} drops tabs instead of converting them to spaces. Representing an interactive session requires including the prompts and output along with the Python code. No special markup is required for interactive sessions. After the last line of input or output presented, there should not be an ``unused'' primary prompt; this is an example of what \emph{not} to do: \begin{verbatim} >>> 1 + 1 2 >>> \end{verbatim} Within the \env{verbatim} environment, characters special to \LaTeX{} do not need to be specially marked in any way. The entire example will be presented in a monospaced font; no attempt at ``pretty-printing'' is made, as the environment must work for non-Python code and non-code displays. There should be no blank lines at the top or bottom of any \env{verbatim} display. Longer displays of verbatim text may be included by storing the example text in an external file containing only plain text. The file may be included using the standard \macro{verbatiminput} macro; this macro takes a single argument naming the file containing the text. For example, to include the Python source file \file{example.py}, use: \begin{verbatim} \verbatiminput{example.py} \end{verbatim} Use of \macro{verbatiminput} allows easier use of special editing modes for the included file. The file should be placed in the same directory as the \LaTeX{} files for the document. The Python Documentation Special Interest Group has discussed a number of approaches to creating pretty-printed code displays and interactive sessions; see the Doc-SIG area on the Python Web site for more information on this topic. \subsection{Inline Markup} The macros described in this section are used to mark just about anything interesting in the document text. They may be used in headings (though anything involving hyperlinks should be avoided there) as well as in the body text. \begin{macrodesc}{bfcode}{\p{text}} Like \macro{code}, but also makes the font bold-face. \end{macrodesc} \begin{macrodesc}{cdata}{\p{name}} The name of a C-language variable. \end{macrodesc} \begin{macrodesc}{cfunction}{\p{name}} The name of a C-language function. \var{name} should include the function name and the trailing parentheses. \end{macrodesc} \begin{macrodesc}{character}{\p{char}} A character when discussing the character rather than a one-byte string value. The character will be typeset as with \macro{samp}. \end{macrodesc} \begin{macrodesc}{citetitle}{\op{url}\p{title}} A title for a referenced publication. If \var{url} is specified, the title will be made into a hyperlink when formatted as HTML. \end{macrodesc} \begin{macrodesc}{class}{\p{name}} A class name; a dotted name may be used. \end{macrodesc} \begin{macrodesc}{code}{\p{text}} A short code fragment or literal constant value. Typically, it should not include any spaces since no quotation marks are added. \end{macrodesc} \begin{macrodesc}{constant}{\p{name}} The name of a ``defined'' constant. This may be a C-language \code{\#define} or a Python variable that is not intended to be changed. \end{macrodesc} \begin{macrodesc}{ctype}{\p{name}} The name of a C \keyword{typedef} or structure. For structures defined without a \keyword{typedef}, use \code{\e ctype\{struct struct_tag\}} to make it clear that the \keyword{struct} is required. \end{macrodesc} \begin{macrodesc}{deprecated}{\p{version}\p{what to do}} Declare whatever is being described as being deprecated starting with release \var{version}. The text given as \var{what to do} should recommend something to use instead. \end{macrodesc} \begin{macrodesc}{dfn}{\p{term}} Mark the defining instance of \var{term} in the text. (No index entries are generated.) \end{macrodesc} \begin{macrodesc}{e}{} Produces a backslash. This is convenient in \macro{code} and similar macros, and is only defined there. To create a backslash in ordinary text (such as the contents of the \macro{file} macro), use the standard \macro{textbackslash} macro. \end{macrodesc} \begin{macrodesc}{email}{\p{address}} An email address. Note that this is \emph{not} hyperlinked in any of the possible output formats. The domain name portion of the address should be lower case. \end{macrodesc} \begin{macrodesc}{emph}{\p{text}} Emphasized text; this will be presented in an italic font. \end{macrodesc} \begin{macrodesc}{envvar}{\p{name}} An environment variable. Index entries are generated. \end{macrodesc} \begin{macrodesc}{exception}{\p{name}} The name of an exception. A dotted name may be used. \end{macrodesc} \begin{macrodesc}{file}{\p{file or dir}} The name of a file or directory. In the PDF and PostScript outputs, single quotes and a font change are used to indicate the file name, but no quotes are used in the HTML output. \strong{Warning:} The \macro{file} macro cannot be used in the content of a section title due to processing limitations. \end{macrodesc} \begin{macrodesc}{filenq}{\p{file or dir}} Like \macro{file}, but single quotes are never used. This can be used in conjunction with tables if a column will only contain file or directory names. \strong{Warning:} The \macro{filenq} macro cannot be used in the content of a section title due to processing limitations. \end{macrodesc} \begin{macrodesc}{function}{\p{name}} The name of a Python function; dotted names may be used. \end{macrodesc} \begin{macrodesc}{kbd}{\p{key sequence}} Mark a sequence of keystrokes. What form \var{key sequence} takes may depend on platform- or application-specific conventions. For example, an \program{xemacs} key sequence may be marked like \code{\e kbd\{C-x C-f\}}. \end{macrodesc} \begin{macrodesc}{keyword}{\p{name}} The name of a keyword in a programming language. \end{macrodesc} \begin{macrodesc}{makevar}{\p{name}} The name of a \program{make} variable. \end{macrodesc} \begin{macrodesc}{manpage}{\p{name}\p{section}} A reference to a \UNIX{} manual page. \end{macrodesc} \begin{macrodesc}{member}{\p{name}} The name of a data attribute of an object. \end{macrodesc} \begin{macrodesc}{method}{\p{name}} The name of a method of an object. \var{name} should include the method name and the trailing parentheses. A dotted name may be used. \end{macrodesc} \begin{macrodesc}{mimetype}{\p{name}} The name of a MIME type. \end{macrodesc} \begin{macrodesc}{module}{\p{name}} The name of a module; a dotted name may be used. This should also be used for package names. \end{macrodesc} \begin{macrodesc}{newsgroup}{\p{name}} The name of a USENET newsgroup. \end{macrodesc} \begin{macrodesc}{program}{\p{name}} The name of an executable program. This may differ from the file name for the executable for some platforms. In particular, the \file{.exe} (or other) extension should be omitted for DOS and Windows programs. \end{macrodesc} \begin{macrodesc}{programopt}{\p{option}} A command-line option to an executable program. Use this only for ``shot'' options, and include the leading hyphen. \end{macrodesc} \begin{macrodesc}{longprogramopt}{\p{option}} A long command-line option to an executable program. This should only be used for long option names which will be prefixed by two hyphens; the hyphens should not be provided as part of \var{option}. \end{macrodesc} \begin{macrodesc}{pep}{\p{number}} A reference to a Python Enhancement Proposal. This generates appropriate index entries. The text \samp{PEP \var{number}} is generated; in the HTML output, this text is a hyperlink to an online copy of the specified PEP. \end{macrodesc} \begin{macrodesc}{refmodule}{\op{key}\p{name}} Like \macro{module}, but create a hyperlink to the documentation for the named module. Note that the corresponding \macro{declaremodule} must be in the same document. If the \macro{declaremodule} defines a module key different from the module name, it must also be provided as \var{key} to the \macro{refmodule} macro. \end{macrodesc} \begin{macrodesc}{regexp}{\p{string}} Mark a regular expression. \end{macrodesc} \begin{macrodesc}{rfc}{\p{number}} A reference to an Internet Request for Comments. This generates appropriate index entries. The text \samp{RFC \var{number}} is generated; in the HTML output, this text is a hyperlink to an online copy of the specified RFC. \end{macrodesc} \begin{macrodesc}{samp}{\p{text}} A short code sample, but possibly longer than would be given using \macro{code}. Since quotation marks are added, spaces are acceptable. \end{macrodesc} \begin{macrodesc}{shortversion}{} The ``short'' version number of the documented software, as specified using the \macro{setshortversion} macro in the preamble. For Python, the short version number for a release is the first three characters of the \code{sys.version} value. For example, versions 2.0b1 and 2.0.1 both have a short version of 2.0. This may not apply for all packages; if \macro{setshortversion} is not used, this produces an empty expansion. See also the \macro{version} macro. \end{macrodesc} \begin{macrodesc}{strong}{\p{text}} Strongly emphasized text; this will be presented using a bold font. \end{macrodesc} \begin{macrodesc}{url}{\p{url}} A URL (or URN). The URL will be presented as text. In the HTML and PDF formatted versions, the URL will also be a hyperlink. This can be used when referring to external resources. Note that many characters are special to \LaTeX{} and this macro does not always do the right thing. In particular, the tilde character (\character{\~}) is mis-handled; encoding it as a hex-sequence does work, use \samp{\%7e} in place of the tilde character. \end{macrodesc} \begin{macrodesc}{var}{\p{name}} The name of a variable or formal parameter in running text. \end{macrodesc} \begin{macrodesc}{version}{} The version number of the described software, as specified using \macro{release} in the preamble. See also the \macro{shortversion} macro. \end{macrodesc} \begin{macrodesc}{versionadded}{\op{explanation}\p{version}} The version of Python which added the described feature to the library or C API. \var{explanation} should be a \emph{brief} explanation of the change consisting of a capitalized sentence fragment; a period will be appended by the formatting process. This is typically added to the end of the first paragraph of the description before any availability notes. The location should be selected so the explanation makes sense and may vary as needed. \end{macrodesc} \begin{macrodesc}{versionchanged}{\op{explanation}\p{version}} The version of Python in which the named feature was changed in some way (new parameters, changed side effects, etc.). \var{explanation} should be a \emph{brief} explanation of the change consisting of a capitalized sentence fragment; a period will be appended by the formatting process. This is typically added to the end of the first paragraph of the description before any availability notes and after \macro{versionadded}. The location should be selected so the explanation makes sense and may vary as needed. \end{macrodesc} \subsection{Module-specific Markup} The markup described in this section is used to provide information about a module being documented. A typical use of this markup appears at the top of the section used to document a module. A typical example might look like this: \begin{verbatim} \section{\module{spam} --- Access to the SPAM facility} \declaremodule{extension}{spam} \platform{Unix} \modulesynopsis{Access to the SPAM facility of \UNIX{}.} \moduleauthor{Jane Doe}{jane.doe@frobnitz.org} \end{verbatim} Python packages\index{packages} --- collections of modules that can be described as a unit --- are documented using the same markup as modules. The name for a module in a package should be typed in ``fully qualified'' form (i.e., it should include the package name). For example, a module ``foo'' in package ``bar'' should be marked as \samp{\e module\{bar.foo\}}, and the beginning of the reference section would appear as: \begin{verbatim} \section{\module{bar.foo} --- Module from the \module{bar} package} \declaremodule{extension}{bar.foo} \modulesynopsis{Nifty module from the \module{bar} package.} \moduleauthor{Jane Doe}{jane.doe@frobnitz.org} \end{verbatim} Note that the name of a package is also marked using \macro{module}. \begin{macrodesc}{declaremodule}{\op{key}\p{type}\p{name}} Requires two parameters: module type (\samp{standard}, \samp{builtin}, \samp{extension}, or \samp{}), and the module name. An optional parameter should be given as the basis for the module's ``key'' used for linking to or referencing the section. The ``key'' should only be given if the module's name contains any underscores, and should be the name with the underscores stripped. Note that the \var{type} parameter must be one of the values listed above or an error will be printed. For modules which are contained in packages, the fully-qualified name should be given as \var{name} parameter. This should be the first thing after the \macro{section} used to introduce the module. \end{macrodesc} \begin{macrodesc}{platform}{\p{specifier}} Specifies the portability of the module. \var{specifier} is a comma-separated list of keys that specify what platforms the module is available on. The keys are short identifiers; examples that are in use include \samp{IRIX}, \samp{Mac}, \samp{Windows}, and \samp{Unix}. It is important to use a key which has already been used when applicable. This is used to provide annotations in the Module Index and the HTML and GNU info output. \end{macrodesc} \begin{macrodesc}{modulesynopsis}{\p{text}} The \var{text} is a short, ``one line'' description of the module that can be used as part of the chapter introduction. This is must be placed after \macro{declaremodule}. The synopsis is used in building the contents of the table inserted as the \macro{localmoduletable}. No text is produced at the point of the markup. \end{macrodesc} \begin{macrodesc}{moduleauthor}{\p{name}\p{email}} This macro is used to encode information about who authored a module. This is currently not used to generate output, but can be used to help determine the origin of the module. \end{macrodesc} \subsection{Library-level Markup} This markup is used when describing a selection of modules. For example, the \citetitle[../mac/mac.html]{Macintosh Library Modules} document uses this to help provide an overview of the modules in the collection, and many chapters in the \citetitle[../lib/lib.html]{Python Library Reference} use it for the same purpose. \begin{macrodesc}{localmoduletable}{} If a \file{.syn} file exists for the current chapter (or for the entire document in \code{howto} documents), a \env{synopsistable} is created with the contents loaded from the \file{.syn} file. \end{macrodesc} \subsection{Table Markup} There are three general-purpose table environments defined which should be used whenever possible. These environments are defined to provide tables of specific widths and some convenience for formatting. These environments are not meant to be general replacements for the standard \LaTeX{} table environments, but can be used for an advantage when the documents are processed using the tools for Python documentation processing. In particular, the generated HTML looks good! There is also an advantage for the eventual conversion of the documentation to SGML (see section \ref{futures}, ``Future Directions''). Each environment is named \env{table\var{cols}}, where \var{cols} is the number of columns in the table specified in lower-case Roman numerals. Within each of these environments, an additional macro, \macro{line\var{cols}}, is defined, where \var{cols} matches the \var{cols} value of the corresponding table environment. These are supported for \var{cols} values of \code{ii}, \code{iii}, and \code{iv}. These environments are all built on top of the \env{tabular} environment. Variants based on the \env{longtable} environment are also provided. Note that all tables in the standard Python documentation use vertical lines between columns, and this must be specified in the markup for each table. A general border around the outside of the table is not used, but would be the responsibility of the processor; the document markup should not include an exterior border. The \env{longtable}-based variants of the table environments are formatted with extra space before and after, so should only be used on tables which are long enough that splitting over multiple pages is reasonable; tables with fewer than twenty rows should never by marked using the long flavors of the table environments. The header row is repeated across the top of each part of the table. \begin{envdesc}{tableii}{\p{colspec}\p{col1font}\p{heading1}\p{heading2}} Create a two-column table using the \LaTeX{} column specifier \var{colspec}. The column specifier should indicate vertical bars between columns as appropriate for the specific table, but should not specify vertical bars on the outside of the table (that is considered a stylesheet issue). The \var{col1font} parameter is used as a stylistic treatment of the first column of the table: the first column is presented as \code{\e\var{col1font}\{column1\}}. To avoid treating the first column specially, \var{col1font} may be \samp{textrm}. The column headings are taken from the values \var{heading1} and \var{heading2}. \end{envdesc} \begin{envdesc}{longtableii}{\unspecified} Like \env{tableii}, but produces a table which may be broken across page boundaries. The parameters are the same as for \env{tableii}. \end{envdesc} \begin{macrodesc}{lineii}{\p{column1}\p{column2}} Create a single table row within a \env{tableii} or \env{longtableii} environment. The text for the first column will be generated by applying the macro named by the \var{col1font} value when the \env{tableii} was opened. \end{macrodesc} \begin{envdesc}{tableiii}{\p{colspec}\p{col1font}\p{heading1}\p{heading2}\p{heading3}} Like the \env{tableii} environment, but with a third column. The heading for the third column is given by \var{heading3}. \end{envdesc} \begin{envdesc}{longtableiii}{\unspecified} Like \env{tableiii}, but produces a table which may be broken across page boundaries. The parameters are the same as for \env{tableiii}. \end{envdesc} \begin{macrodesc}{lineiii}{\p{column1}\p{column2}\p{column3}} Like the \macro{lineii} macro, but with a third column. The text for the third column is given by \var{column3}. \end{macrodesc} \begin{envdesc}{tableiv}{\p{colspec}\p{col1font}\p{heading1}\p{heading2}\p{heading3}\p{heading4}} Like the \env{tableiii} environment, but with a fourth column. The heading for the fourth column is given by \var{heading4}. \end{envdesc} \begin{envdesc}{longtableiv}{\unspecified} Like \env{tableiv}, but produces a table which may be broken across page boundaries. The parameters are the same as for \env{tableiv}. \end{envdesc} \begin{macrodesc}{lineiv}{\p{column1}\p{column2}\p{column3}\p{column4}} Like the \macro{lineiii} macro, but with a fourth column. The text for the fourth column is given by \var{column4}. \end{macrodesc} An additional table-like environment is \env{synopsistable}. The table generated by this environment contains two columns, and each row is defined by an alternate definition of \macro{modulesynopsis}. This environment is not normally used by authors, but is created by the \macro{localmoduletable} macro. \subsection{Reference List Markup \label{references}} Many sections include a list of references to module documentation or external documents. These lists are created using the \env{seealso} environment. This environment defines some additional macros to support creating reference entries in a reasonable manner. The \env{seealso} environment is typically placed in a section just before any sub-sections. This is done to ensure that reference links related to the section are not hidden in a subsection in the hypertext renditions of the documentation. \begin{envdesc}{seealso}{} This environment creates a ``See also:'' heading and defines the markup used to describe individual references. \end{envdesc} For each of the following macros, \var{why} should be one or more complete sentences, starting with a capital letter (unless it starts with an identifier, which should not be modified), and ending with the apropriate punctuation. These macros are only defined within the content of the \env{seealso} environment. \begin{macrodesc}{seemodule}{\op{key}\p{name}\p{why}} Refer to another module. \var{why} should be a brief explanation of why the reference may be interesting. The module name is given in \var{name}, with the link key given in \var{key} if necessary. In the HTML and PDF conversions, the module name will be a hyperlink to the referred-to module. \strong{Note:} The module must be documented in the same document (the corresponding \macro{declaremodule} is required). \end{macrodesc} \begin{macrodesc}{seepep}{\p{number}\p{title}\p{why}} Refer to an Python Enhancement Proposal (PEP). \var{number} should be the official number assigned by the PEP Editor, \var{title} should be the human-readable title of the PEP as found in the official copy of the document, and \var{why} should explain what's interesting about the PEP. This should be used to refer the reader to PEPs which specify interfaces or language features relevant to the material in the annotated section of the documentation. \end{macrodesc} \begin{macrodesc}{seerfc}{\p{number}\p{title}\p{why}} Refer to an IETF Request for Comments (RFC). Otherwise very similar to \macro{seepep}. This should be used to refer the reader to PEPs which specify protocols or data formats relevant to the material in the annotated section of the documentation. \end{macrodesc} \begin{macrodesc}{seetext}{\p{text}} Add arbitrary text \var{text} to the ``See also:'' list. This can be used to refer to off-line materials or on-line materials using the \macro{url} macro. This should consist of one or more complete sentences. \end{macrodesc} \begin{macrodesc}{seetitle}{\op{url}\p{title}\p{why}} Add a reference to an external document named \var{title}. If \var{url} is given, the title is made a hyperlink in the HTML version of the documentation, and displayed below the title in the typeset versions of the documentation. \end{macrodesc} \begin{macrodesc}{seeurl}{\p{url}\p{why}} References to specific on-line resources should be given using the \macro{seeurl} macro. No title is associated with the reference, but the \var{why} text may include a title marked using the \macro{citetitle} macro. \end{macrodesc} \subsection{Index-generating Markup \label{indexing}} Effective index generation for technical documents can be very difficult, especially for someone familiar with the topic but not the creation of indexes. Much of the difficulty arises in the area of terminology: including the terms an expert would use for a concept is not sufficient. Coming up with the terms that a novice would look up is fairly difficult for an author who, typically, is an expert in the area she is writing on. The truly difficult aspects of index generation are not areas with which the documentation tools can help. However, ease of producing the index once content decisions are made is within the scope of the tools. Markup is provided which the processing software is able to use to generate a variety of kinds of index entry with minimal effort. Additionally, many of the environments described in section \ref{info-units}, ``Information Units,'' will generate appropriate entries into the general and module indexes. The following macro can be used to control the generation of index data, and should be used in the document preamble: \begin{macrodesc}{makemodindex}{} This should be used in the document preamble if a ``Module Index'' is desired for a document containing reference material on many modules. This causes a data file \code{lib\var{jobname}.idx} to be created from the \macro{declaremodule} macros. This file can be processed by the \program{makeindex} program to generate a file which can be \macro{input} into the document at the desired location of the module index. \end{macrodesc} There are a number of macros that are useful for adding index entries for particular concepts, many of which are specific to programming languages or even Python. \begin{macrodesc}{bifuncindex}{\p{name}} Add an index entry referring to a built-in function named \var{name}; parentheses should not be included after \var{name}. \end{macrodesc} \begin{macrodesc}{exindex}{\p{exception}} Add a reference to an exception named \var{exception}. The exception may be either string- or class-based. \end{macrodesc} \begin{macrodesc}{kwindex}{\p{keyword}} Add a reference to a language keyword (not a keyword parameter in a function or method call). \end{macrodesc} \begin{macrodesc}{obindex}{\p{object type}} Add an index entry for a built-in object type. \end{macrodesc} \begin{macrodesc}{opindex}{\p{operator}} Add a reference to an operator, such as \samp{+}. \end{macrodesc} \begin{macrodesc}{refmodindex}{\op{key}\p{module}} Add an index entry for module \var{module}; if \var{module} contains an underscore, the optional parameter \var{key} should be provided as the same string with underscores removed. An index entry ``\var{module} (module)'' will be generated. This is intended for use with non-standard modules implemented in Python. \end{macrodesc} \begin{macrodesc}{refexmodindex}{\op{key}\p{module}} As for \macro{refmodindex}, but the index entry will be ``\var{module} (extension module).'' This is intended for use with non-standard modules not implemented in Python. \end{macrodesc} \begin{macrodesc}{refbimodindex}{\op{key}\p{module}} As for \macro{refmodindex}, but the index entry will be ``\var{module} (built-in module).'' This is intended for use with standard modules not implemented in Python. \end{macrodesc} \begin{macrodesc}{refstmodindex}{\op{key}\p{module}} As for \macro{refmodindex}, but the index entry will be ``\var{module} (standard module).'' This is intended for use with standard modules implemented in Python. \end{macrodesc} \begin{macrodesc}{stindex}{\p{statement}} Add an index entry for a statement type, such as \keyword{print} or \keyword{try}/\keyword{finally}. XXX Need better examples of difference from \macro{kwindex}. \end{macrodesc} Additional macros are provided which are useful for conveniently creating general index entries which should appear at many places in the index by rotating a list of words. These are simple macros that simply use \macro{index} to build some number of index entries. Index entries build using these macros contain both primary and secondary text. \begin{macrodesc}{indexii}{\p{word1}\p{word2}} Build two index entries. This is exactly equivalent to using \code{\e index\{\var{word1}!\var{word2}\}} and \code{\e index\{\var{word2}!\var{word1}\}}. \end{macrodesc} \begin{macrodesc}{indexiii}{\p{word1}\p{word2}\p{word3}} Build three index entries. This is exactly equivalent to using \code{\e index\{\var{word1}!\var{word2} \var{word3}\}}, \code{\e index\{\var{word2}!\var{word3}, \var{word1}\}}, and \code{\e index\{\var{word3}!\var{word1} \var{word2}\}}. \end{macrodesc} \begin{macrodesc}{indexiv}{\p{word1}\p{word2}\p{word3}\p{word4}} Build four index entries. This is exactly equivalent to using \code{\e index\{\var{word1}!\var{word2} \var{word3} \var{word4}\}}, \code{\e index\{\var{word2}!\var{word3} \var{word4}, \var{word1}\}}, \code{\e index\{\var{word3}!\var{word4}, \var{word1} \var{word2}\}}, and \code{\e index\{\var{word4}!\var{word1} \var{word2} \var{word3}\}}. \end{macrodesc} \section{Special Names} Many special names are used in the Python documentation, including the names of operating systems, programming languages, standards bodies, and the like. Many of these were assigned \LaTeX{} macros at some point in the distant past, and these macros lived on long past their usefulness. In the current markup, these entities are not assigned any special markup, but the preferred spellings are given here to aid authors in maintaining the consistency of presentation in the Python documentation. \begin{description} \item[POSIX] The name assigned to a particular group of standards. This is always uppercase. \item[Python] The name of our favorite programming language is always capitalized. \item[Unicode] The name of a character set and matching encoding. This is always written capitalized. \end{description} \section{Processing Tools} \subsection{External Tools} Many tools are needed to be able to process the Python documentation if all supported formats are required. This section lists the tools used and when each is required. Consult the \file{Doc/README} file to see if there are specific version requirements for any of these. \begin{description} \item[\program{dvips}] This program is a typical part of \TeX{} installations. It is used to generate PostScript from the ``device independent'' \file{.dvi} files. It is needed for the conversion to PostScript. \item[\program{emacs}] Emacs is the kitchen sink of programmers' editors, and a damn fine kitchen sink it is. It also comes with some of the processing needed to support the proper menu structures for Texinfo documents when an info conversion is desired. This is needed for the info conversion. Using \program{xemacs} instead of FSF \program{emacs} may lead to instability in the conversion, but that's because nobody seems to maintain the Emacs Texinfo code in a portable manner. \item[\program{latex}] This is a world-class typesetter by Donald Knuth. It is used for the conversion to PostScript, and is needed for the HTML conversion as well (\LaTeX2HTML requires one of the intermediate files it creates). \item[\program{latex2html}] Probably the longest Perl script anyone ever attempted to maintain. This converts \LaTeX{} documents to HTML documents, and does a pretty reasonable job. It is required for the conversions to HTML and GNU info. \item[\program{lynx}] This is a text-mode Web browser which includes an HTML-to-plain text conversion. This is used to convert \code{howto} documents to text. \item[\program{make}] Just about any version should work for the standard documents, but GNU \program{make} is required for the experimental processes in \file{Doc/tools/sgmlconv/}, at least while they're experimental. \item[\program{makeindex}] This is a standard program for converting \LaTeX{} index data to a formatted index; it should be included with all \LaTeX{} installations. It is needed for the PDF and PostScript conversions. \item[\program{makeinfo}] GNU \program{makeinfo} is used to convert Texinfo documents to GNU info files. Since Texinfo is used as an intermediate format in the info conversion, this program is needed in that conversion. \item[\program{pdflatex}] pdf\TeX{} is a relatively new variant of \TeX, and is used to generate the PDF version of the manuals. It is typically installed as part of most of the large \TeX{} distributions. \program{pdflatex} is pdf\TeX{} using the \LaTeX{} format. \item[\program{perl}] Perl is required for \LaTeX2HTML{} and one of the scripts used to post-process \LaTeX2HTML output, as well as the HTML-to-Texinfo conversion. This is required for the HTML and GNU info conversions. \item[\program{python}] Python is used for many of the scripts in the \file{Doc/tools/} directory; it is required for all conversions. This shouldn't be a problem if you're interested in writing documentation for Python! \end{description} \subsection{Internal Tools} This section describes the various scripts that are used to implement various stages of document processing or to orchestrate entire build sequences. Most of these tools are only useful in the context of building the standard documentation, but some are more general. \begin{description} \item[\program{mkhowto}] This is the primary script used to format third-party documents. It contains all the logic needed to ``get it right.'' The proper way to use this script is to make a symbolic link to it or run it in place; the actual script file must be stored as part of the documentation source tree, though it may be used to format documents outside the tree. Use \program{mkhowto} \longprogramopt{help} for a list of command line options. \program{mkhowto} can be used for both \code{howto} and \code{manual} class documents. (For the later, be sure to get the latest version from the Python CVS repository rather than the version distributed in the \file{latex-1.5.2.tgz} source archive.) XXX Need more here. \end{description} \section{Future Directions \label{futures}} The history of the Python documentation is full of changes, most of which have been fairly small and evolutionary. There has been a great deal of discussion about making large changes in the markup languages and tools used to process the documentation. This section deals with the nature of the changes and what appears to be the most likely path of future development. \subsection{Structured Documentation \label{structured}} Most of the small changes to the \LaTeX{} markup have been made with an eye to divorcing the markup from the presentation, making both a bit more maintainable. Over the course of 1998, a large number of changes were made with exactly this in mind; previously, changes had been made but in a less systematic manner and with more concern for not needing to update the existing content. The result has been a highly structured and semantically loaded markup language implemented in \LaTeX. With almost no basic \TeX{} or \LaTeX{} markup in use, however, the markup syntax is about the only evidence of \LaTeX{} in the actual document sources. One side effect of this is that while we've been able to use standard ``engines'' for manipulating the documents, such as \LaTeX{} and \LaTeX2HTML, most of the actual transformations have been created specifically for Python. The \LaTeX{} document classes and \LaTeX2HTML support are both complete implementations of the specific markup designed for these documents. Combining highly customized markup with the somewhat esoteric systems used to process the documents leads us to ask some questions: Can we do this more easily? and, Can we do this better? After a great deal of discussion with the community, we have determined that actively pursuing modern structured documentation systems is worth some investment of time. There appear to be two real contenders in this arena: the Standard General Markup Language (SGML), and the Extensible Markup Language (XML). Both of these standards have advantages and disadvantages, and many advantages are shared. SGML offers advantages which may appeal most to authors, especially those using ordinary text editors. There are also additional abilities to define content models. A number of high-quality tools with demonstrated maturity is available, but most are not free; for those which are, portability issues remain a problem. The advantages of XML include the availability of a large number of evolving tools. Unfortunately, many of the associated standards are still evolving, and the tools will have to follow along. This means that developing a robust tool set that uses more than the basic XML 1.0 recommendation is not possible in the short term. The promised availability of a wide variety of high-quality tools which support some of the most important related standards is not immediate. Many tools are likely to be free. XXX Eventual migration to SGML/XML. \subsection{Discussion Forums \label{discussion}} Discussion of the future of the Python documentation and related topics takes place in the Documentation Special Interest Group, or ``Doc-SIG.'' Information on the group, including mailing list archives and subscription information, is available at \url{http://www.python.org/sigs/doc-sig/}. The SIG is open to all interested parties. Comments and bug reports on the standard documents should be sent to \email{python-docs@python.org}. This may include comments about formatting, content, grammatical and spelling errors, or this document. You can also send comments on this document directly to the author at \email{fdrake@acm.org}. \end{document}