Delete the LaTeX doc tree.
202
Doc/ACKS
|
@ -1,202 +0,0 @@
|
|||
Contributors to the Python Documentation
|
||||
----------------------------------------
|
||||
|
||||
This file lists people who have contributed in some way to the Python
|
||||
documentation. It is probably not complete -- if you feel that you or
|
||||
anyone else should be on this list, please let us know (send email to
|
||||
docs@python.org), and we'll be glad to correct the problem.
|
||||
|
||||
It is only with the input and contributions of the Python community
|
||||
that Python has such wonderful documentation -- Thank You!
|
||||
|
||||
In the official sources, this file is encoded in ISO-8859-1 (Latin-1).
|
||||
|
||||
|
||||
-Fred
|
||||
|
||||
|
||||
Aahz
|
||||
Michael Abbott
|
||||
Steve Alexander
|
||||
Jim Ahlstrom
|
||||
Fred Allen
|
||||
A. Amoroso
|
||||
Pehr Anderson
|
||||
Oliver Andrich
|
||||
Jesús Cea Avión
|
||||
Daniel Barclay
|
||||
Chris Barker
|
||||
Don Bashford
|
||||
Anthony Baxter
|
||||
Bennett Benson
|
||||
Jonathan Black
|
||||
Robin Boerdijk
|
||||
Michal Bozon
|
||||
Aaron Brancotti
|
||||
Keith Briggs
|
||||
Lee Busby
|
||||
Lorenzo M. Catucci
|
||||
Mauro Cicognini
|
||||
Gilles Civario
|
||||
Mike Clarkson
|
||||
Steve Clift
|
||||
Dave Cole
|
||||
Matthew Cowles
|
||||
Jeremy Craven
|
||||
Andrew Dalke
|
||||
Ben Darnell
|
||||
L. Peter Deutsch
|
||||
Robert Donohue
|
||||
Fred L. Drake, Jr.
|
||||
Jeff Epler
|
||||
Michael Ernst
|
||||
Blame Andy Eskilsson
|
||||
Carey Evans
|
||||
Martijn Faassen
|
||||
Carl Feynman
|
||||
Hernán Martínez Foffani
|
||||
Stefan Franke
|
||||
Jim Fulton
|
||||
Peter Funk
|
||||
Lele Gaifax
|
||||
Matthew Gallagher
|
||||
Ben Gertzfield
|
||||
Nadim Ghaznavi
|
||||
Jonathan Giddy
|
||||
Shelley Gooch
|
||||
Nathaniel Gray
|
||||
Grant Griffin
|
||||
Thomas Guettler
|
||||
Anders Hammarquist
|
||||
Mark Hammond
|
||||
Harald Hanche-Olsen
|
||||
Manus Hand
|
||||
Gerhard Häring
|
||||
Travis B. Hartwell
|
||||
Janko Hauser
|
||||
Bernhard Herzog
|
||||
Magnus L. Hetland
|
||||
Konrad Hinsen
|
||||
Stefan Hoffmeister
|
||||
Albert Hofkamp
|
||||
Gregor Hoffleit
|
||||
Steve Holden
|
||||
Thomas Holenstein
|
||||
Gerrit Holl
|
||||
Rob Hooft
|
||||
Brian Hooper
|
||||
Randall Hopper
|
||||
Michael Hudson
|
||||
Eric Huss
|
||||
Jeremy Hylton
|
||||
Roger Irwin
|
||||
Jack Jansen
|
||||
Philip H. Jensen
|
||||
Pedro Diaz Jimenez
|
||||
Kent Johnson
|
||||
Lucas de Jonge
|
||||
Andreas Jung
|
||||
Robert Kern
|
||||
Jim Kerr
|
||||
Jan Kim
|
||||
Greg Kochanski
|
||||
Guido Kollerie
|
||||
Peter A. Koren
|
||||
Daniel Kozan
|
||||
Andrew M. Kuchling
|
||||
Dave Kuhlman
|
||||
Erno Kuusela
|
||||
Detlef Lannert
|
||||
Piers Lauder
|
||||
Glyph Lefkowitz
|
||||
Marc-André Lemburg
|
||||
Ulf A. Lindgren
|
||||
Everett Lipman
|
||||
Mirko Liss
|
||||
Martin von Löwis
|
||||
Fredrik Lundh
|
||||
Jeff MacDonald
|
||||
John Machin
|
||||
Andrew MacIntyre
|
||||
Vladimir Marangozov
|
||||
Vincent Marchetti
|
||||
Laura Matson
|
||||
Daniel May
|
||||
Doug Mennella
|
||||
Paolo Milani
|
||||
Skip Montanaro
|
||||
Paul Moore
|
||||
Ross Moore
|
||||
Sjoerd Mullender
|
||||
Dale Nagata
|
||||
Ng Pheng Siong
|
||||
Koray Oner
|
||||
Tomas Oppelstrup
|
||||
Denis S. Otkidach
|
||||
Zooko O'Whielacronx
|
||||
William Park
|
||||
Joonas Paalasmaa
|
||||
Harri Pasanen
|
||||
Bo Peng
|
||||
Tim Peters
|
||||
Christopher Petrilli
|
||||
Justin D. Pettit
|
||||
Chris Phoenix
|
||||
François Pinard
|
||||
Paul Prescod
|
||||
Eric S. Raymond
|
||||
Edward K. Ream
|
||||
Sean Reifschneider
|
||||
Bernhard Reiter
|
||||
Armin Rigo
|
||||
Wes Rishel
|
||||
Jim Roskind
|
||||
Guido van Rossum
|
||||
Donald Wallace Rouse II
|
||||
Nick Russo
|
||||
Chris Ryland
|
||||
Constantina S.
|
||||
Hugh Sasse
|
||||
Bob Savage
|
||||
Scott Schram
|
||||
Neil Schemenauer
|
||||
Barry Scott
|
||||
Joakim Sernbrant
|
||||
Justin Sheehy
|
||||
Michael Simcich
|
||||
Ionel Simionescu
|
||||
Gregory P. Smith
|
||||
Roy Smith
|
||||
Clay Spence
|
||||
Nicholas Spies
|
||||
Tage Stabell-Kulo
|
||||
Frank Stajano
|
||||
Anthony Starks
|
||||
Greg Stein
|
||||
Peter Stoehr
|
||||
Mark Summerfield
|
||||
Reuben Sumner
|
||||
Kalle Svensson
|
||||
Jim Tittsler
|
||||
Ville Vainio
|
||||
Martijn Vries
|
||||
Charles G. Waldman
|
||||
Greg Ward
|
||||
Barry Warsaw
|
||||
Corran Webster
|
||||
Glyn Webster
|
||||
Bob Weiner
|
||||
Eddy Welbourne
|
||||
Mats Wichmann
|
||||
Gerry Wiener
|
||||
Timothy Wild
|
||||
Collin Winter
|
||||
Blake Winton
|
||||
Dan Wolfe
|
||||
Steven Work
|
||||
Thomas Wouters
|
||||
Ka-Ping Yee
|
||||
Rory Yorke
|
||||
Moshe Zadka
|
||||
Milan Zamazal
|
||||
Cheng Zhang
|
736
Doc/Makefile
|
@ -1,736 +0,0 @@
|
|||
# Makefile for Python documentation
|
||||
# ---------------------------------
|
||||
#
|
||||
# See also the README file.
|
||||
#
|
||||
# This is a bit of a mess. The documents are identified by short names:
|
||||
# api -- Python/C API Reference Manual
|
||||
# doc -- Documenting Python
|
||||
# ext -- Extending and Embedding the Python Interpreter
|
||||
# lib -- Library Reference Manual
|
||||
# mac -- Macintosh Library Modules
|
||||
# ref -- Python Reference Manual
|
||||
# tut -- Python Tutorial
|
||||
# inst -- Installing Python Modules
|
||||
# dist -- Distributing Python Modules
|
||||
#
|
||||
# The LaTeX sources for each of these documents are in subdirectories
|
||||
# with the three-letter designations above as the directory names.
|
||||
#
|
||||
# The main target creates HTML for each of the documents. You can
|
||||
# also do "make lib" (etc.) to create the HTML versions of individual
|
||||
# documents.
|
||||
#
|
||||
# The document classes and styles are in the texinputs/ directory.
|
||||
# These define a number of macros that are similar in name and intent
|
||||
# as macros in Texinfo (e.g. \code{...} and \emph{...}), as well as a
|
||||
# number of environments for formatting function and data definitions.
|
||||
# Documentation for the macros is included in "Documenting Python"; see
|
||||
# http://www.python.org/doc/current/doc/doc.html, or the sources for
|
||||
# this document in the doc/ directory.
|
||||
#
|
||||
# Everything is processed by LaTeX. See the file `README' for more
|
||||
# information on the tools needed for processing.
|
||||
#
|
||||
# There's a problem with generating the index which has been solved by
|
||||
# a sed command applied to the index file. The shell script fix_hack
|
||||
# does this (the Makefile takes care of calling it).
|
||||
#
|
||||
# Additional targets attempt to convert selected LaTeX sources to
|
||||
# various other formats. These are generally site specific because
|
||||
# the tools used are all but universal. These targets are:
|
||||
#
|
||||
# ps -- convert all documents from LaTeX to PostScript
|
||||
# pdf -- convert all documents from LaTeX to the
|
||||
# Portable Document Format
|
||||
#
|
||||
# See the README file for more information on these targets.
|
||||
#
|
||||
# The formatted output is located in subdirectories. For PDF and
|
||||
# PostScript, look in the paper-$(PAPER)/ directory. For HTML, look in
|
||||
# the html/ directory. If you want to fix the GNU info process, look
|
||||
# in the info/ directory; please send patches to docs@python.org.
|
||||
|
||||
# This Makefile only includes information on how to perform builds; for
|
||||
# dependency information, see Makefile.deps.
|
||||
|
||||
# Customization -- you *may* have to edit this
|
||||
|
||||
# You could set this to a4:
|
||||
PAPER=letter
|
||||
|
||||
# Ideally, you shouldn't need to edit beyond this point
|
||||
|
||||
INFODIR= info
|
||||
TOOLSDIR= tools
|
||||
|
||||
# This is the *documentation* release, and is used to construct the
|
||||
# file names of the downloadable tarballs. It is initialized by the
|
||||
# getversioninfo script to ensure that the right version number is
|
||||
# used; the script will also write commontex/patchlevel.tex if that
|
||||
# doesn't exist or needs to be changed. Documents which depend on the
|
||||
# version number should use \input{patchlevel} and include
|
||||
# commontex/patchlevel.tex in their dependencies.
|
||||
RELEASE=$(shell $(PYTHON) tools/getversioninfo)
|
||||
|
||||
PYTHON= python
|
||||
DVIPS= dvips -N0 -t $(PAPER)
|
||||
|
||||
# This is ugly! The issue here is that there are two different levels
|
||||
# in the directory tree at which we execute mkhowto, so we can't
|
||||
# define it just once using a relative path (at least not with the
|
||||
# current implementation and Makefile structure). We use the GNUish
|
||||
# $(shell) function here to work around that restriction by
|
||||
# identifying mkhowto and the commontex/ directory using absolute paths.
|
||||
#
|
||||
# If your doc build fails immediately, you may need to switch to GNU make.
|
||||
# (e.g. OpenBSD needs package gmake installed; use gmake instead of make)
|
||||
PWD=$(shell pwd)
|
||||
|
||||
# (The trailing colon in the value is needed; TeX places its default
|
||||
# set of paths at the location of the empty string in the path list.)
|
||||
TEXINPUTS=$(PWD)/commontex:
|
||||
|
||||
# The mkhowto script can be run from the checkout using the first
|
||||
# version of this variable definition, or from a preferred version
|
||||
# using the second version. The standard documentation is typically
|
||||
# built using the second flavor, where the preferred version is from
|
||||
# the Python CVS trunk.
|
||||
MKHOWTO= TEXINPUTS=$(TEXINPUTS) $(PYTHON) $(PWD)/tools/mkhowto
|
||||
|
||||
MKDVI= $(MKHOWTO) --paper=$(PAPER) --dvi
|
||||
MKHTML= $(MKHOWTO) --html --about html/stdabout.dat \
|
||||
--iconserver ../icons --favicon ../icons/pyfav.png \
|
||||
--address $(PYTHONDOCS) --up-link ../index.html \
|
||||
--up-title "Python Documentation Index" \
|
||||
--global-module-index "../modindex.html" --dvips-safe
|
||||
MKISILOHTML=$(MKHOWTO) --html --about html/stdabout.dat \
|
||||
--iconserver ../icons \
|
||||
--l2h-init perl/isilo.perl --numeric --split 1 \
|
||||
--dvips-safe
|
||||
MKISILO= iSilo386 -U -y -rCR -d0
|
||||
MKPDF= $(MKHOWTO) --paper=$(PAPER) --pdf
|
||||
MKPS= $(MKHOWTO) --paper=$(PAPER) --ps
|
||||
|
||||
BUILDINDEX=$(TOOLSDIR)/buildindex.py
|
||||
|
||||
PYTHONDOCS="See <i><a href=\"about.html\">About this document...</a></i> for information on suggesting changes."
|
||||
HTMLBASE= file:`pwd`
|
||||
|
||||
# The emacs binary used to build the info docs. GNU Emacs 21 is required.
|
||||
EMACS= emacs
|
||||
|
||||
# The end of this should reflect the major/minor version numbers of
|
||||
# the release:
|
||||
WHATSNEW=whatsnew26
|
||||
|
||||
# what's what
|
||||
MANDVIFILES= paper-$(PAPER)/api.dvi paper-$(PAPER)/ext.dvi \
|
||||
paper-$(PAPER)/lib.dvi paper-$(PAPER)/mac.dvi \
|
||||
paper-$(PAPER)/ref.dvi paper-$(PAPER)/tut.dvi
|
||||
HOWTODVIFILES= paper-$(PAPER)/doc.dvi paper-$(PAPER)/inst.dvi \
|
||||
paper-$(PAPER)/dist.dvi paper-$(PAPER)/$(WHATSNEW).dvi
|
||||
|
||||
MANPDFFILES= paper-$(PAPER)/api.pdf paper-$(PAPER)/ext.pdf \
|
||||
paper-$(PAPER)/lib.pdf paper-$(PAPER)/mac.pdf \
|
||||
paper-$(PAPER)/ref.pdf paper-$(PAPER)/tut.pdf
|
||||
HOWTOPDFFILES= paper-$(PAPER)/doc.pdf paper-$(PAPER)/inst.pdf \
|
||||
paper-$(PAPER)/dist.pdf paper-$(PAPER)/$(WHATSNEW).pdf
|
||||
|
||||
MANPSFILES= paper-$(PAPER)/api.ps paper-$(PAPER)/ext.ps \
|
||||
paper-$(PAPER)/lib.ps paper-$(PAPER)/mac.ps \
|
||||
paper-$(PAPER)/ref.ps paper-$(PAPER)/tut.ps
|
||||
HOWTOPSFILES= paper-$(PAPER)/doc.ps paper-$(PAPER)/inst.ps \
|
||||
paper-$(PAPER)/dist.ps paper-$(PAPER)/$(WHATSNEW).ps
|
||||
|
||||
DVIFILES= $(MANDVIFILES) $(HOWTODVIFILES)
|
||||
PDFFILES= $(MANPDFFILES) $(HOWTOPDFFILES)
|
||||
PSFILES= $(MANPSFILES) $(HOWTOPSFILES)
|
||||
|
||||
HTMLCSSFILES=html/api/api.css \
|
||||
html/doc/doc.css \
|
||||
html/ext/ext.css \
|
||||
html/lib/lib.css \
|
||||
html/mac/mac.css \
|
||||
html/ref/ref.css \
|
||||
html/tut/tut.css \
|
||||
html/inst/inst.css \
|
||||
html/dist/dist.css
|
||||
|
||||
ISILOCSSFILES=isilo/api/api.css \
|
||||
isilo/doc/doc.css \
|
||||
isilo/ext/ext.css \
|
||||
isilo/lib/lib.css \
|
||||
isilo/mac/mac.css \
|
||||
isilo/ref/ref.css \
|
||||
isilo/tut/tut.css \
|
||||
isilo/inst/inst.css \
|
||||
isilo/dist/dist.css
|
||||
|
||||
ALLCSSFILES=$(HTMLCSSFILES) $(ISILOCSSFILES)
|
||||
|
||||
INDEXFILES=html/api/api.html \
|
||||
html/doc/doc.html \
|
||||
html/ext/ext.html \
|
||||
html/lib/lib.html \
|
||||
html/mac/mac.html \
|
||||
html/ref/ref.html \
|
||||
html/tut/tut.html \
|
||||
html/inst/inst.html \
|
||||
html/dist/dist.html \
|
||||
html/whatsnew/$(WHATSNEW).html
|
||||
|
||||
ALLHTMLFILES=$(INDEXFILES) html/index.html html/modindex.html html/acks.html
|
||||
|
||||
COMMONPERL= perl/manual.perl perl/python.perl perl/l2hinit.perl
|
||||
|
||||
ANNOAPI=api/refcounts.dat tools/anno-api.py
|
||||
|
||||
include Makefile.deps
|
||||
|
||||
# These must be declared phony since there
|
||||
# are directories with matching names:
|
||||
.PHONY: api doc ext lib mac ref tut inst dist
|
||||
.PHONY: html info isilo
|
||||
|
||||
|
||||
# Main target
|
||||
default: html
|
||||
all: html dvi ps pdf isilo
|
||||
|
||||
dvi: $(DVIFILES)
|
||||
pdf: $(PDFFILES)
|
||||
ps: $(PSFILES)
|
||||
|
||||
world: ps pdf html distfiles
|
||||
|
||||
|
||||
# Rules to build PostScript and PDF formats
|
||||
.SUFFIXES: .dvi .ps
|
||||
|
||||
.dvi.ps:
|
||||
$(DVIPS) -o $@ $<
|
||||
|
||||
|
||||
# Targets for each document:
|
||||
# Python/C API Reference Manual
|
||||
paper-$(PAPER)/api.dvi: $(ANNOAPIFILES)
|
||||
cd paper-$(PAPER) && $(MKDVI) api.tex
|
||||
|
||||
paper-$(PAPER)/api.pdf: $(ANNOAPIFILES)
|
||||
cd paper-$(PAPER) && $(MKPDF) api.tex
|
||||
|
||||
paper-$(PAPER)/api.tex: api/api.tex
|
||||
cp api/api.tex $@
|
||||
|
||||
paper-$(PAPER)/abstract.tex: api/abstract.tex $(ANNOAPI)
|
||||
$(PYTHON) $(TOOLSDIR)/anno-api.py -o $@ api/abstract.tex
|
||||
|
||||
paper-$(PAPER)/concrete.tex: api/concrete.tex $(ANNOAPI)
|
||||
$(PYTHON) $(TOOLSDIR)/anno-api.py -o $@ api/concrete.tex
|
||||
|
||||
paper-$(PAPER)/exceptions.tex: api/exceptions.tex $(ANNOAPI)
|
||||
$(PYTHON) $(TOOLSDIR)/anno-api.py -o $@ api/exceptions.tex
|
||||
|
||||
paper-$(PAPER)/init.tex: api/init.tex $(ANNOAPI)
|
||||
$(PYTHON) $(TOOLSDIR)/anno-api.py -o $@ api/init.tex
|
||||
|
||||
paper-$(PAPER)/intro.tex: api/intro.tex
|
||||
cp api/intro.tex $@
|
||||
|
||||
paper-$(PAPER)/memory.tex: api/memory.tex $(ANNOAPI)
|
||||
$(PYTHON) $(TOOLSDIR)/anno-api.py -o $@ api/memory.tex
|
||||
|
||||
paper-$(PAPER)/newtypes.tex: api/newtypes.tex $(ANNOAPI)
|
||||
$(PYTHON) $(TOOLSDIR)/anno-api.py -o $@ api/newtypes.tex
|
||||
|
||||
paper-$(PAPER)/refcounting.tex: api/refcounting.tex $(ANNOAPI)
|
||||
$(PYTHON) $(TOOLSDIR)/anno-api.py -o $@ api/refcounting.tex
|
||||
|
||||
paper-$(PAPER)/utilities.tex: api/utilities.tex $(ANNOAPI)
|
||||
$(PYTHON) $(TOOLSDIR)/anno-api.py -o $@ api/utilities.tex
|
||||
|
||||
paper-$(PAPER)/veryhigh.tex: api/veryhigh.tex $(ANNOAPI)
|
||||
$(PYTHON) $(TOOLSDIR)/anno-api.py -o $@ api/veryhigh.tex
|
||||
|
||||
# Distributing Python Modules
|
||||
paper-$(PAPER)/dist.dvi: $(DISTFILES)
|
||||
cd paper-$(PAPER) && $(MKDVI) ../dist/dist.tex
|
||||
|
||||
paper-$(PAPER)/dist.pdf: $(DISTFILES)
|
||||
cd paper-$(PAPER) && $(MKPDF) ../dist/dist.tex
|
||||
|
||||
# Documenting Python
|
||||
paper-$(PAPER)/doc.dvi: $(DOCFILES)
|
||||
cd paper-$(PAPER) && $(MKDVI) ../doc/doc.tex
|
||||
|
||||
paper-$(PAPER)/doc.pdf: $(DOCFILES)
|
||||
cd paper-$(PAPER) && $(MKPDF) ../doc/doc.tex
|
||||
|
||||
# Extending and Embedding the Python Interpreter
|
||||
paper-$(PAPER)/ext.dvi: $(EXTFILES)
|
||||
cd paper-$(PAPER) && $(MKDVI) ../ext/ext.tex
|
||||
|
||||
paper-$(PAPER)/ext.pdf: $(EXTFILES)
|
||||
cd paper-$(PAPER) && $(MKPDF) ../ext/ext.tex
|
||||
|
||||
# Installing Python Modules
|
||||
paper-$(PAPER)/inst.dvi: $(INSTFILES)
|
||||
cd paper-$(PAPER) && $(MKDVI) ../inst/inst.tex
|
||||
|
||||
paper-$(PAPER)/inst.pdf: $(INSTFILES)
|
||||
cd paper-$(PAPER) && $(MKPDF) ../inst/inst.tex
|
||||
|
||||
# Python Library Reference
|
||||
paper-$(PAPER)/lib.dvi: $(LIBFILES)
|
||||
cd paper-$(PAPER) && $(MKDVI) ../lib/lib.tex
|
||||
|
||||
paper-$(PAPER)/lib.pdf: $(LIBFILES)
|
||||
cd paper-$(PAPER) && $(MKPDF) ../lib/lib.tex
|
||||
|
||||
# Macintosh Library Modules
|
||||
paper-$(PAPER)/mac.dvi: $(MACFILES)
|
||||
cd paper-$(PAPER) && $(MKDVI) ../mac/mac.tex
|
||||
|
||||
paper-$(PAPER)/mac.pdf: $(MACFILES)
|
||||
cd paper-$(PAPER) && $(MKPDF) ../mac/mac.tex
|
||||
|
||||
# Python Reference Manual
|
||||
paper-$(PAPER)/ref.dvi: $(REFFILES)
|
||||
cd paper-$(PAPER) && $(MKDVI) ../ref/ref.tex
|
||||
|
||||
paper-$(PAPER)/ref.pdf: $(REFFILES)
|
||||
cd paper-$(PAPER) && $(MKPDF) ../ref/ref.tex
|
||||
|
||||
# Python Tutorial
|
||||
paper-$(PAPER)/tut.dvi: $(TUTFILES)
|
||||
cd paper-$(PAPER) && $(MKDVI) ../tut/tut.tex
|
||||
|
||||
paper-$(PAPER)/tut.pdf: $(TUTFILES)
|
||||
cd paper-$(PAPER) && $(MKPDF) ../tut/tut.tex
|
||||
|
||||
# What's New in Python X.Y
|
||||
paper-$(PAPER)/$(WHATSNEW).dvi: whatsnew/$(WHATSNEW).tex
|
||||
cd paper-$(PAPER) && $(MKDVI) ../whatsnew/$(WHATSNEW).tex
|
||||
|
||||
paper-$(PAPER)/$(WHATSNEW).pdf: whatsnew/$(WHATSNEW).tex
|
||||
cd paper-$(PAPER) && $(MKPDF) ../whatsnew/$(WHATSNEW).tex
|
||||
|
||||
# The remaining part of the Makefile is concerned with various
|
||||
# conversions, as described above. See also the README file.
|
||||
|
||||
info:
|
||||
cd $(INFODIR) && $(MAKE) EMACS=$(EMACS) WHATSNEW=$(WHATSNEW)
|
||||
|
||||
# Targets to convert the manuals to HTML using Nikos Drakos' LaTeX to
|
||||
# HTML converter. For more info on this program, see
|
||||
# <URL:http://cbl.leeds.ac.uk/nikos/tex2html/doc/latex2html/latex2html.html>.
|
||||
|
||||
# Note that LaTeX2HTML inserts references to an icons directory in
|
||||
# each page that it generates. I have placed a copy of this directory
|
||||
# in the distribution to simplify the process of creating a
|
||||
# self-contained HTML distribution; for this purpose I have also added
|
||||
# a (trivial) index.html. Change the definition of $ICONSERVER in
|
||||
# perl/l2hinit.perl to use a different location for the icons directory.
|
||||
|
||||
# If you have the standard LaTeX2HTML icons installed, the versions shipped
|
||||
# with this documentation should be stored in a separate directory and used
|
||||
# instead. The standard set does *not* include all the icons used in the
|
||||
# Python documentation.
|
||||
|
||||
$(ALLCSSFILES): html/style.css
|
||||
cp $< $@
|
||||
|
||||
$(INDEXFILES): $(COMMONPERL) html/stdabout.dat tools/node2label.pl
|
||||
|
||||
html/acks.html: ACKS $(TOOLSDIR)/support.py $(TOOLSDIR)/mkackshtml
|
||||
$(PYTHON) $(TOOLSDIR)/mkackshtml --address $(PYTHONDOCS) \
|
||||
--favicon icons/pyfav.png \
|
||||
--output html/acks.html <ACKS
|
||||
|
||||
|
||||
# html/index.html is dependent on $(INDEXFILES) since we want the date
|
||||
# on the front index to be updated whenever any of the child documents
|
||||
# are updated and boilerplate.tex uses \today as the date. The index
|
||||
# files are not used to actually generate content.
|
||||
|
||||
BOILERPLATE=commontex/boilerplate.tex
|
||||
html/index.html: $(INDEXFILES)
|
||||
html/index.html: html/index.html.in $(BOILERPLATE) tools/rewrite.py
|
||||
$(PYTHON) tools/rewrite.py $(BOILERPLATE) \
|
||||
RELEASE=$(RELEASE) WHATSNEW=$(WHATSNEW) \
|
||||
<$< >$@
|
||||
|
||||
html/modindex.html: $(TOOLSDIR)/support.py $(TOOLSDIR)/mkmodindex
|
||||
html/modindex.html: html/dist/dist.html
|
||||
html/modindex.html: html/lib/lib.html html/mac/mac.html
|
||||
cd html && \
|
||||
$(PYTHON) ../$(TOOLSDIR)/mkmodindex --columns 3 \
|
||||
--output modindex.html --address $(PYTHONDOCS) \
|
||||
--favicon icons/pyfav.png \
|
||||
dist/modindex.html \
|
||||
lib/modindex.html mac/modindex.html
|
||||
|
||||
html: $(ALLHTMLFILES) $(HTMLCSSFILES)
|
||||
|
||||
api: html/api/api.html html/api/api.css
|
||||
html/api/api.html: $(APIFILES) api/refcounts.dat
|
||||
$(MKHTML) --dir html/api api/api.tex
|
||||
|
||||
doc: html/doc/doc.html html/doc/doc.css
|
||||
html/doc/doc.html: $(DOCFILES)
|
||||
$(MKHTML) --dir html/doc doc/doc.tex
|
||||
|
||||
ext: html/ext/ext.html html/ext/ext.css
|
||||
html/ext/ext.html: $(EXTFILES)
|
||||
$(MKHTML) --dir html/ext ext/ext.tex
|
||||
|
||||
lib: html/lib/lib.html html/lib/lib.css
|
||||
html/lib/lib.html: $(LIBFILES)
|
||||
$(MKHTML) --dir html/lib lib/lib.tex
|
||||
|
||||
mac: html/mac/mac.html html/mac/mac.css
|
||||
html/mac/mac.html: $(MACFILES)
|
||||
$(MKHTML) --dir html/mac mac/mac.tex
|
||||
|
||||
ref: html/ref/ref.html html/ref/ref.css
|
||||
html/ref/ref.html: $(REFFILES)
|
||||
$(MKHTML) --dir html/ref ref/ref.tex
|
||||
|
||||
tut: html/tut/tut.html html/tut/tut.css
|
||||
html/tut/tut.html: $(TUTFILES)
|
||||
$(MKHTML) --dir html/tut --numeric --split 3 tut/tut.tex
|
||||
|
||||
inst: html/inst/inst.html html/inst/inst.css
|
||||
html/inst/inst.html: $(INSTFILES) perl/distutils.perl
|
||||
$(MKHTML) --dir html/inst --split 4 inst/inst.tex
|
||||
|
||||
dist: html/dist/dist.html html/dist/dist.css
|
||||
html/dist/dist.html: $(DISTFILES) perl/distutils.perl
|
||||
$(MKHTML) --dir html/dist --split 4 dist/dist.tex
|
||||
|
||||
whatsnew: html/whatsnew/$(WHATSNEW).html
|
||||
html/whatsnew/$(WHATSNEW).html: whatsnew/$(WHATSNEW).tex
|
||||
$(MKHTML) --dir html/whatsnew --split 4 whatsnew/$(WHATSNEW).tex
|
||||
|
||||
|
||||
# The iSilo format is used by the iSilo document reader for PalmOS devices.
|
||||
|
||||
ISILOINDEXFILES=isilo/api/api.html \
|
||||
isilo/doc/doc.html \
|
||||
isilo/ext/ext.html \
|
||||
isilo/lib/lib.html \
|
||||
isilo/mac/mac.html \
|
||||
isilo/ref/ref.html \
|
||||
isilo/tut/tut.html \
|
||||
isilo/inst/inst.html \
|
||||
isilo/dist/dist.html \
|
||||
isilo/whatsnew/$(WHATSNEW).html
|
||||
|
||||
$(ISILOINDEXFILES): $(COMMONPERL) html/stdabout.dat perl/isilo.perl
|
||||
|
||||
isilo: isilo/python-api.pdb \
|
||||
isilo/python-doc.pdb \
|
||||
isilo/python-ext.pdb \
|
||||
isilo/python-lib.pdb \
|
||||
isilo/python-mac.pdb \
|
||||
isilo/python-ref.pdb \
|
||||
isilo/python-tut.pdb \
|
||||
isilo/python-dist.pdb \
|
||||
isilo/python-inst.pdb \
|
||||
isilo/python-whatsnew.pdb
|
||||
|
||||
isilo/python-api.pdb: isilo/api/api.html isilo/api/api.css
|
||||
$(MKISILO) "-iPython/C API Reference Manual" \
|
||||
isilo/api/api.html $@
|
||||
|
||||
isilo/python-doc.pdb: isilo/doc/doc.html isilo/doc/doc.css
|
||||
$(MKISILO) "-iDocumenting Python" \
|
||||
isilo/doc/doc.html $@
|
||||
|
||||
isilo/python-ext.pdb: isilo/ext/ext.html isilo/ext/ext.css
|
||||
$(MKISILO) "-iExtending & Embedding Python" \
|
||||
isilo/ext/ext.html $@
|
||||
|
||||
isilo/python-lib.pdb: isilo/lib/lib.html isilo/lib/lib.css
|
||||
$(MKISILO) "-iPython Library Reference" \
|
||||
isilo/lib/lib.html $@
|
||||
|
||||
isilo/python-mac.pdb: isilo/mac/mac.html isilo/mac/mac.css
|
||||
$(MKISILO) "-iPython/C API Reference Manual" \
|
||||
isilo/mac/mac.html $@
|
||||
|
||||
isilo/python-ref.pdb: isilo/ref/ref.html isilo/ref/ref.css
|
||||
$(MKISILO) "-iPython Reference Manual" \
|
||||
isilo/ref/ref.html $@
|
||||
|
||||
isilo/python-tut.pdb: isilo/tut/tut.html isilo/tut/tut.css
|
||||
$(MKISILO) "-iPython Tutorial" \
|
||||
isilo/tut/tut.html $@
|
||||
|
||||
isilo/python-dist.pdb: isilo/dist/dist.html isilo/dist/dist.css
|
||||
$(MKISILO) "-iDistributing Python Modules" \
|
||||
isilo/dist/dist.html $@
|
||||
|
||||
isilo/python-inst.pdb: isilo/inst/inst.html isilo/inst/inst.css
|
||||
$(MKISILO) "-iInstalling Python Modules" \
|
||||
isilo/inst/inst.html $@
|
||||
|
||||
isilo/python-whatsnew.pdb: isilo/whatsnew/$(WHATSNEW).html isilo/whatsnew/$(WHATSNEW).css
|
||||
$(MKISILO) "-iWhat's New in Python X.Y" \
|
||||
isilo/whatsnew/$(WHATSNEW).html $@
|
||||
|
||||
isilo/api/api.html: $(APIFILES) api/refcounts.dat
|
||||
$(MKISILOHTML) --dir isilo/api api/api.tex
|
||||
|
||||
isilo/doc/doc.html: $(DOCFILES)
|
||||
$(MKISILOHTML) --dir isilo/doc doc/doc.tex
|
||||
|
||||
isilo/ext/ext.html: $(EXTFILES)
|
||||
$(MKISILOHTML) --dir isilo/ext ext/ext.tex
|
||||
|
||||
isilo/lib/lib.html: $(LIBFILES)
|
||||
$(MKISILOHTML) --dir isilo/lib lib/lib.tex
|
||||
|
||||
isilo/mac/mac.html: $(MACFILES)
|
||||
$(MKISILOHTML) --dir isilo/mac mac/mac.tex
|
||||
|
||||
isilo/ref/ref.html: $(REFFILES)
|
||||
$(MKISILOHTML) --dir isilo/ref ref/ref.tex
|
||||
|
||||
isilo/tut/tut.html: $(TUTFILES)
|
||||
$(MKISILOHTML) --dir isilo/tut tut/tut.tex
|
||||
|
||||
isilo/inst/inst.html: $(INSTFILES) perl/distutils.perl
|
||||
$(MKISILOHTML) --dir isilo/inst inst/inst.tex
|
||||
|
||||
isilo/dist/dist.html: $(DISTFILES) perl/distutils.perl
|
||||
$(MKISILOHTML) --dir isilo/dist dist/dist.tex
|
||||
|
||||
isilo/whatsnew/$(WHATSNEW).html: whatsnew/$(WHATSNEW).tex
|
||||
$(MKISILOHTML) --dir isilo/whatsnew whatsnew/$(WHATSNEW).tex
|
||||
|
||||
# These are useful if you need to transport the iSilo-ready HTML to
|
||||
# another machine to perform the conversion:
|
||||
|
||||
isilozip: isilo-html-$(RELEASE).zip
|
||||
|
||||
isilo-html-$(RELEASE).zip: $(ISILOINDEXFILES)
|
||||
rm -f $@
|
||||
cd isilo && \
|
||||
zip -q -9 ../$@ */*.css */*.html */*.txt
|
||||
|
||||
|
||||
# webchecker needs an extra flag to process the huge index from the libref
|
||||
WEBCHECKER=$(PYTHON) ../Tools/webchecker/webchecker.py
|
||||
HTMLBASE= file:`pwd`/html
|
||||
|
||||
webcheck: $(ALLHTMLFILES)
|
||||
$(WEBCHECKER) $(HTMLBASE)/api/
|
||||
$(WEBCHECKER) $(HTMLBASE)/doc/
|
||||
$(WEBCHECKER) $(HTMLBASE)/ext/
|
||||
$(WEBCHECKER) -m290000 $(HTMLBASE)/lib/
|
||||
$(WEBCHECKER) $(HTMLBASE)/mac/
|
||||
$(WEBCHECKER) $(HTMLBASE)/ref/
|
||||
$(WEBCHECKER) $(HTMLBASE)/tut/
|
||||
$(WEBCHECKER) $(HTMLBASE)/dist/
|
||||
$(WEBCHECKER) $(HTMLBASE)/inst/
|
||||
$(WEBCHECKER) $(HTMLBASE)/whatsnew/
|
||||
|
||||
fastwebcheck: $(ALLHTMLFILES)
|
||||
$(WEBCHECKER) -x $(HTMLBASE)/api/
|
||||
$(WEBCHECKER) -x $(HTMLBASE)/doc/
|
||||
$(WEBCHECKER) -x $(HTMLBASE)/ext/
|
||||
$(WEBCHECKER) -x -m290000 $(HTMLBASE)/lib/
|
||||
$(WEBCHECKER) -x $(HTMLBASE)/mac/
|
||||
$(WEBCHECKER) -x $(HTMLBASE)/ref/
|
||||
$(WEBCHECKER) -x $(HTMLBASE)/tut/
|
||||
$(WEBCHECKER) -x $(HTMLBASE)/dist/
|
||||
$(WEBCHECKER) -x $(HTMLBASE)/inst/
|
||||
$(WEBCHECKER) -x $(HTMLBASE)/whatsnew/
|
||||
|
||||
|
||||
# Release packaging targets:
|
||||
|
||||
paper-$(PAPER)/README: $(PSFILES) $(TOOLSDIR)/getpagecounts
|
||||
cd paper-$(PAPER) && ../$(TOOLSDIR)/getpagecounts -r $(RELEASE) >../$@
|
||||
|
||||
info-$(RELEASE).tgz: info
|
||||
cd $(INFODIR) && tar cf - README python.dir python-*.info* \
|
||||
| gzip -9 >../$@
|
||||
|
||||
info-$(RELEASE).tar.bz2: info
|
||||
cd $(INFODIR) && tar cf - README python.dir python-*.info* \
|
||||
| bzip2 -9 >../$@
|
||||
|
||||
latex-$(RELEASE).tgz:
|
||||
$(PYTHON) $(TOOLSDIR)/mksourcepkg --gzip $(RELEASE)
|
||||
|
||||
latex-$(RELEASE).tar.bz2:
|
||||
$(PYTHON) $(TOOLSDIR)/mksourcepkg --bzip2 $(RELEASE)
|
||||
|
||||
latex-$(RELEASE).zip:
|
||||
rm -f $@
|
||||
$(PYTHON) $(TOOLSDIR)/mksourcepkg --zip $(RELEASE)
|
||||
|
||||
pdf-$(PAPER)-$(RELEASE).tar: $(PDFFILES)
|
||||
rm -f $@
|
||||
mkdir Python-Docs-$(RELEASE)
|
||||
cp paper-$(PAPER)/*.pdf Python-Docs-$(RELEASE)
|
||||
tar cf $@ Python-Docs-$(RELEASE)
|
||||
rm -r Python-Docs-$(RELEASE)
|
||||
|
||||
pdf-$(PAPER)-$(RELEASE).tgz: pdf-$(PAPER)-$(RELEASE).tar
|
||||
gzip -9 <$? >$@
|
||||
|
||||
pdf-$(PAPER)-$(RELEASE).tar.bz2: pdf-$(PAPER)-$(RELEASE).tar
|
||||
bzip2 -9 <$? >$@
|
||||
|
||||
pdf-$(PAPER)-$(RELEASE).zip: pdf
|
||||
rm -f $@
|
||||
mkdir Python-Docs-$(RELEASE)
|
||||
cp paper-$(PAPER)/*.pdf Python-Docs-$(RELEASE)
|
||||
zip -q -r -9 $@ Python-Docs-$(RELEASE)
|
||||
rm -r Python-Docs-$(RELEASE)
|
||||
|
||||
postscript-$(PAPER)-$(RELEASE).tar: $(PSFILES) paper-$(PAPER)/README
|
||||
rm -f $@
|
||||
mkdir Python-Docs-$(RELEASE)
|
||||
cp paper-$(PAPER)/*.ps Python-Docs-$(RELEASE)
|
||||
cp paper-$(PAPER)/README Python-Docs-$(RELEASE)
|
||||
tar cf $@ Python-Docs-$(RELEASE)
|
||||
rm -r Python-Docs-$(RELEASE)
|
||||
|
||||
postscript-$(PAPER)-$(RELEASE).tar.bz2: postscript-$(PAPER)-$(RELEASE).tar
|
||||
bzip2 -9 <$< >$@
|
||||
|
||||
postscript-$(PAPER)-$(RELEASE).tgz: postscript-$(PAPER)-$(RELEASE).tar
|
||||
gzip -9 <$< >$@
|
||||
|
||||
postscript-$(PAPER)-$(RELEASE).zip: $(PSFILES) paper-$(PAPER)/README
|
||||
rm -f $@
|
||||
mkdir Python-Docs-$(RELEASE)
|
||||
cp paper-$(PAPER)/*.ps Python-Docs-$(RELEASE)
|
||||
cp paper-$(PAPER)/README Python-Docs-$(RELEASE)
|
||||
zip -q -r -9 $@ Python-Docs-$(RELEASE)
|
||||
rm -r Python-Docs-$(RELEASE)
|
||||
|
||||
HTMLPKGFILES=*.html */*.css */*.html */*.gif */*.png */*.txt
|
||||
|
||||
html-$(RELEASE).tar: $(ALLHTMLFILES) $(HTMLCSSFILES)
|
||||
mkdir Python-Docs-$(RELEASE)
|
||||
-find html -name '*.gif' -size 0 | xargs rm -f
|
||||
cd html && tar cf ../temp.tar $(HTMLPKGFILES)
|
||||
cd Python-Docs-$(RELEASE) && tar xf ../temp.tar
|
||||
rm temp.tar
|
||||
tar cf html-$(RELEASE).tar Python-Docs-$(RELEASE)
|
||||
rm -r Python-Docs-$(RELEASE)
|
||||
|
||||
html-$(RELEASE).tgz: html-$(RELEASE).tar
|
||||
gzip -9 <$? >$@
|
||||
|
||||
html-$(RELEASE).tar.bz2: html-$(RELEASE).tar
|
||||
bzip2 -9 <$? >$@
|
||||
|
||||
html-$(RELEASE).zip: $(ALLHTMLFILES) $(HTMLCSSFILES)
|
||||
rm -f $@
|
||||
mkdir Python-Docs-$(RELEASE)
|
||||
cd html && tar cf ../temp.tar $(HTMLPKGFILES)
|
||||
cd Python-Docs-$(RELEASE) && tar xf ../temp.tar
|
||||
rm temp.tar
|
||||
zip -q -r -9 $@ Python-Docs-$(RELEASE)
|
||||
rm -r Python-Docs-$(RELEASE)
|
||||
|
||||
isilo-$(RELEASE).zip: isilo
|
||||
rm -f $@
|
||||
mkdir Python-Docs-$(RELEASE)
|
||||
cp isilo/python-*.pdb Python-Docs-$(RELEASE)
|
||||
zip -q -r -9 $@ Python-Docs-$(RELEASE)
|
||||
rm -r Python-Docs-$(RELEASE)
|
||||
|
||||
|
||||
# convenience targets:
|
||||
|
||||
tarhtml: html-$(RELEASE).tgz
|
||||
tarinfo: info-$(RELEASE).tgz
|
||||
tarps: postscript-$(PAPER)-$(RELEASE).tgz
|
||||
tarpdf: pdf-$(PAPER)-$(RELEASE).tgz
|
||||
tarlatex: latex-$(RELEASE).tgz
|
||||
|
||||
tarballs: tarpdf tarps tarhtml
|
||||
|
||||
ziphtml: html-$(RELEASE).zip
|
||||
zipps: postscript-$(PAPER)-$(RELEASE).zip
|
||||
zippdf: pdf-$(PAPER)-$(RELEASE).zip
|
||||
ziplatex: latex-$(RELEASE).zip
|
||||
zipisilo: isilo-$(RELEASE).zip
|
||||
|
||||
zips: zippdf zipps ziphtml
|
||||
|
||||
bziphtml: html-$(RELEASE).tar.bz2
|
||||
bzipinfo: info-$(RELEASE).tar.bz2
|
||||
bzipps: postscript-$(PAPER)-$(RELEASE).tar.bz2
|
||||
bzippdf: pdf-$(PAPER)-$(RELEASE).tar.bz2
|
||||
bziplatex: latex-$(RELEASE).tar.bz2
|
||||
|
||||
bzips: bzippdf bzipps bziphtml
|
||||
|
||||
disthtml: bziphtml ziphtml
|
||||
distinfo: bzipinfo
|
||||
distps: bzipps zipps
|
||||
distpdf: bzippdf zippdf
|
||||
distlatex: bziplatex ziplatex
|
||||
|
||||
# We use the "pkglist" target at the end of these to ensure the
|
||||
# package list is updated after building either of these; this seems a
|
||||
# reasonable compromise between only building it for distfiles or
|
||||
# having to build it manually. Doing it here allows the packages for
|
||||
# distribution to be built using either of
|
||||
# make distfiles && make PAPER=a4 paperdist
|
||||
# make paperdist && make PAPER=a4 distfiles
|
||||
# The small amount of additional work is a small price to pay for not
|
||||
# having to remember which order to do it in. ;)
|
||||
paperdist: distpdf distps pkglist
|
||||
edist: disthtml pkglist
|
||||
|
||||
# The pkglist.html file is used as part of the download.html page on
|
||||
# python.org; it is not used as intermediate input here or as part of
|
||||
# the packages created.
|
||||
pkglist:
|
||||
$(TOOLSDIR)/mkpkglist >pkglist.html
|
||||
|
||||
distfiles: paperdist edist
|
||||
$(TOOLSDIR)/mksourcepkg --bzip2 --zip $(RELEASE)
|
||||
$(TOOLSDIR)/mkpkglist >pkglist.html
|
||||
|
||||
|
||||
# Housekeeping targets
|
||||
|
||||
# Remove temporary files; all except the following:
|
||||
# - sources: .tex, .bib, .sty, *.cls
|
||||
# - useful results: .dvi, .pdf, .ps, .texi, .info
|
||||
clean:
|
||||
rm -f html-$(RELEASE).tar
|
||||
cd $(INFODIR) && $(MAKE) clean
|
||||
|
||||
# Remove temporaries as well as final products
|
||||
clobber:
|
||||
rm -f html-$(RELEASE).tar
|
||||
rm -f html-$(RELEASE).tgz info-$(RELEASE).tgz
|
||||
rm -f pdf-$(RELEASE).tgz postscript-$(RELEASE).tgz
|
||||
rm -f latex-$(RELEASE).tgz html-$(RELEASE).zip
|
||||
rm -f pdf-$(RELEASE).zip postscript-$(RELEASE).zip
|
||||
rm -f $(DVIFILES) $(PSFILES) $(PDFFILES)
|
||||
cd $(INFODIR) && $(MAKE) clobber
|
||||
rm -f paper-$(PAPER)/*.tex paper-$(PAPER)/*.ind paper-$(PAPER)/*.idx
|
||||
rm -f paper-$(PAPER)/*.l2h paper-$(PAPER)/*.how paper-$(PAPER)/README
|
||||
rm -rf html/index.html html/modindex.html html/acks.html
|
||||
rm -rf html/api/ html/doc/ html/ext/ html/lib/ html/mac/
|
||||
rm -rf html/ref/ html/tut/ html/inst/ html/dist/
|
||||
rm -rf html/whatsnew/
|
||||
rm -rf isilo/api/ isilo/doc/ isilo/ext/ isilo/lib/ isilo/mac/
|
||||
rm -rf isilo/ref/ isilo/tut/ isilo/inst/ isilo/dist/
|
||||
rm -rf isilo/whatsnew/
|
||||
rm -f isilo/python-*.pdb isilo-$(RELEASE).zip
|
||||
|
||||
realclean distclean: clobber
|
|
@ -1,382 +0,0 @@
|
|||
# LaTeX source dependencies.
|
||||
|
||||
COMMONSTYLES= texinputs/python.sty \
|
||||
texinputs/pypaper.sty
|
||||
|
||||
INDEXSTYLES=texinputs/python.ist
|
||||
|
||||
COMMONTEX=commontex/copyright.tex \
|
||||
commontex/license.tex \
|
||||
commontex/patchlevel.tex \
|
||||
commontex/boilerplate.tex
|
||||
|
||||
MANSTYLES= texinputs/fncychap.sty \
|
||||
texinputs/manual.cls \
|
||||
$(COMMONSTYLES)
|
||||
|
||||
HOWTOSTYLES= texinputs/howto.cls \
|
||||
$(COMMONSTYLES)
|
||||
|
||||
|
||||
APIFILES= $(MANSTYLES) $(INDEXSTYLES) $(COMMONTEX) \
|
||||
api/api.tex \
|
||||
api/abstract.tex \
|
||||
api/concrete.tex \
|
||||
api/exceptions.tex \
|
||||
api/init.tex \
|
||||
api/intro.tex \
|
||||
api/memory.tex \
|
||||
api/newtypes.tex \
|
||||
api/refcounting.tex \
|
||||
api/utilities.tex \
|
||||
api/veryhigh.tex \
|
||||
commontex/typestruct.h \
|
||||
commontex/reportingbugs.tex
|
||||
|
||||
# These files are generated from those listed above, and are used to
|
||||
# generate the typeset versions of the manuals. The list is defined
|
||||
# here to make it easier to ensure parallelism.
|
||||
ANNOAPIFILES= $(MANSTYLES) $(INDEXSTYLES) $(COMMONTEX) api/refcounts.dat \
|
||||
paper-$(PAPER)/api.tex \
|
||||
paper-$(PAPER)/abstract.tex \
|
||||
paper-$(PAPER)/concrete.tex \
|
||||
paper-$(PAPER)/exceptions.tex \
|
||||
paper-$(PAPER)/init.tex \
|
||||
paper-$(PAPER)/intro.tex \
|
||||
paper-$(PAPER)/memory.tex \
|
||||
paper-$(PAPER)/newtypes.tex \
|
||||
paper-$(PAPER)/refcounting.tex \
|
||||
paper-$(PAPER)/utilities.tex \
|
||||
paper-$(PAPER)/veryhigh.tex \
|
||||
commontex/reportingbugs.tex
|
||||
|
||||
DOCFILES= $(HOWTOSTYLES) \
|
||||
commontex/boilerplate.tex \
|
||||
texinputs/ltxmarkup.sty \
|
||||
doc/doc.tex
|
||||
|
||||
EXTFILES= ext/ext.tex $(MANSTYLES) $(INDEXSTYLES) $(COMMONTEX) \
|
||||
ext/extending.tex \
|
||||
ext/newtypes.tex \
|
||||
ext/building.tex \
|
||||
ext/windows.tex \
|
||||
ext/embedding.tex \
|
||||
ext/noddy.c \
|
||||
ext/noddy2.c \
|
||||
ext/noddy3.c \
|
||||
ext/noddy4.c \
|
||||
ext/run-func.c \
|
||||
commontex/typestruct.h \
|
||||
commontex/reportingbugs.tex
|
||||
|
||||
TUTFILES= tut/tut.tex tut/glossary.tex $(MANSTYLES) $(COMMONTEX)
|
||||
|
||||
# LaTeX source files for the Python Reference Manual
|
||||
REFFILES= $(MANSTYLES) $(INDEXSTYLES) $(COMMONTEX) \
|
||||
ref/ref.tex \
|
||||
ref/ref1.tex \
|
||||
ref/ref2.tex \
|
||||
ref/ref3.tex \
|
||||
ref/ref4.tex \
|
||||
ref/ref5.tex \
|
||||
ref/ref6.tex \
|
||||
ref/ref7.tex \
|
||||
ref/ref8.tex
|
||||
|
||||
# LaTeX source files for the Python Library Reference
|
||||
LIBFILES= $(MANSTYLES) $(INDEXSTYLES) $(COMMONTEX) \
|
||||
commontex/reportingbugs.tex \
|
||||
lib/lib.tex \
|
||||
lib/asttable.tex \
|
||||
lib/compiler.tex \
|
||||
lib/distutils.tex \
|
||||
lib/email.tex \
|
||||
lib/emailencoders.tex \
|
||||
lib/emailexc.tex \
|
||||
lib/emailgenerator.tex \
|
||||
lib/emailiter.tex \
|
||||
lib/emailmessage.tex \
|
||||
lib/emailparser.tex \
|
||||
lib/emailutil.tex \
|
||||
lib/libintro.tex \
|
||||
lib/libobjs.tex \
|
||||
lib/libstdtypes.tex \
|
||||
lib/libexcs.tex \
|
||||
lib/libconsts.tex \
|
||||
lib/libfuncs.tex \
|
||||
lib/libpython.tex \
|
||||
lib/libsys.tex \
|
||||
lib/libplatform.tex \
|
||||
lib/libfpectl.tex \
|
||||
lib/libgc.tex \
|
||||
lib/libsets.tex \
|
||||
lib/libweakref.tex \
|
||||
lib/libinspect.tex \
|
||||
lib/libpydoc.tex \
|
||||
lib/libdifflib.tex \
|
||||
lib/libdoctest.tex \
|
||||
lib/libunittest.tex \
|
||||
lib/libtest.tex \
|
||||
lib/libtypes.tex \
|
||||
lib/libtraceback.tex \
|
||||
lib/libpickle.tex \
|
||||
lib/libshelve.tex \
|
||||
lib/libcopy.tex \
|
||||
lib/libmarshal.tex \
|
||||
lib/libwarnings.tex \
|
||||
lib/libimp.tex \
|
||||
lib/libzipimport.tex \
|
||||
lib/librunpy.tex \
|
||||
lib/libpkgutil.tex \
|
||||
lib/libparser.tex \
|
||||
lib/libbltin.tex \
|
||||
lib/libmain.tex \
|
||||
lib/libfuture.tex \
|
||||
lib/libstrings.tex \
|
||||
lib/libstring.tex \
|
||||
lib/libtextwrap.tex \
|
||||
lib/libcodecs.tex \
|
||||
lib/libunicodedata.tex \
|
||||
lib/libstringprep.tex \
|
||||
lib/libstruct.tex \
|
||||
lib/libmisc.tex \
|
||||
lib/libmath.tex \
|
||||
lib/libdecimal.tex \
|
||||
lib/libarray.tex \
|
||||
lib/liballos.tex \
|
||||
lib/libos.tex \
|
||||
lib/libdatetime.tex \
|
||||
lib/tzinfo-examples.py \
|
||||
lib/libtime.tex \
|
||||
lib/libgetopt.tex \
|
||||
lib/liboptparse.tex \
|
||||
lib/caseless.py \
|
||||
lib/required_1.py \
|
||||
lib/required_2.py \
|
||||
lib/libtempfile.tex \
|
||||
lib/liberrno.tex \
|
||||
lib/libctypes.tex \
|
||||
lib/libsomeos.tex \
|
||||
lib/libsignal.tex \
|
||||
lib/libsocket.tex \
|
||||
lib/libselect.tex \
|
||||
lib/libthread.tex \
|
||||
lib/libdummythread.tex \
|
||||
lib/libunix.tex \
|
||||
lib/libposix.tex \
|
||||
lib/libposixpath.tex \
|
||||
lib/libpwd.tex \
|
||||
lib/libspwd.tex \
|
||||
lib/libgrp.tex \
|
||||
lib/libcrypt.tex \
|
||||
lib/libdbm.tex \
|
||||
lib/libgdbm.tex \
|
||||
lib/libtermios.tex \
|
||||
lib/libfcntl.tex \
|
||||
lib/libposixfile.tex \
|
||||
lib/libsyslog.tex \
|
||||
lib/liblogging.tex \
|
||||
lib/libpdb.tex \
|
||||
lib/libprofile.tex \
|
||||
lib/libhotshot.tex \
|
||||
lib/libtimeit.tex \
|
||||
lib/libtrace.tex \
|
||||
lib/libcgi.tex \
|
||||
lib/libcgitb.tex \
|
||||
lib/liburllib.tex \
|
||||
lib/liburllib2.tex \
|
||||
lib/libhttplib.tex \
|
||||
lib/libftplib.tex \
|
||||
lib/libnntplib.tex \
|
||||
lib/liburlparse.tex \
|
||||
lib/libhtmlparser.tex \
|
||||
lib/libhtmllib.tex \
|
||||
lib/libsgmllib.tex \
|
||||
lib/librfc822.tex \
|
||||
lib/libmimetools.tex \
|
||||
lib/libmimewriter.tex \
|
||||
lib/libbinascii.tex \
|
||||
lib/libmm.tex \
|
||||
lib/libaudioop.tex \
|
||||
lib/libimageop.tex \
|
||||
lib/libaifc.tex \
|
||||
lib/libjpeg.tex \
|
||||
lib/libossaudiodev.tex \
|
||||
lib/libcrypto.tex \
|
||||
lib/libhashlib.tex \
|
||||
lib/libmd5.tex \
|
||||
lib/libsha.tex \
|
||||
lib/libhmac.tex \
|
||||
lib/libsgi.tex \
|
||||
lib/libal.tex \
|
||||
lib/libcd.tex \
|
||||
lib/libfl.tex \
|
||||
lib/libfm.tex \
|
||||
lib/libgl.tex \
|
||||
lib/libimgfile.tex \
|
||||
lib/libsun.tex \
|
||||
lib/libxdrlib.tex \
|
||||
lib/libimghdr.tex \
|
||||
lib/librestricted.tex \
|
||||
lib/librexec.tex \
|
||||
lib/libbastion.tex \
|
||||
lib/libformatter.tex \
|
||||
lib/liboperator.tex \
|
||||
lib/libresource.tex \
|
||||
lib/libstat.tex \
|
||||
lib/libstringio.tex \
|
||||
lib/libtoken.tex \
|
||||
lib/libkeyword.tex \
|
||||
lib/libundoc.tex \
|
||||
lib/libmailcap.tex \
|
||||
lib/libglob.tex \
|
||||
lib/libuser.tex \
|
||||
lib/libanydbm.tex \
|
||||
lib/libbsddb.tex \
|
||||
lib/libdumbdbm.tex \
|
||||
lib/libdbhash.tex \
|
||||
lib/librandom.tex \
|
||||
lib/libsite.tex \
|
||||
lib/libwhichdb.tex \
|
||||
lib/libbase64.tex \
|
||||
lib/libfnmatch.tex \
|
||||
lib/libquopri.tex \
|
||||
lib/libzlib.tex \
|
||||
lib/libsocksvr.tex \
|
||||
lib/libmailbox.tex \
|
||||
lib/libcommands.tex \
|
||||
lib/libcmath.tex \
|
||||
lib/libgzip.tex \
|
||||
lib/libbz2.tex \
|
||||
lib/libzipfile.tex \
|
||||
lib/libpprint.tex \
|
||||
lib/libcode.tex \
|
||||
lib/libmimify.tex \
|
||||
lib/libre.tex \
|
||||
lib/libuserdict.tex \
|
||||
lib/libdis.tex \
|
||||
lib/libxmlrpclib.tex \
|
||||
lib/libsimplexmlrpc.tex \
|
||||
lib/libdocxmlrpc.tex \
|
||||
lib/libpyexpat.tex \
|
||||
lib/libfunctools.tex \
|
||||
lib/xmldom.tex \
|
||||
lib/xmldomminidom.tex \
|
||||
lib/xmldompulldom.tex \
|
||||
lib/xmlsax.tex \
|
||||
lib/xmlsaxhandler.tex \
|
||||
lib/xmlsaxutils.tex \
|
||||
lib/xmlsaxreader.tex \
|
||||
lib/libetree.tex \
|
||||
lib/libqueue.tex \
|
||||
lib/liblocale.tex \
|
||||
lib/libgettext.tex \
|
||||
lib/libbasehttp.tex \
|
||||
lib/libcookie.tex \
|
||||
lib/libcookielib.tex \
|
||||
lib/libcopyreg.tex \
|
||||
lib/libsymbol.tex \
|
||||
lib/libbinhex.tex \
|
||||
lib/libuu.tex \
|
||||
lib/libsunaudio.tex \
|
||||
lib/libfileinput.tex \
|
||||
lib/libimaplib.tex \
|
||||
lib/libpoplib.tex \
|
||||
lib/libcalendar.tex \
|
||||
lib/libpopen2.tex \
|
||||
lib/libbisect.tex \
|
||||
lib/libcollections.tex \
|
||||
lib/libheapq.tex \
|
||||
lib/libmimetypes.tex \
|
||||
lib/libsmtplib.tex \
|
||||
lib/libsmtpd.tex \
|
||||
lib/libcmd.tex \
|
||||
lib/libmultifile.tex \
|
||||
lib/libthreading.tex \
|
||||
lib/libdummythreading.tex \
|
||||
lib/libwebbrowser.tex \
|
||||
lib/internet.tex \
|
||||
lib/netdata.tex \
|
||||
lib/markup.tex \
|
||||
lib/language.tex \
|
||||
lib/libpycompile.tex \
|
||||
lib/libcompileall.tex \
|
||||
lib/libshlex.tex \
|
||||
lib/libnetrc.tex \
|
||||
lib/librobotparser.tex \
|
||||
lib/libgetpass.tex \
|
||||
lib/libshutil.tex \
|
||||
lib/librepr.tex \
|
||||
lib/libmsilib.tex \
|
||||
lib/libmsvcrt.tex \
|
||||
lib/libwinreg.tex \
|
||||
lib/libwinsound.tex \
|
||||
lib/windows.tex \
|
||||
lib/libpyclbr.tex \
|
||||
lib/libtokenize.tex \
|
||||
lib/libtabnanny.tex \
|
||||
lib/libmhlib.tex \
|
||||
lib/libtelnetlib.tex \
|
||||
lib/libcolorsys.tex \
|
||||
lib/libfpformat.tex \
|
||||
lib/libcgihttp.tex \
|
||||
lib/libsimplehttp.tex \
|
||||
lib/liblinecache.tex \
|
||||
lib/libnew.tex \
|
||||
lib/libdircache.tex \
|
||||
lib/libfilecmp.tex \
|
||||
lib/libsunau.tex \
|
||||
lib/libwave.tex \
|
||||
lib/libchunk.tex \
|
||||
lib/libcodeop.tex \
|
||||
lib/libcurses.tex \
|
||||
lib/libcursespanel.tex \
|
||||
lib/libascii.tex \
|
||||
lib/libdl.tex \
|
||||
lib/libmutex.tex \
|
||||
lib/libnis.tex \
|
||||
lib/libpipes.tex \
|
||||
lib/libpty.tex \
|
||||
lib/libreadline.tex \
|
||||
lib/librlcompleter.tex \
|
||||
lib/libsched.tex \
|
||||
lib/libstatvfs.tex \
|
||||
lib/libtty.tex \
|
||||
lib/libasyncore.tex \
|
||||
lib/libasynchat.tex \
|
||||
lib/libatexit.tex \
|
||||
lib/libmmap.tex \
|
||||
lib/tkinter.tex \
|
||||
lib/libturtle.tex \
|
||||
lib/libtarfile.tex \
|
||||
lib/libcsv.tex \
|
||||
lib/libcfgparser.tex \
|
||||
lib/libsqlite3.tex
|
||||
|
||||
# LaTeX source files for Macintosh Library Modules.
|
||||
MACFILES= $(HOWTOSTYLES) $(INDEXSTYLES) $(COMMONTEX) \
|
||||
mac/mac.tex \
|
||||
mac/using.tex \
|
||||
mac/scripting.tex \
|
||||
mac/toolbox.tex \
|
||||
mac/undoc.tex \
|
||||
mac/libcolorpicker.tex \
|
||||
mac/libmac.tex \
|
||||
mac/libgensuitemodule.tex \
|
||||
mac/libaetools.tex \
|
||||
mac/libaepack.tex \
|
||||
mac/libaetypes.tex \
|
||||
mac/libmacos.tex \
|
||||
mac/libmacostools.tex \
|
||||
mac/libmacui.tex \
|
||||
mac/libmacic.tex \
|
||||
mac/libframework.tex \
|
||||
mac/libautogil.tex \
|
||||
mac/libminiae.tex \
|
||||
mac/libscrap.tex
|
||||
|
||||
INSTFILES = $(HOWTOSTYLES) inst/inst.tex
|
||||
|
||||
DISTFILES = $(HOWTOSTYLES) \
|
||||
dist/dist.tex \
|
||||
dist/sysconfig.tex
|
246
Doc/README
|
@ -1,246 +0,0 @@
|
|||
Python standard documentation -- in LaTeX
|
||||
-----------------------------------------
|
||||
|
||||
This directory contains the LaTeX sources to the Python documentation
|
||||
and tools required to support the formatting process. The documents
|
||||
now require LaTeX2e; LaTeX 2.09 compatibility has been dropped.
|
||||
|
||||
If you don't have LaTeX, or if you'd rather not format the
|
||||
documentation yourself, you can ftp a tar file containing HTML, PDF,
|
||||
or PostScript versions of all documents. Additional formats may be
|
||||
available. These should be in the same place where you fetched the
|
||||
main Python distribution (try <http://www.python.org/> or
|
||||
<ftp://ftp.python.org/pub/python/>).
|
||||
|
||||
The following are the LaTeX source files:
|
||||
|
||||
api/*.tex Python/C API Reference Manual
|
||||
doc/*.tex Documenting Python
|
||||
ext/*.tex Extending and Embedding the Python Interpreter
|
||||
lib/*.tex Python Library Reference
|
||||
mac/*.tex Macintosh Library Modules
|
||||
ref/*.tex Python Reference Manual
|
||||
tut/*.tex Python Tutorial
|
||||
inst/*.tex Installing Python Modules
|
||||
dist/*.tex Distributing Python Modules
|
||||
|
||||
Most use the "manual" document class and "python" package, derived from
|
||||
the old "myformat.sty" style file. The Macintosh Library Modules
|
||||
document uses the "howto" document class instead. These contains many
|
||||
macro definitions useful in documenting Python, and set some style
|
||||
parameters.
|
||||
|
||||
There's a Makefile to call LaTeX and the other utilities in the right
|
||||
order and the right number of times. By default, it will build the
|
||||
HTML version of the documentation, but DVI, PDF, and PostScript can
|
||||
also be made. To view the generated HTML, point your favorite browser
|
||||
at the top-level index (html/index.html) after running "make".
|
||||
|
||||
The Makefile can also produce DVI files for each document made; to
|
||||
preview them, use xdvi. PostScript is produced by the same Makefile
|
||||
target that produces the DVI files. This uses the dvips tool.
|
||||
Printing depends on local conventions; at our site, we use lpr. For
|
||||
example:
|
||||
|
||||
make paper-letter/lib.ps # create lib.dvi and lib.ps
|
||||
xdvi paper-letter/lib.dvi # preview lib.dvi
|
||||
lpr paper-letter/lib.ps # print on default printer
|
||||
|
||||
|
||||
What if I find a bug?
|
||||
---------------------
|
||||
|
||||
First, check that the bug is present in the development version of the
|
||||
documentation at <http://www.python.org/dev/doc/devel/>; we may
|
||||
have already fixed it.
|
||||
|
||||
If we haven't, tell us about it. We'd like the documentation to be
|
||||
complete and accurate, but have limited time. If you discover any
|
||||
inconsistencies between the documentation and implementation, or just
|
||||
have suggestions as to how to improve the documentation, let is know!
|
||||
Specific bugs and patches should be reported using our bug & patch
|
||||
databases at:
|
||||
|
||||
http://sourceforge.net/projects/python
|
||||
|
||||
Other suggestions or questions should be sent to the Python
|
||||
Documentation Team:
|
||||
|
||||
docs@python.org
|
||||
|
||||
Thanks!
|
||||
|
||||
|
||||
What tools do I need?
|
||||
---------------------
|
||||
|
||||
You need to install Python; some of the scripts used to produce the
|
||||
documentation are written in Python. You don't need this
|
||||
documentation to install Python; instructions are included in the
|
||||
README file in the Python distribution.
|
||||
|
||||
The simplest way to get the rest of the tools in the configuration we
|
||||
used is to install the teTeX TeX distribution, versions 0.9 or newer.
|
||||
More information is available on teTeX at <http://www.tug.org/tetex/>.
|
||||
This is a Unix-only TeX distribution at this time. This documentation
|
||||
release was tested with the 1.0.7 release, but there have been no
|
||||
substantial changes since late in the 0.9 series, which we used
|
||||
extensively for previous versions without any difficulty.
|
||||
|
||||
If you don't want to get teTeX, here is what you'll need:
|
||||
|
||||
To create DVI, PDF, or PostScript files:
|
||||
|
||||
- LaTeX2e, 1995/12/01 or newer. Older versions are likely to
|
||||
choke.
|
||||
|
||||
- makeindex. This is used to produce the indexes for the
|
||||
library reference and Python/C API reference.
|
||||
|
||||
To create PDF files:
|
||||
|
||||
- pdflatex. We used the one in the teTeX distribution (pdfTeX
|
||||
version 3.14159-13d (Web2C 7.3.1) at the time of this
|
||||
writing). Versions even a couple of patchlevels earlier are
|
||||
highly likely to fail due to syntax changes for some of the
|
||||
pdftex primitives.
|
||||
|
||||
To create PostScript files:
|
||||
|
||||
- dvips. Most TeX installations include this. If you don't
|
||||
have one, check CTAN (<ftp://ctan.tug.org/tex-archive/>).
|
||||
|
||||
To create info files:
|
||||
|
||||
Note that info support is currently being revised using new
|
||||
conversion tools by Michael Ernst <mernst@cs.washington.edu>.
|
||||
|
||||
- makeinfo. This is available from any GNU mirror.
|
||||
|
||||
- emacs or xemacs. Emacs is available from the same place as
|
||||
makeinfo, and xemacs is available from ftp.xemacs.org.
|
||||
|
||||
- Perl. Find the software at
|
||||
<http://language.perl.com/info/software.html>.
|
||||
|
||||
- HTML::Element. If you don't have this installed, you can get
|
||||
this from CPAN. Use the command:
|
||||
|
||||
perl -e 'use CPAN; CPAN::install("HTML::Element");'
|
||||
|
||||
You may need to be root to do this.
|
||||
|
||||
To create HTML files:
|
||||
|
||||
- Perl 5.6.0 or newer. Find the software at
|
||||
<http://language.perl.com/info/software.html>.
|
||||
|
||||
- LaTeX2HTML 99.2b8 or newer. Older versions are not
|
||||
supported; each version changes enough that supporting
|
||||
multiple versions is not likely to work. Many older
|
||||
versions don't work with Perl 5.6 as well. This also screws
|
||||
up code fragments. ;-( Releases are available at:
|
||||
<http://www.latex2html.org/>.
|
||||
|
||||
|
||||
I got a make error: "make: don't know how to make commontex/patchlevel.tex."
|
||||
----------------------------------------------------------------------------
|
||||
|
||||
Your version of make doesn't support the 'shell' function. You will need to
|
||||
use a version which does, e.g. GNU make.
|
||||
|
||||
|
||||
LaTeX (or pdfLaTeX) ran out of memory; how can I fix it?
|
||||
--------------------------------------------------------
|
||||
|
||||
This is known to be a problem at least on Mac OS X, but it has been
|
||||
observed on other systems in the past.
|
||||
|
||||
On some systems, the default sizes of some of the memory pools
|
||||
allocated by TeX needs to be changed; this is a configuration setting
|
||||
for installations based on web2c (most if not all installations).
|
||||
This is usually set in a file named texmf/web2c/texmf.cnf (where the
|
||||
top-level texmf/ directory is part of the TeX installation). If you
|
||||
get a "buffer overflow" warning from LaTeX, open that configuration
|
||||
file and look for the "main_memory.pdflatex" setting. If there is not
|
||||
one, you can add a line with the setting. The value 1500000 seems to
|
||||
be sufficient for formatting the Python documetantion.
|
||||
|
||||
|
||||
What if Times fonts are not available?
|
||||
--------------------------------------
|
||||
|
||||
As distributed, the LaTeX documents use PostScript Times fonts. This
|
||||
is done since they are much better looking and produce smaller
|
||||
PostScript files. If, however, your TeX installation does not support
|
||||
them, they may be easily disabled. Edit the file
|
||||
texinputs/pypaper.sty and comment out the line that starts
|
||||
"\RequirePackage{times}" by inserting a "%" character at the beginning
|
||||
of the line. If you're formatting the docs for A4 paper instead of
|
||||
US-Letter paper, change paper-a4/pypaper.sty instead. An alternative
|
||||
is to install the right fonts and LaTeX style file.
|
||||
|
||||
|
||||
What if I want to use A4 paper?
|
||||
-------------------------------
|
||||
|
||||
Instead of building the PostScript by giving the command "make ps",
|
||||
give the command "make PAPER=a4 ps"; the output will be produced in
|
||||
the paper-a4/ subdirectory. (You can use "make PAPER=a4 pdf" if you'd
|
||||
rather have PDF output.)
|
||||
|
||||
|
||||
Making HTML files
|
||||
-----------------
|
||||
|
||||
The LaTeX documents can be converted to HTML using Nikos Drakos'
|
||||
LaTeX2HTML converter. See the Makefile; after some twiddling, "make"
|
||||
should do the trick.
|
||||
|
||||
|
||||
What else is in here?
|
||||
---------------------
|
||||
|
||||
There is a new LaTeX document class called "howto". This is used for
|
||||
the new series of Python HOWTO documents which is being coordinated by
|
||||
Andrew Kuchling <akuchlin@mems-exchange.org>. The file
|
||||
templates/howto.tex is a commented example which may be used as a
|
||||
template. A Python script to "do the right thing" to format a howto
|
||||
document is included as tools/mkhowto. These documents can be
|
||||
formatted as HTML, PDF, PostScript, or ASCII files. Use "mkhowto
|
||||
--help" for information on using the formatting tool.
|
||||
|
||||
For authors of module documentation, there is a file
|
||||
templates/module.tex which may be used as a template for a module
|
||||
section. This may be used in conjunction with either the howto or
|
||||
manual document class. Create the documentation for a new module by
|
||||
copying the template to lib<mymodule>.tex and editing according to the
|
||||
instructions in the comments.
|
||||
|
||||
Documentation on the authoring Python documentation, including
|
||||
information about both style and markup, is available in the
|
||||
"Documenting Python" manual.
|
||||
|
||||
|
||||
Copyright notice
|
||||
================
|
||||
|
||||
The Python source is copyrighted, but you can freely use and copy it
|
||||
as long as you don't change or remove the copyright notice:
|
||||
|
||||
----------------------------------------------------------------------
|
||||
Copyright (c) 2000-2007 Python Software Foundation.
|
||||
All rights reserved.
|
||||
|
||||
Copyright (c) 2000 BeOpen.com.
|
||||
All rights reserved.
|
||||
|
||||
Copyright (c) 1995-2000 Corporation for National Research Initiatives.
|
||||
All rights reserved.
|
||||
|
||||
Copyright (c) 1991-1995 Stichting Mathematisch Centrum.
|
||||
All rights reserved.
|
||||
|
||||
See the file "commontex/license.tex" for information on usage and
|
||||
redistribution of this file, and for a DISCLAIMER OF ALL WARRANTIES.
|
||||
----------------------------------------------------------------------
|
74
Doc/TODO
|
@ -1,74 +0,0 @@
|
|||
PYTHON DOCUMENTATION TO-DO LIST -*- indented-text -*-
|
||||
===============================
|
||||
|
||||
General
|
||||
-------
|
||||
|
||||
* Figure out HTMLHelp generation for the Windows world.
|
||||
|
||||
|
||||
Python/C API
|
||||
------------
|
||||
|
||||
* The "Very High Level Interface" in the API document has been
|
||||
requested; I guess it wouldn't hurt to fill in a bit there. Request
|
||||
by Albert Hofkamp <a.hofkamp@wtb.tue.nl>. (Partly done.)
|
||||
|
||||
* Describe implementing types in C, including use of the 'self'
|
||||
parameter to the method implementation function. (Missing material
|
||||
mentioned in the Extending & Embedding manual, section 1.1; problem
|
||||
reported by Clay Spence <cspence@sarnoff.com>.) Heavily impacts one
|
||||
chapter of the Python/C API manual.
|
||||
|
||||
* Missing PyArg_ParseTuple(), PyArg_ParseTupleAndKeywords(),
|
||||
Py_BuildValue(). Information requested by Greg Kochanski
|
||||
<gpk@bell-labs.com>. PyEval_EvalCode() has also been requested.
|
||||
|
||||
Extending & Embedding
|
||||
---------------------
|
||||
|
||||
* More information is needed about building dynamically linked
|
||||
extensions in C++. Specifically, the extensions must be linked
|
||||
against the C++ libraries (and possibly runtime). Also noted by
|
||||
Albert Hofkamp <a.hofkamp@wtb.tue.nl>.
|
||||
|
||||
Reference Manual
|
||||
----------------
|
||||
|
||||
* Document the Extended Call Syntax in the language reference.
|
||||
[Jeremy Hylton]
|
||||
|
||||
* Document new comparison support for recursive objects (lang. ref.?
|
||||
library ref.? (cmp() function). [Jeremy Hylton]
|
||||
|
||||
Library Reference
|
||||
-----------------
|
||||
|
||||
* Update the pickle documentation to describe all of the current
|
||||
behavior; only a subset is described. __reduce__, etc. Partial
|
||||
update submitted by Jim Kerr <jbkerr@sr.hp.com>.
|
||||
|
||||
* Update the httplib documentation to match Greg Stein's HTTP/1.1
|
||||
support and new classes. (Greg, this is yours!)
|
||||
|
||||
Tutorial
|
||||
--------
|
||||
|
||||
* Update tutorial to use string methods and talk about backward
|
||||
compatibility of same.
|
||||
|
||||
|
||||
NOT WORTH THE TROUBLE
|
||||
---------------------
|
||||
|
||||
* In the indexes, some subitem entries are separated from the item
|
||||
entries by column- or page-breaks. Reported by Lorenzo M. Catucci
|
||||
<lorenzo@argon.roma2.infn.it>. This one will be hard; probably not
|
||||
really worth the pain. (Only an issue at all when a header-letter
|
||||
and the first index entry get separated -- can change as soon as we
|
||||
change the index entries in the text.) Also only a problem in the
|
||||
print version.
|
||||
|
||||
* Fix problem with howto documents getting the last module synopsis
|
||||
twice (in \localmoduletable) so we can get rid of the ugly 'uniq'
|
||||
hack in tools/mkhowto. (Probably not worth the trouble of fixing.)
|
1057
Doc/api/abstract.tex
|
@ -1,60 +0,0 @@
|
|||
\documentclass{manual}
|
||||
|
||||
\title{Python/C API Reference Manual}
|
||||
|
||||
\input{boilerplate}
|
||||
|
||||
\makeindex % tell \index to actually write the .idx file
|
||||
|
||||
|
||||
\begin{document}
|
||||
|
||||
\maketitle
|
||||
|
||||
\ifhtml
|
||||
\chapter*{Front Matter\label{front}}
|
||||
\fi
|
||||
|
||||
\input{copyright}
|
||||
|
||||
\begin{abstract}
|
||||
|
||||
\noindent
|
||||
This manual documents the API used by C and \Cpp{} programmers who
|
||||
want to write extension modules or embed Python. It is a companion to
|
||||
\citetitle[../ext/ext.html]{Extending and Embedding the Python
|
||||
Interpreter}, which describes the general principles of extension
|
||||
writing but does not document the API functions in detail.
|
||||
|
||||
\warning{The current version of this document is incomplete. I hope
|
||||
that it is nevertheless useful. I will continue to work on it, and
|
||||
release new versions from time to time, independent from Python source
|
||||
code releases.}
|
||||
|
||||
\end{abstract}
|
||||
|
||||
\tableofcontents
|
||||
|
||||
|
||||
\input{intro}
|
||||
\input{veryhigh}
|
||||
\input{refcounting}
|
||||
\input{exceptions}
|
||||
\input{utilities}
|
||||
\input{abstract}
|
||||
\input{concrete}
|
||||
\input{init}
|
||||
\input{memory}
|
||||
\input{newtypes}
|
||||
|
||||
|
||||
\appendix
|
||||
\chapter{Reporting Bugs}
|
||||
\input{reportingbugs}
|
||||
|
||||
\chapter{History and License}
|
||||
\input{license}
|
||||
|
||||
\input{api.ind} % Index -- must be last
|
||||
|
||||
\end{document}
|
3238
Doc/api/concrete.tex
|
@ -1,442 +0,0 @@
|
|||
\chapter{Exception Handling \label{exceptionHandling}}
|
||||
|
||||
The functions described in this chapter will let you handle and raise Python
|
||||
exceptions. It is important to understand some of the basics of
|
||||
Python exception handling. It works somewhat like the
|
||||
\UNIX{} \cdata{errno} variable: there is a global indicator (per
|
||||
thread) of the last error that occurred. Most functions don't clear
|
||||
this on success, but will set it to indicate the cause of the error on
|
||||
failure. Most functions also return an error indicator, usually
|
||||
\NULL{} if they are supposed to return a pointer, or \code{-1} if they
|
||||
return an integer (exception: the \cfunction{PyArg_*()} functions
|
||||
return \code{1} for success and \code{0} for failure).
|
||||
|
||||
When a function must fail because some function it called failed, it
|
||||
generally doesn't set the error indicator; the function it called
|
||||
already set it. It is responsible for either handling the error and
|
||||
clearing the exception or returning after cleaning up any resources it
|
||||
holds (such as object references or memory allocations); it should
|
||||
\emph{not} continue normally if it is not prepared to handle the
|
||||
error. If returning due to an error, it is important to indicate to
|
||||
the caller that an error has been set. If the error is not handled or
|
||||
carefully propagated, additional calls into the Python/C API may not
|
||||
behave as intended and may fail in mysterious ways.
|
||||
|
||||
The error indicator consists of three Python objects corresponding to
|
||||
\withsubitem{(in module sys)}{
|
||||
\ttindex{exc_type}\ttindex{exc_value}\ttindex{exc_traceback}}
|
||||
the Python variables \code{sys.exc_type}, \code{sys.exc_value} and
|
||||
\code{sys.exc_traceback}. API functions exist to interact with the
|
||||
error indicator in various ways. There is a separate error indicator
|
||||
for each thread.
|
||||
|
||||
% XXX Order of these should be more thoughtful.
|
||||
% Either alphabetical or some kind of structure.
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_Print}{}
|
||||
Print a standard traceback to \code{sys.stderr} and clear the error
|
||||
indicator. Call this function only when the error indicator is
|
||||
set. (Otherwise it will cause a fatal error!)
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyErr_Occurred}{}
|
||||
Test whether the error indicator is set. If set, return the
|
||||
exception \emph{type} (the first argument to the last call to one of
|
||||
the \cfunction{PyErr_Set*()} functions or to
|
||||
\cfunction{PyErr_Restore()}). If not set, return \NULL. You do
|
||||
not own a reference to the return value, so you do not need to
|
||||
\cfunction{Py_DECREF()} it. \note{Do not compare the return value
|
||||
to a specific exception; use \cfunction{PyErr_ExceptionMatches()}
|
||||
instead, shown below. (The comparison could easily fail since the
|
||||
exception may be an instance instead of a class, in the case of a
|
||||
class exception, or it may the a subclass of the expected
|
||||
exception.)}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyErr_ExceptionMatches}{PyObject *exc}
|
||||
Equivalent to \samp{PyErr_GivenExceptionMatches(PyErr_Occurred(),
|
||||
\var{exc})}. This should only be called when an exception is
|
||||
actually set; a memory access violation will occur if no exception
|
||||
has been raised.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyErr_GivenExceptionMatches}{PyObject *given, PyObject *exc}
|
||||
Return true if the \var{given} exception matches the exception in
|
||||
\var{exc}. If \var{exc} is a class object, this also returns true
|
||||
when \var{given} is an instance of a subclass. If \var{exc} is a
|
||||
tuple, all exceptions in the tuple (and recursively in subtuples)
|
||||
are searched for a match. If \var{given} is \NULL, a memory access
|
||||
violation will occur.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_NormalizeException}{PyObject**exc, PyObject**val, PyObject**tb}
|
||||
Under certain circumstances, the values returned by
|
||||
\cfunction{PyErr_Fetch()} below can be ``unnormalized'', meaning
|
||||
that \code{*\var{exc}} is a class object but \code{*\var{val}} is
|
||||
not an instance of the same class. This function can be used to
|
||||
instantiate the class in that case. If the values are already
|
||||
normalized, nothing happens. The delayed normalization is
|
||||
implemented to improve performance.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_Clear}{}
|
||||
Clear the error indicator. If the error indicator is not set, there
|
||||
is no effect.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_Fetch}{PyObject **ptype, PyObject **pvalue,
|
||||
PyObject **ptraceback}
|
||||
Retrieve the error indicator into three variables whose addresses
|
||||
are passed. If the error indicator is not set, set all three
|
||||
variables to \NULL. If it is set, it will be cleared and you own a
|
||||
reference to each object retrieved. The value and traceback object
|
||||
may be \NULL{} even when the type object is not. \note{This
|
||||
function is normally only used by code that needs to handle
|
||||
exceptions or by code that needs to save and restore the error
|
||||
indicator temporarily.}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_Restore}{PyObject *type, PyObject *value,
|
||||
PyObject *traceback}
|
||||
Set the error indicator from the three objects. If the error
|
||||
indicator is already set, it is cleared first. If the objects are
|
||||
\NULL, the error indicator is cleared. Do not pass a \NULL{} type
|
||||
and non-\NULL{} value or traceback. The exception type should be a
|
||||
class. Do not pass an invalid exception type or value.
|
||||
(Violating these rules will cause subtle problems later.) This call
|
||||
takes away a reference to each object: you must own a reference to
|
||||
each object before the call and after the call you no longer own
|
||||
these references. (If you don't understand this, don't use this
|
||||
function. I warned you.) \note{This function is normally only used
|
||||
by code that needs to save and restore the error indicator
|
||||
temporarily; use \cfunction{PyErr_Fetch()} to save the current
|
||||
exception state.}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_SetString}{PyObject *type, const char *message}
|
||||
This is the most common way to set the error indicator. The first
|
||||
argument specifies the exception type; it is normally one of the
|
||||
standard exceptions, e.g. \cdata{PyExc_RuntimeError}. You need not
|
||||
increment its reference count. The second argument is an error
|
||||
message; it is converted to a string object.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_SetObject}{PyObject *type, PyObject *value}
|
||||
This function is similar to \cfunction{PyErr_SetString()} but lets
|
||||
you specify an arbitrary Python object for the ``value'' of the
|
||||
exception.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyErr_Format}{PyObject *exception,
|
||||
const char *format, \moreargs}
|
||||
This function sets the error indicator and returns \NULL.
|
||||
\var{exception} should be a Python exception (class, not
|
||||
an instance). \var{format} should be a string, containing format
|
||||
codes, similar to \cfunction{printf()}. The \code{width.precision}
|
||||
before a format code is parsed, but the width part is ignored.
|
||||
|
||||
% This should be exactly the same as the table in PyString_FromFormat.
|
||||
% One should just refer to the other.
|
||||
|
||||
% The descriptions for %zd and %zu are wrong, but the truth is complicated
|
||||
% because not all compilers support the %z width modifier -- we fake it
|
||||
% when necessary via interpolating PY_FORMAT_SIZE_T.
|
||||
|
||||
% %u, %lu, %zu should have "new in Python 2.5" blurbs.
|
||||
|
||||
\begin{tableiii}{l|l|l}{member}{Format Characters}{Type}{Comment}
|
||||
\lineiii{\%\%}{\emph{n/a}}{The literal \% character.}
|
||||
\lineiii{\%c}{int}{A single character, represented as an C int.}
|
||||
\lineiii{\%d}{int}{Exactly equivalent to \code{printf("\%d")}.}
|
||||
\lineiii{\%u}{unsigned int}{Exactly equivalent to \code{printf("\%u")}.}
|
||||
\lineiii{\%ld}{long}{Exactly equivalent to \code{printf("\%ld")}.}
|
||||
\lineiii{\%lu}{unsigned long}{Exactly equivalent to \code{printf("\%lu")}.}
|
||||
\lineiii{\%zd}{Py_ssize_t}{Exactly equivalent to \code{printf("\%zd")}.}
|
||||
\lineiii{\%zu}{size_t}{Exactly equivalent to \code{printf("\%zu")}.}
|
||||
\lineiii{\%i}{int}{Exactly equivalent to \code{printf("\%i")}.}
|
||||
\lineiii{\%x}{int}{Exactly equivalent to \code{printf("\%x")}.}
|
||||
\lineiii{\%s}{char*}{A null-terminated C character array.}
|
||||
\lineiii{\%p}{void*}{The hex representation of a C pointer.
|
||||
Mostly equivalent to \code{printf("\%p")} except that it is
|
||||
guaranteed to start with the literal \code{0x} regardless of
|
||||
what the platform's \code{printf} yields.}
|
||||
\end{tableiii}
|
||||
|
||||
An unrecognized format character causes all the rest of the format
|
||||
string to be copied as-is to the result string, and any extra
|
||||
arguments discarded.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_SetNone}{PyObject *type}
|
||||
This is a shorthand for \samp{PyErr_SetObject(\var{type},
|
||||
Py_None)}.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyErr_BadArgument}{}
|
||||
This is a shorthand for \samp{PyErr_SetString(PyExc_TypeError,
|
||||
\var{message})}, where \var{message} indicates that a built-in
|
||||
operation was invoked with an illegal argument. It is mostly for
|
||||
internal use.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyErr_NoMemory}{}
|
||||
This is a shorthand for \samp{PyErr_SetNone(PyExc_MemoryError)}; it
|
||||
returns \NULL{} so an object allocation function can write
|
||||
\samp{return PyErr_NoMemory();} when it runs out of memory.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyErr_SetFromErrno}{PyObject *type}
|
||||
This is a convenience function to raise an exception when a C
|
||||
library function has returned an error and set the C variable
|
||||
\cdata{errno}. It constructs a tuple object whose first item is the
|
||||
integer \cdata{errno} value and whose second item is the
|
||||
corresponding error message (gotten from
|
||||
\cfunction{strerror()}\ttindex{strerror()}), and then calls
|
||||
\samp{PyErr_SetObject(\var{type}, \var{object})}. On \UNIX, when
|
||||
the \cdata{errno} value is \constant{EINTR}, indicating an
|
||||
interrupted system call, this calls
|
||||
\cfunction{PyErr_CheckSignals()}, and if that set the error
|
||||
indicator, leaves it set to that. The function always returns
|
||||
\NULL, so a wrapper function around a system call can write
|
||||
\samp{return PyErr_SetFromErrno(\var{type});} when the system call
|
||||
returns an error.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyErr_SetFromErrnoWithFilename}{PyObject *type,
|
||||
const char *filename}
|
||||
Similar to \cfunction{PyErr_SetFromErrno()}, with the additional
|
||||
behavior that if \var{filename} is not \NULL, it is passed to the
|
||||
constructor of \var{type} as a third parameter. In the case of
|
||||
exceptions such as \exception{IOError} and \exception{OSError}, this
|
||||
is used to define the \member{filename} attribute of the exception
|
||||
instance.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyErr_SetFromWindowsErr}{int ierr}
|
||||
This is a convenience function to raise \exception{WindowsError}.
|
||||
If called with \var{ierr} of \cdata{0}, the error code returned by a
|
||||
call to \cfunction{GetLastError()} is used instead. It calls the
|
||||
Win32 function \cfunction{FormatMessage()} to retrieve the Windows
|
||||
description of error code given by \var{ierr} or
|
||||
\cfunction{GetLastError()}, then it constructs a tuple object whose
|
||||
first item is the \var{ierr} value and whose second item is the
|
||||
corresponding error message (gotten from
|
||||
\cfunction{FormatMessage()}), and then calls
|
||||
\samp{PyErr_SetObject(\var{PyExc_WindowsError}, \var{object})}.
|
||||
This function always returns \NULL.
|
||||
Availability: Windows.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyErr_SetExcFromWindowsErr}{PyObject *type,
|
||||
int ierr}
|
||||
Similar to \cfunction{PyErr_SetFromWindowsErr()}, with an additional
|
||||
parameter specifying the exception type to be raised.
|
||||
Availability: Windows.
|
||||
\versionadded{2.3}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyErr_SetFromWindowsErrWithFilename}{int ierr,
|
||||
const char *filename}
|
||||
Similar to \cfunction{PyErr_SetFromWindowsErr()}, with the
|
||||
additional behavior that if \var{filename} is not \NULL, it is
|
||||
passed to the constructor of \exception{WindowsError} as a third
|
||||
parameter.
|
||||
Availability: Windows.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyErr_SetExcFromWindowsErrWithFilename}
|
||||
{PyObject *type, int ierr, char *filename}
|
||||
Similar to \cfunction{PyErr_SetFromWindowsErrWithFilename()}, with
|
||||
an additional parameter specifying the exception type to be raised.
|
||||
Availability: Windows.
|
||||
\versionadded{2.3}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_BadInternalCall}{}
|
||||
This is a shorthand for \samp{PyErr_SetString(PyExc_TypeError,
|
||||
\var{message})}, where \var{message} indicates that an internal
|
||||
operation (e.g. a Python/C API function) was invoked with an illegal
|
||||
argument. It is mostly for internal use.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyErr_WarnEx}{PyObject *category, char *message, int stacklevel}
|
||||
Issue a warning message. The \var{category} argument is a warning
|
||||
category (see below) or \NULL; the \var{message} argument is a
|
||||
message string. \var{stacklevel} is a positive number giving a
|
||||
number of stack frames; the warning will be issued from the
|
||||
currently executing line of code in that stack frame. A \var{stacklevel}
|
||||
of 1 is the function calling \cfunction{PyErr_WarnEx()}, 2 is
|
||||
the function above that, and so forth.
|
||||
|
||||
This function normally prints a warning message to \var{sys.stderr};
|
||||
however, it is also possible that the user has specified that
|
||||
warnings are to be turned into errors, and in that case this will
|
||||
raise an exception. It is also possible that the function raises an
|
||||
exception because of a problem with the warning machinery (the
|
||||
implementation imports the \module{warnings} module to do the heavy
|
||||
lifting). The return value is \code{0} if no exception is raised,
|
||||
or \code{-1} if an exception is raised. (It is not possible to
|
||||
determine whether a warning message is actually printed, nor what
|
||||
the reason is for the exception; this is intentional.) If an
|
||||
exception is raised, the caller should do its normal exception
|
||||
handling (for example, \cfunction{Py_DECREF()} owned references and
|
||||
return an error value).
|
||||
|
||||
Warning categories must be subclasses of \cdata{Warning}; the
|
||||
default warning category is \cdata{RuntimeWarning}. The standard
|
||||
Python warning categories are available as global variables whose
|
||||
names are \samp{PyExc_} followed by the Python exception name.
|
||||
These have the type \ctype{PyObject*}; they are all class objects.
|
||||
Their names are \cdata{PyExc_Warning}, \cdata{PyExc_UserWarning},
|
||||
\cdata{PyExc_UnicodeWarning}, \cdata{PyExc_DeprecationWarning},
|
||||
\cdata{PyExc_SyntaxWarning}, \cdata{PyExc_RuntimeWarning}, and
|
||||
\cdata{PyExc_FutureWarning}. \cdata{PyExc_Warning} is a subclass of
|
||||
\cdata{PyExc_Exception}; the other warning categories are subclasses
|
||||
of \cdata{PyExc_Warning}.
|
||||
|
||||
For information about warning control, see the documentation for the
|
||||
\module{warnings} module and the \programopt{-W} option in the
|
||||
command line documentation. There is no C API for warning control.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyErr_Warn}{PyObject *category, char *message}
|
||||
Issue a warning message. The \var{category} argument is a warning
|
||||
category (see below) or \NULL; the \var{message} argument is a
|
||||
message string. The warning will appear to be issued from the function
|
||||
calling \cfunction{PyErr_Warn()}, equivalent to calling
|
||||
\cfunction{PyErr_WarnEx()} with a \var{stacklevel} of 1.
|
||||
|
||||
Deprecated; use \cfunction{PyErr_WarnEx()} instead.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyErr_WarnExplicit}{PyObject *category,
|
||||
const char *message, const char *filename, int lineno,
|
||||
const char *module, PyObject *registry}
|
||||
Issue a warning message with explicit control over all warning
|
||||
attributes. This is a straightforward wrapper around the Python
|
||||
function \function{warnings.warn_explicit()}, see there for more
|
||||
information. The \var{module} and \var{registry} arguments may be
|
||||
set to \NULL{} to get the default effect described there.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyErr_CheckSignals}{}
|
||||
This function interacts with Python's signal handling. It checks
|
||||
whether a signal has been sent to the processes and if so, invokes
|
||||
the corresponding signal handler. If the
|
||||
\module{signal}\refbimodindex{signal} module is supported, this can
|
||||
invoke a signal handler written in Python. In all cases, the
|
||||
default effect for \constant{SIGINT}\ttindex{SIGINT} is to raise the
|
||||
\withsubitem{(built-in exception)}{\ttindex{KeyboardInterrupt}}
|
||||
\exception{KeyboardInterrupt} exception. If an exception is raised
|
||||
the error indicator is set and the function returns \code{-1};
|
||||
otherwise the function returns \code{0}. The error indicator may or
|
||||
may not be cleared if it was previously set.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_SetInterrupt}{}
|
||||
This function simulates the effect of a
|
||||
\constant{SIGINT}\ttindex{SIGINT} signal arriving --- the next time
|
||||
\cfunction{PyErr_CheckSignals()} is called,
|
||||
\withsubitem{(built-in exception)}{\ttindex{KeyboardInterrupt}}
|
||||
\exception{KeyboardInterrupt} will be raised. It may be called
|
||||
without holding the interpreter lock.
|
||||
% XXX This was described as obsolete, but is used in
|
||||
% thread.interrupt_main() (used from IDLE), so it's still needed.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyErr_NewException}{char *name,
|
||||
PyObject *base,
|
||||
PyObject *dict}
|
||||
This utility function creates and returns a new exception object.
|
||||
The \var{name} argument must be the name of the new exception, a C
|
||||
string of the form \code{module.class}. The \var{base} and
|
||||
\var{dict} arguments are normally \NULL. This creates a class
|
||||
object derived from \exception{Exception} (accessible in C as
|
||||
\cdata{PyExc_Exception}).
|
||||
|
||||
The \member{__module__} attribute of the new class is set to the
|
||||
first part (up to the last dot) of the \var{name} argument, and the
|
||||
class name is set to the last part (after the last dot). The
|
||||
\var{base} argument can be used to specify alternate base classes;
|
||||
it can either be only one class or a tuple of classes.
|
||||
The \var{dict} argument can be used to specify a dictionary of class
|
||||
variables and methods.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyErr_WriteUnraisable}{PyObject *obj}
|
||||
This utility function prints a warning message to \code{sys.stderr}
|
||||
when an exception has been set but it is impossible for the
|
||||
interpreter to actually raise the exception. It is used, for
|
||||
example, when an exception occurs in an \method{__del__()} method.
|
||||
|
||||
The function is called with a single argument \var{obj} that
|
||||
identifies the context in which the unraisable exception occurred.
|
||||
The repr of \var{obj} will be printed in the warning message.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\section{Standard Exceptions \label{standardExceptions}}
|
||||
|
||||
All standard Python exceptions are available as global variables whose
|
||||
names are \samp{PyExc_} followed by the Python exception name. These
|
||||
have the type \ctype{PyObject*}; they are all class objects. For
|
||||
completeness, here are all the variables:
|
||||
|
||||
\begin{tableiii}{l|l|c}{cdata}{C Name}{Python Name}{Notes}
|
||||
\lineiii{PyExc_BaseException\ttindex{PyExc_BaseException}}{\exception{BaseException}}{(1), (4)}
|
||||
\lineiii{PyExc_Exception\ttindex{PyExc_Exception}}{\exception{Exception}}{(1)}
|
||||
\lineiii{PyExc_StandardError\ttindex{PyExc_StandardError}}{\exception{StandardError}}{(1)}
|
||||
\lineiii{PyExc_ArithmeticError\ttindex{PyExc_ArithmeticError}}{\exception{ArithmeticError}}{(1)}
|
||||
\lineiii{PyExc_LookupError\ttindex{PyExc_LookupError}}{\exception{LookupError}}{(1)}
|
||||
\lineiii{PyExc_AssertionError\ttindex{PyExc_AssertionError}}{\exception{AssertionError}}{}
|
||||
\lineiii{PyExc_AttributeError\ttindex{PyExc_AttributeError}}{\exception{AttributeError}}{}
|
||||
\lineiii{PyExc_EOFError\ttindex{PyExc_EOFError}}{\exception{EOFError}}{}
|
||||
\lineiii{PyExc_EnvironmentError\ttindex{PyExc_EnvironmentError}}{\exception{EnvironmentError}}{(1)}
|
||||
\lineiii{PyExc_FloatingPointError\ttindex{PyExc_FloatingPointError}}{\exception{FloatingPointError}}{}
|
||||
\lineiii{PyExc_IOError\ttindex{PyExc_IOError}}{\exception{IOError}}{}
|
||||
\lineiii{PyExc_ImportError\ttindex{PyExc_ImportError}}{\exception{ImportError}}{}
|
||||
\lineiii{PyExc_IndexError\ttindex{PyExc_IndexError}}{\exception{IndexError}}{}
|
||||
\lineiii{PyExc_KeyError\ttindex{PyExc_KeyError}}{\exception{KeyError}}{}
|
||||
\lineiii{PyExc_KeyboardInterrupt\ttindex{PyExc_KeyboardInterrupt}}{\exception{KeyboardInterrupt}}{}
|
||||
\lineiii{PyExc_MemoryError\ttindex{PyExc_MemoryError}}{\exception{MemoryError}}{}
|
||||
\lineiii{PyExc_NameError\ttindex{PyExc_NameError}}{\exception{NameError}}{}
|
||||
\lineiii{PyExc_NotImplementedError\ttindex{PyExc_NotImplementedError}}{\exception{NotImplementedError}}{}
|
||||
\lineiii{PyExc_OSError\ttindex{PyExc_OSError}}{\exception{OSError}}{}
|
||||
\lineiii{PyExc_OverflowError\ttindex{PyExc_OverflowError}}{\exception{OverflowError}}{}
|
||||
\lineiii{PyExc_ReferenceError\ttindex{PyExc_ReferenceError}}{\exception{ReferenceError}}{(2)}
|
||||
\lineiii{PyExc_RuntimeError\ttindex{PyExc_RuntimeError}}{\exception{RuntimeError}}{}
|
||||
\lineiii{PyExc_SyntaxError\ttindex{PyExc_SyntaxError}}{\exception{SyntaxError}}{}
|
||||
\lineiii{PyExc_SystemError\ttindex{PyExc_SystemError}}{\exception{SystemError}}{}
|
||||
\lineiii{PyExc_SystemExit\ttindex{PyExc_SystemExit}}{\exception{SystemExit}}{}
|
||||
\lineiii{PyExc_TypeError\ttindex{PyExc_TypeError}}{\exception{TypeError}}{}
|
||||
\lineiii{PyExc_ValueError\ttindex{PyExc_ValueError}}{\exception{ValueError}}{}
|
||||
\lineiii{PyExc_WindowsError\ttindex{PyExc_WindowsError}}{\exception{WindowsError}}{(3)}
|
||||
\lineiii{PyExc_ZeroDivisionError\ttindex{PyExc_ZeroDivisionError}}{\exception{ZeroDivisionError}}{}
|
||||
\end{tableiii}
|
||||
|
||||
\noindent
|
||||
Notes:
|
||||
\begin{description}
|
||||
\item[(1)]
|
||||
This is a base class for other standard exceptions.
|
||||
|
||||
\item[(2)]
|
||||
This is the same as \exception{weakref.ReferenceError}.
|
||||
|
||||
\item[(3)]
|
||||
Only defined on Windows; protect code that uses this by testing that
|
||||
the preprocessor macro \code{MS_WINDOWS} is defined.
|
||||
|
||||
\item[(4)]
|
||||
\versionadded{2.5}
|
||||
\end{description}
|
||||
|
||||
|
||||
\section{Deprecation of String Exceptions}
|
||||
|
||||
All exceptions built into Python or provided in the standard library
|
||||
are derived from \exception{BaseException}.
|
||||
\withsubitem{(built-in exception)}{\ttindex{BaseException}}
|
||||
|
||||
String exceptions are still supported in the interpreter to allow
|
||||
existing code to run unmodified, but this will also change in a future
|
||||
release.
|
884
Doc/api/init.tex
|
@ -1,884 +0,0 @@
|
|||
\chapter{Initialization, Finalization, and Threads
|
||||
\label{initialization}}
|
||||
|
||||
\begin{cfuncdesc}{void}{Py_Initialize}{}
|
||||
Initialize the Python interpreter. In an application embedding
|
||||
Python, this should be called before using any other Python/C API
|
||||
functions; with the exception of
|
||||
\cfunction{Py_SetProgramName()}\ttindex{Py_SetProgramName()},
|
||||
\cfunction{PyEval_InitThreads()}\ttindex{PyEval_InitThreads()},
|
||||
\cfunction{PyEval_ReleaseLock()}\ttindex{PyEval_ReleaseLock()},
|
||||
and \cfunction{PyEval_AcquireLock()}\ttindex{PyEval_AcquireLock()}.
|
||||
This initializes the table of loaded modules (\code{sys.modules}),
|
||||
and\withsubitem{(in module sys)}{\ttindex{modules}\ttindex{path}}
|
||||
creates the fundamental modules
|
||||
\module{__builtin__}\refbimodindex{__builtin__},
|
||||
\module{__main__}\refbimodindex{__main__} and
|
||||
\module{sys}\refbimodindex{sys}. It also initializes the module
|
||||
search\indexiii{module}{search}{path} path (\code{sys.path}).
|
||||
It does not set \code{sys.argv}; use
|
||||
\cfunction{PySys_SetArgv()}\ttindex{PySys_SetArgv()} for that. This
|
||||
is a no-op when called for a second time (without calling
|
||||
\cfunction{Py_Finalize()}\ttindex{Py_Finalize()} first). There is
|
||||
no return value; it is a fatal error if the initialization fails.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{Py_InitializeEx}{int initsigs}
|
||||
This function works like \cfunction{Py_Initialize()} if
|
||||
\var{initsigs} is 1. If \var{initsigs} is 0, it skips
|
||||
initialization registration of signal handlers, which
|
||||
might be useful when Python is embedded. \versionadded{2.4}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{Py_IsInitialized}{}
|
||||
Return true (nonzero) when the Python interpreter has been
|
||||
initialized, false (zero) if not. After \cfunction{Py_Finalize()}
|
||||
is called, this returns false until \cfunction{Py_Initialize()} is
|
||||
called again.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{Py_Finalize}{}
|
||||
Undo all initializations made by \cfunction{Py_Initialize()} and
|
||||
subsequent use of Python/C API functions, and destroy all
|
||||
sub-interpreters (see \cfunction{Py_NewInterpreter()} below) that
|
||||
were created and not yet destroyed since the last call to
|
||||
\cfunction{Py_Initialize()}. Ideally, this frees all memory
|
||||
allocated by the Python interpreter. This is a no-op when called
|
||||
for a second time (without calling \cfunction{Py_Initialize()} again
|
||||
first). There is no return value; errors during finalization are
|
||||
ignored.
|
||||
|
||||
This function is provided for a number of reasons. An embedding
|
||||
application might want to restart Python without having to restart
|
||||
the application itself. An application that has loaded the Python
|
||||
interpreter from a dynamically loadable library (or DLL) might want
|
||||
to free all memory allocated by Python before unloading the
|
||||
DLL. During a hunt for memory leaks in an application a developer
|
||||
might want to free all memory allocated by Python before exiting
|
||||
from the application.
|
||||
|
||||
\strong{Bugs and caveats:} The destruction of modules and objects in
|
||||
modules is done in random order; this may cause destructors
|
||||
(\method{__del__()} methods) to fail when they depend on other
|
||||
objects (even functions) or modules. Dynamically loaded extension
|
||||
modules loaded by Python are not unloaded. Small amounts of memory
|
||||
allocated by the Python interpreter may not be freed (if you find a
|
||||
leak, please report it). Memory tied up in circular references
|
||||
between objects is not freed. Some memory allocated by extension
|
||||
modules may not be freed. Some extensions may not work properly if
|
||||
their initialization routine is called more than once; this can
|
||||
happen if an application calls \cfunction{Py_Initialize()} and
|
||||
\cfunction{Py_Finalize()} more than once.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyThreadState*}{Py_NewInterpreter}{}
|
||||
Create a new sub-interpreter. This is an (almost) totally separate
|
||||
environment for the execution of Python code. In particular, the
|
||||
new interpreter has separate, independent versions of all imported
|
||||
modules, including the fundamental modules
|
||||
\module{__builtin__}\refbimodindex{__builtin__},
|
||||
\module{__main__}\refbimodindex{__main__} and
|
||||
\module{sys}\refbimodindex{sys}. The table of loaded modules
|
||||
(\code{sys.modules}) and the module search path (\code{sys.path})
|
||||
are also separate. The new environment has no \code{sys.argv}
|
||||
variable. It has new standard I/O stream file objects
|
||||
\code{sys.stdin}, \code{sys.stdout} and \code{sys.stderr} (however
|
||||
these refer to the same underlying \ctype{FILE} structures in the C
|
||||
library).
|
||||
\withsubitem{(in module sys)}{
|
||||
\ttindex{stdout}\ttindex{stderr}\ttindex{stdin}}
|
||||
|
||||
The return value points to the first thread state created in the new
|
||||
sub-interpreter. This thread state is made in the current thread
|
||||
state. Note that no actual thread is created; see the discussion of
|
||||
thread states below. If creation of the new interpreter is
|
||||
unsuccessful, \NULL{} is returned; no exception is set since the
|
||||
exception state is stored in the current thread state and there may
|
||||
not be a current thread state. (Like all other Python/C API
|
||||
functions, the global interpreter lock must be held before calling
|
||||
this function and is still held when it returns; however, unlike
|
||||
most other Python/C API functions, there needn't be a current thread
|
||||
state on entry.)
|
||||
|
||||
Extension modules are shared between (sub-)interpreters as follows:
|
||||
the first time a particular extension is imported, it is initialized
|
||||
normally, and a (shallow) copy of its module's dictionary is
|
||||
squirreled away. When the same extension is imported by another
|
||||
(sub-)interpreter, a new module is initialized and filled with the
|
||||
contents of this copy; the extension's \code{init} function is not
|
||||
called. Note that this is different from what happens when an
|
||||
extension is imported after the interpreter has been completely
|
||||
re-initialized by calling
|
||||
\cfunction{Py_Finalize()}\ttindex{Py_Finalize()} and
|
||||
\cfunction{Py_Initialize()}\ttindex{Py_Initialize()}; in that case,
|
||||
the extension's \code{init\var{module}} function \emph{is} called
|
||||
again.
|
||||
|
||||
\strong{Bugs and caveats:} Because sub-interpreters (and the main
|
||||
interpreter) are part of the same process, the insulation between
|
||||
them isn't perfect --- for example, using low-level file operations
|
||||
like \withsubitem{(in module os)}{\ttindex{close()}}
|
||||
\function{os.close()} they can (accidentally or maliciously) affect
|
||||
each other's open files. Because of the way extensions are shared
|
||||
between (sub-)interpreters, some extensions may not work properly;
|
||||
this is especially likely when the extension makes use of (static)
|
||||
global variables, or when the extension manipulates its module's
|
||||
dictionary after its initialization. It is possible to insert
|
||||
objects created in one sub-interpreter into a namespace of another
|
||||
sub-interpreter; this should be done with great care to avoid
|
||||
sharing user-defined functions, methods, instances or classes
|
||||
between sub-interpreters, since import operations executed by such
|
||||
objects may affect the wrong (sub-)interpreter's dictionary of
|
||||
loaded modules. (XXX This is a hard-to-fix bug that will be
|
||||
addressed in a future release.)
|
||||
|
||||
Also note that the use of this functionality is incompatible with
|
||||
extension modules such as PyObjC and ctypes that use the
|
||||
\cfunction{PyGILState_*} APIs (and this is inherent in the way the
|
||||
\cfunction{PyGILState_*} functions work). Simple things may work,
|
||||
but confusing behavior will always be near.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{Py_EndInterpreter}{PyThreadState *tstate}
|
||||
Destroy the (sub-)interpreter represented by the given thread state.
|
||||
The given thread state must be the current thread state. See the
|
||||
discussion of thread states below. When the call returns, the
|
||||
current thread state is \NULL. All thread states associated with
|
||||
this interpreter are destroyed. (The global interpreter lock must
|
||||
be held before calling this function and is still held when it
|
||||
returns.) \cfunction{Py_Finalize()}\ttindex{Py_Finalize()} will
|
||||
destroy all sub-interpreters that haven't been explicitly destroyed
|
||||
at that point.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{Py_SetProgramName}{char *name}
|
||||
This function should be called before
|
||||
\cfunction{Py_Initialize()}\ttindex{Py_Initialize()} is called
|
||||
for the first time, if it is called at all. It tells the
|
||||
interpreter the value of the \code{argv[0]} argument to the
|
||||
\cfunction{main()}\ttindex{main()} function of the program. This is
|
||||
used by \cfunction{Py_GetPath()}\ttindex{Py_GetPath()} and some
|
||||
other functions below to find the Python run-time libraries relative
|
||||
to the interpreter executable. The default value is
|
||||
\code{'python'}. The argument should point to a zero-terminated
|
||||
character string in static storage whose contents will not change
|
||||
for the duration of the program's execution. No code in the Python
|
||||
interpreter will change the contents of this storage.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{char*}{Py_GetProgramName}{}
|
||||
Return the program name set with
|
||||
\cfunction{Py_SetProgramName()}\ttindex{Py_SetProgramName()}, or the
|
||||
default. The returned string points into static storage; the caller
|
||||
should not modify its value.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{char*}{Py_GetPrefix}{}
|
||||
Return the \emph{prefix} for installed platform-independent files.
|
||||
This is derived through a number of complicated rules from the
|
||||
program name set with \cfunction{Py_SetProgramName()} and some
|
||||
environment variables; for example, if the program name is
|
||||
\code{'/usr/local/bin/python'}, the prefix is \code{'/usr/local'}.
|
||||
The returned string points into static storage; the caller should
|
||||
not modify its value. This corresponds to the \makevar{prefix}
|
||||
variable in the top-level \file{Makefile} and the
|
||||
\longprogramopt{prefix} argument to the \program{configure} script
|
||||
at build time. The value is available to Python code as
|
||||
\code{sys.prefix}. It is only useful on \UNIX{}. See also the next
|
||||
function.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{char*}{Py_GetExecPrefix}{}
|
||||
Return the \emph{exec-prefix} for installed
|
||||
platform-\emph{de}pendent files. This is derived through a number
|
||||
of complicated rules from the program name set with
|
||||
\cfunction{Py_SetProgramName()} and some environment variables; for
|
||||
example, if the program name is \code{'/usr/local/bin/python'}, the
|
||||
exec-prefix is \code{'/usr/local'}. The returned string points into
|
||||
static storage; the caller should not modify its value. This
|
||||
corresponds to the \makevar{exec_prefix} variable in the top-level
|
||||
\file{Makefile} and the \longprogramopt{exec-prefix} argument to the
|
||||
\program{configure} script at build time. The value is available
|
||||
to Python code as \code{sys.exec_prefix}. It is only useful on
|
||||
\UNIX.
|
||||
|
||||
Background: The exec-prefix differs from the prefix when platform
|
||||
dependent files (such as executables and shared libraries) are
|
||||
installed in a different directory tree. In a typical installation,
|
||||
platform dependent files may be installed in the
|
||||
\file{/usr/local/plat} subtree while platform independent may be
|
||||
installed in \file{/usr/local}.
|
||||
|
||||
Generally speaking, a platform is a combination of hardware and
|
||||
software families, e.g. Sparc machines running the Solaris 2.x
|
||||
operating system are considered the same platform, but Intel
|
||||
machines running Solaris 2.x are another platform, and Intel
|
||||
machines running Linux are yet another platform. Different major
|
||||
revisions of the same operating system generally also form different
|
||||
platforms. Non-\UNIX{} operating systems are a different story; the
|
||||
installation strategies on those systems are so different that the
|
||||
prefix and exec-prefix are meaningless, and set to the empty string.
|
||||
Note that compiled Python bytecode files are platform independent
|
||||
(but not independent from the Python version by which they were
|
||||
compiled!).
|
||||
|
||||
System administrators will know how to configure the \program{mount}
|
||||
or \program{automount} programs to share \file{/usr/local} between
|
||||
platforms while having \file{/usr/local/plat} be a different
|
||||
filesystem for each platform.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{char*}{Py_GetProgramFullPath}{}
|
||||
Return the full program name of the Python executable; this is
|
||||
computed as a side-effect of deriving the default module search path
|
||||
from the program name (set by
|
||||
\cfunction{Py_SetProgramName()}\ttindex{Py_SetProgramName()} above).
|
||||
The returned string points into static storage; the caller should
|
||||
not modify its value. The value is available to Python code as
|
||||
\code{sys.executable}.
|
||||
\withsubitem{(in module sys)}{\ttindex{executable}}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{char*}{Py_GetPath}{}
|
||||
\indexiii{module}{search}{path}
|
||||
Return the default module search path; this is computed from the
|
||||
program name (set by \cfunction{Py_SetProgramName()} above) and some
|
||||
environment variables. The returned string consists of a series of
|
||||
directory names separated by a platform dependent delimiter
|
||||
character. The delimiter character is \character{:} on \UNIX{} and Mac OS X,
|
||||
\character{;} on Windows. The returned string points into
|
||||
static storage; the caller should not modify its value. The value
|
||||
is available to Python code as the list
|
||||
\code{sys.path}\withsubitem{(in module sys)}{\ttindex{path}}, which
|
||||
may be modified to change the future search path for loaded
|
||||
modules.
|
||||
|
||||
% XXX should give the exact rules
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{const char*}{Py_GetVersion}{}
|
||||
Return the version of this Python interpreter. This is a string
|
||||
that looks something like
|
||||
|
||||
\begin{verbatim}
|
||||
"1.5 (#67, Dec 31 1997, 22:34:28) [GCC 2.7.2.2]"
|
||||
\end{verbatim}
|
||||
|
||||
The first word (up to the first space character) is the current
|
||||
Python version; the first three characters are the major and minor
|
||||
version separated by a period. The returned string points into
|
||||
static storage; the caller should not modify its value. The value
|
||||
is available to Python code as \code{sys.version}.
|
||||
\withsubitem{(in module sys)}{\ttindex{version}}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{const char*}{Py_GetBuildNumber}{}
|
||||
Return a string representing the Subversion revision that this Python
|
||||
executable was built from. This number is a string because it may contain a
|
||||
trailing 'M' if Python was built from a mixed revision source tree.
|
||||
\versionadded{2.5}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{const char*}{Py_GetPlatform}{}
|
||||
Return the platform identifier for the current platform. On \UNIX,
|
||||
this is formed from the ``official'' name of the operating system,
|
||||
converted to lower case, followed by the major revision number;
|
||||
e.g., for Solaris 2.x, which is also known as SunOS 5.x, the value
|
||||
is \code{'sunos5'}. On Mac OS X, it is \code{'darwin'}. On Windows,
|
||||
it is \code{'win'}. The returned string points into static storage;
|
||||
the caller should not modify its value. The value is available to
|
||||
Python code as \code{sys.platform}.
|
||||
\withsubitem{(in module sys)}{\ttindex{platform}}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{const char*}{Py_GetCopyright}{}
|
||||
Return the official copyright string for the current Python version,
|
||||
for example
|
||||
|
||||
\code{'Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam'}
|
||||
|
||||
The returned string points into static storage; the caller should
|
||||
not modify its value. The value is available to Python code as
|
||||
\code{sys.copyright}.
|
||||
\withsubitem{(in module sys)}{\ttindex{copyright}}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{const char*}{Py_GetCompiler}{}
|
||||
Return an indication of the compiler used to build the current
|
||||
Python version, in square brackets, for example:
|
||||
|
||||
\begin{verbatim}
|
||||
"[GCC 2.7.2.2]"
|
||||
\end{verbatim}
|
||||
|
||||
The returned string points into static storage; the caller should
|
||||
not modify its value. The value is available to Python code as part
|
||||
of the variable \code{sys.version}.
|
||||
\withsubitem{(in module sys)}{\ttindex{version}}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{const char*}{Py_GetBuildInfo}{}
|
||||
Return information about the sequence number and build date and time
|
||||
of the current Python interpreter instance, for example
|
||||
|
||||
\begin{verbatim}
|
||||
"#67, Aug 1 1997, 22:34:28"
|
||||
\end{verbatim}
|
||||
|
||||
The returned string points into static storage; the caller should
|
||||
not modify its value. The value is available to Python code as part
|
||||
of the variable \code{sys.version}.
|
||||
\withsubitem{(in module sys)}{\ttindex{version}}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PySys_SetArgv}{int argc, char **argv}
|
||||
Set \code{sys.argv} based on \var{argc} and \var{argv}. These
|
||||
parameters are similar to those passed to the program's
|
||||
\cfunction{main()}\ttindex{main()} function with the difference that
|
||||
the first entry should refer to the script file to be executed
|
||||
rather than the executable hosting the Python interpreter. If there
|
||||
isn't a script that will be run, the first entry in \var{argv} can
|
||||
be an empty string. If this function fails to initialize
|
||||
\code{sys.argv}, a fatal condition is signalled using
|
||||
\cfunction{Py_FatalError()}\ttindex{Py_FatalError()}.
|
||||
\withsubitem{(in module sys)}{\ttindex{argv}}
|
||||
% XXX impl. doesn't seem consistent in allowing 0/NULL for the params;
|
||||
% check w/ Guido.
|
||||
\end{cfuncdesc}
|
||||
|
||||
% XXX Other PySys thingies (doesn't really belong in this chapter)
|
||||
|
||||
\section{Thread State and the Global Interpreter Lock
|
||||
\label{threads}}
|
||||
|
||||
\index{global interpreter lock}
|
||||
\index{interpreter lock}
|
||||
\index{lock, interpreter}
|
||||
|
||||
The Python interpreter is not fully thread safe. In order to support
|
||||
multi-threaded Python programs, there's a global lock that must be
|
||||
held by the current thread before it can safely access Python objects.
|
||||
Without the lock, even the simplest operations could cause problems in
|
||||
a multi-threaded program: for example, when two threads simultaneously
|
||||
increment the reference count of the same object, the reference count
|
||||
could end up being incremented only once instead of twice.
|
||||
|
||||
Therefore, the rule exists that only the thread that has acquired the
|
||||
global interpreter lock may operate on Python objects or call Python/C
|
||||
API functions. In order to support multi-threaded Python programs,
|
||||
the interpreter regularly releases and reacquires the lock --- by
|
||||
default, every 100 bytecode instructions (this can be changed with
|
||||
\withsubitem{(in module sys)}{\ttindex{setcheckinterval()}}
|
||||
\function{sys.setcheckinterval()}). The lock is also released and
|
||||
reacquired around potentially blocking I/O operations like reading or
|
||||
writing a file, so that other threads can run while the thread that
|
||||
requests the I/O is waiting for the I/O operation to complete.
|
||||
|
||||
The Python interpreter needs to keep some bookkeeping information
|
||||
separate per thread --- for this it uses a data structure called
|
||||
\ctype{PyThreadState}\ttindex{PyThreadState}. There's one global
|
||||
variable, however: the pointer to the current
|
||||
\ctype{PyThreadState}\ttindex{PyThreadState} structure. While most
|
||||
thread packages have a way to store ``per-thread global data,''
|
||||
Python's internal platform independent thread abstraction doesn't
|
||||
support this yet. Therefore, the current thread state must be
|
||||
manipulated explicitly.
|
||||
|
||||
This is easy enough in most cases. Most code manipulating the global
|
||||
interpreter lock has the following simple structure:
|
||||
|
||||
\begin{verbatim}
|
||||
Save the thread state in a local variable.
|
||||
Release the interpreter lock.
|
||||
...Do some blocking I/O operation...
|
||||
Reacquire the interpreter lock.
|
||||
Restore the thread state from the local variable.
|
||||
\end{verbatim}
|
||||
|
||||
This is so common that a pair of macros exists to simplify it:
|
||||
|
||||
\begin{verbatim}
|
||||
Py_BEGIN_ALLOW_THREADS
|
||||
...Do some blocking I/O operation...
|
||||
Py_END_ALLOW_THREADS
|
||||
\end{verbatim}
|
||||
|
||||
The
|
||||
\csimplemacro{Py_BEGIN_ALLOW_THREADS}\ttindex{Py_BEGIN_ALLOW_THREADS}
|
||||
macro opens a new block and declares a hidden local variable; the
|
||||
\csimplemacro{Py_END_ALLOW_THREADS}\ttindex{Py_END_ALLOW_THREADS}
|
||||
macro closes the block. Another advantage of using these two macros
|
||||
is that when Python is compiled without thread support, they are
|
||||
defined empty, thus saving the thread state and lock manipulations.
|
||||
|
||||
When thread support is enabled, the block above expands to the
|
||||
following code:
|
||||
|
||||
\begin{verbatim}
|
||||
PyThreadState *_save;
|
||||
|
||||
_save = PyEval_SaveThread();
|
||||
...Do some blocking I/O operation...
|
||||
PyEval_RestoreThread(_save);
|
||||
\end{verbatim}
|
||||
|
||||
Using even lower level primitives, we can get roughly the same effect
|
||||
as follows:
|
||||
|
||||
\begin{verbatim}
|
||||
PyThreadState *_save;
|
||||
|
||||
_save = PyThreadState_Swap(NULL);
|
||||
PyEval_ReleaseLock();
|
||||
...Do some blocking I/O operation...
|
||||
PyEval_AcquireLock();
|
||||
PyThreadState_Swap(_save);
|
||||
\end{verbatim}
|
||||
|
||||
There are some subtle differences; in particular,
|
||||
\cfunction{PyEval_RestoreThread()}\ttindex{PyEval_RestoreThread()} saves
|
||||
and restores the value of the global variable
|
||||
\cdata{errno}\ttindex{errno}, since the lock manipulation does not
|
||||
guarantee that \cdata{errno} is left alone. Also, when thread support
|
||||
is disabled,
|
||||
\cfunction{PyEval_SaveThread()}\ttindex{PyEval_SaveThread()} and
|
||||
\cfunction{PyEval_RestoreThread()} don't manipulate the lock; in this
|
||||
case, \cfunction{PyEval_ReleaseLock()}\ttindex{PyEval_ReleaseLock()} and
|
||||
\cfunction{PyEval_AcquireLock()}\ttindex{PyEval_AcquireLock()} are not
|
||||
available. This is done so that dynamically loaded extensions
|
||||
compiled with thread support enabled can be loaded by an interpreter
|
||||
that was compiled with disabled thread support.
|
||||
|
||||
The global interpreter lock is used to protect the pointer to the
|
||||
current thread state. When releasing the lock and saving the thread
|
||||
state, the current thread state pointer must be retrieved before the
|
||||
lock is released (since another thread could immediately acquire the
|
||||
lock and store its own thread state in the global variable).
|
||||
Conversely, when acquiring the lock and restoring the thread state,
|
||||
the lock must be acquired before storing the thread state pointer.
|
||||
|
||||
Why am I going on with so much detail about this? Because when
|
||||
threads are created from C, they don't have the global interpreter
|
||||
lock, nor is there a thread state data structure for them. Such
|
||||
threads must bootstrap themselves into existence, by first creating a
|
||||
thread state data structure, then acquiring the lock, and finally
|
||||
storing their thread state pointer, before they can start using the
|
||||
Python/C API. When they are done, they should reset the thread state
|
||||
pointer, release the lock, and finally free their thread state data
|
||||
structure.
|
||||
|
||||
Beginning with version 2.3, threads can now take advantage of the
|
||||
\cfunction{PyGILState_*()} functions to do all of the above
|
||||
automatically. The typical idiom for calling into Python from a C
|
||||
thread is now:
|
||||
|
||||
\begin{verbatim}
|
||||
PyGILState_STATE gstate;
|
||||
gstate = PyGILState_Ensure();
|
||||
|
||||
/* Perform Python actions here. */
|
||||
result = CallSomeFunction();
|
||||
/* evaluate result */
|
||||
|
||||
/* Release the thread. No Python API allowed beyond this point. */
|
||||
PyGILState_Release(gstate);
|
||||
\end{verbatim}
|
||||
|
||||
Note that the \cfunction{PyGILState_*()} functions assume there is
|
||||
only one global interpreter (created automatically by
|
||||
\cfunction{Py_Initialize()}). Python still supports the creation of
|
||||
additional interpreters (using \cfunction{Py_NewInterpreter()}), but
|
||||
mixing multiple interpreters and the \cfunction{PyGILState_*()} API is
|
||||
unsupported.
|
||||
|
||||
\begin{ctypedesc}{PyInterpreterState}
|
||||
This data structure represents the state shared by a number of
|
||||
cooperating threads. Threads belonging to the same interpreter
|
||||
share their module administration and a few other internal items.
|
||||
There are no public members in this structure.
|
||||
|
||||
Threads belonging to different interpreters initially share nothing,
|
||||
except process state like available memory, open file descriptors
|
||||
and such. The global interpreter lock is also shared by all
|
||||
threads, regardless of to which interpreter they belong.
|
||||
\end{ctypedesc}
|
||||
|
||||
\begin{ctypedesc}{PyThreadState}
|
||||
This data structure represents the state of a single thread. The
|
||||
only public data member is \ctype{PyInterpreterState
|
||||
*}\member{interp}, which points to this thread's interpreter state.
|
||||
\end{ctypedesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyEval_InitThreads}{}
|
||||
Initialize and acquire the global interpreter lock. It should be
|
||||
called in the main thread before creating a second thread or
|
||||
engaging in any other thread operations such as
|
||||
\cfunction{PyEval_ReleaseLock()}\ttindex{PyEval_ReleaseLock()} or
|
||||
\code{PyEval_ReleaseThread(\var{tstate})}\ttindex{PyEval_ReleaseThread()}.
|
||||
It is not needed before calling
|
||||
\cfunction{PyEval_SaveThread()}\ttindex{PyEval_SaveThread()} or
|
||||
\cfunction{PyEval_RestoreThread()}\ttindex{PyEval_RestoreThread()}.
|
||||
|
||||
This is a no-op when called for a second time. It is safe to call
|
||||
this function before calling
|
||||
\cfunction{Py_Initialize()}\ttindex{Py_Initialize()}.
|
||||
|
||||
When only the main thread exists, no lock operations are needed.
|
||||
This is a common situation (most Python programs do not use
|
||||
threads), and the lock operations slow the interpreter down a bit.
|
||||
Therefore, the lock is not created initially. This situation is
|
||||
equivalent to having acquired the lock: when there is only a single
|
||||
thread, all object accesses are safe. Therefore, when this function
|
||||
initializes the lock, it also acquires it. Before the Python
|
||||
\module{thread}\refbimodindex{thread} module creates a new thread,
|
||||
knowing that either it has the lock or the lock hasn't been created
|
||||
yet, it calls \cfunction{PyEval_InitThreads()}. When this call
|
||||
returns, it is guaranteed that the lock has been created and that the
|
||||
calling thread has acquired it.
|
||||
|
||||
It is \strong{not} safe to call this function when it is unknown
|
||||
which thread (if any) currently has the global interpreter lock.
|
||||
|
||||
This function is not available when thread support is disabled at
|
||||
compile time.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyEval_ThreadsInitialized}{}
|
||||
Returns a non-zero value if \cfunction{PyEval_InitThreads()} has been
|
||||
called. This function can be called without holding the lock, and
|
||||
therefore can be used to avoid calls to the locking API when running
|
||||
single-threaded. This function is not available when thread support
|
||||
is disabled at compile time. \versionadded{2.4}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyEval_AcquireLock}{}
|
||||
Acquire the global interpreter lock. The lock must have been
|
||||
created earlier. If this thread already has the lock, a deadlock
|
||||
ensues. This function is not available when thread support is
|
||||
disabled at compile time.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyEval_ReleaseLock}{}
|
||||
Release the global interpreter lock. The lock must have been
|
||||
created earlier. This function is not available when thread support
|
||||
is disabled at compile time.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyEval_AcquireThread}{PyThreadState *tstate}
|
||||
Acquire the global interpreter lock and set the current thread
|
||||
state to \var{tstate}, which should not be \NULL. The lock must
|
||||
have been created earlier. If this thread already has the lock,
|
||||
deadlock ensues. This function is not available when thread support
|
||||
is disabled at compile time.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyEval_ReleaseThread}{PyThreadState *tstate}
|
||||
Reset the current thread state to \NULL{} and release the global
|
||||
interpreter lock. The lock must have been created earlier and must
|
||||
be held by the current thread. The \var{tstate} argument, which
|
||||
must not be \NULL, is only used to check that it represents the
|
||||
current thread state --- if it isn't, a fatal error is reported.
|
||||
This function is not available when thread support is disabled at
|
||||
compile time.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyThreadState*}{PyEval_SaveThread}{}
|
||||
Release the interpreter lock (if it has been created and thread
|
||||
support is enabled) and reset the thread state to \NULL, returning
|
||||
the previous thread state (which is not \NULL). If the lock has
|
||||
been created, the current thread must have acquired it. (This
|
||||
function is available even when thread support is disabled at
|
||||
compile time.)
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyEval_RestoreThread}{PyThreadState *tstate}
|
||||
Acquire the interpreter lock (if it has been created and thread
|
||||
support is enabled) and set the thread state to \var{tstate}, which
|
||||
must not be \NULL. If the lock has been created, the current thread
|
||||
must not have acquired it, otherwise deadlock ensues. (This
|
||||
function is available even when thread support is disabled at
|
||||
compile time.)
|
||||
\end{cfuncdesc}
|
||||
|
||||
The following macros are normally used without a trailing semicolon;
|
||||
look for example usage in the Python source distribution.
|
||||
|
||||
\begin{csimplemacrodesc}{Py_BEGIN_ALLOW_THREADS}
|
||||
This macro expands to
|
||||
\samp{\{ PyThreadState *_save; _save = PyEval_SaveThread();}.
|
||||
Note that it contains an opening brace; it must be matched with a
|
||||
following \csimplemacro{Py_END_ALLOW_THREADS} macro. See above for
|
||||
further discussion of this macro. It is a no-op when thread support
|
||||
is disabled at compile time.
|
||||
\end{csimplemacrodesc}
|
||||
|
||||
\begin{csimplemacrodesc}{Py_END_ALLOW_THREADS}
|
||||
This macro expands to \samp{PyEval_RestoreThread(_save); \}}.
|
||||
Note that it contains a closing brace; it must be matched with an
|
||||
earlier \csimplemacro{Py_BEGIN_ALLOW_THREADS} macro. See above for
|
||||
further discussion of this macro. It is a no-op when thread support
|
||||
is disabled at compile time.
|
||||
\end{csimplemacrodesc}
|
||||
|
||||
\begin{csimplemacrodesc}{Py_BLOCK_THREADS}
|
||||
This macro expands to \samp{PyEval_RestoreThread(_save);}: it is
|
||||
equivalent to \csimplemacro{Py_END_ALLOW_THREADS} without the
|
||||
closing brace. It is a no-op when thread support is disabled at
|
||||
compile time.
|
||||
\end{csimplemacrodesc}
|
||||
|
||||
\begin{csimplemacrodesc}{Py_UNBLOCK_THREADS}
|
||||
This macro expands to \samp{_save = PyEval_SaveThread();}: it is
|
||||
equivalent to \csimplemacro{Py_BEGIN_ALLOW_THREADS} without the
|
||||
opening brace and variable declaration. It is a no-op when thread
|
||||
support is disabled at compile time.
|
||||
\end{csimplemacrodesc}
|
||||
|
||||
All of the following functions are only available when thread support
|
||||
is enabled at compile time, and must be called only when the
|
||||
interpreter lock has been created.
|
||||
|
||||
\begin{cfuncdesc}{PyInterpreterState*}{PyInterpreterState_New}{}
|
||||
Create a new interpreter state object. The interpreter lock need
|
||||
not be held, but may be held if it is necessary to serialize calls
|
||||
to this function.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyInterpreterState_Clear}{PyInterpreterState *interp}
|
||||
Reset all information in an interpreter state object. The
|
||||
interpreter lock must be held.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyInterpreterState_Delete}{PyInterpreterState *interp}
|
||||
Destroy an interpreter state object. The interpreter lock need not
|
||||
be held. The interpreter state must have been reset with a previous
|
||||
call to \cfunction{PyInterpreterState_Clear()}.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyThreadState*}{PyThreadState_New}{PyInterpreterState *interp}
|
||||
Create a new thread state object belonging to the given interpreter
|
||||
object. The interpreter lock need not be held, but may be held if
|
||||
it is necessary to serialize calls to this function.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyThreadState_Clear}{PyThreadState *tstate}
|
||||
Reset all information in a thread state object. The interpreter lock
|
||||
must be held.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyThreadState_Delete}{PyThreadState *tstate}
|
||||
Destroy a thread state object. The interpreter lock need not be
|
||||
held. The thread state must have been reset with a previous call to
|
||||
\cfunction{PyThreadState_Clear()}.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyThreadState*}{PyThreadState_Get}{}
|
||||
Return the current thread state. The interpreter lock must be
|
||||
held. When the current thread state is \NULL, this issues a fatal
|
||||
error (so that the caller needn't check for \NULL).
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyThreadState*}{PyThreadState_Swap}{PyThreadState *tstate}
|
||||
Swap the current thread state with the thread state given by the
|
||||
argument \var{tstate}, which may be \NULL. The interpreter lock
|
||||
must be held.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyThreadState_GetDict}{}
|
||||
Return a dictionary in which extensions can store thread-specific
|
||||
state information. Each extension should use a unique key to use to
|
||||
store state in the dictionary. It is okay to call this function
|
||||
when no current thread state is available.
|
||||
If this function returns \NULL, no exception has been raised and the
|
||||
caller should assume no current thread state is available.
|
||||
\versionchanged[Previously this could only be called when a current
|
||||
thread is active, and \NULL{} meant that an exception was raised]{2.3}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyThreadState_SetAsyncExc}{long id, PyObject *exc}
|
||||
Asynchronously raise an exception in a thread.
|
||||
The \var{id} argument is the thread id of the target thread;
|
||||
\var{exc} is the exception object to be raised.
|
||||
This function does not steal any references to \var{exc}.
|
||||
To prevent naive misuse, you must write your own C extension
|
||||
to call this. Must be called with the GIL held.
|
||||
Returns the number of thread states modified; this is normally one, but
|
||||
will be zero if the thread id isn't found. If \var{exc} is
|
||||
\constant{NULL}, the pending exception (if any) for the thread is cleared.
|
||||
This raises no exceptions.
|
||||
\versionadded{2.3}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyGILState_STATE}{PyGILState_Ensure}{}
|
||||
Ensure that the current thread is ready to call the Python C API
|
||||
regardless of the current state of Python, or of its thread lock.
|
||||
This may be called as many times as desired by a thread as long as
|
||||
each call is matched with a call to \cfunction{PyGILState_Release()}.
|
||||
In general, other thread-related APIs may be used between
|
||||
\cfunction{PyGILState_Ensure()} and \cfunction{PyGILState_Release()}
|
||||
calls as long as the thread state is restored to its previous state
|
||||
before the Release(). For example, normal usage of the
|
||||
\csimplemacro{Py_BEGIN_ALLOW_THREADS} and
|
||||
\csimplemacro{Py_END_ALLOW_THREADS} macros is acceptable.
|
||||
|
||||
The return value is an opaque "handle" to the thread state when
|
||||
\cfunction{PyGILState_Acquire()} was called, and must be passed to
|
||||
\cfunction{PyGILState_Release()} to ensure Python is left in the same
|
||||
state. Even though recursive calls are allowed, these handles
|
||||
\emph{cannot} be shared - each unique call to
|
||||
\cfunction{PyGILState_Ensure} must save the handle for its call to
|
||||
\cfunction{PyGILState_Release}.
|
||||
|
||||
When the function returns, the current thread will hold the GIL.
|
||||
Failure is a fatal error.
|
||||
\versionadded{2.3}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyGILState_Release}{PyGILState_STATE}
|
||||
Release any resources previously acquired. After this call, Python's
|
||||
state will be the same as it was prior to the corresponding
|
||||
\cfunction{PyGILState_Ensure} call (but generally this state will be
|
||||
unknown to the caller, hence the use of the GILState API.)
|
||||
|
||||
Every call to \cfunction{PyGILState_Ensure()} must be matched by a call to
|
||||
\cfunction{PyGILState_Release()} on the same thread.
|
||||
\versionadded{2.3}
|
||||
\end{cfuncdesc}
|
||||
|
||||
|
||||
\section{Profiling and Tracing \label{profiling}}
|
||||
|
||||
\sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
|
||||
|
||||
The Python interpreter provides some low-level support for attaching
|
||||
profiling and execution tracing facilities. These are used for
|
||||
profiling, debugging, and coverage analysis tools.
|
||||
|
||||
Starting with Python 2.2, the implementation of this facility was
|
||||
substantially revised, and an interface from C was added. This C
|
||||
interface allows the profiling or tracing code to avoid the overhead
|
||||
of calling through Python-level callable objects, making a direct C
|
||||
function call instead. The essential attributes of the facility have
|
||||
not changed; the interface allows trace functions to be installed
|
||||
per-thread, and the basic events reported to the trace function are
|
||||
the same as had been reported to the Python-level trace functions in
|
||||
previous versions.
|
||||
|
||||
\begin{ctypedesc}[Py_tracefunc]{int (*Py_tracefunc)(PyObject *obj,
|
||||
PyFrameObject *frame, int what,
|
||||
PyObject *arg)}
|
||||
The type of the trace function registered using
|
||||
\cfunction{PyEval_SetProfile()} and \cfunction{PyEval_SetTrace()}.
|
||||
The first parameter is the object passed to the registration
|
||||
function as \var{obj}, \var{frame} is the frame object to which the
|
||||
event pertains, \var{what} is one of the constants
|
||||
\constant{PyTrace_CALL}, \constant{PyTrace_EXCEPTION},
|
||||
\constant{PyTrace_LINE}, \constant{PyTrace_RETURN},
|
||||
\constant{PyTrace_C_CALL}, \constant{PyTrace_C_EXCEPTION},
|
||||
or \constant{PyTrace_C_RETURN}, and \var{arg}
|
||||
depends on the value of \var{what}:
|
||||
|
||||
\begin{tableii}{l|l}{constant}{Value of \var{what}}{Meaning of \var{arg}}
|
||||
\lineii{PyTrace_CALL}{Always \NULL.}
|
||||
\lineii{PyTrace_EXCEPTION}{Exception information as returned by
|
||||
\function{sys.exc_info()}.}
|
||||
\lineii{PyTrace_LINE}{Always \NULL.}
|
||||
\lineii{PyTrace_RETURN}{Value being returned to the caller.}
|
||||
\lineii{PyTrace_C_CALL}{Name of function being called.}
|
||||
\lineii{PyTrace_C_EXCEPTION}{Always \NULL.}
|
||||
\lineii{PyTrace_C_RETURN}{Always \NULL.}
|
||||
\end{tableii}
|
||||
\end{ctypedesc}
|
||||
|
||||
\begin{cvardesc}{int}{PyTrace_CALL}
|
||||
The value of the \var{what} parameter to a \ctype{Py_tracefunc}
|
||||
function when a new call to a function or method is being reported,
|
||||
or a new entry into a generator. Note that the creation of the
|
||||
iterator for a generator function is not reported as there is no
|
||||
control transfer to the Python bytecode in the corresponding frame.
|
||||
\end{cvardesc}
|
||||
|
||||
\begin{cvardesc}{int}{PyTrace_EXCEPTION}
|
||||
The value of the \var{what} parameter to a \ctype{Py_tracefunc}
|
||||
function when an exception has been raised. The callback function
|
||||
is called with this value for \var{what} when after any bytecode is
|
||||
processed after which the exception becomes set within the frame
|
||||
being executed. The effect of this is that as exception propagation
|
||||
causes the Python stack to unwind, the callback is called upon
|
||||
return to each frame as the exception propagates. Only trace
|
||||
functions receives these events; they are not needed by the
|
||||
profiler.
|
||||
\end{cvardesc}
|
||||
|
||||
\begin{cvardesc}{int}{PyTrace_LINE}
|
||||
The value passed as the \var{what} parameter to a trace function
|
||||
(but not a profiling function) when a line-number event is being
|
||||
reported.
|
||||
\end{cvardesc}
|
||||
|
||||
\begin{cvardesc}{int}{PyTrace_RETURN}
|
||||
The value for the \var{what} parameter to \ctype{Py_tracefunc}
|
||||
functions when a call is returning without propagating an exception.
|
||||
\end{cvardesc}
|
||||
|
||||
\begin{cvardesc}{int}{PyTrace_C_CALL}
|
||||
The value for the \var{what} parameter to \ctype{Py_tracefunc}
|
||||
functions when a C function is about to be called.
|
||||
\end{cvardesc}
|
||||
|
||||
\begin{cvardesc}{int}{PyTrace_C_EXCEPTION}
|
||||
The value for the \var{what} parameter to \ctype{Py_tracefunc}
|
||||
functions when a C function has thrown an exception.
|
||||
\end{cvardesc}
|
||||
|
||||
\begin{cvardesc}{int}{PyTrace_C_RETURN}
|
||||
The value for the \var{what} parameter to \ctype{Py_tracefunc}
|
||||
functions when a C function has returned.
|
||||
\end{cvardesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyEval_SetProfile}{Py_tracefunc func, PyObject *obj}
|
||||
Set the profiler function to \var{func}. The \var{obj} parameter is
|
||||
passed to the function as its first parameter, and may be any Python
|
||||
object, or \NULL. If the profile function needs to maintain state,
|
||||
using a different value for \var{obj} for each thread provides a
|
||||
convenient and thread-safe place to store it. The profile function
|
||||
is called for all monitored events except the line-number events.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyEval_SetTrace}{Py_tracefunc func, PyObject *obj}
|
||||
Set the tracing function to \var{func}. This is similar to
|
||||
\cfunction{PyEval_SetProfile()}, except the tracing function does
|
||||
receive line-number events.
|
||||
\end{cfuncdesc}
|
||||
|
||||
|
||||
\section{Advanced Debugger Support \label{advanced-debugging}}
|
||||
\sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
|
||||
|
||||
These functions are only intended to be used by advanced debugging
|
||||
tools.
|
||||
|
||||
\begin{cfuncdesc}{PyInterpreterState*}{PyInterpreterState_Head}{}
|
||||
Return the interpreter state object at the head of the list of all
|
||||
such objects.
|
||||
\versionadded{2.2}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyInterpreterState*}{PyInterpreterState_Next}{PyInterpreterState *interp}
|
||||
Return the next interpreter state object after \var{interp} from the
|
||||
list of all such objects.
|
||||
\versionadded{2.2}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyThreadState *}{PyInterpreterState_ThreadHead}{PyInterpreterState *interp}
|
||||
Return the a pointer to the first \ctype{PyThreadState} object in
|
||||
the list of threads associated with the interpreter \var{interp}.
|
||||
\versionadded{2.2}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyThreadState*}{PyThreadState_Next}{PyThreadState *tstate}
|
||||
Return the next thread state object after \var{tstate} from the list
|
||||
of all such objects belonging to the same \ctype{PyInterpreterState}
|
||||
object.
|
||||
\versionadded{2.2}
|
||||
\end{cfuncdesc}
|
|
@ -1,627 +0,0 @@
|
|||
\chapter{Introduction \label{intro}}
|
||||
|
||||
|
||||
The Application Programmer's Interface to Python gives C and
|
||||
\Cpp{} programmers access to the Python interpreter at a variety of
|
||||
levels. The API is equally usable from \Cpp, but for brevity it is
|
||||
generally referred to as the Python/C API. There are two
|
||||
fundamentally different reasons for using the Python/C API. The first
|
||||
reason is to write \emph{extension modules} for specific purposes;
|
||||
these are C modules that extend the Python interpreter. This is
|
||||
probably the most common use. The second reason is to use Python as a
|
||||
component in a larger application; this technique is generally
|
||||
referred to as \dfn{embedding} Python in an application.
|
||||
|
||||
Writing an extension module is a relatively well-understood process,
|
||||
where a ``cookbook'' approach works well. There are several tools
|
||||
that automate the process to some extent. While people have embedded
|
||||
Python in other applications since its early existence, the process of
|
||||
embedding Python is less straightforward than writing an extension.
|
||||
|
||||
Many API functions are useful independent of whether you're embedding
|
||||
or extending Python; moreover, most applications that embed Python
|
||||
will need to provide a custom extension as well, so it's probably a
|
||||
good idea to become familiar with writing an extension before
|
||||
attempting to embed Python in a real application.
|
||||
|
||||
|
||||
\section{Include Files \label{includes}}
|
||||
|
||||
All function, type and macro definitions needed to use the Python/C
|
||||
API are included in your code by the following line:
|
||||
|
||||
\begin{verbatim}
|
||||
#include "Python.h"
|
||||
\end{verbatim}
|
||||
|
||||
This implies inclusion of the following standard headers:
|
||||
\code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>},
|
||||
\code{<limits.h>}, and \code{<stdlib.h>} (if available).
|
||||
|
||||
\begin{notice}[warning]
|
||||
Since Python may define some pre-processor definitions which affect
|
||||
the standard headers on some systems, you \emph{must} include
|
||||
\file{Python.h} before any standard headers are included.
|
||||
\end{notice}
|
||||
|
||||
All user visible names defined by Python.h (except those defined by
|
||||
the included standard headers) have one of the prefixes \samp{Py} or
|
||||
\samp{_Py}. Names beginning with \samp{_Py} are for internal use by
|
||||
the Python implementation and should not be used by extension writers.
|
||||
Structure member names do not have a reserved prefix.
|
||||
|
||||
\strong{Important:} user code should never define names that begin
|
||||
with \samp{Py} or \samp{_Py}. This confuses the reader, and
|
||||
jeopardizes the portability of the user code to future Python
|
||||
versions, which may define additional names beginning with one of
|
||||
these prefixes.
|
||||
|
||||
The header files are typically installed with Python. On \UNIX, these
|
||||
are located in the directories
|
||||
\file{\envvar{prefix}/include/python\var{version}/} and
|
||||
\file{\envvar{exec_prefix}/include/python\var{version}/}, where
|
||||
\envvar{prefix} and \envvar{exec_prefix} are defined by the
|
||||
corresponding parameters to Python's \program{configure} script and
|
||||
\var{version} is \code{sys.version[:3]}. On Windows, the headers are
|
||||
installed in \file{\envvar{prefix}/include}, where \envvar{prefix} is
|
||||
the installation directory specified to the installer.
|
||||
|
||||
To include the headers, place both directories (if different) on your
|
||||
compiler's search path for includes. Do \emph{not} place the parent
|
||||
directories on the search path and then use
|
||||
\samp{\#include <python\shortversion/Python.h>}; this will break on
|
||||
multi-platform builds since the platform independent headers under
|
||||
\envvar{prefix} include the platform specific headers from
|
||||
\envvar{exec_prefix}.
|
||||
|
||||
\Cpp{} users should note that though the API is defined entirely using
|
||||
C, the header files do properly declare the entry points to be
|
||||
\code{extern "C"}, so there is no need to do anything special to use
|
||||
the API from \Cpp.
|
||||
|
||||
|
||||
\section{Objects, Types and Reference Counts \label{objects}}
|
||||
|
||||
Most Python/C API functions have one or more arguments as well as a
|
||||
return value of type \ctype{PyObject*}. This type is a pointer
|
||||
to an opaque data type representing an arbitrary Python
|
||||
object. Since all Python object types are treated the same way by the
|
||||
Python language in most situations (e.g., assignments, scope rules,
|
||||
and argument passing), it is only fitting that they should be
|
||||
represented by a single C type. Almost all Python objects live on the
|
||||
heap: you never declare an automatic or static variable of type
|
||||
\ctype{PyObject}, only pointer variables of type \ctype{PyObject*} can
|
||||
be declared. The sole exception are the type objects\obindex{type};
|
||||
since these must never be deallocated, they are typically static
|
||||
\ctype{PyTypeObject} objects.
|
||||
|
||||
All Python objects (even Python integers) have a \dfn{type} and a
|
||||
\dfn{reference count}. An object's type determines what kind of object
|
||||
it is (e.g., an integer, a list, or a user-defined function; there are
|
||||
many more as explained in the \citetitle[../ref/ref.html]{Python
|
||||
Reference Manual}). For each of the well-known types there is a macro
|
||||
to check whether an object is of that type; for instance,
|
||||
\samp{PyList_Check(\var{a})} is true if (and only if) the object
|
||||
pointed to by \var{a} is a Python list.
|
||||
|
||||
|
||||
\subsection{Reference Counts \label{refcounts}}
|
||||
|
||||
The reference count is important because today's computers have a
|
||||
finite (and often severely limited) memory size; it counts how many
|
||||
different places there are that have a reference to an object. Such a
|
||||
place could be another object, or a global (or static) C variable, or
|
||||
a local variable in some C function. When an object's reference count
|
||||
becomes zero, the object is deallocated. If it contains references to
|
||||
other objects, their reference count is decremented. Those other
|
||||
objects may be deallocated in turn, if this decrement makes their
|
||||
reference count become zero, and so on. (There's an obvious problem
|
||||
with objects that reference each other here; for now, the solution is
|
||||
``don't do that.'')
|
||||
|
||||
Reference counts are always manipulated explicitly. The normal way is
|
||||
to use the macro \cfunction{Py_INCREF()}\ttindex{Py_INCREF()} to
|
||||
increment an object's reference count by one, and
|
||||
\cfunction{Py_DECREF()}\ttindex{Py_DECREF()} to decrement it by
|
||||
one. The \cfunction{Py_DECREF()} macro is considerably more complex
|
||||
than the incref one, since it must check whether the reference count
|
||||
becomes zero and then cause the object's deallocator to be called.
|
||||
The deallocator is a function pointer contained in the object's type
|
||||
structure. The type-specific deallocator takes care of decrementing
|
||||
the reference counts for other objects contained in the object if this
|
||||
is a compound object type, such as a list, as well as performing any
|
||||
additional finalization that's needed. There's no chance that the
|
||||
reference count can overflow; at least as many bits are used to hold
|
||||
the reference count as there are distinct memory locations in virtual
|
||||
memory (assuming \code{sizeof(long) >= sizeof(char*)}). Thus, the
|
||||
reference count increment is a simple operation.
|
||||
|
||||
It is not necessary to increment an object's reference count for every
|
||||
local variable that contains a pointer to an object. In theory, the
|
||||
object's reference count goes up by one when the variable is made to
|
||||
point to it and it goes down by one when the variable goes out of
|
||||
scope. However, these two cancel each other out, so at the end the
|
||||
reference count hasn't changed. The only real reason to use the
|
||||
reference count is to prevent the object from being deallocated as
|
||||
long as our variable is pointing to it. If we know that there is at
|
||||
least one other reference to the object that lives at least as long as
|
||||
our variable, there is no need to increment the reference count
|
||||
temporarily. An important situation where this arises is in objects
|
||||
that are passed as arguments to C functions in an extension module
|
||||
that are called from Python; the call mechanism guarantees to hold a
|
||||
reference to every argument for the duration of the call.
|
||||
|
||||
However, a common pitfall is to extract an object from a list and
|
||||
hold on to it for a while without incrementing its reference count.
|
||||
Some other operation might conceivably remove the object from the
|
||||
list, decrementing its reference count and possible deallocating it.
|
||||
The real danger is that innocent-looking operations may invoke
|
||||
arbitrary Python code which could do this; there is a code path which
|
||||
allows control to flow back to the user from a \cfunction{Py_DECREF()},
|
||||
so almost any operation is potentially dangerous.
|
||||
|
||||
A safe approach is to always use the generic operations (functions
|
||||
whose name begins with \samp{PyObject_}, \samp{PyNumber_},
|
||||
\samp{PySequence_} or \samp{PyMapping_}). These operations always
|
||||
increment the reference count of the object they return. This leaves
|
||||
the caller with the responsibility to call
|
||||
\cfunction{Py_DECREF()} when they are done with the result; this soon
|
||||
becomes second nature.
|
||||
|
||||
|
||||
\subsubsection{Reference Count Details \label{refcountDetails}}
|
||||
|
||||
The reference count behavior of functions in the Python/C API is best
|
||||
explained in terms of \emph{ownership of references}. Ownership
|
||||
pertains to references, never to objects (objects are not owned: they
|
||||
are always shared). "Owning a reference" means being responsible for
|
||||
calling Py_DECREF on it when the reference is no longer needed.
|
||||
Ownership can also be transferred, meaning that the code that receives
|
||||
ownership of the reference then becomes responsible for eventually
|
||||
decref'ing it by calling \cfunction{Py_DECREF()} or
|
||||
\cfunction{Py_XDECREF()} when it's no longer needed---or passing on
|
||||
this responsibility (usually to its caller).
|
||||
When a function passes ownership of a reference on to its caller, the
|
||||
caller is said to receive a \emph{new} reference. When no ownership
|
||||
is transferred, the caller is said to \emph{borrow} the reference.
|
||||
Nothing needs to be done for a borrowed reference.
|
||||
|
||||
Conversely, when a calling function passes it a reference to an
|
||||
object, there are two possibilities: the function \emph{steals} a
|
||||
reference to the object, or it does not. \emph{Stealing a reference}
|
||||
means that when you pass a reference to a function, that function
|
||||
assumes that it now owns that reference, and you are not responsible
|
||||
for it any longer.
|
||||
|
||||
Few functions steal references; the two notable exceptions are
|
||||
\cfunction{PyList_SetItem()}\ttindex{PyList_SetItem()} and
|
||||
\cfunction{PyTuple_SetItem()}\ttindex{PyTuple_SetItem()}, which
|
||||
steal a reference to the item (but not to the tuple or list into which
|
||||
the item is put!). These functions were designed to steal a reference
|
||||
because of a common idiom for populating a tuple or list with newly
|
||||
created objects; for example, the code to create the tuple \code{(1,
|
||||
2, "three")} could look like this (forgetting about error handling for
|
||||
the moment; a better way to code this is shown below):
|
||||
|
||||
\begin{verbatim}
|
||||
PyObject *t;
|
||||
|
||||
t = PyTuple_New(3);
|
||||
PyTuple_SetItem(t, 0, PyInt_FromLong(1L));
|
||||
PyTuple_SetItem(t, 1, PyInt_FromLong(2L));
|
||||
PyTuple_SetItem(t, 2, PyString_FromString("three"));
|
||||
\end{verbatim}
|
||||
|
||||
Here, \cfunction{PyInt_FromLong()} returns a new reference which is
|
||||
immediately stolen by \cfunction{PyTuple_SetItem()}. When you want to
|
||||
keep using an object although the reference to it will be stolen,
|
||||
use \cfunction{Py_INCREF()} to grab another reference before calling the
|
||||
reference-stealing function.
|
||||
|
||||
Incidentally, \cfunction{PyTuple_SetItem()} is the \emph{only} way to
|
||||
set tuple items; \cfunction{PySequence_SetItem()} and
|
||||
\cfunction{PyObject_SetItem()} refuse to do this since tuples are an
|
||||
immutable data type. You should only use
|
||||
\cfunction{PyTuple_SetItem()} for tuples that you are creating
|
||||
yourself.
|
||||
|
||||
Equivalent code for populating a list can be written using
|
||||
\cfunction{PyList_New()} and \cfunction{PyList_SetItem()}.
|
||||
|
||||
However, in practice, you will rarely use these ways of
|
||||
creating and populating a tuple or list. There's a generic function,
|
||||
\cfunction{Py_BuildValue()}, that can create most common objects from
|
||||
C values, directed by a \dfn{format string}. For example, the
|
||||
above two blocks of code could be replaced by the following (which
|
||||
also takes care of the error checking):
|
||||
|
||||
\begin{verbatim}
|
||||
PyObject *tuple, *list;
|
||||
|
||||
tuple = Py_BuildValue("(iis)", 1, 2, "three");
|
||||
list = Py_BuildValue("[iis]", 1, 2, "three");
|
||||
\end{verbatim}
|
||||
|
||||
It is much more common to use \cfunction{PyObject_SetItem()} and
|
||||
friends with items whose references you are only borrowing, like
|
||||
arguments that were passed in to the function you are writing. In
|
||||
that case, their behaviour regarding reference counts is much saner,
|
||||
since you don't have to increment a reference count so you can give a
|
||||
reference away (``have it be stolen''). For example, this function
|
||||
sets all items of a list (actually, any mutable sequence) to a given
|
||||
item:
|
||||
|
||||
\begin{verbatim}
|
||||
int
|
||||
set_all(PyObject *target, PyObject *item)
|
||||
{
|
||||
int i, n;
|
||||
|
||||
n = PyObject_Length(target);
|
||||
if (n < 0)
|
||||
return -1;
|
||||
for (i = 0; i < n; i++) {
|
||||
PyObject *index = PyInt_FromLong(i);
|
||||
if (!index)
|
||||
return -1;
|
||||
if (PyObject_SetItem(target, index, item) < 0)
|
||||
return -1;
|
||||
Py_DECREF(index);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
\end{verbatim}
|
||||
\ttindex{set_all()}
|
||||
|
||||
The situation is slightly different for function return values.
|
||||
While passing a reference to most functions does not change your
|
||||
ownership responsibilities for that reference, many functions that
|
||||
return a reference to an object give you ownership of the reference.
|
||||
The reason is simple: in many cases, the returned object is created
|
||||
on the fly, and the reference you get is the only reference to the
|
||||
object. Therefore, the generic functions that return object
|
||||
references, like \cfunction{PyObject_GetItem()} and
|
||||
\cfunction{PySequence_GetItem()}, always return a new reference (the
|
||||
caller becomes the owner of the reference).
|
||||
|
||||
It is important to realize that whether you own a reference returned
|
||||
by a function depends on which function you call only --- \emph{the
|
||||
plumage} (the type of the object passed as an
|
||||
argument to the function) \emph{doesn't enter into it!} Thus, if you
|
||||
extract an item from a list using \cfunction{PyList_GetItem()}, you
|
||||
don't own the reference --- but if you obtain the same item from the
|
||||
same list using \cfunction{PySequence_GetItem()} (which happens to
|
||||
take exactly the same arguments), you do own a reference to the
|
||||
returned object.
|
||||
|
||||
Here is an example of how you could write a function that computes the
|
||||
sum of the items in a list of integers; once using
|
||||
\cfunction{PyList_GetItem()}\ttindex{PyList_GetItem()}, and once using
|
||||
\cfunction{PySequence_GetItem()}\ttindex{PySequence_GetItem()}.
|
||||
|
||||
\begin{verbatim}
|
||||
long
|
||||
sum_list(PyObject *list)
|
||||
{
|
||||
int i, n;
|
||||
long total = 0;
|
||||
PyObject *item;
|
||||
|
||||
n = PyList_Size(list);
|
||||
if (n < 0)
|
||||
return -1; /* Not a list */
|
||||
for (i = 0; i < n; i++) {
|
||||
item = PyList_GetItem(list, i); /* Can't fail */
|
||||
if (!PyInt_Check(item)) continue; /* Skip non-integers */
|
||||
total += PyInt_AsLong(item);
|
||||
}
|
||||
return total;
|
||||
}
|
||||
\end{verbatim}
|
||||
\ttindex{sum_list()}
|
||||
|
||||
\begin{verbatim}
|
||||
long
|
||||
sum_sequence(PyObject *sequence)
|
||||
{
|
||||
int i, n;
|
||||
long total = 0;
|
||||
PyObject *item;
|
||||
n = PySequence_Length(sequence);
|
||||
if (n < 0)
|
||||
return -1; /* Has no length */
|
||||
for (i = 0; i < n; i++) {
|
||||
item = PySequence_GetItem(sequence, i);
|
||||
if (item == NULL)
|
||||
return -1; /* Not a sequence, or other failure */
|
||||
if (PyInt_Check(item))
|
||||
total += PyInt_AsLong(item);
|
||||
Py_DECREF(item); /* Discard reference ownership */
|
||||
}
|
||||
return total;
|
||||
}
|
||||
\end{verbatim}
|
||||
\ttindex{sum_sequence()}
|
||||
|
||||
|
||||
\subsection{Types \label{types}}
|
||||
|
||||
There are few other data types that play a significant role in
|
||||
the Python/C API; most are simple C types such as \ctype{int},
|
||||
\ctype{long}, \ctype{double} and \ctype{char*}. A few structure types
|
||||
are used to describe static tables used to list the functions exported
|
||||
by a module or the data attributes of a new object type, and another
|
||||
is used to describe the value of a complex number. These will
|
||||
be discussed together with the functions that use them.
|
||||
|
||||
|
||||
\section{Exceptions \label{exceptions}}
|
||||
|
||||
The Python programmer only needs to deal with exceptions if specific
|
||||
error handling is required; unhandled exceptions are automatically
|
||||
propagated to the caller, then to the caller's caller, and so on, until
|
||||
they reach the top-level interpreter, where they are reported to the
|
||||
user accompanied by a stack traceback.
|
||||
|
||||
For C programmers, however, error checking always has to be explicit.
|
||||
All functions in the Python/C API can raise exceptions, unless an
|
||||
explicit claim is made otherwise in a function's documentation. In
|
||||
general, when a function encounters an error, it sets an exception,
|
||||
discards any object references that it owns, and returns an
|
||||
error indicator --- usually \NULL{} or \code{-1}. A few functions
|
||||
return a Boolean true/false result, with false indicating an error.
|
||||
Very few functions return no explicit error indicator or have an
|
||||
ambiguous return value, and require explicit testing for errors with
|
||||
\cfunction{PyErr_Occurred()}\ttindex{PyErr_Occurred()}.
|
||||
|
||||
Exception state is maintained in per-thread storage (this is
|
||||
equivalent to using global storage in an unthreaded application). A
|
||||
thread can be in one of two states: an exception has occurred, or not.
|
||||
The function \cfunction{PyErr_Occurred()} can be used to check for
|
||||
this: it returns a borrowed reference to the exception type object
|
||||
when an exception has occurred, and \NULL{} otherwise. There are a
|
||||
number of functions to set the exception state:
|
||||
\cfunction{PyErr_SetString()}\ttindex{PyErr_SetString()} is the most
|
||||
common (though not the most general) function to set the exception
|
||||
state, and \cfunction{PyErr_Clear()}\ttindex{PyErr_Clear()} clears the
|
||||
exception state.
|
||||
|
||||
The full exception state consists of three objects (all of which can
|
||||
be \NULL): the exception type, the corresponding exception
|
||||
value, and the traceback. These have the same meanings as the Python
|
||||
\withsubitem{(in module sys)}{
|
||||
\ttindex{exc_type}\ttindex{exc_value}\ttindex{exc_traceback}}
|
||||
objects \code{sys.exc_type}, \code{sys.exc_value}, and
|
||||
\code{sys.exc_traceback}; however, they are not the same: the Python
|
||||
objects represent the last exception being handled by a Python
|
||||
\keyword{try} \ldots\ \keyword{except} statement, while the C level
|
||||
exception state only exists while an exception is being passed on
|
||||
between C functions until it reaches the Python bytecode interpreter's
|
||||
main loop, which takes care of transferring it to \code{sys.exc_type}
|
||||
and friends.
|
||||
|
||||
Note that starting with Python 1.5, the preferred, thread-safe way to
|
||||
access the exception state from Python code is to call the function
|
||||
\withsubitem{(in module sys)}{\ttindex{exc_info()}}
|
||||
\function{sys.exc_info()}, which returns the per-thread exception state
|
||||
for Python code. Also, the semantics of both ways to access the
|
||||
exception state have changed so that a function which catches an
|
||||
exception will save and restore its thread's exception state so as to
|
||||
preserve the exception state of its caller. This prevents common bugs
|
||||
in exception handling code caused by an innocent-looking function
|
||||
overwriting the exception being handled; it also reduces the often
|
||||
unwanted lifetime extension for objects that are referenced by the
|
||||
stack frames in the traceback.
|
||||
|
||||
As a general principle, a function that calls another function to
|
||||
perform some task should check whether the called function raised an
|
||||
exception, and if so, pass the exception state on to its caller. It
|
||||
should discard any object references that it owns, and return an
|
||||
error indicator, but it should \emph{not} set another exception ---
|
||||
that would overwrite the exception that was just raised, and lose
|
||||
important information about the exact cause of the error.
|
||||
|
||||
A simple example of detecting exceptions and passing them on is shown
|
||||
in the \cfunction{sum_sequence()}\ttindex{sum_sequence()} example
|
||||
above. It so happens that that example doesn't need to clean up any
|
||||
owned references when it detects an error. The following example
|
||||
function shows some error cleanup. First, to remind you why you like
|
||||
Python, we show the equivalent Python code:
|
||||
|
||||
\begin{verbatim}
|
||||
def incr_item(dict, key):
|
||||
try:
|
||||
item = dict[key]
|
||||
except KeyError:
|
||||
item = 0
|
||||
dict[key] = item + 1
|
||||
\end{verbatim}
|
||||
\ttindex{incr_item()}
|
||||
|
||||
Here is the corresponding C code, in all its glory:
|
||||
|
||||
\begin{verbatim}
|
||||
int
|
||||
incr_item(PyObject *dict, PyObject *key)
|
||||
{
|
||||
/* Objects all initialized to NULL for Py_XDECREF */
|
||||
PyObject *item = NULL, *const_one = NULL, *incremented_item = NULL;
|
||||
int rv = -1; /* Return value initialized to -1 (failure) */
|
||||
|
||||
item = PyObject_GetItem(dict, key);
|
||||
if (item == NULL) {
|
||||
/* Handle KeyError only: */
|
||||
if (!PyErr_ExceptionMatches(PyExc_KeyError))
|
||||
goto error;
|
||||
|
||||
/* Clear the error and use zero: */
|
||||
PyErr_Clear();
|
||||
item = PyInt_FromLong(0L);
|
||||
if (item == NULL)
|
||||
goto error;
|
||||
}
|
||||
const_one = PyInt_FromLong(1L);
|
||||
if (const_one == NULL)
|
||||
goto error;
|
||||
|
||||
incremented_item = PyNumber_Add(item, const_one);
|
||||
if (incremented_item == NULL)
|
||||
goto error;
|
||||
|
||||
if (PyObject_SetItem(dict, key, incremented_item) < 0)
|
||||
goto error;
|
||||
rv = 0; /* Success */
|
||||
/* Continue with cleanup code */
|
||||
|
||||
error:
|
||||
/* Cleanup code, shared by success and failure path */
|
||||
|
||||
/* Use Py_XDECREF() to ignore NULL references */
|
||||
Py_XDECREF(item);
|
||||
Py_XDECREF(const_one);
|
||||
Py_XDECREF(incremented_item);
|
||||
|
||||
return rv; /* -1 for error, 0 for success */
|
||||
}
|
||||
\end{verbatim}
|
||||
\ttindex{incr_item()}
|
||||
|
||||
This example represents an endorsed use of the \keyword{goto} statement
|
||||
in C! It illustrates the use of
|
||||
\cfunction{PyErr_ExceptionMatches()}\ttindex{PyErr_ExceptionMatches()} and
|
||||
\cfunction{PyErr_Clear()}\ttindex{PyErr_Clear()} to
|
||||
handle specific exceptions, and the use of
|
||||
\cfunction{Py_XDECREF()}\ttindex{Py_XDECREF()} to
|
||||
dispose of owned references that may be \NULL{} (note the
|
||||
\character{X} in the name; \cfunction{Py_DECREF()} would crash when
|
||||
confronted with a \NULL{} reference). It is important that the
|
||||
variables used to hold owned references are initialized to \NULL{} for
|
||||
this to work; likewise, the proposed return value is initialized to
|
||||
\code{-1} (failure) and only set to success after the final call made
|
||||
is successful.
|
||||
|
||||
|
||||
\section{Embedding Python \label{embedding}}
|
||||
|
||||
The one important task that only embedders (as opposed to extension
|
||||
writers) of the Python interpreter have to worry about is the
|
||||
initialization, and possibly the finalization, of the Python
|
||||
interpreter. Most functionality of the interpreter can only be used
|
||||
after the interpreter has been initialized.
|
||||
|
||||
The basic initialization function is
|
||||
\cfunction{Py_Initialize()}\ttindex{Py_Initialize()}.
|
||||
This initializes the table of loaded modules, and creates the
|
||||
fundamental modules \module{__builtin__}\refbimodindex{__builtin__},
|
||||
\module{__main__}\refbimodindex{__main__}, \module{sys}\refbimodindex{sys},
|
||||
and \module{exceptions}.\refbimodindex{exceptions} It also initializes
|
||||
the module search path (\code{sys.path}).%
|
||||
\indexiii{module}{search}{path}
|
||||
\withsubitem{(in module sys)}{\ttindex{path}}
|
||||
|
||||
\cfunction{Py_Initialize()} does not set the ``script argument list''
|
||||
(\code{sys.argv}). If this variable is needed by Python code that
|
||||
will be executed later, it must be set explicitly with a call to
|
||||
\code{PySys_SetArgv(\var{argc},
|
||||
\var{argv})}\ttindex{PySys_SetArgv()} subsequent to the call to
|
||||
\cfunction{Py_Initialize()}.
|
||||
|
||||
On most systems (in particular, on \UNIX{} and Windows, although the
|
||||
details are slightly different),
|
||||
\cfunction{Py_Initialize()} calculates the module search path based
|
||||
upon its best guess for the location of the standard Python
|
||||
interpreter executable, assuming that the Python library is found in a
|
||||
fixed location relative to the Python interpreter executable. In
|
||||
particular, it looks for a directory named
|
||||
\file{lib/python\shortversion} relative to the parent directory where
|
||||
the executable named \file{python} is found on the shell command
|
||||
search path (the environment variable \envvar{PATH}).
|
||||
|
||||
For instance, if the Python executable is found in
|
||||
\file{/usr/local/bin/python}, it will assume that the libraries are in
|
||||
\file{/usr/local/lib/python\shortversion}. (In fact, this particular path
|
||||
is also the ``fallback'' location, used when no executable file named
|
||||
\file{python} is found along \envvar{PATH}.) The user can override
|
||||
this behavior by setting the environment variable \envvar{PYTHONHOME},
|
||||
or insert additional directories in front of the standard path by
|
||||
setting \envvar{PYTHONPATH}.
|
||||
|
||||
The embedding application can steer the search by calling
|
||||
\code{Py_SetProgramName(\var{file})}\ttindex{Py_SetProgramName()} \emph{before} calling
|
||||
\cfunction{Py_Initialize()}. Note that \envvar{PYTHONHOME} still
|
||||
overrides this and \envvar{PYTHONPATH} is still inserted in front of
|
||||
the standard path. An application that requires total control has to
|
||||
provide its own implementation of
|
||||
\cfunction{Py_GetPath()}\ttindex{Py_GetPath()},
|
||||
\cfunction{Py_GetPrefix()}\ttindex{Py_GetPrefix()},
|
||||
\cfunction{Py_GetExecPrefix()}\ttindex{Py_GetExecPrefix()}, and
|
||||
\cfunction{Py_GetProgramFullPath()}\ttindex{Py_GetProgramFullPath()} (all
|
||||
defined in \file{Modules/getpath.c}).
|
||||
|
||||
Sometimes, it is desirable to ``uninitialize'' Python. For instance,
|
||||
the application may want to start over (make another call to
|
||||
\cfunction{Py_Initialize()}) or the application is simply done with its
|
||||
use of Python and wants to free memory allocated by Python. This
|
||||
can be accomplished by calling \cfunction{Py_Finalize()}. The function
|
||||
\cfunction{Py_IsInitialized()}\ttindex{Py_IsInitialized()} returns
|
||||
true if Python is currently in the initialized state. More
|
||||
information about these functions is given in a later chapter.
|
||||
Notice that \cfunction{Py_Finalize} does \emph{not} free all memory
|
||||
allocated by the Python interpreter, e.g. memory allocated by extension
|
||||
modules currently cannot be released.
|
||||
|
||||
|
||||
\section{Debugging Builds \label{debugging}}
|
||||
|
||||
Python can be built with several macros to enable extra checks of the
|
||||
interpreter and extension modules. These checks tend to add a large
|
||||
amount of overhead to the runtime so they are not enabled by default.
|
||||
|
||||
A full list of the various types of debugging builds is in the file
|
||||
\file{Misc/SpecialBuilds.txt} in the Python source distribution.
|
||||
Builds are available that support tracing of reference counts,
|
||||
debugging the memory allocator, or low-level profiling of the main
|
||||
interpreter loop. Only the most frequently-used builds will be
|
||||
described in the remainder of this section.
|
||||
|
||||
Compiling the interpreter with the \csimplemacro{Py_DEBUG} macro
|
||||
defined produces what is generally meant by "a debug build" of Python.
|
||||
\csimplemacro{Py_DEBUG} is enabled in the \UNIX{} build by adding
|
||||
\longprogramopt{with-pydebug} to the \file{configure} command. It is also
|
||||
implied by the presence of the not-Python-specific
|
||||
\csimplemacro{_DEBUG} macro. When \csimplemacro{Py_DEBUG} is enabled
|
||||
in the \UNIX{} build, compiler optimization is disabled.
|
||||
|
||||
In addition to the reference count debugging described below, the
|
||||
following extra checks are performed:
|
||||
|
||||
\begin{itemize}
|
||||
\item Extra checks are added to the object allocator.
|
||||
\item Extra checks are added to the parser and compiler.
|
||||
\item Downcasts from wide types to narrow types are checked for
|
||||
loss of information.
|
||||
\item A number of assertions are added to the dictionary and set
|
||||
implementations. In addition, the set object acquires a
|
||||
\method{test_c_api} method.
|
||||
\item Sanity checks of the input arguments are added to frame
|
||||
creation.
|
||||
\item The storage for long ints is initialized with a known
|
||||
invalid pattern to catch reference to uninitialized
|
||||
digits.
|
||||
\item Low-level tracing and extra exception checking are added
|
||||
to the runtime virtual machine.
|
||||
\item Extra checks are added to the memory arena implementation.
|
||||
\item Extra debugging is added to the thread module.
|
||||
\end{itemize}
|
||||
|
||||
There may be additional checks not mentioned here.
|
||||
|
||||
Defining \csimplemacro{Py_TRACE_REFS} enables reference tracing. When
|
||||
defined, a circular doubly linked list of active objects is maintained
|
||||
by adding two extra fields to every \ctype{PyObject}. Total
|
||||
allocations are tracked as well. Upon exit, all existing references
|
||||
are printed. (In interactive mode this happens after every statement
|
||||
run by the interpreter.) Implied by \csimplemacro{Py_DEBUG}.
|
||||
|
||||
Please refer to \file{Misc/SpecialBuilds.txt} in the Python source
|
||||
distribution for more detailed information.
|
|
@ -1,204 +0,0 @@
|
|||
\chapter{Memory Management \label{memory}}
|
||||
\sectionauthor{Vladimir Marangozov}{Vladimir.Marangozov@inrialpes.fr}
|
||||
|
||||
|
||||
\section{Overview \label{memoryOverview}}
|
||||
|
||||
Memory management in Python involves a private heap containing all
|
||||
Python objects and data structures. The management of this private
|
||||
heap is ensured internally by the \emph{Python memory manager}. The
|
||||
Python memory manager has different components which deal with various
|
||||
dynamic storage management aspects, like sharing, segmentation,
|
||||
preallocation or caching.
|
||||
|
||||
At the lowest level, a raw memory allocator ensures that there is
|
||||
enough room in the private heap for storing all Python-related data
|
||||
by interacting with the memory manager of the operating system. On top
|
||||
of the raw memory allocator, several object-specific allocators
|
||||
operate on the same heap and implement distinct memory management
|
||||
policies adapted to the peculiarities of every object type. For
|
||||
example, integer objects are managed differently within the heap than
|
||||
strings, tuples or dictionaries because integers imply different
|
||||
storage requirements and speed/space tradeoffs. The Python memory
|
||||
manager thus delegates some of the work to the object-specific
|
||||
allocators, but ensures that the latter operate within the bounds of
|
||||
the private heap.
|
||||
|
||||
It is important to understand that the management of the Python heap
|
||||
is performed by the interpreter itself and that the user has no
|
||||
control over it, even if she regularly manipulates object pointers to
|
||||
memory blocks inside that heap. The allocation of heap space for
|
||||
Python objects and other internal buffers is performed on demand by
|
||||
the Python memory manager through the Python/C API functions listed in
|
||||
this document.
|
||||
|
||||
To avoid memory corruption, extension writers should never try to
|
||||
operate on Python objects with the functions exported by the C
|
||||
library: \cfunction{malloc()}\ttindex{malloc()},
|
||||
\cfunction{calloc()}\ttindex{calloc()},
|
||||
\cfunction{realloc()}\ttindex{realloc()} and
|
||||
\cfunction{free()}\ttindex{free()}. This will result in
|
||||
mixed calls between the C allocator and the Python memory manager
|
||||
with fatal consequences, because they implement different algorithms
|
||||
and operate on different heaps. However, one may safely allocate and
|
||||
release memory blocks with the C library allocator for individual
|
||||
purposes, as shown in the following example:
|
||||
|
||||
\begin{verbatim}
|
||||
PyObject *res;
|
||||
char *buf = (char *) malloc(BUFSIZ); /* for I/O */
|
||||
|
||||
if (buf == NULL)
|
||||
return PyErr_NoMemory();
|
||||
...Do some I/O operation involving buf...
|
||||
res = PyString_FromString(buf);
|
||||
free(buf); /* malloc'ed */
|
||||
return res;
|
||||
\end{verbatim}
|
||||
|
||||
In this example, the memory request for the I/O buffer is handled by
|
||||
the C library allocator. The Python memory manager is involved only
|
||||
in the allocation of the string object returned as a result.
|
||||
|
||||
In most situations, however, it is recommended to allocate memory from
|
||||
the Python heap specifically because the latter is under control of
|
||||
the Python memory manager. For example, this is required when the
|
||||
interpreter is extended with new object types written in C. Another
|
||||
reason for using the Python heap is the desire to \emph{inform} the
|
||||
Python memory manager about the memory needs of the extension module.
|
||||
Even when the requested memory is used exclusively for internal,
|
||||
highly-specific purposes, delegating all memory requests to the Python
|
||||
memory manager causes the interpreter to have a more accurate image of
|
||||
its memory footprint as a whole. Consequently, under certain
|
||||
circumstances, the Python memory manager may or may not trigger
|
||||
appropriate actions, like garbage collection, memory compaction or
|
||||
other preventive procedures. Note that by using the C library
|
||||
allocator as shown in the previous example, the allocated memory for
|
||||
the I/O buffer escapes completely the Python memory manager.
|
||||
|
||||
|
||||
\section{Memory Interface \label{memoryInterface}}
|
||||
|
||||
The following function sets, modeled after the ANSI C standard,
|
||||
but specifying behavior when requesting zero bytes,
|
||||
are available for allocating and releasing memory from the Python heap:
|
||||
|
||||
|
||||
\begin{cfuncdesc}{void*}{PyMem_Malloc}{size_t n}
|
||||
Allocates \var{n} bytes and returns a pointer of type \ctype{void*}
|
||||
to the allocated memory, or \NULL{} if the request fails.
|
||||
Requesting zero bytes returns a distinct non-\NULL{} pointer if
|
||||
possible, as if \cfunction{PyMem_Malloc(1)} had been called instead.
|
||||
The memory will not have been initialized in any way.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void*}{PyMem_Realloc}{void *p, size_t n}
|
||||
Resizes the memory block pointed to by \var{p} to \var{n} bytes.
|
||||
The contents will be unchanged to the minimum of the old and the new
|
||||
sizes. If \var{p} is \NULL, the call is equivalent to
|
||||
\cfunction{PyMem_Malloc(\var{n})}; else if \var{n} is equal to zero, the
|
||||
memory block is resized but is not freed, and the returned pointer
|
||||
is non-\NULL. Unless \var{p} is \NULL, it must have been
|
||||
returned by a previous call to \cfunction{PyMem_Malloc()} or
|
||||
\cfunction{PyMem_Realloc()}. If the request fails,
|
||||
\cfunction{PyMem_Realloc()} returns \NULL{} and \var{p} remains a
|
||||
valid pointer to the previous memory area.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyMem_Free}{void *p}
|
||||
Frees the memory block pointed to by \var{p}, which must have been
|
||||
returned by a previous call to \cfunction{PyMem_Malloc()} or
|
||||
\cfunction{PyMem_Realloc()}. Otherwise, or if
|
||||
\cfunction{PyMem_Free(p)} has been called before, undefined
|
||||
behavior occurs. If \var{p} is \NULL, no operation is performed.
|
||||
\end{cfuncdesc}
|
||||
|
||||
The following type-oriented macros are provided for convenience. Note
|
||||
that \var{TYPE} refers to any C type.
|
||||
|
||||
\begin{cfuncdesc}{\var{TYPE}*}{PyMem_New}{TYPE, size_t n}
|
||||
Same as \cfunction{PyMem_Malloc()}, but allocates \code{(\var{n} *
|
||||
sizeof(\var{TYPE}))} bytes of memory. Returns a pointer cast to
|
||||
\ctype{\var{TYPE}*}. The memory will not have been initialized in
|
||||
any way.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{\var{TYPE}*}{PyMem_Resize}{void *p, TYPE, size_t n}
|
||||
Same as \cfunction{PyMem_Realloc()}, but the memory block is resized
|
||||
to \code{(\var{n} * sizeof(\var{TYPE}))} bytes. Returns a pointer
|
||||
cast to \ctype{\var{TYPE}*}. On return, \var{p} will be a pointer to
|
||||
the new memory area, or \NULL{} in the event of failure.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{PyMem_Del}{void *p}
|
||||
Same as \cfunction{PyMem_Free()}.
|
||||
\end{cfuncdesc}
|
||||
|
||||
In addition, the following macro sets are provided for calling the
|
||||
Python memory allocator directly, without involving the C API functions
|
||||
listed above. However, note that their use does not preserve binary
|
||||
compatibility across Python versions and is therefore deprecated in
|
||||
extension modules.
|
||||
|
||||
\cfunction{PyMem_MALLOC()}, \cfunction{PyMem_REALLOC()}, \cfunction{PyMem_FREE()}.
|
||||
|
||||
\cfunction{PyMem_NEW()}, \cfunction{PyMem_RESIZE()}, \cfunction{PyMem_DEL()}.
|
||||
|
||||
|
||||
\section{Examples \label{memoryExamples}}
|
||||
|
||||
Here is the example from section \ref{memoryOverview}, rewritten so
|
||||
that the I/O buffer is allocated from the Python heap by using the
|
||||
first function set:
|
||||
|
||||
\begin{verbatim}
|
||||
PyObject *res;
|
||||
char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */
|
||||
|
||||
if (buf == NULL)
|
||||
return PyErr_NoMemory();
|
||||
/* ...Do some I/O operation involving buf... */
|
||||
res = PyString_FromString(buf);
|
||||
PyMem_Free(buf); /* allocated with PyMem_Malloc */
|
||||
return res;
|
||||
\end{verbatim}
|
||||
|
||||
The same code using the type-oriented function set:
|
||||
|
||||
\begin{verbatim}
|
||||
PyObject *res;
|
||||
char *buf = PyMem_New(char, BUFSIZ); /* for I/O */
|
||||
|
||||
if (buf == NULL)
|
||||
return PyErr_NoMemory();
|
||||
/* ...Do some I/O operation involving buf... */
|
||||
res = PyString_FromString(buf);
|
||||
PyMem_Del(buf); /* allocated with PyMem_New */
|
||||
return res;
|
||||
\end{verbatim}
|
||||
|
||||
Note that in the two examples above, the buffer is always
|
||||
manipulated via functions belonging to the same set. Indeed, it
|
||||
is required to use the same memory API family for a given
|
||||
memory block, so that the risk of mixing different allocators is
|
||||
reduced to a minimum. The following code sequence contains two errors,
|
||||
one of which is labeled as \emph{fatal} because it mixes two different
|
||||
allocators operating on different heaps.
|
||||
|
||||
\begin{verbatim}
|
||||
char *buf1 = PyMem_New(char, BUFSIZ);
|
||||
char *buf2 = (char *) malloc(BUFSIZ);
|
||||
char *buf3 = (char *) PyMem_Malloc(BUFSIZ);
|
||||
...
|
||||
PyMem_Del(buf3); /* Wrong -- should be PyMem_Free() */
|
||||
free(buf2); /* Right -- allocated via malloc() */
|
||||
free(buf1); /* Fatal -- should be PyMem_Del() */
|
||||
\end{verbatim}
|
||||
|
||||
In addition to the functions aimed at handling raw memory blocks from
|
||||
the Python heap, objects in Python are allocated and released with
|
||||
\cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()} and
|
||||
\cfunction{PyObject_Del()}.
|
||||
|
||||
These will be explained in the next chapter on defining and
|
||||
implementing new object types in C.
|
1780
Doc/api/newtypes.tex
|
@ -1,69 +0,0 @@
|
|||
\chapter{Reference Counting \label{countingRefs}}
|
||||
|
||||
|
||||
The macros in this section are used for managing reference counts
|
||||
of Python objects.
|
||||
|
||||
|
||||
\begin{cfuncdesc}{void}{Py_INCREF}{PyObject *o}
|
||||
Increment the reference count for object \var{o}. The object must
|
||||
not be \NULL; if you aren't sure that it isn't \NULL, use
|
||||
\cfunction{Py_XINCREF()}.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{Py_XINCREF}{PyObject *o}
|
||||
Increment the reference count for object \var{o}. The object may be
|
||||
\NULL, in which case the macro has no effect.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{Py_DECREF}{PyObject *o}
|
||||
Decrement the reference count for object \var{o}. The object must
|
||||
not be \NULL; if you aren't sure that it isn't \NULL, use
|
||||
\cfunction{Py_XDECREF()}. If the reference count reaches zero, the
|
||||
object's type's deallocation function (which must not be \NULL) is
|
||||
invoked.
|
||||
|
||||
\warning{The deallocation function can cause arbitrary Python code
|
||||
to be invoked (e.g. when a class instance with a \method{__del__()}
|
||||
method is deallocated). While exceptions in such code are not
|
||||
propagated, the executed code has free access to all Python global
|
||||
variables. This means that any object that is reachable from a
|
||||
global variable should be in a consistent state before
|
||||
\cfunction{Py_DECREF()} is invoked. For example, code to delete an
|
||||
object from a list should copy a reference to the deleted object in
|
||||
a temporary variable, update the list data structure, and then call
|
||||
\cfunction{Py_DECREF()} for the temporary variable.}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{Py_XDECREF}{PyObject *o}
|
||||
Decrement the reference count for object \var{o}. The object may be
|
||||
\NULL, in which case the macro has no effect; otherwise the effect
|
||||
is the same as for \cfunction{Py_DECREF()}, and the same warning
|
||||
applies.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{void}{Py_CLEAR}{PyObject *o}
|
||||
Decrement the reference count for object \var{o}. The object may be
|
||||
\NULL, in which case the macro has no effect; otherwise the effect
|
||||
is the same as for \cfunction{Py_DECREF()}, except that the argument
|
||||
is also set to \NULL. The warning for \cfunction{Py_DECREF()} does
|
||||
not apply with respect to the object passed because the macro
|
||||
carefully uses a temporary variable and sets the argument to \NULL
|
||||
before decrementing its reference count.
|
||||
|
||||
It is a good idea to use this macro whenever decrementing the value
|
||||
of a variable that might be traversed during garbage collection.
|
||||
|
||||
\versionadded{2.4}
|
||||
\end{cfuncdesc}
|
||||
|
||||
|
||||
The following functions are for runtime dynamic embedding of Python:
|
||||
\cfunction{Py_IncRef(PyObject *o)}, \cfunction{Py_DecRef(PyObject *o)}.
|
||||
They are simply exported function versions of \cfunction{Py_XINCREF()} and
|
||||
\cfunction{Py_XDECREF()}, respectively.
|
||||
|
||||
The following functions or macros are only for use within the
|
||||
interpreter core: \cfunction{_Py_Dealloc()},
|
||||
\cfunction{_Py_ForgetReference()}, \cfunction{_Py_NewReference()}, as
|
||||
well as the global variable \cdata{_Py_RefTotal}.
|
|
@ -1,287 +0,0 @@
|
|||
\chapter{The Very High Level Layer \label{veryhigh}}
|
||||
|
||||
|
||||
The functions in this chapter will let you execute Python source code
|
||||
given in a file or a buffer, but they will not let you interact in a
|
||||
more detailed way with the interpreter.
|
||||
|
||||
Several of these functions accept a start symbol from the grammar as a
|
||||
parameter. The available start symbols are \constant{Py_eval_input},
|
||||
\constant{Py_file_input}, and \constant{Py_single_input}. These are
|
||||
described following the functions which accept them as parameters.
|
||||
|
||||
Note also that several of these functions take \ctype{FILE*}
|
||||
parameters. On particular issue which needs to be handled carefully
|
||||
is that the \ctype{FILE} structure for different C libraries can be
|
||||
different and incompatible. Under Windows (at least), it is possible
|
||||
for dynamically linked extensions to actually use different libraries,
|
||||
so care should be taken that \ctype{FILE*} parameters are only passed
|
||||
to these functions if it is certain that they were created by the same
|
||||
library that the Python runtime is using.
|
||||
|
||||
|
||||
\begin{cfuncdesc}{int}{Py_Main}{int argc, char **argv}
|
||||
The main program for the standard interpreter. This is made
|
||||
available for programs which embed Python. The \var{argc} and
|
||||
\var{argv} parameters should be prepared exactly as those which are
|
||||
passed to a C program's \cfunction{main()} function. It is
|
||||
important to note that the argument list may be modified (but the
|
||||
contents of the strings pointed to by the argument list are not).
|
||||
The return value will be the integer passed to the
|
||||
\function{sys.exit()} function, \code{1} if the interpreter exits
|
||||
due to an exception, or \code{2} if the parameter list does not
|
||||
represent a valid Python command line.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_AnyFile}{FILE *fp, const char *filename}
|
||||
This is a simplified interface to \cfunction{PyRun_AnyFileExFlags()}
|
||||
below, leaving \var{closeit} set to \code{0} and \var{flags} set to \NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_AnyFileFlags}{FILE *fp, const char *filename,
|
||||
PyCompilerFlags *flags}
|
||||
This is a simplified interface to \cfunction{PyRun_AnyFileExFlags()}
|
||||
below, leaving the \var{closeit} argument set to \code{0}.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_AnyFileEx}{FILE *fp, const char *filename,
|
||||
int closeit}
|
||||
This is a simplified interface to \cfunction{PyRun_AnyFileExFlags()}
|
||||
below, leaving the \var{flags} argument set to \NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_AnyFileExFlags}{FILE *fp, const char *filename,
|
||||
int closeit,
|
||||
PyCompilerFlags *flags}
|
||||
If \var{fp} refers to a file associated with an interactive device
|
||||
(console or terminal input or \UNIX{} pseudo-terminal), return the
|
||||
value of \cfunction{PyRun_InteractiveLoop()}, otherwise return the
|
||||
result of \cfunction{PyRun_SimpleFile()}. If \var{filename} is
|
||||
\NULL, this function uses \code{"???"} as the filename.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_SimpleString}{const char *command}
|
||||
This is a simplified interface to \cfunction{PyRun_SimpleStringFlags()}
|
||||
below, leaving the \var{PyCompilerFlags*} argument set to NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_SimpleStringFlags}{const char *command,
|
||||
PyCompilerFlags *flags}
|
||||
Executes the Python source code from \var{command} in the
|
||||
\module{__main__} module according to the \var{flags} argument.
|
||||
If \module{__main__} does not already exist, it is created. Returns
|
||||
\code{0} on success or \code{-1} if an exception was raised. If there
|
||||
was an error, there is no way to get the exception information.
|
||||
For the meaning of \var{flags}, see below.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_SimpleFile}{FILE *fp, const char *filename}
|
||||
This is a simplified interface to \cfunction{PyRun_SimpleFileExFlags()}
|
||||
below, leaving \var{closeit} set to \code{0} and \var{flags} set to
|
||||
\NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_SimpleFileFlags}{FILE *fp, const char *filename,
|
||||
PyCompilerFlags *flags}
|
||||
This is a simplified interface to \cfunction{PyRun_SimpleFileExFlags()}
|
||||
below, leaving \var{closeit} set to \code{0}.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_SimpleFileEx}{FILE *fp, const char *filename,
|
||||
int closeit}
|
||||
This is a simplified interface to \cfunction{PyRun_SimpleFileExFlags()}
|
||||
below, leaving \var{flags} set to \NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_SimpleFileExFlags}{FILE *fp, const char *filename,
|
||||
int closeit,
|
||||
PyCompilerFlags *flags}
|
||||
Similar to \cfunction{PyRun_SimpleStringFlags()}, but the Python source
|
||||
code is read from \var{fp} instead of an in-memory string.
|
||||
\var{filename} should be the name of the file. If \var{closeit} is
|
||||
true, the file is closed before PyRun_SimpleFileExFlags returns.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_InteractiveOne}{FILE *fp, const char *filename}
|
||||
This is a simplified interface to \cfunction{PyRun_InteractiveOneFlags()}
|
||||
below, leaving \var{flags} set to \NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_InteractiveOneFlags}{FILE *fp,
|
||||
const char *filename,
|
||||
PyCompilerFlags *flags}
|
||||
Read and execute a single statement from a file associated with an
|
||||
interactive device according to the \var{flags} argument. If
|
||||
\var{filename} is \NULL, \code{"???"} is used instead. The user will
|
||||
be prompted using \code{sys.ps1} and \code{sys.ps2}. Returns \code{0}
|
||||
when the input was executed successfully, \code{-1} if there was an
|
||||
exception, or an error code from the \file{errcode.h} include file
|
||||
distributed as part of Python if there was a parse error. (Note that
|
||||
\file{errcode.h} is not included by \file{Python.h}, so must be included
|
||||
specifically if needed.)
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_InteractiveLoop}{FILE *fp, const char *filename}
|
||||
This is a simplified interface to \cfunction{PyRun_InteractiveLoopFlags()}
|
||||
below, leaving \var{flags} set to \NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{int}{PyRun_InteractiveLoopFlags}{FILE *fp,
|
||||
const char *filename,
|
||||
PyCompilerFlags *flags}
|
||||
Read and execute statements from a file associated with an
|
||||
interactive device until \EOF{} is reached. If \var{filename} is
|
||||
\NULL, \code{"???"} is used instead. The user will be prompted
|
||||
using \code{sys.ps1} and \code{sys.ps2}. Returns \code{0} at \EOF.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{struct _node*}{PyParser_SimpleParseString}{const char *str,
|
||||
int start}
|
||||
This is a simplified interface to
|
||||
\cfunction{PyParser_SimpleParseStringFlagsFilename()} below, leaving
|
||||
\var{filename} set to \NULL{} and \var{flags} set to \code{0}.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{struct _node*}{PyParser_SimpleParseStringFlags}{
|
||||
const char *str, int start, int flags}
|
||||
This is a simplified interface to
|
||||
\cfunction{PyParser_SimpleParseStringFlagsFilename()} below, leaving
|
||||
\var{filename} set to \NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{struct _node*}{PyParser_SimpleParseStringFlagsFilename}{
|
||||
const char *str, const char *filename,
|
||||
int start, int flags}
|
||||
Parse Python source code from \var{str} using the start token
|
||||
\var{start} according to the \var{flags} argument. The result can
|
||||
be used to create a code object which can be evaluated efficiently.
|
||||
This is useful if a code fragment must be evaluated many times.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{struct _node*}{PyParser_SimpleParseFile}{FILE *fp,
|
||||
const char *filename, int start}
|
||||
This is a simplified interface to \cfunction{PyParser_SimpleParseFileFlags()}
|
||||
below, leaving \var{flags} set to \code{0}
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{struct _node*}{PyParser_SimpleParseFileFlags}{FILE *fp,
|
||||
const char *filename, int start, int flags}
|
||||
Similar to \cfunction{PyParser_SimpleParseStringFlagsFilename()}, but
|
||||
the Python source code is read from \var{fp} instead of an in-memory
|
||||
string.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyRun_String}{const char *str, int start,
|
||||
PyObject *globals,
|
||||
PyObject *locals}
|
||||
This is a simplified interface to \cfunction{PyRun_StringFlags()} below,
|
||||
leaving \var{flags} set to \NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyRun_StringFlags}{const char *str, int start,
|
||||
PyObject *globals,
|
||||
PyObject *locals,
|
||||
PyCompilerFlags *flags}
|
||||
Execute Python source code from \var{str} in the context specified
|
||||
by the dictionaries \var{globals} and \var{locals} with the compiler
|
||||
flags specified by \var{flags}. The parameter \var{start} specifies
|
||||
the start token that should be used to parse the source code.
|
||||
|
||||
Returns the result of executing the code as a Python object, or
|
||||
\NULL{} if an exception was raised.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyRun_File}{FILE *fp, const char *filename,
|
||||
int start, PyObject *globals,
|
||||
PyObject *locals}
|
||||
This is a simplified interface to \cfunction{PyRun_FileExFlags()} below,
|
||||
leaving \var{closeit} set to \code{0} and \var{flags} set to \NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyRun_FileEx}{FILE *fp, const char *filename,
|
||||
int start, PyObject *globals,
|
||||
PyObject *locals, int closeit}
|
||||
This is a simplified interface to \cfunction{PyRun_FileExFlags()} below,
|
||||
leaving \var{flags} set to \NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyRun_FileFlags}{FILE *fp, const char *filename,
|
||||
int start, PyObject *globals,
|
||||
PyObject *locals,
|
||||
PyCompilerFlags *flags}
|
||||
This is a simplified interface to \cfunction{PyRun_FileExFlags()} below,
|
||||
leaving \var{closeit} set to \code{0}.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{PyRun_FileExFlags}{FILE *fp, const char *filename,
|
||||
int start, PyObject *globals,
|
||||
PyObject *locals, int closeit,
|
||||
PyCompilerFlags *flags}
|
||||
Similar to \cfunction{PyRun_StringFlags()}, but the Python source code is
|
||||
read from \var{fp} instead of an in-memory string.
|
||||
\var{filename} should be the name of the file.
|
||||
If \var{closeit} is true, the file is closed before
|
||||
\cfunction{PyRun_FileExFlags()} returns.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{Py_CompileString}{const char *str,
|
||||
const char *filename,
|
||||
int start}
|
||||
This is a simplified interface to \cfunction{Py_CompileStringFlags()} below,
|
||||
leaving \var{flags} set to \NULL.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cfuncdesc}{PyObject*}{Py_CompileStringFlags}{const char *str,
|
||||
const char *filename,
|
||||
int start,
|
||||
PyCompilerFlags *flags}
|
||||
Parse and compile the Python source code in \var{str}, returning the
|
||||
resulting code object. The start token is given by \var{start};
|
||||
this can be used to constrain the code which can be compiled and should
|
||||
be \constant{Py_eval_input}, \constant{Py_file_input}, or
|
||||
\constant{Py_single_input}. The filename specified by
|
||||
\var{filename} is used to construct the code object and may appear
|
||||
in tracebacks or \exception{SyntaxError} exception messages. This
|
||||
returns \NULL{} if the code cannot be parsed or compiled.
|
||||
\end{cfuncdesc}
|
||||
|
||||
\begin{cvardesc}{int}{Py_eval_input}
|
||||
The start symbol from the Python grammar for isolated expressions;
|
||||
for use with
|
||||
\cfunction{Py_CompileString()}\ttindex{Py_CompileString()}.
|
||||
\end{cvardesc}
|
||||
|
||||
\begin{cvardesc}{int}{Py_file_input}
|
||||
The start symbol from the Python grammar for sequences of statements
|
||||
as read from a file or other source; for use with
|
||||
\cfunction{Py_CompileString()}\ttindex{Py_CompileString()}. This is
|
||||
the symbol to use when compiling arbitrarily long Python source code.
|
||||
\end{cvardesc}
|
||||
|
||||
\begin{cvardesc}{int}{Py_single_input}
|
||||
The start symbol from the Python grammar for a single statement; for
|
||||
use with \cfunction{Py_CompileString()}\ttindex{Py_CompileString()}.
|
||||
This is the symbol used for the interactive interpreter loop.
|
||||
\end{cvardesc}
|
||||
|
||||
\begin{ctypedesc}[PyCompilerFlags]{struct PyCompilerFlags}
|
||||
This is the structure used to hold compiler flags. In cases where
|
||||
code is only being compiled, it is passed as \code{int flags}, and in
|
||||
cases where code is being executed, it is passed as
|
||||
\code{PyCompilerFlags *flags}. In this case, \code{from __future__
|
||||
import} can modify \var{flags}.
|
||||
|
||||
Whenever \code{PyCompilerFlags *flags} is \NULL, \member{cf_flags}
|
||||
is treated as equal to \code{0}, and any modification due to
|
||||
\code{from __future__ import} is discarded.
|
||||
\begin{verbatim}
|
||||
struct PyCompilerFlags {
|
||||
int cf_flags;
|
||||
}
|
||||
\end{verbatim}
|
||||
\end{ctypedesc}
|
||||
|
||||
\begin{cvardesc}{int}{CO_FUTURE_DIVISION}
|
||||
This bit can be set in \var{flags} to cause division operator \code{/}
|
||||
to be interpreted as ``true division'' according to \pep{238}.
|
||||
\end{cvardesc}
|
|
@ -1,9 +0,0 @@
|
|||
\author{Guido van Rossum\\
|
||||
Fred L. Drake, Jr., editor}
|
||||
\authoraddress{
|
||||
\strong{Python Software Foundation}\\
|
||||
Email: \email{docs@python.org}
|
||||
}
|
||||
|
||||
\date{\today} % XXX update before final release!
|
||||
\input{patchlevel} % include Python version information
|
|
@ -1,14 +0,0 @@
|
|||
Copyright \copyright{} 2001-2007 Python Software Foundation.
|
||||
All rights reserved.
|
||||
|
||||
Copyright \copyright{} 2000 BeOpen.com.
|
||||
All rights reserved.
|
||||
|
||||
Copyright \copyright{} 1995-2000 Corporation for National Research Initiatives.
|
||||
All rights reserved.
|
||||
|
||||
Copyright \copyright{} 1991-1995 Stichting Mathematisch Centrum.
|
||||
All rights reserved.
|
||||
|
||||
See the end of this document for complete license and permissions
|
||||
information.
|
|
@ -1,674 +0,0 @@
|
|||
\section{History of the software}
|
||||
|
||||
Python was created in the early 1990s by Guido van Rossum at Stichting
|
||||
Mathematisch Centrum (CWI, see \url{http://www.cwi.nl/}) in the Netherlands
|
||||
as a successor of a language called ABC. Guido remains Python's
|
||||
principal author, although it includes many contributions from others.
|
||||
|
||||
In 1995, Guido continued his work on Python at the Corporation for
|
||||
National Research Initiatives (CNRI, see \url{http://www.cnri.reston.va.us/})
|
||||
in Reston, Virginia where he released several versions of the
|
||||
software.
|
||||
|
||||
In May 2000, Guido and the Python core development team moved to
|
||||
BeOpen.com to form the BeOpen PythonLabs team. In October of the same
|
||||
year, the PythonLabs team moved to Digital Creations (now Zope
|
||||
Corporation; see \url{http://www.zope.com/}). In 2001, the Python
|
||||
Software Foundation (PSF, see \url{http://www.python.org/psf/}) was
|
||||
formed, a non-profit organization created specifically to own
|
||||
Python-related Intellectual Property. Zope Corporation is a
|
||||
sponsoring member of the PSF.
|
||||
|
||||
All Python releases are Open Source (see
|
||||
\url{http://www.opensource.org/} for the Open Source Definition).
|
||||
Historically, most, but not all, Python releases have also been
|
||||
GPL-compatible; the table below summarizes the various releases.
|
||||
|
||||
\begin{tablev}{c|c|c|c|c}{textrm}%
|
||||
{Release}{Derived from}{Year}{Owner}{GPL compatible?}
|
||||
\linev{0.9.0 thru 1.2}{n/a}{1991-1995}{CWI}{yes}
|
||||
\linev{1.3 thru 1.5.2}{1.2}{1995-1999}{CNRI}{yes}
|
||||
\linev{1.6}{1.5.2}{2000}{CNRI}{no}
|
||||
\linev{2.0}{1.6}{2000}{BeOpen.com}{no}
|
||||
\linev{1.6.1}{1.6}{2001}{CNRI}{no}
|
||||
\linev{2.1}{2.0+1.6.1}{2001}{PSF}{no}
|
||||
\linev{2.0.1}{2.0+1.6.1}{2001}{PSF}{yes}
|
||||
\linev{2.1.1}{2.1+2.0.1}{2001}{PSF}{yes}
|
||||
\linev{2.2}{2.1.1}{2001}{PSF}{yes}
|
||||
\linev{2.1.2}{2.1.1}{2002}{PSF}{yes}
|
||||
\linev{2.1.3}{2.1.2}{2002}{PSF}{yes}
|
||||
\linev{2.2.1}{2.2}{2002}{PSF}{yes}
|
||||
\linev{2.2.2}{2.2.1}{2002}{PSF}{yes}
|
||||
\linev{2.2.3}{2.2.2}{2002-2003}{PSF}{yes}
|
||||
\linev{2.3}{2.2.2}{2002-2003}{PSF}{yes}
|
||||
\linev{2.3.1}{2.3}{2002-2003}{PSF}{yes}
|
||||
\linev{2.3.2}{2.3.1}{2003}{PSF}{yes}
|
||||
\linev{2.3.3}{2.3.2}{2003}{PSF}{yes}
|
||||
\linev{2.3.4}{2.3.3}{2004}{PSF}{yes}
|
||||
\linev{2.3.5}{2.3.4}{2005}{PSF}{yes}
|
||||
\linev{2.4}{2.3}{2004}{PSF}{yes}
|
||||
\linev{2.4.1}{2.4}{2005}{PSF}{yes}
|
||||
\linev{2.4.2}{2.4.1}{2005}{PSF}{yes}
|
||||
\linev{2.4.3}{2.4.2}{2006}{PSF}{yes}
|
||||
\linev{2.4.4}{2.4.3}{2006}{PSF}{yes}
|
||||
\linev{2.5}{2.4}{2006}{PSF}{yes}
|
||||
\linev{2.5.1}{2.5}{2007}{PSF}{yes}
|
||||
\end{tablev}
|
||||
|
||||
\note{GPL-compatible doesn't mean that we're distributing
|
||||
Python under the GPL. All Python licenses, unlike the GPL, let you
|
||||
distribute a modified version without making your changes open source.
|
||||
The GPL-compatible licenses make it possible to combine Python with
|
||||
other software that is released under the GPL; the others don't.}
|
||||
|
||||
Thanks to the many outside volunteers who have worked under Guido's
|
||||
direction to make these releases possible.
|
||||
|
||||
|
||||
\section{Terms and conditions for accessing or otherwise using Python}
|
||||
|
||||
\centerline{\strong{PSF LICENSE AGREEMENT FOR PYTHON \version}}
|
||||
|
||||
\begin{enumerate}
|
||||
\item
|
||||
This LICENSE AGREEMENT is between the Python Software Foundation
|
||||
(``PSF''), and the Individual or Organization (``Licensee'') accessing
|
||||
and otherwise using Python \version{} software in source or binary
|
||||
form and its associated documentation.
|
||||
|
||||
\item
|
||||
Subject to the terms and conditions of this License Agreement, PSF
|
||||
hereby grants Licensee a nonexclusive, royalty-free, world-wide
|
||||
license to reproduce, analyze, test, perform and/or display publicly,
|
||||
prepare derivative works, distribute, and otherwise use Python
|
||||
\version{} alone or in any derivative version, provided, however, that
|
||||
PSF's License Agreement and PSF's notice of copyright, i.e.,
|
||||
``Copyright \copyright{} 2001-2007 Python Software Foundation; All
|
||||
Rights Reserved'' are retained in Python \version{} alone or in any
|
||||
derivative version prepared by Licensee.
|
||||
|
||||
\item
|
||||
In the event Licensee prepares a derivative work that is based on
|
||||
or incorporates Python \version{} or any part thereof, and wants to
|
||||
make the derivative work available to others as provided herein, then
|
||||
Licensee hereby agrees to include in any such work a brief summary of
|
||||
the changes made to Python \version.
|
||||
|
||||
\item
|
||||
PSF is making Python \version{} available to Licensee on an ``AS IS''
|
||||
basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
|
||||
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND
|
||||
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
|
||||
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON \version{} WILL
|
||||
NOT INFRINGE ANY THIRD PARTY RIGHTS.
|
||||
|
||||
\item
|
||||
PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
|
||||
\version{} FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR
|
||||
LOSS AS A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON
|
||||
\version, OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE
|
||||
POSSIBILITY THEREOF.
|
||||
|
||||
\item
|
||||
This License Agreement will automatically terminate upon a material
|
||||
breach of its terms and conditions.
|
||||
|
||||
\item
|
||||
Nothing in this License Agreement shall be deemed to create any
|
||||
relationship of agency, partnership, or joint venture between PSF and
|
||||
Licensee. This License Agreement does not grant permission to use PSF
|
||||
trademarks or trade name in a trademark sense to endorse or promote
|
||||
products or services of Licensee, or any third party.
|
||||
|
||||
\item
|
||||
By copying, installing or otherwise using Python \version, Licensee
|
||||
agrees to be bound by the terms and conditions of this License
|
||||
Agreement.
|
||||
\end{enumerate}
|
||||
|
||||
|
||||
\centerline{\strong{BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0}}
|
||||
|
||||
\centerline{\strong{BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1}}
|
||||
|
||||
\begin{enumerate}
|
||||
\item
|
||||
This LICENSE AGREEMENT is between BeOpen.com (``BeOpen''), having an
|
||||
office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the
|
||||
Individual or Organization (``Licensee'') accessing and otherwise
|
||||
using this software in source or binary form and its associated
|
||||
documentation (``the Software'').
|
||||
|
||||
\item
|
||||
Subject to the terms and conditions of this BeOpen Python License
|
||||
Agreement, BeOpen hereby grants Licensee a non-exclusive,
|
||||
royalty-free, world-wide license to reproduce, analyze, test, perform
|
||||
and/or display publicly, prepare derivative works, distribute, and
|
||||
otherwise use the Software alone or in any derivative version,
|
||||
provided, however, that the BeOpen Python License is retained in the
|
||||
Software, alone or in any derivative version prepared by Licensee.
|
||||
|
||||
\item
|
||||
BeOpen is making the Software available to Licensee on an ``AS IS''
|
||||
basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
|
||||
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND
|
||||
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
|
||||
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT
|
||||
INFRINGE ANY THIRD PARTY RIGHTS.
|
||||
|
||||
\item
|
||||
BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE
|
||||
SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS
|
||||
AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY
|
||||
DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
|
||||
|
||||
\item
|
||||
This License Agreement will automatically terminate upon a material
|
||||
breach of its terms and conditions.
|
||||
|
||||
\item
|
||||
This License Agreement shall be governed by and interpreted in all
|
||||
respects by the law of the State of California, excluding conflict of
|
||||
law provisions. Nothing in this License Agreement shall be deemed to
|
||||
create any relationship of agency, partnership, or joint venture
|
||||
between BeOpen and Licensee. This License Agreement does not grant
|
||||
permission to use BeOpen trademarks or trade names in a trademark
|
||||
sense to endorse or promote products or services of Licensee, or any
|
||||
third party. As an exception, the ``BeOpen Python'' logos available
|
||||
at http://www.pythonlabs.com/logos.html may be used according to the
|
||||
permissions granted on that web page.
|
||||
|
||||
\item
|
||||
By copying, installing or otherwise using the software, Licensee
|
||||
agrees to be bound by the terms and conditions of this License
|
||||
Agreement.
|
||||
\end{enumerate}
|
||||
|
||||
|
||||
\centerline{\strong{CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1}}
|
||||
|
||||
\begin{enumerate}
|
||||
\item
|
||||
This LICENSE AGREEMENT is between the Corporation for National
|
||||
Research Initiatives, having an office at 1895 Preston White Drive,
|
||||
Reston, VA 20191 (``CNRI''), and the Individual or Organization
|
||||
(``Licensee'') accessing and otherwise using Python 1.6.1 software in
|
||||
source or binary form and its associated documentation.
|
||||
|
||||
\item
|
||||
Subject to the terms and conditions of this License Agreement, CNRI
|
||||
hereby grants Licensee a nonexclusive, royalty-free, world-wide
|
||||
license to reproduce, analyze, test, perform and/or display publicly,
|
||||
prepare derivative works, distribute, and otherwise use Python 1.6.1
|
||||
alone or in any derivative version, provided, however, that CNRI's
|
||||
License Agreement and CNRI's notice of copyright, i.e., ``Copyright
|
||||
\copyright{} 1995-2001 Corporation for National Research Initiatives;
|
||||
All Rights Reserved'' are retained in Python 1.6.1 alone or in any
|
||||
derivative version prepared by Licensee. Alternately, in lieu of
|
||||
CNRI's License Agreement, Licensee may substitute the following text
|
||||
(omitting the quotes): ``Python 1.6.1 is made available subject to the
|
||||
terms and conditions in CNRI's License Agreement. This Agreement
|
||||
together with Python 1.6.1 may be located on the Internet using the
|
||||
following unique, persistent identifier (known as a handle):
|
||||
1895.22/1013. This Agreement may also be obtained from a proxy server
|
||||
on the Internet using the following URL:
|
||||
\url{http://hdl.handle.net/1895.22/1013}.''
|
||||
|
||||
\item
|
||||
In the event Licensee prepares a derivative work that is based on
|
||||
or incorporates Python 1.6.1 or any part thereof, and wants to make
|
||||
the derivative work available to others as provided herein, then
|
||||
Licensee hereby agrees to include in any such work a brief summary of
|
||||
the changes made to Python 1.6.1.
|
||||
|
||||
\item
|
||||
CNRI is making Python 1.6.1 available to Licensee on an ``AS IS''
|
||||
basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
|
||||
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND
|
||||
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
|
||||
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT
|
||||
INFRINGE ANY THIRD PARTY RIGHTS.
|
||||
|
||||
\item
|
||||
CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
|
||||
1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
|
||||
A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1,
|
||||
OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
|
||||
|
||||
\item
|
||||
This License Agreement will automatically terminate upon a material
|
||||
breach of its terms and conditions.
|
||||
|
||||
\item
|
||||
This License Agreement shall be governed by the federal
|
||||
intellectual property law of the United States, including without
|
||||
limitation the federal copyright law, and, to the extent such
|
||||
U.S. federal law does not apply, by the law of the Commonwealth of
|
||||
Virginia, excluding Virginia's conflict of law provisions.
|
||||
Notwithstanding the foregoing, with regard to derivative works based
|
||||
on Python 1.6.1 that incorporate non-separable material that was
|
||||
previously distributed under the GNU General Public License (GPL), the
|
||||
law of the Commonwealth of Virginia shall govern this License
|
||||
Agreement only as to issues arising under or with respect to
|
||||
Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this
|
||||
License Agreement shall be deemed to create any relationship of
|
||||
agency, partnership, or joint venture between CNRI and Licensee. This
|
||||
License Agreement does not grant permission to use CNRI trademarks or
|
||||
trade name in a trademark sense to endorse or promote products or
|
||||
services of Licensee, or any third party.
|
||||
|
||||
\item
|
||||
By clicking on the ``ACCEPT'' button where indicated, or by copying,
|
||||
installing or otherwise using Python 1.6.1, Licensee agrees to be
|
||||
bound by the terms and conditions of this License Agreement.
|
||||
\end{enumerate}
|
||||
|
||||
\centerline{ACCEPT}
|
||||
|
||||
|
||||
|
||||
\centerline{\strong{CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2}}
|
||||
|
||||
Copyright \copyright{} 1991 - 1995, Stichting Mathematisch Centrum
|
||||
Amsterdam, The Netherlands. All rights reserved.
|
||||
|
||||
Permission to use, copy, modify, and distribute this software and its
|
||||
documentation for any purpose and without fee is hereby granted,
|
||||
provided that the above copyright notice appear in all copies and that
|
||||
both that copyright notice and this permission notice appear in
|
||||
supporting documentation, and that the name of Stichting Mathematisch
|
||||
Centrum or CWI not be used in advertising or publicity pertaining to
|
||||
distribution of the software without specific, written prior
|
||||
permission.
|
||||
|
||||
STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO
|
||||
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
|
||||
FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE
|
||||
FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
||||
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
|
||||
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
|
||||
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
||||
|
||||
|
||||
\section{Licenses and Acknowledgements for Incorporated Software}
|
||||
|
||||
This section is an incomplete, but growing list of licenses and
|
||||
acknowledgements for third-party software incorporated in the
|
||||
Python distribution.
|
||||
|
||||
|
||||
\subsection{Mersenne Twister}
|
||||
|
||||
The \module{_random} module includes code based on a download from
|
||||
\url{http://www.math.keio.ac.jp/~matumoto/MT2002/emt19937ar.html}.
|
||||
The following are the verbatim comments from the original code:
|
||||
|
||||
\begin{verbatim}
|
||||
A C-program for MT19937, with initialization improved 2002/1/26.
|
||||
Coded by Takuji Nishimura and Makoto Matsumoto.
|
||||
|
||||
Before using, initialize the state by using init_genrand(seed)
|
||||
or init_by_array(init_key, key_length).
|
||||
|
||||
Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura,
|
||||
All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. Redistributions in binary form must reproduce the above copyright
|
||||
notice, this list of conditions and the following disclaimer in the
|
||||
documentation and/or other materials provided with the distribution.
|
||||
|
||||
3. The names of its contributors may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
||||
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
||||
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
|
||||
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
|
||||
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
|
||||
Any feedback is very welcome.
|
||||
http://www.math.keio.ac.jp/matumoto/emt.html
|
||||
email: matumoto@math.keio.ac.jp
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
|
||||
\subsection{Sockets}
|
||||
|
||||
The \module{socket} module uses the functions, \function{getaddrinfo},
|
||||
and \function{getnameinfo}, which are coded in separate source files
|
||||
from the WIDE Project, \url{http://www.wide.ad.jp/about/index.html}.
|
||||
|
||||
\begin{verbatim}
|
||||
Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
|
||||
All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
2. Redistributions in binary form must reproduce the above copyright
|
||||
notice, this list of conditions and the following disclaimer in the
|
||||
documentation and/or other materials provided with the distribution.
|
||||
3. Neither the name of the project nor the names of its contributors
|
||||
may be used to endorse or promote products derived from this software
|
||||
without specific prior written permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
|
||||
GAI_ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
|
||||
FOR GAI_ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
||||
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
||||
HOWEVER CAUSED AND ON GAI_ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
||||
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN GAI_ANY WAY
|
||||
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
||||
SUCH DAMAGE.
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
|
||||
\subsection{Floating point exception control}
|
||||
|
||||
The source for the \module{fpectl} module includes the following notice:
|
||||
|
||||
\begin{verbatim}
|
||||
---------------------------------------------------------------------
|
||||
/ Copyright (c) 1996. \
|
||||
| The Regents of the University of California. |
|
||||
| All rights reserved. |
|
||||
| |
|
||||
| Permission to use, copy, modify, and distribute this software for |
|
||||
| any purpose without fee is hereby granted, provided that this en- |
|
||||
| tire notice is included in all copies of any software which is or |
|
||||
| includes a copy or modification of this software and in all |
|
||||
| copies of the supporting documentation for such software. |
|
||||
| |
|
||||
| This work was produced at the University of California, Lawrence |
|
||||
| Livermore National Laboratory under contract no. W-7405-ENG-48 |
|
||||
| between the U.S. Department of Energy and The Regents of the |
|
||||
| University of California for the operation of UC LLNL. |
|
||||
| |
|
||||
| DISCLAIMER |
|
||||
| |
|
||||
| This software was prepared as an account of work sponsored by an |
|
||||
| agency of the United States Government. Neither the United States |
|
||||
| Government nor the University of California nor any of their em- |
|
||||
| ployees, makes any warranty, express or implied, or assumes any |
|
||||
| liability or responsibility for the accuracy, completeness, or |
|
||||
| usefulness of any information, apparatus, product, or process |
|
||||
| disclosed, or represents that its use would not infringe |
|
||||
| privately-owned rights. Reference herein to any specific commer- |
|
||||
| cial products, process, or service by trade name, trademark, |
|
||||
| manufacturer, or otherwise, does not necessarily constitute or |
|
||||
| imply its endorsement, recommendation, or favoring by the United |
|
||||
| States Government or the University of California. The views and |
|
||||
| opinions of authors expressed herein do not necessarily state or |
|
||||
| reflect those of the United States Government or the University |
|
||||
| of California, and shall not be used for advertising or product |
|
||||
\ endorsement purposes. /
|
||||
---------------------------------------------------------------------
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
|
||||
\subsection{MD5 message digest algorithm}
|
||||
|
||||
The source code for the \module{md5} module contains the following notice:
|
||||
|
||||
\begin{verbatim}
|
||||
Copyright (C) 1999, 2002 Aladdin Enterprises. All rights reserved.
|
||||
|
||||
This software is provided 'as-is', without any express or implied
|
||||
warranty. In no event will the authors be held liable for any damages
|
||||
arising from the use of this software.
|
||||
|
||||
Permission is granted to anyone to use this software for any purpose,
|
||||
including commercial applications, and to alter it and redistribute it
|
||||
freely, subject to the following restrictions:
|
||||
|
||||
1. The origin of this software must not be misrepresented; you must not
|
||||
claim that you wrote the original software. If you use this software
|
||||
in a product, an acknowledgment in the product documentation would be
|
||||
appreciated but is not required.
|
||||
2. Altered source versions must be plainly marked as such, and must not be
|
||||
misrepresented as being the original software.
|
||||
3. This notice may not be removed or altered from any source distribution.
|
||||
|
||||
L. Peter Deutsch
|
||||
ghost@aladdin.com
|
||||
|
||||
Independent implementation of MD5 (RFC 1321).
|
||||
|
||||
This code implements the MD5 Algorithm defined in RFC 1321, whose
|
||||
text is available at
|
||||
http://www.ietf.org/rfc/rfc1321.txt
|
||||
The code is derived from the text of the RFC, including the test suite
|
||||
(section A.5) but excluding the rest of Appendix A. It does not include
|
||||
any code or documentation that is identified in the RFC as being
|
||||
copyrighted.
|
||||
|
||||
The original and principal author of md5.h is L. Peter Deutsch
|
||||
<ghost@aladdin.com>. Other authors are noted in the change history
|
||||
that follows (in reverse chronological order):
|
||||
|
||||
2002-04-13 lpd Removed support for non-ANSI compilers; removed
|
||||
references to Ghostscript; clarified derivation from RFC 1321;
|
||||
now handles byte order either statically or dynamically.
|
||||
1999-11-04 lpd Edited comments slightly for automatic TOC extraction.
|
||||
1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5);
|
||||
added conditionalization for C++ compilation from Martin
|
||||
Purschke <purschke@bnl.gov>.
|
||||
1999-05-03 lpd Original version.
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
|
||||
\subsection{Asynchronous socket services}
|
||||
|
||||
The \module{asynchat} and \module{asyncore} modules contain the
|
||||
following notice:
|
||||
|
||||
\begin{verbatim}
|
||||
Copyright 1996 by Sam Rushing
|
||||
|
||||
All Rights Reserved
|
||||
|
||||
Permission to use, copy, modify, and distribute this software and
|
||||
its documentation for any purpose and without fee is hereby
|
||||
granted, provided that the above copyright notice appear in all
|
||||
copies and that both that copyright notice and this permission
|
||||
notice appear in supporting documentation, and that the name of Sam
|
||||
Rushing not be used in advertising or publicity pertaining to
|
||||
distribution of the software without specific, written prior
|
||||
permission.
|
||||
|
||||
SAM RUSHING DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
|
||||
INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN
|
||||
NO EVENT SHALL SAM RUSHING BE LIABLE FOR ANY SPECIAL, INDIRECT OR
|
||||
CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
|
||||
OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
|
||||
NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
|
||||
CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
\subsection{Cookie management}
|
||||
|
||||
The \module{Cookie} module contains the following notice:
|
||||
|
||||
\begin{verbatim}
|
||||
Copyright 2000 by Timothy O'Malley <timo@alum.mit.edu>
|
||||
|
||||
All Rights Reserved
|
||||
|
||||
Permission to use, copy, modify, and distribute this software
|
||||
and its documentation for any purpose and without fee is hereby
|
||||
granted, provided that the above copyright notice appear in all
|
||||
copies and that both that copyright notice and this permission
|
||||
notice appear in supporting documentation, and that the name of
|
||||
Timothy O'Malley not be used in advertising or publicity
|
||||
pertaining to distribution of the software without specific, written
|
||||
prior permission.
|
||||
|
||||
Timothy O'Malley DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS
|
||||
SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
|
||||
AND FITNESS, IN NO EVENT SHALL Timothy O'Malley BE LIABLE FOR
|
||||
ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
||||
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
|
||||
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
|
||||
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
|
||||
PERFORMANCE OF THIS SOFTWARE.
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
|
||||
\subsection{Profiling}
|
||||
|
||||
The \module{profile} and \module{pstats} modules contain
|
||||
the following notice:
|
||||
|
||||
\begin{verbatim}
|
||||
Copyright 1994, by InfoSeek Corporation, all rights reserved.
|
||||
Written by James Roskind
|
||||
|
||||
Permission to use, copy, modify, and distribute this Python software
|
||||
and its associated documentation for any purpose (subject to the
|
||||
restriction in the following sentence) without fee is hereby granted,
|
||||
provided that the above copyright notice appears in all copies, and
|
||||
that both that copyright notice and this permission notice appear in
|
||||
supporting documentation, and that the name of InfoSeek not be used in
|
||||
advertising or publicity pertaining to distribution of the software
|
||||
without specific, written prior permission. This permission is
|
||||
explicitly restricted to the copying and modification of the software
|
||||
to remain in Python, compiled Python, or other languages (such as C)
|
||||
wherein the modified or derived code is exclusively imported into a
|
||||
Python module.
|
||||
|
||||
INFOSEEK CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS
|
||||
SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
|
||||
FITNESS. IN NO EVENT SHALL INFOSEEK CORPORATION BE LIABLE FOR ANY
|
||||
SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
|
||||
RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
|
||||
CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
|
||||
CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
|
||||
\subsection{Execution tracing}
|
||||
|
||||
The \module{trace} module contains the following notice:
|
||||
|
||||
\begin{verbatim}
|
||||
portions copyright 2001, Autonomous Zones Industries, Inc., all rights...
|
||||
err... reserved and offered to the public under the terms of the
|
||||
Python 2.2 license.
|
||||
Author: Zooko O'Whielacronx
|
||||
http://zooko.com/
|
||||
mailto:zooko@zooko.com
|
||||
|
||||
Copyright 2000, Mojam Media, Inc., all rights reserved.
|
||||
Author: Skip Montanaro
|
||||
|
||||
Copyright 1999, Bioreason, Inc., all rights reserved.
|
||||
Author: Andrew Dalke
|
||||
|
||||
Copyright 1995-1997, Automatrix, Inc., all rights reserved.
|
||||
Author: Skip Montanaro
|
||||
|
||||
Copyright 1991-1995, Stichting Mathematisch Centrum, all rights reserved.
|
||||
|
||||
|
||||
Permission to use, copy, modify, and distribute this Python software and
|
||||
its associated documentation for any purpose without fee is hereby
|
||||
granted, provided that the above copyright notice appears in all copies,
|
||||
and that both that copyright notice and this permission notice appear in
|
||||
supporting documentation, and that the name of neither Automatrix,
|
||||
Bioreason or Mojam Media be used in advertising or publicity pertaining to
|
||||
distribution of the software without specific, written prior permission.
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
|
||||
\subsection{UUencode and UUdecode functions}
|
||||
|
||||
The \module{uu} module contains the following notice:
|
||||
|
||||
\begin{verbatim}
|
||||
Copyright 1994 by Lance Ellinghouse
|
||||
Cathedral City, California Republic, United States of America.
|
||||
All Rights Reserved
|
||||
Permission to use, copy, modify, and distribute this software and its
|
||||
documentation for any purpose and without fee is hereby granted,
|
||||
provided that the above copyright notice appear in all copies and that
|
||||
both that copyright notice and this permission notice appear in
|
||||
supporting documentation, and that the name of Lance Ellinghouse
|
||||
not be used in advertising or publicity pertaining to distribution
|
||||
of the software without specific, written prior permission.
|
||||
LANCE ELLINGHOUSE DISCLAIMS ALL WARRANTIES WITH REGARD TO
|
||||
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
|
||||
FITNESS, IN NO EVENT SHALL LANCE ELLINGHOUSE CENTRUM BE LIABLE
|
||||
FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
||||
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
|
||||
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
|
||||
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
||||
|
||||
Modified by Jack Jansen, CWI, July 1995:
|
||||
- Use binascii module to do the actual line-by-line conversion
|
||||
between ascii and binary. This results in a 1000-fold speedup. The C
|
||||
version is still 5 times faster, though.
|
||||
- Arguments more compliant with python standard
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
|
||||
\subsection{XML Remote Procedure Calls}
|
||||
|
||||
The \module{xmlrpclib} module contains the following notice:
|
||||
|
||||
\begin{verbatim}
|
||||
The XML-RPC client interface is
|
||||
|
||||
Copyright (c) 1999-2002 by Secret Labs AB
|
||||
Copyright (c) 1999-2002 by Fredrik Lundh
|
||||
|
||||
By obtaining, using, and/or copying this software and/or its
|
||||
associated documentation, you agree that you have read, understood,
|
||||
and will comply with the following terms and conditions:
|
||||
|
||||
Permission to use, copy, modify, and distribute this software and
|
||||
its associated documentation for any purpose and without fee is
|
||||
hereby granted, provided that the above copyright notice appears in
|
||||
all copies, and that both that copyright notice and this permission
|
||||
notice appear in supporting documentation, and that the name of
|
||||
Secret Labs AB or the author not be used in advertising or publicity
|
||||
pertaining to distribution of the software without specific, written
|
||||
prior permission.
|
||||
|
||||
SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD
|
||||
TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANT-
|
||||
ABILITY AND FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR
|
||||
BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY
|
||||
DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
|
||||
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
|
||||
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE
|
||||
OF THIS SOFTWARE.
|
||||
\end{verbatim}
|
|
@ -1,61 +0,0 @@
|
|||
\label{reporting-bugs}
|
||||
|
||||
Python is a mature programming language which has established a
|
||||
reputation for stability. In order to maintain this reputation, the
|
||||
developers would like to know of any deficiencies you find in Python
|
||||
or its documentation.
|
||||
|
||||
Before submitting a report, you will be required to log into SourceForge;
|
||||
this will make it possible for the developers to contact you
|
||||
for additional information if needed. It is not possible to submit a
|
||||
bug report anonymously.
|
||||
|
||||
All bug reports should be submitted via the Python Bug Tracker on
|
||||
SourceForge (\url{http://sourceforge.net/bugs/?group_id=5470}). The
|
||||
bug tracker offers a Web form which allows pertinent information to be
|
||||
entered and submitted to the developers.
|
||||
|
||||
The first step in filing a report is to determine whether the problem
|
||||
has already been reported. The advantage in doing so, aside from
|
||||
saving the developers time, is that you learn what has been done to
|
||||
fix it; it may be that the problem has already been fixed for the next
|
||||
release, or additional information is needed (in which case you are
|
||||
welcome to provide it if you can!). To do this, search the bug
|
||||
database using the search box on the left side of the page.
|
||||
|
||||
If the problem you're reporting is not already in the bug tracker, go
|
||||
back to the Python Bug Tracker
|
||||
(\url{http://sourceforge.net/bugs/?group_id=5470}). Select the
|
||||
``Submit a Bug'' link at the top of the page to open the bug reporting
|
||||
form.
|
||||
|
||||
The submission form has a number of fields. The only fields that are
|
||||
required are the ``Summary'' and ``Details'' fields. For the summary,
|
||||
enter a \emph{very} short description of the problem; less than ten
|
||||
words is good. In the Details field, describe the problem in detail,
|
||||
including what you expected to happen and what did happen. Be sure to
|
||||
include the version of Python you used, whether any extension modules
|
||||
were involved, and what hardware and software platform you were using
|
||||
(including version information as appropriate).
|
||||
|
||||
The only other field that you may want to set is the ``Category''
|
||||
field, which allows you to place the bug report into a broad category
|
||||
(such as ``Documentation'' or ``Library'').
|
||||
|
||||
Each bug report will be assigned to a developer who will determine
|
||||
what needs to be done to correct the problem. You will
|
||||
receive an update each time action is taken on the bug.
|
||||
|
||||
|
||||
\begin{seealso}
|
||||
\seetitle[http://www-mice.cs.ucl.ac.uk/multimedia/software/documentation/ReportingBugs.html]{How
|
||||
to Report Bugs Effectively}{Article which goes into some
|
||||
detail about how to create a useful bug report. This
|
||||
describes what kind of information is useful and why it is
|
||||
useful.}
|
||||
|
||||
\seetitle[http://www.mozilla.org/quality/bug-writing-guidelines.html]{Bug
|
||||
Writing Guidelines}{Information about writing a good bug
|
||||
report. Some of this is specific to the Mozilla project, but
|
||||
describes general good practices.}
|
||||
\end{seealso}
|
|
@ -1,76 +0,0 @@
|
|||
typedef struct _typeobject {
|
||||
PyObject_VAR_HEAD
|
||||
char *tp_name; /* For printing, in format "<module>.<name>" */
|
||||
int tp_basicsize, tp_itemsize; /* For allocation */
|
||||
|
||||
/* Methods to implement standard operations */
|
||||
|
||||
destructor tp_dealloc;
|
||||
printfunc tp_print;
|
||||
getattrfunc tp_getattr;
|
||||
setattrfunc tp_setattr;
|
||||
cmpfunc tp_compare;
|
||||
reprfunc tp_repr;
|
||||
|
||||
/* Method suites for standard classes */
|
||||
|
||||
PyNumberMethods *tp_as_number;
|
||||
PySequenceMethods *tp_as_sequence;
|
||||
PyMappingMethods *tp_as_mapping;
|
||||
|
||||
/* More standard operations (here for binary compatibility) */
|
||||
|
||||
hashfunc tp_hash;
|
||||
ternaryfunc tp_call;
|
||||
reprfunc tp_str;
|
||||
getattrofunc tp_getattro;
|
||||
setattrofunc tp_setattro;
|
||||
|
||||
/* Functions to access object as input/output buffer */
|
||||
PyBufferProcs *tp_as_buffer;
|
||||
|
||||
/* Flags to define presence of optional/expanded features */
|
||||
long tp_flags;
|
||||
|
||||
char *tp_doc; /* Documentation string */
|
||||
|
||||
/* Assigned meaning in release 2.0 */
|
||||
/* call function for all accessible objects */
|
||||
traverseproc tp_traverse;
|
||||
|
||||
/* delete references to contained objects */
|
||||
inquiry tp_clear;
|
||||
|
||||
/* Assigned meaning in release 2.1 */
|
||||
/* rich comparisons */
|
||||
richcmpfunc tp_richcompare;
|
||||
|
||||
/* weak reference enabler */
|
||||
long tp_weaklistoffset;
|
||||
|
||||
/* Added in release 2.2 */
|
||||
/* Iterators */
|
||||
getiterfunc tp_iter;
|
||||
iternextfunc tp_iternext;
|
||||
|
||||
/* Attribute descriptor and subclassing stuff */
|
||||
struct PyMethodDef *tp_methods;
|
||||
struct PyMemberDef *tp_members;
|
||||
struct PyGetSetDef *tp_getset;
|
||||
struct _typeobject *tp_base;
|
||||
PyObject *tp_dict;
|
||||
descrgetfunc tp_descr_get;
|
||||
descrsetfunc tp_descr_set;
|
||||
long tp_dictoffset;
|
||||
initproc tp_init;
|
||||
allocfunc tp_alloc;
|
||||
newfunc tp_new;
|
||||
freefunc tp_free; /* Low-level free-memory routine */
|
||||
inquiry tp_is_gc; /* For PyObject_IS_GC */
|
||||
PyObject *tp_bases;
|
||||
PyObject *tp_mro; /* method resolution order */
|
||||
PyObject *tp_cache;
|
||||
PyObject *tp_subclasses;
|
||||
PyObject *tp_weaklist;
|
||||
|
||||
} PyTypeObject;
|
|
@ -1,113 +0,0 @@
|
|||
\section{\module{distutils.sysconfig} ---
|
||||
System configuration information}
|
||||
|
||||
\declaremodule{standard}{distutils.sysconfig}
|
||||
\modulesynopsis{Low-level access to configuration information of the
|
||||
Python interpreter.}
|
||||
\moduleauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
|
||||
\moduleauthor{Greg Ward}{gward@python.net}
|
||||
\sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
|
||||
|
||||
|
||||
The \module{distutils.sysconfig} module provides access to Python's
|
||||
low-level configuration information. The specific configuration
|
||||
variables available depend heavily on the platform and configuration.
|
||||
The specific variables depend on the build process for the specific
|
||||
version of Python being run; the variables are those found in the
|
||||
\file{Makefile} and configuration header that are installed with
|
||||
Python on \UNIX{} systems. The configuration header is called
|
||||
\file{pyconfig.h} for Python versions starting with 2.2, and
|
||||
\file{config.h} for earlier versions of Python.
|
||||
|
||||
Some additional functions are provided which perform some useful
|
||||
manipulations for other parts of the \module{distutils} package.
|
||||
|
||||
|
||||
\begin{datadesc}{PREFIX}
|
||||
The result of \code{os.path.normpath(sys.prefix)}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{EXEC_PREFIX}
|
||||
The result of \code{os.path.normpath(sys.exec_prefix)}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{funcdesc}{get_config_var}{name}
|
||||
Return the value of a single variable. This is equivalent to
|
||||
\code{get_config_vars().get(\var{name})}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{get_config_vars}{\moreargs}
|
||||
Return a set of variable definitions. If there are no arguments,
|
||||
this returns a dictionary mapping names of configuration variables
|
||||
to values. If arguments are provided, they should be strings, and
|
||||
the return value will be a sequence giving the associated values.
|
||||
If a given name does not have a corresponding value, \code{None}
|
||||
will be included for that variable.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{get_config_h_filename}{}
|
||||
Return the full path name of the configuration header. For \UNIX,
|
||||
this will be the header generated by the \program{configure} script;
|
||||
for other platforms the header will have been supplied directly by
|
||||
the Python source distribution. The file is a platform-specific
|
||||
text file.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{get_makefile_filename}{}
|
||||
Return the full path name of the \file{Makefile} used to build
|
||||
Python. For \UNIX, this will be a file generated by the
|
||||
\program{configure} script; the meaning for other platforms will
|
||||
vary. The file is a platform-specific text file, if it exists.
|
||||
This function is only useful on \POSIX{} platforms.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{get_python_inc}{\optional{plat_specific\optional{, prefix}}}
|
||||
Return the directory for either the general or platform-dependent C
|
||||
include files. If \var{plat_specific} is true, the
|
||||
platform-dependent include directory is returned; if false or
|
||||
omitted, the platform-independent directory is returned. If
|
||||
\var{prefix} is given, it is used as either the prefix instead of
|
||||
\constant{PREFIX}, or as the exec-prefix instead of
|
||||
\constant{EXEC_PREFIX} if \var{plat_specific} is true.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{get_python_lib}{\optional{plat_specific\optional{,
|
||||
standard_lib\optional{, prefix}}}}
|
||||
Return the directory for either the general or platform-dependent
|
||||
library installation. If \var{plat_specific} is true, the
|
||||
platform-dependent include directory is returned; if false or
|
||||
omitted, the platform-independent directory is returned. If
|
||||
\var{prefix} is given, it is used as either the prefix instead of
|
||||
\constant{PREFIX}, or as the exec-prefix instead of
|
||||
\constant{EXEC_PREFIX} if \var{plat_specific} is true. If
|
||||
\var{standard_lib} is true, the directory for the standard library
|
||||
is returned rather than the directory for the installation of
|
||||
third-party extensions.
|
||||
\end{funcdesc}
|
||||
|
||||
|
||||
The following function is only intended for use within the
|
||||
\module{distutils} package.
|
||||
|
||||
\begin{funcdesc}{customize_compiler}{compiler}
|
||||
Do any platform-specific customization of a
|
||||
\class{distutils.ccompiler.CCompiler} instance.
|
||||
|
||||
This function is only needed on \UNIX{} at this time, but should be
|
||||
called consistently to support forward-compatibility. It inserts
|
||||
the information that varies across \UNIX{} flavors and is stored in
|
||||
Python's \file{Makefile}. This information includes the selected
|
||||
compiler, compiler and linker options, and the extension used by the
|
||||
linker for shared objects.
|
||||
\end{funcdesc}
|
||||
|
||||
|
||||
This function is even more special-purpose, and should only be used
|
||||
from Python's own build procedures.
|
||||
|
||||
\begin{funcdesc}{set_python_build}{}
|
||||
Inform the \module{distutils.sysconfig} module that it is being used
|
||||
as part of the build process for Python. This changes a lot of
|
||||
relative locations for files, allowing them to be located in the
|
||||
build area rather than in an installed Python.
|
||||
\end{funcdesc}
|
2129
Doc/doc/doc.tex
|
@ -1,143 +0,0 @@
|
|||
\chapter{Building C and \Cpp{} Extensions with distutils
|
||||
\label{building}}
|
||||
|
||||
\sectionauthor{Martin v. L\"owis}{martin@v.loewis.de}
|
||||
|
||||
Starting in Python 1.4, Python provides, on \UNIX{}, a special make
|
||||
file for building make files for building dynamically-linked
|
||||
extensions and custom interpreters. Starting with Python 2.0, this
|
||||
mechanism (known as related to Makefile.pre.in, and Setup files) is no
|
||||
longer supported. Building custom interpreters was rarely used, and
|
||||
extension modules can be built using distutils.
|
||||
|
||||
Building an extension module using distutils requires that distutils
|
||||
is installed on the build machine, which is included in Python 2.x and
|
||||
available separately for Python 1.5. Since distutils also supports
|
||||
creation of binary packages, users don't necessarily need a compiler
|
||||
and distutils to install the extension.
|
||||
|
||||
A distutils package contains a driver script, \file{setup.py}. This is
|
||||
a plain Python file, which, in the most simple case, could look like
|
||||
this:
|
||||
|
||||
\begin{verbatim}
|
||||
from distutils.core import setup, Extension
|
||||
|
||||
module1 = Extension('demo',
|
||||
sources = ['demo.c'])
|
||||
|
||||
setup (name = 'PackageName',
|
||||
version = '1.0',
|
||||
description = 'This is a demo package',
|
||||
ext_modules = [module1])
|
||||
|
||||
\end{verbatim}
|
||||
|
||||
With this \file{setup.py}, and a file \file{demo.c}, running
|
||||
|
||||
\begin{verbatim}
|
||||
python setup.py build
|
||||
\end{verbatim}
|
||||
|
||||
will compile \file{demo.c}, and produce an extension module named
|
||||
\samp{demo} in the \file{build} directory. Depending on the system,
|
||||
the module file will end up in a subdirectory \file{build/lib.system},
|
||||
and may have a name like \file{demo.so} or \file{demo.pyd}.
|
||||
|
||||
In the \file{setup.py}, all execution is performed by calling the
|
||||
\samp{setup} function. This takes a variable number of keyword
|
||||
arguments, of which the example above uses only a
|
||||
subset. Specifically, the example specifies meta-information to build
|
||||
packages, and it specifies the contents of the package. Normally, a
|
||||
package will contain of addition modules, like Python source modules,
|
||||
documentation, subpackages, etc. Please refer to the distutils
|
||||
documentation in \citetitle[../dist/dist.html]{Distributing Python
|
||||
Modules} to learn more about the features of distutils; this section
|
||||
explains building extension modules only.
|
||||
|
||||
It is common to pre-compute arguments to \function{setup}, to better
|
||||
structure the driver script. In the example above,
|
||||
the\samp{ext_modules} argument to \function{setup} is a list of
|
||||
extension modules, each of which is an instance of the
|
||||
\class{Extension}. In the example, the instance defines an extension
|
||||
named \samp{demo} which is build by compiling a single source file,
|
||||
\file{demo.c}.
|
||||
|
||||
In many cases, building an extension is more complex, since additional
|
||||
preprocessor defines and libraries may be needed. This is demonstrated
|
||||
in the example below.
|
||||
|
||||
\begin{verbatim}
|
||||
from distutils.core import setup, Extension
|
||||
|
||||
module1 = Extension('demo',
|
||||
define_macros = [('MAJOR_VERSION', '1'),
|
||||
('MINOR_VERSION', '0')],
|
||||
include_dirs = ['/usr/local/include'],
|
||||
libraries = ['tcl83'],
|
||||
library_dirs = ['/usr/local/lib'],
|
||||
sources = ['demo.c'])
|
||||
|
||||
setup (name = 'PackageName',
|
||||
version = '1.0',
|
||||
description = 'This is a demo package',
|
||||
author = 'Martin v. Loewis',
|
||||
author_email = 'martin@v.loewis.de',
|
||||
url = 'http://www.python.org/doc/current/ext/building.html',
|
||||
long_description = '''
|
||||
This is really just a demo package.
|
||||
''',
|
||||
ext_modules = [module1])
|
||||
|
||||
\end{verbatim}
|
||||
|
||||
In this example, \function{setup} is called with additional
|
||||
meta-information, which is recommended when distribution packages have
|
||||
to be built. For the extension itself, it specifies preprocessor
|
||||
defines, include directories, library directories, and libraries.
|
||||
Depending on the compiler, distutils passes this information in
|
||||
different ways to the compiler. For example, on \UNIX{}, this may
|
||||
result in the compilation commands
|
||||
|
||||
\begin{verbatim}
|
||||
gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -DMAJOR_VERSION=1 -DMINOR_VERSION=0 -I/usr/local/include -I/usr/local/include/python2.2 -c demo.c -o build/temp.linux-i686-2.2/demo.o
|
||||
|
||||
gcc -shared build/temp.linux-i686-2.2/demo.o -L/usr/local/lib -ltcl83 -o build/lib.linux-i686-2.2/demo.so
|
||||
\end{verbatim}
|
||||
|
||||
These lines are for demonstration purposes only; distutils users
|
||||
should trust that distutils gets the invocations right.
|
||||
|
||||
\section{Distributing your extension modules
|
||||
\label{distributing}}
|
||||
|
||||
When an extension has been successfully build, there are three ways to
|
||||
use it.
|
||||
|
||||
End-users will typically want to install the module, they do so by
|
||||
running
|
||||
|
||||
\begin{verbatim}
|
||||
python setup.py install
|
||||
\end{verbatim}
|
||||
|
||||
Module maintainers should produce source packages; to do so, they run
|
||||
|
||||
\begin{verbatim}
|
||||
python setup.py sdist
|
||||
\end{verbatim}
|
||||
|
||||
In some cases, additional files need to be included in a source
|
||||
distribution; this is done through a \file{MANIFEST.in} file; see the
|
||||
distutils documentation for details.
|
||||
|
||||
If the source distribution has been build successfully, maintainers
|
||||
can also create binary distributions. Depending on the platform, one
|
||||
of the following commands can be used to do so.
|
||||
|
||||
\begin{verbatim}
|
||||
python setup.py bdist_wininst
|
||||
python setup.py bdist_rpm
|
||||
python setup.py bdist_dumb
|
||||
\end{verbatim}
|
||||
|
|
@ -1,316 +0,0 @@
|
|||
\chapter{Embedding Python in Another Application
|
||||
\label{embedding}}
|
||||
|
||||
The previous chapters discussed how to extend Python, that is, how to
|
||||
extend the functionality of Python by attaching a library of C
|
||||
functions to it. It is also possible to do it the other way around:
|
||||
enrich your C/\Cpp{} application by embedding Python in it. Embedding
|
||||
provides your application with the ability to implement some of the
|
||||
functionality of your application in Python rather than C or \Cpp.
|
||||
This can be used for many purposes; one example would be to allow
|
||||
users to tailor the application to their needs by writing some scripts
|
||||
in Python. You can also use it yourself if some of the functionality
|
||||
can be written in Python more easily.
|
||||
|
||||
Embedding Python is similar to extending it, but not quite. The
|
||||
difference is that when you extend Python, the main program of the
|
||||
application is still the Python interpreter, while if you embed
|
||||
Python, the main program may have nothing to do with Python ---
|
||||
instead, some parts of the application occasionally call the Python
|
||||
interpreter to run some Python code.
|
||||
|
||||
So if you are embedding Python, you are providing your own main
|
||||
program. One of the things this main program has to do is initialize
|
||||
the Python interpreter. At the very least, you have to call the
|
||||
function \cfunction{Py_Initialize()} (on Mac OS, call
|
||||
\cfunction{PyMac_Initialize()} instead). There are optional calls to
|
||||
pass command line arguments to Python. Then later you can call the
|
||||
interpreter from any part of the application.
|
||||
|
||||
There are several different ways to call the interpreter: you can pass
|
||||
a string containing Python statements to
|
||||
\cfunction{PyRun_SimpleString()}, or you can pass a stdio file pointer
|
||||
and a file name (for identification in error messages only) to
|
||||
\cfunction{PyRun_SimpleFile()}. You can also call the lower-level
|
||||
operations described in the previous chapters to construct and use
|
||||
Python objects.
|
||||
|
||||
A simple demo of embedding Python can be found in the directory
|
||||
\file{Demo/embed/} of the source distribution.
|
||||
|
||||
|
||||
\begin{seealso}
|
||||
\seetitle[../api/api.html]{Python/C API Reference Manual}{The
|
||||
details of Python's C interface are given in this manual.
|
||||
A great deal of necessary information can be found here.}
|
||||
\end{seealso}
|
||||
|
||||
|
||||
\section{Very High Level Embedding
|
||||
\label{high-level-embedding}}
|
||||
|
||||
The simplest form of embedding Python is the use of the very
|
||||
high level interface. This interface is intended to execute a
|
||||
Python script without needing to interact with the application
|
||||
directly. This can for example be used to perform some operation
|
||||
on a file.
|
||||
|
||||
\begin{verbatim}
|
||||
#include <Python.h>
|
||||
|
||||
int
|
||||
main(int argc, char *argv[])
|
||||
{
|
||||
Py_Initialize();
|
||||
PyRun_SimpleString("from time import time,ctime\n"
|
||||
"print 'Today is',ctime(time())\n");
|
||||
Py_Finalize();
|
||||
return 0;
|
||||
}
|
||||
\end{verbatim}
|
||||
|
||||
The above code first initializes the Python interpreter with
|
||||
\cfunction{Py_Initialize()}, followed by the execution of a hard-coded
|
||||
Python script that print the date and time. Afterwards, the
|
||||
\cfunction{Py_Finalize()} call shuts the interpreter down, followed by
|
||||
the end of the program. In a real program, you may want to get the
|
||||
Python script from another source, perhaps a text-editor routine, a
|
||||
file, or a database. Getting the Python code from a file can better
|
||||
be done by using the \cfunction{PyRun_SimpleFile()} function, which
|
||||
saves you the trouble of allocating memory space and loading the file
|
||||
contents.
|
||||
|
||||
|
||||
\section{Beyond Very High Level Embedding: An overview
|
||||
\label{lower-level-embedding}}
|
||||
|
||||
The high level interface gives you the ability to execute
|
||||
arbitrary pieces of Python code from your application, but
|
||||
exchanging data values is quite cumbersome to say the least. If
|
||||
you want that, you should use lower level calls. At the cost of
|
||||
having to write more C code, you can achieve almost anything.
|
||||
|
||||
It should be noted that extending Python and embedding Python
|
||||
is quite the same activity, despite the different intent. Most
|
||||
topics discussed in the previous chapters are still valid. To
|
||||
show this, consider what the extension code from Python to C
|
||||
really does:
|
||||
|
||||
\begin{enumerate}
|
||||
\item Convert data values from Python to C,
|
||||
\item Perform a function call to a C routine using the
|
||||
converted values, and
|
||||
\item Convert the data values from the call from C to Python.
|
||||
\end{enumerate}
|
||||
|
||||
When embedding Python, the interface code does:
|
||||
|
||||
\begin{enumerate}
|
||||
\item Convert data values from C to Python,
|
||||
\item Perform a function call to a Python interface routine
|
||||
using the converted values, and
|
||||
\item Convert the data values from the call from Python to C.
|
||||
\end{enumerate}
|
||||
|
||||
As you can see, the data conversion steps are simply swapped to
|
||||
accommodate the different direction of the cross-language transfer.
|
||||
The only difference is the routine that you call between both
|
||||
data conversions. When extending, you call a C routine, when
|
||||
embedding, you call a Python routine.
|
||||
|
||||
This chapter will not discuss how to convert data from Python
|
||||
to C and vice versa. Also, proper use of references and dealing
|
||||
with errors is assumed to be understood. Since these aspects do not
|
||||
differ from extending the interpreter, you can refer to earlier
|
||||
chapters for the required information.
|
||||
|
||||
|
||||
\section{Pure Embedding
|
||||
\label{pure-embedding}}
|
||||
|
||||
The first program aims to execute a function in a Python
|
||||
script. Like in the section about the very high level interface,
|
||||
the Python interpreter does not directly interact with the
|
||||
application (but that will change in the next section).
|
||||
|
||||
The code to run a function defined in a Python script is:
|
||||
|
||||
\verbatiminput{run-func.c}
|
||||
|
||||
This code loads a Python script using \code{argv[1]}, and calls the
|
||||
function named in \code{argv[2]}. Its integer arguments are the other
|
||||
values of the \code{argv} array. If you compile and link this
|
||||
program (let's call the finished executable \program{call}), and use
|
||||
it to execute a Python script, such as:
|
||||
|
||||
\begin{verbatim}
|
||||
def multiply(a,b):
|
||||
print "Will compute", a, "times", b
|
||||
c = 0
|
||||
for i in range(0, a):
|
||||
c = c + b
|
||||
return c
|
||||
\end{verbatim}
|
||||
|
||||
then the result should be:
|
||||
|
||||
\begin{verbatim}
|
||||
$ call multiply multiply 3 2
|
||||
Will compute 3 times 2
|
||||
Result of call: 6
|
||||
\end{verbatim} % $
|
||||
|
||||
Although the program is quite large for its functionality, most of the
|
||||
code is for data conversion between Python and C, and for error
|
||||
reporting. The interesting part with respect to embedding Python
|
||||
starts with
|
||||
|
||||
\begin{verbatim}
|
||||
Py_Initialize();
|
||||
pName = PyString_FromString(argv[1]);
|
||||
/* Error checking of pName left out */
|
||||
pModule = PyImport_Import(pName);
|
||||
\end{verbatim}
|
||||
|
||||
After initializing the interpreter, the script is loaded using
|
||||
\cfunction{PyImport_Import()}. This routine needs a Python string
|
||||
as its argument, which is constructed using the
|
||||
\cfunction{PyString_FromString()} data conversion routine.
|
||||
|
||||
\begin{verbatim}
|
||||
pFunc = PyObject_GetAttrString(pModule, argv[2]);
|
||||
/* pFunc is a new reference */
|
||||
|
||||
if (pFunc && PyCallable_Check(pFunc)) {
|
||||
...
|
||||
}
|
||||
Py_XDECREF(pFunc);
|
||||
\end{verbatim}
|
||||
|
||||
Once the script is loaded, the name we're looking for is retrieved
|
||||
using \cfunction{PyObject_GetAttrString()}. If the name exists, and
|
||||
the object returned is callable, you can safely assume that it is a
|
||||
function. The program then proceeds by constructing a tuple of
|
||||
arguments as normal. The call to the Python function is then made
|
||||
with:
|
||||
|
||||
\begin{verbatim}
|
||||
pValue = PyObject_CallObject(pFunc, pArgs);
|
||||
\end{verbatim}
|
||||
|
||||
Upon return of the function, \code{pValue} is either \NULL{} or it
|
||||
contains a reference to the return value of the function. Be sure to
|
||||
release the reference after examining the value.
|
||||
|
||||
|
||||
\section{Extending Embedded Python
|
||||
\label{extending-with-embedding}}
|
||||
|
||||
Until now, the embedded Python interpreter had no access to
|
||||
functionality from the application itself. The Python API allows this
|
||||
by extending the embedded interpreter. That is, the embedded
|
||||
interpreter gets extended with routines provided by the application.
|
||||
While it sounds complex, it is not so bad. Simply forget for a while
|
||||
that the application starts the Python interpreter. Instead, consider
|
||||
the application to be a set of subroutines, and write some glue code
|
||||
that gives Python access to those routines, just like you would write
|
||||
a normal Python extension. For example:
|
||||
|
||||
\begin{verbatim}
|
||||
static int numargs=0;
|
||||
|
||||
/* Return the number of arguments of the application command line */
|
||||
static PyObject*
|
||||
emb_numargs(PyObject *self, PyObject *args)
|
||||
{
|
||||
if(!PyArg_ParseTuple(args, ":numargs"))
|
||||
return NULL;
|
||||
return Py_BuildValue("i", numargs);
|
||||
}
|
||||
|
||||
static PyMethodDef EmbMethods[] = {
|
||||
{"numargs", emb_numargs, METH_VARARGS,
|
||||
"Return the number of arguments received by the process."},
|
||||
{NULL, NULL, 0, NULL}
|
||||
};
|
||||
\end{verbatim}
|
||||
|
||||
Insert the above code just above the \cfunction{main()} function.
|
||||
Also, insert the following two statements directly after
|
||||
\cfunction{Py_Initialize()}:
|
||||
|
||||
\begin{verbatim}
|
||||
numargs = argc;
|
||||
Py_InitModule("emb", EmbMethods);
|
||||
\end{verbatim}
|
||||
|
||||
These two lines initialize the \code{numargs} variable, and make the
|
||||
\function{emb.numargs()} function accessible to the embedded Python
|
||||
interpreter. With these extensions, the Python script can do things
|
||||
like
|
||||
|
||||
\begin{verbatim}
|
||||
import emb
|
||||
print "Number of arguments", emb.numargs()
|
||||
\end{verbatim}
|
||||
|
||||
In a real application, the methods will expose an API of the
|
||||
application to Python.
|
||||
|
||||
|
||||
%\section{For the future}
|
||||
%
|
||||
%You don't happen to have a nice library to get textual
|
||||
%equivalents of numeric values do you :-) ?
|
||||
%Callbacks here ? (I may be using information from that section
|
||||
%?!)
|
||||
%threads
|
||||
%code examples do not really behave well if errors happen
|
||||
% (what to watch out for)
|
||||
|
||||
|
||||
\section{Embedding Python in \Cpp
|
||||
\label{embeddingInCplusplus}}
|
||||
|
||||
It is also possible to embed Python in a \Cpp{} program; precisely how this
|
||||
is done will depend on the details of the \Cpp{} system used; in general you
|
||||
will need to write the main program in \Cpp, and use the \Cpp{} compiler
|
||||
to compile and link your program. There is no need to recompile Python
|
||||
itself using \Cpp.
|
||||
|
||||
|
||||
\section{Linking Requirements
|
||||
\label{link-reqs}}
|
||||
|
||||
While the \program{configure} script shipped with the Python sources
|
||||
will correctly build Python to export the symbols needed by
|
||||
dynamically linked extensions, this is not automatically inherited by
|
||||
applications which embed the Python library statically, at least on
|
||||
\UNIX. This is an issue when the application is linked to the static
|
||||
runtime library (\file{libpython.a}) and needs to load dynamic
|
||||
extensions (implemented as \file{.so} files).
|
||||
|
||||
The problem is that some entry points are defined by the Python
|
||||
runtime solely for extension modules to use. If the embedding
|
||||
application does not use any of these entry points, some linkers will
|
||||
not include those entries in the symbol table of the finished
|
||||
executable. Some additional options are needed to inform the linker
|
||||
not to remove these symbols.
|
||||
|
||||
Determining the right options to use for any given platform can be
|
||||
quite difficult, but fortunately the Python configuration already has
|
||||
those values. To retrieve them from an installed Python interpreter,
|
||||
start an interactive interpreter and have a short session like this:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> import distutils.sysconfig
|
||||
>>> distutils.sysconfig.get_config_var('LINKFORSHARED')
|
||||
'-Xlinker -export-dynamic'
|
||||
\end{verbatim}
|
||||
\refstmodindex{distutils.sysconfig}
|
||||
|
||||
The contents of the string presented will be the options that should
|
||||
be used. If the string is empty, there's no need to add any
|
||||
additional options. The \constant{LINKFORSHARED} definition
|
||||
corresponds to the variable of the same name in Python's top-level
|
||||
\file{Makefile}.
|
|
@ -1,67 +0,0 @@
|
|||
\documentclass{manual}
|
||||
|
||||
% XXX PM explain how to add new types to Python
|
||||
|
||||
\title{Extending and Embedding the Python Interpreter}
|
||||
|
||||
\input{boilerplate}
|
||||
|
||||
% Tell \index to actually write the .idx file
|
||||
\makeindex
|
||||
|
||||
\begin{document}
|
||||
|
||||
\maketitle
|
||||
|
||||
\ifhtml
|
||||
\chapter*{Front Matter\label{front}}
|
||||
\fi
|
||||
|
||||
\input{copyright}
|
||||
|
||||
|
||||
\begin{abstract}
|
||||
|
||||
\noindent
|
||||
Python is an interpreted, object-oriented programming language. This
|
||||
document describes how to write modules in C or \Cpp{} to extend the
|
||||
Python interpreter with new modules. Those modules can define new
|
||||
functions but also new object types and their methods. The document
|
||||
also describes how to embed the Python interpreter in another
|
||||
application, for use as an extension language. Finally, it shows how
|
||||
to compile and link extension modules so that they can be loaded
|
||||
dynamically (at run time) into the interpreter, if the underlying
|
||||
operating system supports this feature.
|
||||
|
||||
This document assumes basic knowledge about Python. For an informal
|
||||
introduction to the language, see the
|
||||
\citetitle[../tut/tut.html]{Python Tutorial}. The
|
||||
\citetitle[../ref/ref.html]{Python Reference Manual} gives a more
|
||||
formal definition of the language. The
|
||||
\citetitle[../lib/lib.html]{Python Library Reference} documents the
|
||||
existing object types, functions and modules (both built-in and
|
||||
written in Python) that give the language its wide application range.
|
||||
|
||||
For a detailed description of the whole Python/C API, see the separate
|
||||
\citetitle[../api/api.html]{Python/C API Reference Manual}.
|
||||
|
||||
\end{abstract}
|
||||
|
||||
\tableofcontents
|
||||
|
||||
|
||||
\input{extending}
|
||||
\input{newtypes}
|
||||
\input{building}
|
||||
\input{windows}
|
||||
\input{embedding}
|
||||
|
||||
|
||||
\appendix
|
||||
\chapter{Reporting Bugs}
|
||||
\input{reportingbugs}
|
||||
|
||||
\chapter{History and License}
|
||||
\input{license}
|
||||
|
||||
\end{document}
|
1765
Doc/ext/newtypes.tex
|
@ -1,54 +0,0 @@
|
|||
#include <Python.h>
|
||||
|
||||
typedef struct {
|
||||
PyObject_HEAD
|
||||
/* Type-specific fields go here. */
|
||||
} noddy_NoddyObject;
|
||||
|
||||
static PyTypeObject noddy_NoddyType = {
|
||||
PyObject_HEAD_INIT(NULL)
|
||||
0, /*ob_size*/
|
||||
"noddy.Noddy", /*tp_name*/
|
||||
sizeof(noddy_NoddyObject), /*tp_basicsize*/
|
||||
0, /*tp_itemsize*/
|
||||
0, /*tp_dealloc*/
|
||||
0, /*tp_print*/
|
||||
0, /*tp_getattr*/
|
||||
0, /*tp_setattr*/
|
||||
0, /*tp_compare*/
|
||||
0, /*tp_repr*/
|
||||
0, /*tp_as_number*/
|
||||
0, /*tp_as_sequence*/
|
||||
0, /*tp_as_mapping*/
|
||||
0, /*tp_hash */
|
||||
0, /*tp_call*/
|
||||
0, /*tp_str*/
|
||||
0, /*tp_getattro*/
|
||||
0, /*tp_setattro*/
|
||||
0, /*tp_as_buffer*/
|
||||
Py_TPFLAGS_DEFAULT, /*tp_flags*/
|
||||
"Noddy objects", /* tp_doc */
|
||||
};
|
||||
|
||||
static PyMethodDef noddy_methods[] = {
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */
|
||||
#define PyMODINIT_FUNC void
|
||||
#endif
|
||||
PyMODINIT_FUNC
|
||||
initnoddy(void)
|
||||
{
|
||||
PyObject* m;
|
||||
|
||||
noddy_NoddyType.tp_new = PyType_GenericNew;
|
||||
if (PyType_Ready(&noddy_NoddyType) < 0)
|
||||
return;
|
||||
|
||||
m = Py_InitModule3("noddy", noddy_methods,
|
||||
"Example module that creates an extension type.");
|
||||
|
||||
Py_INCREF(&noddy_NoddyType);
|
||||
PyModule_AddObject(m, "Noddy", (PyObject *)&noddy_NoddyType);
|
||||
}
|
190
Doc/ext/noddy2.c
|
@ -1,190 +0,0 @@
|
|||
#include <Python.h>
|
||||
#include "structmember.h"
|
||||
|
||||
typedef struct {
|
||||
PyObject_HEAD
|
||||
PyObject *first; /* first name */
|
||||
PyObject *last; /* last name */
|
||||
int number;
|
||||
} Noddy;
|
||||
|
||||
static void
|
||||
Noddy_dealloc(Noddy* self)
|
||||
{
|
||||
Py_XDECREF(self->first);
|
||||
Py_XDECREF(self->last);
|
||||
self->ob_type->tp_free((PyObject*)self);
|
||||
}
|
||||
|
||||
static PyObject *
|
||||
Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
|
||||
{
|
||||
Noddy *self;
|
||||
|
||||
self = (Noddy *)type->tp_alloc(type, 0);
|
||||
if (self != NULL) {
|
||||
self->first = PyString_FromString("");
|
||||
if (self->first == NULL)
|
||||
{
|
||||
Py_DECREF(self);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
self->last = PyString_FromString("");
|
||||
if (self->last == NULL)
|
||||
{
|
||||
Py_DECREF(self);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
self->number = 0;
|
||||
}
|
||||
|
||||
return (PyObject *)self;
|
||||
}
|
||||
|
||||
static int
|
||||
Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
|
||||
{
|
||||
PyObject *first=NULL, *last=NULL, *tmp;
|
||||
|
||||
static char *kwlist[] = {"first", "last", "number", NULL};
|
||||
|
||||
if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
|
||||
&first, &last,
|
||||
&self->number))
|
||||
return -1;
|
||||
|
||||
if (first) {
|
||||
tmp = self->first;
|
||||
Py_INCREF(first);
|
||||
self->first = first;
|
||||
Py_XDECREF(tmp);
|
||||
}
|
||||
|
||||
if (last) {
|
||||
tmp = self->last;
|
||||
Py_INCREF(last);
|
||||
self->last = last;
|
||||
Py_XDECREF(tmp);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
static PyMemberDef Noddy_members[] = {
|
||||
{"first", T_OBJECT_EX, offsetof(Noddy, first), 0,
|
||||
"first name"},
|
||||
{"last", T_OBJECT_EX, offsetof(Noddy, last), 0,
|
||||
"last name"},
|
||||
{"number", T_INT, offsetof(Noddy, number), 0,
|
||||
"noddy number"},
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
static PyObject *
|
||||
Noddy_name(Noddy* self)
|
||||
{
|
||||
static PyObject *format = NULL;
|
||||
PyObject *args, *result;
|
||||
|
||||
if (format == NULL) {
|
||||
format = PyString_FromString("%s %s");
|
||||
if (format == NULL)
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (self->first == NULL) {
|
||||
PyErr_SetString(PyExc_AttributeError, "first");
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (self->last == NULL) {
|
||||
PyErr_SetString(PyExc_AttributeError, "last");
|
||||
return NULL;
|
||||
}
|
||||
|
||||
args = Py_BuildValue("OO", self->first, self->last);
|
||||
if (args == NULL)
|
||||
return NULL;
|
||||
|
||||
result = PyString_Format(format, args);
|
||||
Py_DECREF(args);
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
static PyMethodDef Noddy_methods[] = {
|
||||
{"name", (PyCFunction)Noddy_name, METH_NOARGS,
|
||||
"Return the name, combining the first and last name"
|
||||
},
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
static PyTypeObject NoddyType = {
|
||||
PyObject_HEAD_INIT(NULL)
|
||||
0, /*ob_size*/
|
||||
"noddy.Noddy", /*tp_name*/
|
||||
sizeof(Noddy), /*tp_basicsize*/
|
||||
0, /*tp_itemsize*/
|
||||
(destructor)Noddy_dealloc, /*tp_dealloc*/
|
||||
0, /*tp_print*/
|
||||
0, /*tp_getattr*/
|
||||
0, /*tp_setattr*/
|
||||
0, /*tp_compare*/
|
||||
0, /*tp_repr*/
|
||||
0, /*tp_as_number*/
|
||||
0, /*tp_as_sequence*/
|
||||
0, /*tp_as_mapping*/
|
||||
0, /*tp_hash */
|
||||
0, /*tp_call*/
|
||||
0, /*tp_str*/
|
||||
0, /*tp_getattro*/
|
||||
0, /*tp_setattro*/
|
||||
0, /*tp_as_buffer*/
|
||||
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/
|
||||
"Noddy objects", /* tp_doc */
|
||||
0, /* tp_traverse */
|
||||
0, /* tp_clear */
|
||||
0, /* tp_richcompare */
|
||||
0, /* tp_weaklistoffset */
|
||||
0, /* tp_iter */
|
||||
0, /* tp_iternext */
|
||||
Noddy_methods, /* tp_methods */
|
||||
Noddy_members, /* tp_members */
|
||||
0, /* tp_getset */
|
||||
0, /* tp_base */
|
||||
0, /* tp_dict */
|
||||
0, /* tp_descr_get */
|
||||
0, /* tp_descr_set */
|
||||
0, /* tp_dictoffset */
|
||||
(initproc)Noddy_init, /* tp_init */
|
||||
0, /* tp_alloc */
|
||||
Noddy_new, /* tp_new */
|
||||
};
|
||||
|
||||
static PyMethodDef module_methods[] = {
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */
|
||||
#define PyMODINIT_FUNC void
|
||||
#endif
|
||||
PyMODINIT_FUNC
|
||||
initnoddy2(void)
|
||||
{
|
||||
PyObject* m;
|
||||
|
||||
if (PyType_Ready(&NoddyType) < 0)
|
||||
return;
|
||||
|
||||
m = Py_InitModule3("noddy2", module_methods,
|
||||
"Example module that creates an extension type.");
|
||||
|
||||
if (m == NULL)
|
||||
return;
|
||||
|
||||
Py_INCREF(&NoddyType);
|
||||
PyModule_AddObject(m, "Noddy", (PyObject *)&NoddyType);
|
||||
}
|
243
Doc/ext/noddy3.c
|
@ -1,243 +0,0 @@
|
|||
#include <Python.h>
|
||||
#include "structmember.h"
|
||||
|
||||
typedef struct {
|
||||
PyObject_HEAD
|
||||
PyObject *first;
|
||||
PyObject *last;
|
||||
int number;
|
||||
} Noddy;
|
||||
|
||||
static void
|
||||
Noddy_dealloc(Noddy* self)
|
||||
{
|
||||
Py_XDECREF(self->first);
|
||||
Py_XDECREF(self->last);
|
||||
self->ob_type->tp_free((PyObject*)self);
|
||||
}
|
||||
|
||||
static PyObject *
|
||||
Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
|
||||
{
|
||||
Noddy *self;
|
||||
|
||||
self = (Noddy *)type->tp_alloc(type, 0);
|
||||
if (self != NULL) {
|
||||
self->first = PyString_FromString("");
|
||||
if (self->first == NULL)
|
||||
{
|
||||
Py_DECREF(self);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
self->last = PyString_FromString("");
|
||||
if (self->last == NULL)
|
||||
{
|
||||
Py_DECREF(self);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
self->number = 0;
|
||||
}
|
||||
|
||||
return (PyObject *)self;
|
||||
}
|
||||
|
||||
static int
|
||||
Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
|
||||
{
|
||||
PyObject *first=NULL, *last=NULL, *tmp;
|
||||
|
||||
static char *kwlist[] = {"first", "last", "number", NULL};
|
||||
|
||||
if (! PyArg_ParseTupleAndKeywords(args, kwds, "|SSi", kwlist,
|
||||
&first, &last,
|
||||
&self->number))
|
||||
return -1;
|
||||
|
||||
if (first) {
|
||||
tmp = self->first;
|
||||
Py_INCREF(first);
|
||||
self->first = first;
|
||||
Py_DECREF(tmp);
|
||||
}
|
||||
|
||||
if (last) {
|
||||
tmp = self->last;
|
||||
Py_INCREF(last);
|
||||
self->last = last;
|
||||
Py_DECREF(tmp);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static PyMemberDef Noddy_members[] = {
|
||||
{"number", T_INT, offsetof(Noddy, number), 0,
|
||||
"noddy number"},
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
static PyObject *
|
||||
Noddy_getfirst(Noddy *self, void *closure)
|
||||
{
|
||||
Py_INCREF(self->first);
|
||||
return self->first;
|
||||
}
|
||||
|
||||
static int
|
||||
Noddy_setfirst(Noddy *self, PyObject *value, void *closure)
|
||||
{
|
||||
if (value == NULL) {
|
||||
PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (! PyString_Check(value)) {
|
||||
PyErr_SetString(PyExc_TypeError,
|
||||
"The first attribute value must be a string");
|
||||
return -1;
|
||||
}
|
||||
|
||||
Py_DECREF(self->first);
|
||||
Py_INCREF(value);
|
||||
self->first = value;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static PyObject *
|
||||
Noddy_getlast(Noddy *self, void *closure)
|
||||
{
|
||||
Py_INCREF(self->last);
|
||||
return self->last;
|
||||
}
|
||||
|
||||
static int
|
||||
Noddy_setlast(Noddy *self, PyObject *value, void *closure)
|
||||
{
|
||||
if (value == NULL) {
|
||||
PyErr_SetString(PyExc_TypeError, "Cannot delete the last attribute");
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (! PyString_Check(value)) {
|
||||
PyErr_SetString(PyExc_TypeError,
|
||||
"The last attribute value must be a string");
|
||||
return -1;
|
||||
}
|
||||
|
||||
Py_DECREF(self->last);
|
||||
Py_INCREF(value);
|
||||
self->last = value;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static PyGetSetDef Noddy_getseters[] = {
|
||||
{"first",
|
||||
(getter)Noddy_getfirst, (setter)Noddy_setfirst,
|
||||
"first name",
|
||||
NULL},
|
||||
{"last",
|
||||
(getter)Noddy_getlast, (setter)Noddy_setlast,
|
||||
"last name",
|
||||
NULL},
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
static PyObject *
|
||||
Noddy_name(Noddy* self)
|
||||
{
|
||||
static PyObject *format = NULL;
|
||||
PyObject *args, *result;
|
||||
|
||||
if (format == NULL) {
|
||||
format = PyString_FromString("%s %s");
|
||||
if (format == NULL)
|
||||
return NULL;
|
||||
}
|
||||
|
||||
args = Py_BuildValue("OO", self->first, self->last);
|
||||
if (args == NULL)
|
||||
return NULL;
|
||||
|
||||
result = PyString_Format(format, args);
|
||||
Py_DECREF(args);
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
static PyMethodDef Noddy_methods[] = {
|
||||
{"name", (PyCFunction)Noddy_name, METH_NOARGS,
|
||||
"Return the name, combining the first and last name"
|
||||
},
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
static PyTypeObject NoddyType = {
|
||||
PyObject_HEAD_INIT(NULL)
|
||||
0, /*ob_size*/
|
||||
"noddy.Noddy", /*tp_name*/
|
||||
sizeof(Noddy), /*tp_basicsize*/
|
||||
0, /*tp_itemsize*/
|
||||
(destructor)Noddy_dealloc, /*tp_dealloc*/
|
||||
0, /*tp_print*/
|
||||
0, /*tp_getattr*/
|
||||
0, /*tp_setattr*/
|
||||
0, /*tp_compare*/
|
||||
0, /*tp_repr*/
|
||||
0, /*tp_as_number*/
|
||||
0, /*tp_as_sequence*/
|
||||
0, /*tp_as_mapping*/
|
||||
0, /*tp_hash */
|
||||
0, /*tp_call*/
|
||||
0, /*tp_str*/
|
||||
0, /*tp_getattro*/
|
||||
0, /*tp_setattro*/
|
||||
0, /*tp_as_buffer*/
|
||||
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/
|
||||
"Noddy objects", /* tp_doc */
|
||||
0, /* tp_traverse */
|
||||
0, /* tp_clear */
|
||||
0, /* tp_richcompare */
|
||||
0, /* tp_weaklistoffset */
|
||||
0, /* tp_iter */
|
||||
0, /* tp_iternext */
|
||||
Noddy_methods, /* tp_methods */
|
||||
Noddy_members, /* tp_members */
|
||||
Noddy_getseters, /* tp_getset */
|
||||
0, /* tp_base */
|
||||
0, /* tp_dict */
|
||||
0, /* tp_descr_get */
|
||||
0, /* tp_descr_set */
|
||||
0, /* tp_dictoffset */
|
||||
(initproc)Noddy_init, /* tp_init */
|
||||
0, /* tp_alloc */
|
||||
Noddy_new, /* tp_new */
|
||||
};
|
||||
|
||||
static PyMethodDef module_methods[] = {
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */
|
||||
#define PyMODINIT_FUNC void
|
||||
#endif
|
||||
PyMODINIT_FUNC
|
||||
initnoddy3(void)
|
||||
{
|
||||
PyObject* m;
|
||||
|
||||
if (PyType_Ready(&NoddyType) < 0)
|
||||
return;
|
||||
|
||||
m = Py_InitModule3("noddy3", module_methods,
|
||||
"Example module that creates an extension type.");
|
||||
|
||||
if (m == NULL)
|
||||
return;
|
||||
|
||||
Py_INCREF(&NoddyType);
|
||||
PyModule_AddObject(m, "Noddy", (PyObject *)&NoddyType);
|
||||
}
|
224
Doc/ext/noddy4.c
|
@ -1,224 +0,0 @@
|
|||
#include <Python.h>
|
||||
#include "structmember.h"
|
||||
|
||||
typedef struct {
|
||||
PyObject_HEAD
|
||||
PyObject *first;
|
||||
PyObject *last;
|
||||
int number;
|
||||
} Noddy;
|
||||
|
||||
static int
|
||||
Noddy_traverse(Noddy *self, visitproc visit, void *arg)
|
||||
{
|
||||
int vret;
|
||||
|
||||
if (self->first) {
|
||||
vret = visit(self->first, arg);
|
||||
if (vret != 0)
|
||||
return vret;
|
||||
}
|
||||
if (self->last) {
|
||||
vret = visit(self->last, arg);
|
||||
if (vret != 0)
|
||||
return vret;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int
|
||||
Noddy_clear(Noddy *self)
|
||||
{
|
||||
PyObject *tmp;
|
||||
|
||||
tmp = self->first;
|
||||
self->first = NULL;
|
||||
Py_XDECREF(tmp);
|
||||
|
||||
tmp = self->last;
|
||||
self->last = NULL;
|
||||
Py_XDECREF(tmp);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void
|
||||
Noddy_dealloc(Noddy* self)
|
||||
{
|
||||
Noddy_clear(self);
|
||||
self->ob_type->tp_free((PyObject*)self);
|
||||
}
|
||||
|
||||
static PyObject *
|
||||
Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
|
||||
{
|
||||
Noddy *self;
|
||||
|
||||
self = (Noddy *)type->tp_alloc(type, 0);
|
||||
if (self != NULL) {
|
||||
self->first = PyString_FromString("");
|
||||
if (self->first == NULL)
|
||||
{
|
||||
Py_DECREF(self);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
self->last = PyString_FromString("");
|
||||
if (self->last == NULL)
|
||||
{
|
||||
Py_DECREF(self);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
self->number = 0;
|
||||
}
|
||||
|
||||
return (PyObject *)self;
|
||||
}
|
||||
|
||||
static int
|
||||
Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
|
||||
{
|
||||
PyObject *first=NULL, *last=NULL, *tmp;
|
||||
|
||||
static char *kwlist[] = {"first", "last", "number", NULL};
|
||||
|
||||
if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
|
||||
&first, &last,
|
||||
&self->number))
|
||||
return -1;
|
||||
|
||||
if (first) {
|
||||
tmp = self->first;
|
||||
Py_INCREF(first);
|
||||
self->first = first;
|
||||
Py_XDECREF(tmp);
|
||||
}
|
||||
|
||||
if (last) {
|
||||
tmp = self->last;
|
||||
Py_INCREF(last);
|
||||
self->last = last;
|
||||
Py_XDECREF(tmp);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
static PyMemberDef Noddy_members[] = {
|
||||
{"first", T_OBJECT_EX, offsetof(Noddy, first), 0,
|
||||
"first name"},
|
||||
{"last", T_OBJECT_EX, offsetof(Noddy, last), 0,
|
||||
"last name"},
|
||||
{"number", T_INT, offsetof(Noddy, number), 0,
|
||||
"noddy number"},
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
static PyObject *
|
||||
Noddy_name(Noddy* self)
|
||||
{
|
||||
static PyObject *format = NULL;
|
||||
PyObject *args, *result;
|
||||
|
||||
if (format == NULL) {
|
||||
format = PyString_FromString("%s %s");
|
||||
if (format == NULL)
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (self->first == NULL) {
|
||||
PyErr_SetString(PyExc_AttributeError, "first");
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (self->last == NULL) {
|
||||
PyErr_SetString(PyExc_AttributeError, "last");
|
||||
return NULL;
|
||||
}
|
||||
|
||||
args = Py_BuildValue("OO", self->first, self->last);
|
||||
if (args == NULL)
|
||||
return NULL;
|
||||
|
||||
result = PyString_Format(format, args);
|
||||
Py_DECREF(args);
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
static PyMethodDef Noddy_methods[] = {
|
||||
{"name", (PyCFunction)Noddy_name, METH_NOARGS,
|
||||
"Return the name, combining the first and last name"
|
||||
},
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
static PyTypeObject NoddyType = {
|
||||
PyObject_HEAD_INIT(NULL)
|
||||
0, /*ob_size*/
|
||||
"noddy.Noddy", /*tp_name*/
|
||||
sizeof(Noddy), /*tp_basicsize*/
|
||||
0, /*tp_itemsize*/
|
||||
(destructor)Noddy_dealloc, /*tp_dealloc*/
|
||||
0, /*tp_print*/
|
||||
0, /*tp_getattr*/
|
||||
0, /*tp_setattr*/
|
||||
0, /*tp_compare*/
|
||||
0, /*tp_repr*/
|
||||
0, /*tp_as_number*/
|
||||
0, /*tp_as_sequence*/
|
||||
0, /*tp_as_mapping*/
|
||||
0, /*tp_hash */
|
||||
0, /*tp_call*/
|
||||
0, /*tp_str*/
|
||||
0, /*tp_getattro*/
|
||||
0, /*tp_setattro*/
|
||||
0, /*tp_as_buffer*/
|
||||
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, /*tp_flags*/
|
||||
"Noddy objects", /* tp_doc */
|
||||
(traverseproc)Noddy_traverse, /* tp_traverse */
|
||||
(inquiry)Noddy_clear, /* tp_clear */
|
||||
0, /* tp_richcompare */
|
||||
0, /* tp_weaklistoffset */
|
||||
0, /* tp_iter */
|
||||
0, /* tp_iternext */
|
||||
Noddy_methods, /* tp_methods */
|
||||
Noddy_members, /* tp_members */
|
||||
0, /* tp_getset */
|
||||
0, /* tp_base */
|
||||
0, /* tp_dict */
|
||||
0, /* tp_descr_get */
|
||||
0, /* tp_descr_set */
|
||||
0, /* tp_dictoffset */
|
||||
(initproc)Noddy_init, /* tp_init */
|
||||
0, /* tp_alloc */
|
||||
Noddy_new, /* tp_new */
|
||||
};
|
||||
|
||||
static PyMethodDef module_methods[] = {
|
||||
{NULL} /* Sentinel */
|
||||
};
|
||||
|
||||
#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */
|
||||
#define PyMODINIT_FUNC void
|
||||
#endif
|
||||
PyMODINIT_FUNC
|
||||
initnoddy4(void)
|
||||
{
|
||||
PyObject* m;
|
||||
|
||||
if (PyType_Ready(&NoddyType) < 0)
|
||||
return;
|
||||
|
||||
m = Py_InitModule3("noddy4", module_methods,
|
||||
"Example module that creates an extension type.");
|
||||
|
||||
if (m == NULL)
|
||||
return;
|
||||
|
||||
Py_INCREF(&NoddyType);
|
||||
PyModule_AddObject(m, "Noddy", (PyObject *)&NoddyType);
|
||||
}
|
|
@ -1,68 +0,0 @@
|
|||
#include <Python.h>
|
||||
|
||||
int
|
||||
main(int argc, char *argv[])
|
||||
{
|
||||
PyObject *pName, *pModule, *pDict, *pFunc;
|
||||
PyObject *pArgs, *pValue;
|
||||
int i;
|
||||
|
||||
if (argc < 3) {
|
||||
fprintf(stderr,"Usage: call pythonfile funcname [args]\n");
|
||||
return 1;
|
||||
}
|
||||
|
||||
Py_Initialize();
|
||||
pName = PyString_FromString(argv[1]);
|
||||
/* Error checking of pName left out */
|
||||
|
||||
pModule = PyImport_Import(pName);
|
||||
Py_DECREF(pName);
|
||||
|
||||
if (pModule != NULL) {
|
||||
pFunc = PyObject_GetAttrString(pModule, argv[2]);
|
||||
/* pFunc is a new reference */
|
||||
|
||||
if (pFunc && PyCallable_Check(pFunc)) {
|
||||
pArgs = PyTuple_New(argc - 3);
|
||||
for (i = 0; i < argc - 3; ++i) {
|
||||
pValue = PyInt_FromLong(atoi(argv[i + 3]));
|
||||
if (!pValue) {
|
||||
Py_DECREF(pArgs);
|
||||
Py_DECREF(pModule);
|
||||
fprintf(stderr, "Cannot convert argument\n");
|
||||
return 1;
|
||||
}
|
||||
/* pValue reference stolen here: */
|
||||
PyTuple_SetItem(pArgs, i, pValue);
|
||||
}
|
||||
pValue = PyObject_CallObject(pFunc, pArgs);
|
||||
Py_DECREF(pArgs);
|
||||
if (pValue != NULL) {
|
||||
printf("Result of call: %ld\n", PyInt_AsLong(pValue));
|
||||
Py_DECREF(pValue);
|
||||
}
|
||||
else {
|
||||
Py_DECREF(pFunc);
|
||||
Py_DECREF(pModule);
|
||||
PyErr_Print();
|
||||
fprintf(stderr,"Call failed\n");
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
else {
|
||||
if (PyErr_Occurred())
|
||||
PyErr_Print();
|
||||
fprintf(stderr, "Cannot find function \"%s\"\n", argv[2]);
|
||||
}
|
||||
Py_XDECREF(pFunc);
|
||||
Py_DECREF(pModule);
|
||||
}
|
||||
else {
|
||||
PyErr_Print();
|
||||
fprintf(stderr, "Failed to load \"%s\"\n", argv[1]);
|
||||
return 1;
|
||||
}
|
||||
Py_Finalize();
|
||||
return 0;
|
||||
}
|
|
@ -1,8 +0,0 @@
|
|||
from distutils.core import setup, Extension
|
||||
setup(name="noddy", version="1.0",
|
||||
ext_modules=[
|
||||
Extension("noddy", ["noddy.c"]),
|
||||
Extension("noddy2", ["noddy2.c"]),
|
||||
Extension("noddy3", ["noddy3.c"]),
|
||||
Extension("noddy4", ["noddy4.c"]),
|
||||
])
|
|
@ -1,91 +0,0 @@
|
|||
#include <Python.h>
|
||||
|
||||
typedef struct {
|
||||
PyListObject list;
|
||||
int state;
|
||||
} Shoddy;
|
||||
|
||||
|
||||
static PyObject *
|
||||
Shoddy_increment(Shoddy *self, PyObject *unused)
|
||||
{
|
||||
self->state++;
|
||||
return PyInt_FromLong(self->state);
|
||||
}
|
||||
|
||||
|
||||
static PyMethodDef Shoddy_methods[] = {
|
||||
{"increment", (PyCFunction)Shoddy_increment, METH_NOARGS,
|
||||
PyDoc_STR("increment state counter")},
|
||||
{NULL, NULL},
|
||||
};
|
||||
|
||||
static int
|
||||
Shoddy_init(Shoddy *self, PyObject *args, PyObject *kwds)
|
||||
{
|
||||
if (PyList_Type.tp_init((PyObject *)self, args, kwds) < 0)
|
||||
return -1;
|
||||
self->state = 0;
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
static PyTypeObject ShoddyType = {
|
||||
PyObject_HEAD_INIT(NULL)
|
||||
0, /* ob_size */
|
||||
"shoddy.Shoddy", /* tp_name */
|
||||
sizeof(Shoddy), /* tp_basicsize */
|
||||
0, /* tp_itemsize */
|
||||
0, /* tp_dealloc */
|
||||
0, /* tp_print */
|
||||
0, /* tp_getattr */
|
||||
0, /* tp_setattr */
|
||||
0, /* tp_compare */
|
||||
0, /* tp_repr */
|
||||
0, /* tp_as_number */
|
||||
0, /* tp_as_sequence */
|
||||
0, /* tp_as_mapping */
|
||||
0, /* tp_hash */
|
||||
0, /* tp_call */
|
||||
0, /* tp_str */
|
||||
0, /* tp_getattro */
|
||||
0, /* tp_setattro */
|
||||
0, /* tp_as_buffer */
|
||||
Py_TPFLAGS_DEFAULT |
|
||||
Py_TPFLAGS_BASETYPE, /* tp_flags */
|
||||
0, /* tp_doc */
|
||||
0, /* tp_traverse */
|
||||
0, /* tp_clear */
|
||||
0, /* tp_richcompare */
|
||||
0, /* tp_weaklistoffset */
|
||||
0, /* tp_iter */
|
||||
0, /* tp_iternext */
|
||||
Shoddy_methods, /* tp_methods */
|
||||
0, /* tp_members */
|
||||
0, /* tp_getset */
|
||||
0, /* tp_base */
|
||||
0, /* tp_dict */
|
||||
0, /* tp_descr_get */
|
||||
0, /* tp_descr_set */
|
||||
0, /* tp_dictoffset */
|
||||
(initproc)Shoddy_init, /* tp_init */
|
||||
0, /* tp_alloc */
|
||||
0, /* tp_new */
|
||||
};
|
||||
|
||||
PyMODINIT_FUNC
|
||||
initshoddy(void)
|
||||
{
|
||||
PyObject *m;
|
||||
|
||||
ShoddyType.tp_base = &PyList_Type;
|
||||
if (PyType_Ready(&ShoddyType) < 0)
|
||||
return;
|
||||
|
||||
m = Py_InitModule3("shoddy", NULL, "Shoddy module");
|
||||
if (m == NULL)
|
||||
return;
|
||||
|
||||
Py_INCREF(&ShoddyType);
|
||||
PyModule_AddObject(m, "Shoddy", (PyObject *) &ShoddyType);
|
||||
}
|
213
Doc/ext/test.py
|
@ -1,213 +0,0 @@
|
|||
"""Test module for the noddy examples
|
||||
|
||||
Noddy 1:
|
||||
|
||||
>>> import noddy
|
||||
>>> n1 = noddy.Noddy()
|
||||
>>> n2 = noddy.Noddy()
|
||||
>>> del n1
|
||||
>>> del n2
|
||||
|
||||
|
||||
Noddy 2
|
||||
|
||||
>>> import noddy2
|
||||
>>> n1 = noddy2.Noddy('jim', 'fulton', 42)
|
||||
>>> n1.first
|
||||
'jim'
|
||||
>>> n1.last
|
||||
'fulton'
|
||||
>>> n1.number
|
||||
42
|
||||
>>> n1.name()
|
||||
'jim fulton'
|
||||
>>> n1.first = 'will'
|
||||
>>> n1.name()
|
||||
'will fulton'
|
||||
>>> n1.last = 'tell'
|
||||
>>> n1.name()
|
||||
'will tell'
|
||||
>>> del n1.first
|
||||
>>> n1.name()
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
AttributeError: first
|
||||
>>> n1.first
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
AttributeError: first
|
||||
>>> n1.first = 'drew'
|
||||
>>> n1.first
|
||||
'drew'
|
||||
>>> del n1.number
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
TypeError: can't delete numeric/char attribute
|
||||
>>> n1.number=2
|
||||
>>> n1.number
|
||||
2
|
||||
>>> n1.first = 42
|
||||
>>> n1.name()
|
||||
'42 tell'
|
||||
>>> n2 = noddy2.Noddy()
|
||||
>>> n2.name()
|
||||
' '
|
||||
>>> n2.first
|
||||
''
|
||||
>>> n2.last
|
||||
''
|
||||
>>> del n2.first
|
||||
>>> n2.first
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
AttributeError: first
|
||||
>>> n2.first
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
AttributeError: first
|
||||
>>> n2.name()
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
AttributeError: first
|
||||
>>> n2.number
|
||||
0
|
||||
>>> n3 = noddy2.Noddy('jim', 'fulton', 'waaa')
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
TypeError: an integer is required
|
||||
>>> del n1
|
||||
>>> del n2
|
||||
|
||||
|
||||
Noddy 3
|
||||
|
||||
>>> import noddy3
|
||||
>>> n1 = noddy3.Noddy('jim', 'fulton', 42)
|
||||
>>> n1 = noddy3.Noddy('jim', 'fulton', 42)
|
||||
>>> n1.name()
|
||||
'jim fulton'
|
||||
>>> del n1.first
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
TypeError: Cannot delete the first attribute
|
||||
>>> n1.first = 42
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
TypeError: The first attribute value must be a string
|
||||
>>> n1.first = 'will'
|
||||
>>> n1.name()
|
||||
'will fulton'
|
||||
>>> n2 = noddy3.Noddy()
|
||||
>>> n2 = noddy3.Noddy()
|
||||
>>> n2 = noddy3.Noddy()
|
||||
>>> n3 = noddy3.Noddy('jim', 'fulton', 'waaa')
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
TypeError: an integer is required
|
||||
>>> del n1
|
||||
>>> del n2
|
||||
|
||||
Noddy 4
|
||||
|
||||
>>> import noddy4
|
||||
>>> n1 = noddy4.Noddy('jim', 'fulton', 42)
|
||||
>>> n1.first
|
||||
'jim'
|
||||
>>> n1.last
|
||||
'fulton'
|
||||
>>> n1.number
|
||||
42
|
||||
>>> n1.name()
|
||||
'jim fulton'
|
||||
>>> n1.first = 'will'
|
||||
>>> n1.name()
|
||||
'will fulton'
|
||||
>>> n1.last = 'tell'
|
||||
>>> n1.name()
|
||||
'will tell'
|
||||
>>> del n1.first
|
||||
>>> n1.name()
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
AttributeError: first
|
||||
>>> n1.first
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
AttributeError: first
|
||||
>>> n1.first = 'drew'
|
||||
>>> n1.first
|
||||
'drew'
|
||||
>>> del n1.number
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
TypeError: can't delete numeric/char attribute
|
||||
>>> n1.number=2
|
||||
>>> n1.number
|
||||
2
|
||||
>>> n1.first = 42
|
||||
>>> n1.name()
|
||||
'42 tell'
|
||||
>>> n2 = noddy4.Noddy()
|
||||
>>> n2 = noddy4.Noddy()
|
||||
>>> n2 = noddy4.Noddy()
|
||||
>>> n2 = noddy4.Noddy()
|
||||
>>> n2.name()
|
||||
' '
|
||||
>>> n2.first
|
||||
''
|
||||
>>> n2.last
|
||||
''
|
||||
>>> del n2.first
|
||||
>>> n2.first
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
AttributeError: first
|
||||
>>> n2.first
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
AttributeError: first
|
||||
>>> n2.name()
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
AttributeError: first
|
||||
>>> n2.number
|
||||
0
|
||||
>>> n3 = noddy4.Noddy('jim', 'fulton', 'waaa')
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
TypeError: an integer is required
|
||||
|
||||
|
||||
Test cyclic gc(?)
|
||||
|
||||
>>> import gc
|
||||
>>> gc.disable()
|
||||
|
||||
>>> x = []
|
||||
>>> l = [x]
|
||||
>>> n2.first = l
|
||||
>>> n2.first
|
||||
[[]]
|
||||
>>> l.append(n2)
|
||||
>>> del l
|
||||
>>> del n1
|
||||
>>> del n2
|
||||
>>> sys.getrefcount(x)
|
||||
3
|
||||
>>> ignore = gc.collect()
|
||||
>>> sys.getrefcount(x)
|
||||
2
|
||||
|
||||
>>> gc.enable()
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
from distutils.util import get_platform
|
||||
PLAT_SPEC = "%s-%s" % (get_platform(), sys.version[0:3])
|
||||
src = os.path.join("build", "lib.%s" % PLAT_SPEC)
|
||||
sys.path.append(src)
|
||||
|
||||
if __name__ == "__main__":
|
||||
import doctest, __main__
|
||||
doctest.testmod(__main__)
|
|
@ -1,320 +0,0 @@
|
|||
\chapter{Building C and \Cpp{} Extensions on Windows%
|
||||
\label{building-on-windows}}
|
||||
|
||||
|
||||
This chapter briefly explains how to create a Windows extension module
|
||||
for Python using Microsoft Visual \Cpp, and follows with more
|
||||
detailed background information on how it works. The explanatory
|
||||
material is useful for both the Windows programmer learning to build
|
||||
Python extensions and the \UNIX{} programmer interested in producing
|
||||
software which can be successfully built on both \UNIX{} and Windows.
|
||||
|
||||
Module authors are encouraged to use the distutils approach for
|
||||
building extension modules, instead of the one described in this
|
||||
section. You will still need the C compiler that was used to build
|
||||
Python; typically Microsoft Visual \Cpp.
|
||||
|
||||
\begin{notice}
|
||||
This chapter mentions a number of filenames that include an encoded
|
||||
Python version number. These filenames are represented with the
|
||||
version number shown as \samp{XY}; in practive, \character{X} will
|
||||
be the major version number and \character{Y} will be the minor
|
||||
version number of the Python release you're working with. For
|
||||
example, if you are using Python 2.2.1, \samp{XY} will actually be
|
||||
\samp{22}.
|
||||
\end{notice}
|
||||
|
||||
|
||||
\section{A Cookbook Approach \label{win-cookbook}}
|
||||
|
||||
There are two approaches to building extension modules on Windows,
|
||||
just as there are on \UNIX: use the
|
||||
\ulink{\module{distutils}}{../lib/module-distutils.html} package to
|
||||
control the build process, or do things manually. The distutils
|
||||
approach works well for most extensions; documentation on using
|
||||
\ulink{\module{distutils}}{../lib/module-distutils.html} to build and
|
||||
package extension modules is available in
|
||||
\citetitle[../dist/dist.html]{Distributing Python Modules}. This
|
||||
section describes the manual approach to building Python extensions
|
||||
written in C or \Cpp.
|
||||
|
||||
To build extensions using these instructions, you need to have a copy
|
||||
of the Python sources of the same version as your installed Python.
|
||||
You will need Microsoft Visual \Cpp{} ``Developer Studio''; project
|
||||
files are supplied for V\Cpp{} version 7.1, but you can use older
|
||||
versions of V\Cpp. Notice that you should use the same version of
|
||||
V\Cpp that was used to build Python itself. The example files
|
||||
described here are distributed with the Python sources in the
|
||||
\file{PC\textbackslash example_nt\textbackslash} directory.
|
||||
|
||||
\begin{enumerate}
|
||||
\item
|
||||
\strong{Copy the example files}\\
|
||||
The \file{example_nt} directory is a subdirectory of the \file{PC}
|
||||
directory, in order to keep all the PC-specific files under the
|
||||
same directory in the source distribution. However, the
|
||||
\file{example_nt} directory can't actually be used from this
|
||||
location. You first need to copy or move it up one level, so that
|
||||
\file{example_nt} is a sibling of the \file{PC} and \file{Include}
|
||||
directories. Do all your work from within this new location.
|
||||
|
||||
\item
|
||||
\strong{Open the project}\\
|
||||
From V\Cpp, use the \menuselection{File \sub Open Solution}
|
||||
dialog (not \menuselection{File \sub Open}!). Navigate to and
|
||||
select the file \file{example.sln}, in the \emph{copy} of the
|
||||
\file{example_nt} directory you made above. Click Open.
|
||||
|
||||
\item
|
||||
\strong{Build the example DLL}\\
|
||||
In order to check that everything is set up right, try building:
|
||||
|
||||
\begin{enumerate}
|
||||
\item
|
||||
Select a configuration. This step is optional. Choose
|
||||
\menuselection{Build \sub Configuration Manager \sub Active
|
||||
Solution Configuration} and select either \guilabel{Release}
|
||||
or\guilabel{Debug}. If you skip this step,
|
||||
V\Cpp{} will use the Debug configuration by default.
|
||||
|
||||
\item
|
||||
Build the DLL. Choose \menuselection{Build \sub Build
|
||||
Solution}. This creates all intermediate and result files in
|
||||
a subdirectory called either \file{Debug} or \file{Release},
|
||||
depending on which configuration you selected in the preceding
|
||||
step.
|
||||
\end{enumerate}
|
||||
|
||||
\item
|
||||
\strong{Testing the debug-mode DLL}\\
|
||||
Once the Debug build has succeeded, bring up a DOS box, and change
|
||||
to the \file{example_nt\textbackslash Debug} directory. You
|
||||
should now be able to repeat the following session (\code{C>} is
|
||||
the DOS prompt, \code{>>>} is the Python prompt; note that
|
||||
build information and various debug output from Python may not
|
||||
match this screen dump exactly):
|
||||
|
||||
\begin{verbatim}
|
||||
C>..\..\PCbuild\python_d
|
||||
Adding parser accelerators ...
|
||||
Done.
|
||||
Python 2.2 (#28, Dec 19 2001, 23:26:37) [MSC 32 bit (Intel)] on win32
|
||||
Type "copyright", "credits" or "license" for more information.
|
||||
>>> import example
|
||||
[4897 refs]
|
||||
>>> example.foo()
|
||||
Hello, world
|
||||
[4903 refs]
|
||||
>>>
|
||||
\end{verbatim}
|
||||
|
||||
Congratulations! You've successfully built your first Python
|
||||
extension module.
|
||||
|
||||
\item
|
||||
\strong{Creating your own project}\\
|
||||
Choose a name and create a directory for it. Copy your C sources
|
||||
into it. Note that the module source file name does not
|
||||
necessarily have to match the module name, but the name of the
|
||||
initialization function should match the module name --- you can
|
||||
only import a module \module{spam} if its initialization function
|
||||
is called \cfunction{initspam()}, and it should call
|
||||
\cfunction{Py_InitModule()} with the string \code{"spam"} as its
|
||||
first argument (use the minimal \file{example.c} in this directory
|
||||
as a guide). By convention, it lives in a file called
|
||||
\file{spam.c} or \file{spammodule.c}. The output file should be
|
||||
called \file{spam.dll} or \file{spam.pyd} (the latter is supported
|
||||
to avoid confusion with a system library \file{spam.dll} to which
|
||||
your module could be a Python interface) in Release mode, or
|
||||
\file{spam_d.dll} or \file{spam_d.pyd} in Debug mode.
|
||||
|
||||
Now your options are:
|
||||
|
||||
\begin{enumerate}
|
||||
\item Copy \file{example.sln} and \file{example.vcproj}, rename
|
||||
them to \file{spam.*}, and edit them by hand, or
|
||||
\item Create a brand new project; instructions are below.
|
||||
\end{enumerate}
|
||||
|
||||
In either case, copy \file{example_nt\textbackslash example.def}
|
||||
to \file{spam\textbackslash spam.def}, and edit the new
|
||||
\file{spam.def} so its second line contains the string
|
||||
`\code{initspam}'. If you created a new project yourself, add the
|
||||
file \file{spam.def} to the project now. (This is an annoying
|
||||
little file with only two lines. An alternative approach is to
|
||||
forget about the \file{.def} file, and add the option
|
||||
\programopt{/export:initspam} somewhere to the Link settings, by
|
||||
manually editing the setting in Project Properties dialog).
|
||||
|
||||
\item
|
||||
\strong{Creating a brand new project}\\
|
||||
Use the \menuselection{File \sub New \sub Project} dialog to
|
||||
create a new Project Workspace. Select \guilabel{Visual C++
|
||||
Projects/Win32/ Win32 Project}, enter the name (\samp{spam}), and
|
||||
make sure the Location is set to parent of the \file{spam}
|
||||
directory you have created (which should be a direct subdirectory
|
||||
of the Python build tree, a sibling of \file{Include} and
|
||||
\file{PC}). Select Win32 as the platform (in my version, this is
|
||||
the only choice). Make sure the Create new workspace radio button
|
||||
is selected. Click OK.
|
||||
|
||||
You should now create the file \file{spam.def} as instructed in
|
||||
the previous section. Add the source files to the project, using
|
||||
\menuselection{Project \sub Add Existing Item}. Set the pattern to
|
||||
\code{*.*} and select both \file{spam.c} and \file{spam.def} and
|
||||
click OK. (Inserting them one by one is fine too.)
|
||||
|
||||
Now open the \menuselection{Project \sub spam properties} dialog.
|
||||
You only need to change a few settings. Make sure \guilabel{All
|
||||
Configurations} is selected from the \guilabel{Settings for:}
|
||||
dropdown list. Select the C/\Cpp{} tab. Choose the General
|
||||
category in the popup menu at the top. Type the following text in
|
||||
the entry box labeled \guilabel{Additional Include Directories}:
|
||||
|
||||
\begin{verbatim}
|
||||
..\Include,..\PC
|
||||
\end{verbatim}
|
||||
|
||||
Then, choose the General category in the Linker tab, and enter
|
||||
|
||||
\begin{verbatim}
|
||||
..\PCbuild
|
||||
\end{verbatim}
|
||||
|
||||
in the text box labelled \guilabel{Additional library Directories}.
|
||||
|
||||
Now you need to add some mode-specific settings:
|
||||
|
||||
Select \guilabel{Release} in the \guilabel{Configuration}
|
||||
dropdown list. Choose the \guilabel{Link} tab, choose the
|
||||
\guilabel{Input} category, and append \code{pythonXY.lib} to the
|
||||
list in the \guilabel{Additional Dependencies} box.
|
||||
|
||||
Select \guilabel{Debug} in the \guilabel{Configuration} dropdown
|
||||
list, and append \code{pythonXY_d.lib} to the list in the
|
||||
\guilabel{Additional Dependencies} box. Then click the C/\Cpp{}
|
||||
tab, select \guilabel{Code Generation}, and select
|
||||
\guilabel{Multi-threaded Debug DLL} from the \guilabel{Runtime
|
||||
library} dropdown list.
|
||||
|
||||
Select \guilabel{Release} again from the \guilabel{Configuration}
|
||||
dropdown list. Select \guilabel{Multi-threaded DLL} from the
|
||||
\guilabel{Runtime library} dropdown list.
|
||||
\end{enumerate}
|
||||
|
||||
|
||||
If your module creates a new type, you may have trouble with this line:
|
||||
|
||||
\begin{verbatim}
|
||||
PyObject_HEAD_INIT(&PyType_Type)
|
||||
\end{verbatim}
|
||||
|
||||
Change it to:
|
||||
|
||||
\begin{verbatim}
|
||||
PyObject_HEAD_INIT(NULL)
|
||||
\end{verbatim}
|
||||
|
||||
and add the following to the module initialization function:
|
||||
|
||||
\begin{verbatim}
|
||||
MyObject_Type.ob_type = &PyType_Type;
|
||||
\end{verbatim}
|
||||
|
||||
Refer to section~3 of the
|
||||
\citetitle[http://www.python.org/doc/FAQ.html]{Python FAQ} for details
|
||||
on why you must do this.
|
||||
|
||||
|
||||
\section{Differences Between \UNIX{} and Windows
|
||||
\label{dynamic-linking}}
|
||||
\sectionauthor{Chris Phoenix}{cphoenix@best.com}
|
||||
|
||||
|
||||
\UNIX{} and Windows use completely different paradigms for run-time
|
||||
loading of code. Before you try to build a module that can be
|
||||
dynamically loaded, be aware of how your system works.
|
||||
|
||||
In \UNIX, a shared object (\file{.so}) file contains code to be used by the
|
||||
program, and also the names of functions and data that it expects to
|
||||
find in the program. When the file is joined to the program, all
|
||||
references to those functions and data in the file's code are changed
|
||||
to point to the actual locations in the program where the functions
|
||||
and data are placed in memory. This is basically a link operation.
|
||||
|
||||
In Windows, a dynamic-link library (\file{.dll}) file has no dangling
|
||||
references. Instead, an access to functions or data goes through a
|
||||
lookup table. So the DLL code does not have to be fixed up at runtime
|
||||
to refer to the program's memory; instead, the code already uses the
|
||||
DLL's lookup table, and the lookup table is modified at runtime to
|
||||
point to the functions and data.
|
||||
|
||||
In \UNIX, there is only one type of library file (\file{.a}) which
|
||||
contains code from several object files (\file{.o}). During the link
|
||||
step to create a shared object file (\file{.so}), the linker may find
|
||||
that it doesn't know where an identifier is defined. The linker will
|
||||
look for it in the object files in the libraries; if it finds it, it
|
||||
will include all the code from that object file.
|
||||
|
||||
In Windows, there are two types of library, a static library and an
|
||||
import library (both called \file{.lib}). A static library is like a
|
||||
\UNIX{} \file{.a} file; it contains code to be included as necessary.
|
||||
An import library is basically used only to reassure the linker that a
|
||||
certain identifier is legal, and will be present in the program when
|
||||
the DLL is loaded. So the linker uses the information from the
|
||||
import library to build the lookup table for using identifiers that
|
||||
are not included in the DLL. When an application or a DLL is linked,
|
||||
an import library may be generated, which will need to be used for all
|
||||
future DLLs that depend on the symbols in the application or DLL.
|
||||
|
||||
Suppose you are building two dynamic-load modules, B and C, which should
|
||||
share another block of code A. On \UNIX, you would \emph{not} pass
|
||||
\file{A.a} to the linker for \file{B.so} and \file{C.so}; that would
|
||||
cause it to be included twice, so that B and C would each have their
|
||||
own copy. In Windows, building \file{A.dll} will also build
|
||||
\file{A.lib}. You \emph{do} pass \file{A.lib} to the linker for B and
|
||||
C. \file{A.lib} does not contain code; it just contains information
|
||||
which will be used at runtime to access A's code.
|
||||
|
||||
In Windows, using an import library is sort of like using \samp{import
|
||||
spam}; it gives you access to spam's names, but does not create a
|
||||
separate copy. On \UNIX, linking with a library is more like
|
||||
\samp{from spam import *}; it does create a separate copy.
|
||||
|
||||
|
||||
\section{Using DLLs in Practice \label{win-dlls}}
|
||||
\sectionauthor{Chris Phoenix}{cphoenix@best.com}
|
||||
|
||||
Windows Python is built in Microsoft Visual \Cpp; using other
|
||||
compilers may or may not work (though Borland seems to). The rest of
|
||||
this section is MSV\Cpp{} specific.
|
||||
|
||||
When creating DLLs in Windows, you must pass \file{pythonXY.lib} to
|
||||
the linker. To build two DLLs, spam and ni (which uses C functions
|
||||
found in spam), you could use these commands:
|
||||
|
||||
\begin{verbatim}
|
||||
cl /LD /I/python/include spam.c ../libs/pythonXY.lib
|
||||
cl /LD /I/python/include ni.c spam.lib ../libs/pythonXY.lib
|
||||
\end{verbatim}
|
||||
|
||||
The first command created three files: \file{spam.obj},
|
||||
\file{spam.dll} and \file{spam.lib}. \file{Spam.dll} does not contain
|
||||
any Python functions (such as \cfunction{PyArg_ParseTuple()}), but it
|
||||
does know how to find the Python code thanks to \file{pythonXY.lib}.
|
||||
|
||||
The second command created \file{ni.dll} (and \file{.obj} and
|
||||
\file{.lib}), which knows how to find the necessary functions from
|
||||
spam, and also from the Python executable.
|
||||
|
||||
Not every identifier is exported to the lookup table. If you want any
|
||||
other modules (including Python) to be able to see your identifiers,
|
||||
you have to say \samp{_declspec(dllexport)}, as in \samp{void
|
||||
_declspec(dllexport) initspam(void)} or \samp{PyObject
|
||||
_declspec(dllexport) *NiGetSpamData(void)}.
|
||||
|
||||
Developer Studio will throw in a lot of import libraries that you do
|
||||
not really need, adding about 100K to your executable. To get rid of
|
||||
them, use the Project Settings dialog, Link tab, to specify
|
||||
\emph{ignore default libraries}. Add the correct
|
||||
\file{msvcrt\var{xx}.lib} to the list of libraries.
|
|
@ -1,84 +0,0 @@
|
|||
# Makefile for the HOWTO directory
|
||||
# LaTeX HOWTOs can be turned into HTML, PDF, PS, DVI or plain text output.
|
||||
# reST HOWTOs can only be turned into HTML.
|
||||
|
||||
# Variables to change
|
||||
|
||||
# Paper size for non-HTML formats (letter or a4)
|
||||
PAPER=letter
|
||||
|
||||
# Arguments to rst2html.py, and location of the script
|
||||
RSTARGS = --input-encoding=utf-8
|
||||
RST2HTML = rst2html.py
|
||||
|
||||
# List of HOWTOs that aren't to be processed. This should contain the
|
||||
# base name of the HOWTO without any extension (e.g. 'advocacy',
|
||||
# 'unicode').
|
||||
REMOVE_HOWTOS =
|
||||
|
||||
MKHOWTO=../tools/mkhowto
|
||||
WEBDIR=.
|
||||
PAPERDIR=../paper-$(PAPER)
|
||||
HTMLDIR=../html
|
||||
|
||||
# Determine list of files to be built
|
||||
TEX_SOURCES = $(wildcard *.tex)
|
||||
RST_SOURCES = $(wildcard *.rst)
|
||||
TEX_NAMES = $(filter-out $(REMOVE_HOWTOS),$(patsubst %.tex,%,$(TEX_SOURCES)))
|
||||
|
||||
PAPER_PATHS=$(addprefix $(PAPERDIR)/,$(TEX_NAMES))
|
||||
DVI =$(addsuffix .dvi,$(PAPER_PATHS))
|
||||
PDF =$(addsuffix .pdf,$(PAPER_PATHS))
|
||||
PS =$(addsuffix .ps,$(PAPER_PATHS))
|
||||
|
||||
ALL_HOWTO_NAMES = $(TEX_NAMES) $(patsubst %.rst,%,$(RST_SOURCES))
|
||||
HOWTO_NAMES = $(filter-out $(REMOVE_HOWTOS),$(ALL_HOWTO_NAMES))
|
||||
HTML = $(addprefix $(HTMLDIR)/,$(HOWTO_NAMES))
|
||||
|
||||
# Rules for building various formats
|
||||
|
||||
# reST to HTML
|
||||
$(HTMLDIR)/%: %.rst
|
||||
if [ ! -d $@ ] ; then mkdir $@ ; fi
|
||||
$(RST2HTML) $(RSTARGS) $< >$@/index.html
|
||||
|
||||
# LaTeX to various output formats
|
||||
$(PAPERDIR)/%.dvi : %.tex
|
||||
$(MKHOWTO) --dvi $<
|
||||
mv $*.dvi $@
|
||||
|
||||
$(PAPERDIR)/%.pdf : %.tex
|
||||
$(MKHOWTO) --pdf $<
|
||||
mv $*.pdf $@
|
||||
|
||||
$(PAPERDIR)/%.ps : %.tex
|
||||
$(MKHOWTO) --ps $<
|
||||
mv $*.ps $@
|
||||
|
||||
$(HTMLDIR)/% : %.tex
|
||||
$(MKHOWTO) --html --iconserver="." --dir $@ $<
|
||||
|
||||
# Rule that isn't actually used -- we no longer support the 'txt' target.
|
||||
$(PAPERDIR)/%.txt : %.tex
|
||||
$(MKHOWTO) --text $<
|
||||
mv $@ txt
|
||||
|
||||
default:
|
||||
@echo "'all' -- build all files"
|
||||
@echo "'dvi', 'pdf', 'ps', 'html' -- build one format"
|
||||
|
||||
all: dvi pdf ps html
|
||||
|
||||
.PHONY : dvi pdf ps html
|
||||
dvi: $(DVI)
|
||||
pdf: $(PDF)
|
||||
ps: $(PS)
|
||||
html: $(HTML)
|
||||
|
||||
clean:
|
||||
rm -f *~ *.log *.ind *.l2h *.aux *.toc *.how *.bkm
|
||||
rm -f *.dvi *.pdf *.ps
|
||||
|
||||
clobber:
|
||||
rm -rf $(HTML)
|
||||
rm -rf $(DVI) $(PDF) $(PS)
|
|
@ -1,13 +0,0 @@
|
|||
|
||||
Short-term tasks:
|
||||
Quick revision pass to make HOWTOs match the current state of Python
|
||||
doanddont regex sockets
|
||||
|
||||
Medium-term tasks:
|
||||
Revisit the regex howto.
|
||||
* Add exercises with answers for each section
|
||||
* More examples?
|
||||
|
||||
Long-term tasks:
|
||||
Integrate with other Python docs?
|
||||
|
|
@ -1,411 +0,0 @@
|
|||
|
||||
\documentclass{howto}
|
||||
|
||||
\title{Python Advocacy HOWTO}
|
||||
|
||||
\release{0.03}
|
||||
|
||||
\author{A.M. Kuchling}
|
||||
\authoraddress{\email{amk@amk.ca}}
|
||||
|
||||
\begin{document}
|
||||
\maketitle
|
||||
|
||||
\begin{abstract}
|
||||
\noindent
|
||||
It's usually difficult to get your management to accept open source
|
||||
software, and Python is no exception to this rule. This document
|
||||
discusses reasons to use Python, strategies for winning acceptance,
|
||||
facts and arguments you can use, and cases where you \emph{shouldn't}
|
||||
try to use Python.
|
||||
|
||||
This document is available from the Python HOWTO page at
|
||||
\url{http://www.python.org/doc/howto}.
|
||||
|
||||
\end{abstract}
|
||||
|
||||
\tableofcontents
|
||||
|
||||
\section{Reasons to Use Python}
|
||||
|
||||
There are several reasons to incorporate a scripting language into
|
||||
your development process, and this section will discuss them, and why
|
||||
Python has some properties that make it a particularly good choice.
|
||||
|
||||
\subsection{Programmability}
|
||||
|
||||
Programs are often organized in a modular fashion. Lower-level
|
||||
operations are grouped together, and called by higher-level functions,
|
||||
which may in turn be used as basic operations by still further upper
|
||||
levels.
|
||||
|
||||
For example, the lowest level might define a very low-level
|
||||
set of functions for accessing a hash table. The next level might use
|
||||
hash tables to store the headers of a mail message, mapping a header
|
||||
name like \samp{Date} to a value such as \samp{Tue, 13 May 1997
|
||||
20:00:54 -0400}. A yet higher level may operate on message objects,
|
||||
without knowing or caring that message headers are stored in a hash
|
||||
table, and so forth.
|
||||
|
||||
Often, the lowest levels do very simple things; they implement a data
|
||||
structure such as a binary tree or hash table, or they perform some
|
||||
simple computation, such as converting a date string to a number. The
|
||||
higher levels then contain logic connecting these primitive
|
||||
operations. Using the approach, the primitives can be seen as basic
|
||||
building blocks which are then glued together to produce the complete
|
||||
product.
|
||||
|
||||
Why is this design approach relevant to Python? Because Python is
|
||||
well suited to functioning as such a glue language. A common approach
|
||||
is to write a Python module that implements the lower level
|
||||
operations; for the sake of speed, the implementation might be in C,
|
||||
Java, or even Fortran. Once the primitives are available to Python
|
||||
programs, the logic underlying higher level operations is written in
|
||||
the form of Python code. The high-level logic is then more
|
||||
understandable, and easier to modify.
|
||||
|
||||
John Ousterhout wrote a paper that explains this idea at greater
|
||||
length, entitled ``Scripting: Higher Level Programming for the 21st
|
||||
Century''. I recommend that you read this paper; see the references
|
||||
for the URL. Ousterhout is the inventor of the Tcl language, and
|
||||
therefore argues that Tcl should be used for this purpose; he only
|
||||
briefly refers to other languages such as Python, Perl, and
|
||||
Lisp/Scheme, but in reality, Ousterhout's argument applies to
|
||||
scripting languages in general, since you could equally write
|
||||
extensions for any of the languages mentioned above.
|
||||
|
||||
\subsection{Prototyping}
|
||||
|
||||
In \emph{The Mythical Man-Month}, Fredrick Brooks suggests the
|
||||
following rule when planning software projects: ``Plan to throw one
|
||||
away; you will anyway.'' Brooks is saying that the first attempt at a
|
||||
software design often turns out to be wrong; unless the problem is
|
||||
very simple or you're an extremely good designer, you'll find that new
|
||||
requirements and features become apparent once development has
|
||||
actually started. If these new requirements can't be cleanly
|
||||
incorporated into the program's structure, you're presented with two
|
||||
unpleasant choices: hammer the new features into the program somehow,
|
||||
or scrap everything and write a new version of the program, taking the
|
||||
new features into account from the beginning.
|
||||
|
||||
Python provides you with a good environment for quickly developing an
|
||||
initial prototype. That lets you get the overall program structure
|
||||
and logic right, and you can fine-tune small details in the fast
|
||||
development cycle that Python provides. Once you're satisfied with
|
||||
the GUI interface or program output, you can translate the Python code
|
||||
into C++, Fortran, Java, or some other compiled language.
|
||||
|
||||
Prototyping means you have to be careful not to use too many Python
|
||||
features that are hard to implement in your other language. Using
|
||||
\code{eval()}, or regular expressions, or the \module{pickle} module,
|
||||
means that you're going to need C or Java libraries for formula
|
||||
evaluation, regular expressions, and serialization, for example. But
|
||||
it's not hard to avoid such tricky code, and in the end the
|
||||
translation usually isn't very difficult. The resulting code can be
|
||||
rapidly debugged, because any serious logical errors will have been
|
||||
removed from the prototype, leaving only more minor slip-ups in the
|
||||
translation to track down.
|
||||
|
||||
This strategy builds on the earlier discussion of programmability.
|
||||
Using Python as glue to connect lower-level components has obvious
|
||||
relevance for constructing prototype systems. In this way Python can
|
||||
help you with development, even if end users never come in contact
|
||||
with Python code at all. If the performance of the Python version is
|
||||
adequate and corporate politics allow it, you may not need to do a
|
||||
translation into C or Java, but it can still be faster to develop a
|
||||
prototype and then translate it, instead of attempting to produce the
|
||||
final version immediately.
|
||||
|
||||
One example of this development strategy is Microsoft Merchant Server.
|
||||
Version 1.0 was written in pure Python, by a company that subsequently
|
||||
was purchased by Microsoft. Version 2.0 began to translate the code
|
||||
into \Cpp, shipping with some \Cpp code and some Python code. Version
|
||||
3.0 didn't contain any Python at all; all the code had been translated
|
||||
into \Cpp. Even though the product doesn't contain a Python
|
||||
interpreter, the Python language has still served a useful purpose by
|
||||
speeding up development.
|
||||
|
||||
This is a very common use for Python. Past conference papers have
|
||||
also described this approach for developing high-level numerical
|
||||
algorithms; see David M. Beazley and Peter S. Lomdahl's paper
|
||||
``Feeding a Large-scale Physics Application to Python'' in the
|
||||
references for a good example. If an algorithm's basic operations are
|
||||
things like "Take the inverse of this 4000x4000 matrix", and are
|
||||
implemented in some lower-level language, then Python has almost no
|
||||
additional performance cost; the extra time required for Python to
|
||||
evaluate an expression like \code{m.invert()} is dwarfed by the cost
|
||||
of the actual computation. It's particularly good for applications
|
||||
where seemingly endless tweaking is required to get things right. GUI
|
||||
interfaces and Web sites are prime examples.
|
||||
|
||||
The Python code is also shorter and faster to write (once you're
|
||||
familiar with Python), so it's easier to throw it away if you decide
|
||||
your approach was wrong; if you'd spent two weeks working on it
|
||||
instead of just two hours, you might waste time trying to patch up
|
||||
what you've got out of a natural reluctance to admit that those two
|
||||
weeks were wasted. Truthfully, those two weeks haven't been wasted,
|
||||
since you've learnt something about the problem and the technology
|
||||
you're using to solve it, but it's human nature to view this as a
|
||||
failure of some sort.
|
||||
|
||||
\subsection{Simplicity and Ease of Understanding}
|
||||
|
||||
Python is definitely \emph{not} a toy language that's only usable for
|
||||
small tasks. The language features are general and powerful enough to
|
||||
enable it to be used for many different purposes. It's useful at the
|
||||
small end, for 10- or 20-line scripts, but it also scales up to larger
|
||||
systems that contain thousands of lines of code.
|
||||
|
||||
However, this expressiveness doesn't come at the cost of an obscure or
|
||||
tricky syntax. While Python has some dark corners that can lead to
|
||||
obscure code, there are relatively few such corners, and proper design
|
||||
can isolate their use to only a few classes or modules. It's
|
||||
certainly possible to write confusing code by using too many features
|
||||
with too little concern for clarity, but most Python code can look a
|
||||
lot like a slightly-formalized version of human-understandable
|
||||
pseudocode.
|
||||
|
||||
In \emph{The New Hacker's Dictionary}, Eric S. Raymond gives the following
|
||||
definition for "compact":
|
||||
|
||||
\begin{quotation}
|
||||
Compact \emph{adj.} Of a design, describes the valuable property
|
||||
that it can all be apprehended at once in one's head. This
|
||||
generally means the thing created from the design can be used
|
||||
with greater facility and fewer errors than an equivalent tool
|
||||
that is not compact. Compactness does not imply triviality or
|
||||
lack of power; for example, C is compact and FORTRAN is not,
|
||||
but C is more powerful than FORTRAN. Designs become
|
||||
non-compact through accreting features and cruft that don't
|
||||
merge cleanly into the overall design scheme (thus, some fans
|
||||
of Classic C maintain that ANSI C is no longer compact).
|
||||
\end{quotation}
|
||||
|
||||
(From \url{http://www.catb.org/~esr/jargon/html/C/compact.html})
|
||||
|
||||
In this sense of the word, Python is quite compact, because the
|
||||
language has just a few ideas, which are used in lots of places. Take
|
||||
namespaces, for example. Import a module with \code{import math}, and
|
||||
you create a new namespace called \samp{math}. Classes are also
|
||||
namespaces that share many of the properties of modules, and have a
|
||||
few of their own; for example, you can create instances of a class.
|
||||
Instances? They're yet another namespace. Namespaces are currently
|
||||
implemented as Python dictionaries, so they have the same methods as
|
||||
the standard dictionary data type: .keys() returns all the keys, and
|
||||
so forth.
|
||||
|
||||
This simplicity arises from Python's development history. The
|
||||
language syntax derives from different sources; ABC, a relatively
|
||||
obscure teaching language, is one primary influence, and Modula-3 is
|
||||
another. (For more information about ABC and Modula-3, consult their
|
||||
respective Web sites at \url{http://www.cwi.nl/~steven/abc/} and
|
||||
\url{http://www.m3.org}.) Other features have come from C, Icon,
|
||||
Algol-68, and even Perl. Python hasn't really innovated very much,
|
||||
but instead has tried to keep the language small and easy to learn,
|
||||
building on ideas that have been tried in other languages and found
|
||||
useful.
|
||||
|
||||
Simplicity is a virtue that should not be underestimated. It lets you
|
||||
learn the language more quickly, and then rapidly write code, code
|
||||
that often works the first time you run it.
|
||||
|
||||
\subsection{Java Integration}
|
||||
|
||||
If you're working with Java, Jython
|
||||
(\url{http://www.jython.org/}) is definitely worth your
|
||||
attention. Jython is a re-implementation of Python in Java that
|
||||
compiles Python code into Java bytecodes. The resulting environment
|
||||
has very tight, almost seamless, integration with Java. It's trivial
|
||||
to access Java classes from Python, and you can write Python classes
|
||||
that subclass Java classes. Jython can be used for prototyping Java
|
||||
applications in much the same way CPython is used, and it can also be
|
||||
used for test suites for Java code, or embedded in a Java application
|
||||
to add scripting capabilities.
|
||||
|
||||
\section{Arguments and Rebuttals}
|
||||
|
||||
Let's say that you've decided upon Python as the best choice for your
|
||||
application. How can you convince your management, or your fellow
|
||||
developers, to use Python? This section lists some common arguments
|
||||
against using Python, and provides some possible rebuttals.
|
||||
|
||||
\emph{Python is freely available software that doesn't cost anything.
|
||||
How good can it be?}
|
||||
|
||||
Very good, indeed. These days Linux and Apache, two other pieces of
|
||||
open source software, are becoming more respected as alternatives to
|
||||
commercial software, but Python hasn't had all the publicity.
|
||||
|
||||
Python has been around for several years, with many users and
|
||||
developers. Accordingly, the interpreter has been used by many
|
||||
people, and has gotten most of the bugs shaken out of it. While bugs
|
||||
are still discovered at intervals, they're usually either quite
|
||||
obscure (they'd have to be, for no one to have run into them before)
|
||||
or they involve interfaces to external libraries. The internals of
|
||||
the language itself are quite stable.
|
||||
|
||||
Having the source code should be viewed as making the software
|
||||
available for peer review; people can examine the code, suggest (and
|
||||
implement) improvements, and track down bugs. To find out more about
|
||||
the idea of open source code, along with arguments and case studies
|
||||
supporting it, go to \url{http://www.opensource.org}.
|
||||
|
||||
\emph{Who's going to support it?}
|
||||
|
||||
Python has a sizable community of developers, and the number is still
|
||||
growing. The Internet community surrounding the language is an active
|
||||
one, and is worth being considered another one of Python's advantages.
|
||||
Most questions posted to the comp.lang.python newsgroup are quickly
|
||||
answered by someone.
|
||||
|
||||
Should you need to dig into the source code, you'll find it's clear
|
||||
and well-organized, so it's not very difficult to write extensions and
|
||||
track down bugs yourself. If you'd prefer to pay for support, there
|
||||
are companies and individuals who offer commercial support for Python.
|
||||
|
||||
\emph{Who uses Python for serious work?}
|
||||
|
||||
Lots of people; one interesting thing about Python is the surprising
|
||||
diversity of applications that it's been used for. People are using
|
||||
Python to:
|
||||
|
||||
\begin{itemize}
|
||||
\item Run Web sites
|
||||
\item Write GUI interfaces
|
||||
\item Control
|
||||
number-crunching code on supercomputers
|
||||
\item Make a commercial application scriptable by embedding the Python
|
||||
interpreter inside it
|
||||
\item Process large XML data sets
|
||||
\item Build test suites for C or Java code
|
||||
\end{itemize}
|
||||
|
||||
Whatever your application domain is, there's probably someone who's
|
||||
used Python for something similar. Yet, despite being useable for
|
||||
such high-end applications, Python's still simple enough to use for
|
||||
little jobs.
|
||||
|
||||
See \url{http://wiki.python.org/moin/OrganizationsUsingPython} for a list of some of the
|
||||
organizations that use Python.
|
||||
|
||||
\emph{What are the restrictions on Python's use?}
|
||||
|
||||
They're practically nonexistent. Consult the \file{Misc/COPYRIGHT}
|
||||
file in the source distribution, or
|
||||
\url{http://www.python.org/doc/Copyright.html} for the full language,
|
||||
but it boils down to three conditions.
|
||||
|
||||
\begin{itemize}
|
||||
|
||||
\item You have to leave the copyright notice on the software; if you
|
||||
don't include the source code in a product, you have to put the
|
||||
copyright notice in the supporting documentation.
|
||||
|
||||
\item Don't claim that the institutions that have developed Python
|
||||
endorse your product in any way.
|
||||
|
||||
\item If something goes wrong, you can't sue for damages. Practically
|
||||
all software licences contain this condition.
|
||||
|
||||
\end{itemize}
|
||||
|
||||
Notice that you don't have to provide source code for anything that
|
||||
contains Python or is built with it. Also, the Python interpreter and
|
||||
accompanying documentation can be modified and redistributed in any
|
||||
way you like, and you don't have to pay anyone any licensing fees at
|
||||
all.
|
||||
|
||||
\emph{Why should we use an obscure language like Python instead of
|
||||
well-known language X?}
|
||||
|
||||
I hope this HOWTO, and the documents listed in the final section, will
|
||||
help convince you that Python isn't obscure, and has a healthily
|
||||
growing user base. One word of advice: always present Python's
|
||||
positive advantages, instead of concentrating on language X's
|
||||
failings. People want to know why a solution is good, rather than why
|
||||
all the other solutions are bad. So instead of attacking a competing
|
||||
solution on various grounds, simply show how Python's virtues can
|
||||
help.
|
||||
|
||||
|
||||
\section{Useful Resources}
|
||||
|
||||
\begin{definitions}
|
||||
|
||||
|
||||
\term{\url{http://www.pythonology.com/success}}
|
||||
|
||||
The Python Success Stories are a collection of stories from successful
|
||||
users of Python, with the emphasis on business and corporate users.
|
||||
|
||||
%\term{\url{http://www.fsbassociates.com/books/pythonchpt1.htm}}
|
||||
|
||||
%The first chapter of \emph{Internet Programming with Python} also
|
||||
%examines some of the reasons for using Python. The book is well worth
|
||||
%buying, but the publishers have made the first chapter available on
|
||||
%the Web.
|
||||
|
||||
\term{\url{http://home.pacbell.net/ouster/scripting.html}}
|
||||
|
||||
John Ousterhout's white paper on scripting is a good argument for the
|
||||
utility of scripting languages, though naturally enough, he emphasizes
|
||||
Tcl, the language he developed. Most of the arguments would apply to
|
||||
any scripting language.
|
||||
|
||||
\term{\url{http://www.python.org/workshops/1997-10/proceedings/beazley.html}}
|
||||
|
||||
The authors, David M. Beazley and Peter S. Lomdahl,
|
||||
describe their use of Python at Los Alamos National Laboratory.
|
||||
It's another good example of how Python can help get real work done.
|
||||
This quotation from the paper has been echoed by many people:
|
||||
|
||||
\begin{quotation}
|
||||
Originally developed as a large monolithic application for
|
||||
massively parallel processing systems, we have used Python to
|
||||
transform our application into a flexible, highly modular, and
|
||||
extremely powerful system for performing simulation, data
|
||||
analysis, and visualization. In addition, we describe how Python
|
||||
has solved a number of important problems related to the
|
||||
development, debugging, deployment, and maintenance of scientific
|
||||
software.
|
||||
\end{quotation}
|
||||
|
||||
\term{\url{http://pythonjournal.cognizor.com/pyj1/Everitt-Feit_interview98-V1.html}}
|
||||
|
||||
This interview with Andy Feit, discussing Infoseek's use of Python, can be
|
||||
used to show that choosing Python didn't introduce any difficulties
|
||||
into a company's development process, and provided some substantial benefits.
|
||||
|
||||
%\term{\url{http://www.python.org/psa/Commercial.html}}
|
||||
|
||||
%Robin Friedrich wrote this document on how to support Python's use in
|
||||
%commercial projects.
|
||||
|
||||
\term{\url{http://www.python.org/workshops/1997-10/proceedings/stein.ps}}
|
||||
|
||||
For the 6th Python conference, Greg Stein presented a paper that
|
||||
traced Python's adoption and usage at a startup called eShop, and
|
||||
later at Microsoft.
|
||||
|
||||
\term{\url{http://www.opensource.org}}
|
||||
|
||||
Management may be doubtful of the reliability and usefulness of
|
||||
software that wasn't written commercially. This site presents
|
||||
arguments that show how open source software can have considerable
|
||||
advantages over closed-source software.
|
||||
|
||||
\term{\url{http://sunsite.unc.edu/LDP/HOWTO/mini/Advocacy.html}}
|
||||
|
||||
The Linux Advocacy mini-HOWTO was the inspiration for this document,
|
||||
and is also well worth reading for general suggestions on winning
|
||||
acceptance for a new technology, such as Linux or Python. In general,
|
||||
you won't make much progress by simply attacking existing systems and
|
||||
complaining about their inadequacies; this often ends up looking like
|
||||
unfocused whining. It's much better to point out some of the many
|
||||
areas where Python is an improvement over other systems.
|
||||
|
||||
\end{definitions}
|
||||
|
||||
\end{document}
|
||||
|
||||
|
|
@ -1,486 +0,0 @@
|
|||
\documentclass{howto}
|
||||
|
||||
\title{Curses Programming with Python}
|
||||
|
||||
\release{2.02}
|
||||
|
||||
\author{A.M. Kuchling, Eric S. Raymond}
|
||||
\authoraddress{\email{amk@amk.ca}, \email{esr@thyrsus.com}}
|
||||
|
||||
\begin{document}
|
||||
\maketitle
|
||||
|
||||
\begin{abstract}
|
||||
\noindent
|
||||
This document describes how to write text-mode programs with Python 2.x,
|
||||
using the \module{curses} extension module to control the display.
|
||||
|
||||
This document is available from the Python HOWTO page at
|
||||
\url{http://www.python.org/doc/howto}.
|
||||
\end{abstract}
|
||||
|
||||
\tableofcontents
|
||||
|
||||
\section{What is curses?}
|
||||
|
||||
The curses library supplies a terminal-independent screen-painting and
|
||||
keyboard-handling facility for text-based terminals; such terminals
|
||||
include VT100s, the Linux console, and the simulated terminal provided
|
||||
by X11 programs such as xterm and rxvt. Display terminals support
|
||||
various control codes to perform common operations such as moving the
|
||||
cursor, scrolling the screen, and erasing areas. Different terminals
|
||||
use widely differing codes, and often have their own minor quirks.
|
||||
|
||||
In a world of X displays, one might ask ``why bother''? It's true
|
||||
that character-cell display terminals are an obsolete technology, but
|
||||
there are niches in which being able to do fancy things with them are
|
||||
still valuable. One is on small-footprint or embedded Unixes that
|
||||
don't carry an X server. Another is for tools like OS installers
|
||||
and kernel configurators that may have to run before X is available.
|
||||
|
||||
The curses library hides all the details of different terminals, and
|
||||
provides the programmer with an abstraction of a display, containing
|
||||
multiple non-overlapping windows. The contents of a window can be
|
||||
changed in various ways--adding text, erasing it, changing its
|
||||
appearance--and the curses library will automagically figure out what
|
||||
control codes need to be sent to the terminal to produce the right
|
||||
output.
|
||||
|
||||
The curses library was originally written for BSD Unix; the later System V
|
||||
versions of Unix from AT\&T added many enhancements and new functions.
|
||||
BSD curses is no longer maintained, having been replaced by ncurses,
|
||||
which is an open-source implementation of the AT\&T interface. If you're
|
||||
using an open-source Unix such as Linux or FreeBSD, your system almost
|
||||
certainly uses ncurses. Since most current commercial Unix versions
|
||||
are based on System V code, all the functions described here will
|
||||
probably be available. The older versions of curses carried by some
|
||||
proprietary Unixes may not support everything, though.
|
||||
|
||||
No one has made a Windows port of the curses module. On a Windows
|
||||
platform, try the Console module written by Fredrik Lundh. The
|
||||
Console module provides cursor-addressable text output, plus full
|
||||
support for mouse and keyboard input, and is available from
|
||||
\url{http://effbot.org/efflib/console}.
|
||||
|
||||
\subsection{The Python curses module}
|
||||
|
||||
Thy Python module is a fairly simple wrapper over the C functions
|
||||
provided by curses; if you're already familiar with curses programming
|
||||
in C, it's really easy to transfer that knowledge to Python. The
|
||||
biggest difference is that the Python interface makes things simpler,
|
||||
by merging different C functions such as \function{addstr},
|
||||
\function{mvaddstr}, \function{mvwaddstr}, into a single
|
||||
\method{addstr()} method. You'll see this covered in more detail
|
||||
later.
|
||||
|
||||
This HOWTO is simply an introduction to writing text-mode programs
|
||||
with curses and Python. It doesn't attempt to be a complete guide to
|
||||
the curses API; for that, see the Python library guide's section on
|
||||
ncurses, and the C manual pages for ncurses. It will, however, give
|
||||
you the basic ideas.
|
||||
|
||||
\section{Starting and ending a curses application}
|
||||
|
||||
Before doing anything, curses must be initialized. This is done by
|
||||
calling the \function{initscr()} function, which will determine the
|
||||
terminal type, send any required setup codes to the terminal, and
|
||||
create various internal data structures. If successful,
|
||||
\function{initscr()} returns a window object representing the entire
|
||||
screen; this is usually called \code{stdscr}, after the name of the
|
||||
corresponding C
|
||||
variable.
|
||||
|
||||
\begin{verbatim}
|
||||
import curses
|
||||
stdscr = curses.initscr()
|
||||
\end{verbatim}
|
||||
|
||||
Usually curses applications turn off automatic echoing of keys to the
|
||||
screen, in order to be able to read keys and only display them under
|
||||
certain circumstances. This requires calling the \function{noecho()}
|
||||
function.
|
||||
|
||||
\begin{verbatim}
|
||||
curses.noecho()
|
||||
\end{verbatim}
|
||||
|
||||
Applications will also commonly need to react to keys instantly,
|
||||
without requiring the Enter key to be pressed; this is called cbreak
|
||||
mode, as opposed to the usual buffered input mode.
|
||||
|
||||
\begin{verbatim}
|
||||
curses.cbreak()
|
||||
\end{verbatim}
|
||||
|
||||
Terminals usually return special keys, such as the cursor keys or
|
||||
navigation keys such as Page Up and Home, as a multibyte escape
|
||||
sequence. While you could write your application to expect such
|
||||
sequences and process them accordingly, curses can do it for you,
|
||||
returning a special value such as \constant{curses.KEY_LEFT}. To get
|
||||
curses to do the job, you'll have to enable keypad mode.
|
||||
|
||||
\begin{verbatim}
|
||||
stdscr.keypad(1)
|
||||
\end{verbatim}
|
||||
|
||||
Terminating a curses application is much easier than starting one.
|
||||
You'll need to call
|
||||
|
||||
\begin{verbatim}
|
||||
curses.nocbreak(); stdscr.keypad(0); curses.echo()
|
||||
\end{verbatim}
|
||||
|
||||
to reverse the curses-friendly terminal settings. Then call the
|
||||
\function{endwin()} function to restore the terminal to its original
|
||||
operating mode.
|
||||
|
||||
\begin{verbatim}
|
||||
curses.endwin()
|
||||
\end{verbatim}
|
||||
|
||||
A common problem when debugging a curses application is to get your
|
||||
terminal messed up when the application dies without restoring the
|
||||
terminal to its previous state. In Python this commonly happens when
|
||||
your code is buggy and raises an uncaught exception. Keys are no
|
||||
longer be echoed to the screen when you type them, for example, which
|
||||
makes using the shell difficult.
|
||||
|
||||
In Python you can avoid these complications and make debugging much
|
||||
easier by importing the module \module{curses.wrapper}. It supplies a
|
||||
\function{wrapper()} function that takes a callable. It does the
|
||||
initializations described above, and also initializes colors if color
|
||||
support is present. It then runs your provided callable and finally
|
||||
deinitializes appropriately. The callable is called inside a try-catch
|
||||
clause which catches exceptions, performs curses deinitialization, and
|
||||
then passes the exception upwards. Thus, your terminal won't be left
|
||||
in a funny state on exception.
|
||||
|
||||
\section{Windows and Pads}
|
||||
|
||||
Windows are the basic abstraction in curses. A window object
|
||||
represents a rectangular area of the screen, and supports various
|
||||
methods to display text, erase it, allow the user to input strings,
|
||||
and so forth.
|
||||
|
||||
The \code{stdscr} object returned by the \function{initscr()} function
|
||||
is a window object that covers the entire screen. Many programs may
|
||||
need only this single window, but you might wish to divide the screen
|
||||
into smaller windows, in order to redraw or clear them separately.
|
||||
The \function{newwin()} function creates a new window of a given size,
|
||||
returning the new window object.
|
||||
|
||||
\begin{verbatim}
|
||||
begin_x = 20 ; begin_y = 7
|
||||
height = 5 ; width = 40
|
||||
win = curses.newwin(height, width, begin_y, begin_x)
|
||||
\end{verbatim}
|
||||
|
||||
A word about the coordinate system used in curses: coordinates are
|
||||
always passed in the order \emph{y,x}, and the top-left corner of a
|
||||
window is coordinate (0,0). This breaks a common convention for
|
||||
handling coordinates, where the \emph{x} coordinate usually comes
|
||||
first. This is an unfortunate difference from most other computer
|
||||
applications, but it's been part of curses since it was first written,
|
||||
and it's too late to change things now.
|
||||
|
||||
When you call a method to display or erase text, the effect doesn't
|
||||
immediately show up on the display. This is because curses was
|
||||
originally written with slow 300-baud terminal connections in mind;
|
||||
with these terminals, minimizing the time required to redraw the
|
||||
screen is very important. This lets curses accumulate changes to the
|
||||
screen, and display them in the most efficient manner. For example,
|
||||
if your program displays some characters in a window, and then clears
|
||||
the window, there's no need to send the original characters because
|
||||
they'd never be visible.
|
||||
|
||||
Accordingly, curses requires that you explicitly tell it to redraw
|
||||
windows, using the \function{refresh()} method of window objects. In
|
||||
practice, this doesn't really complicate programming with curses much.
|
||||
Most programs go into a flurry of activity, and then pause waiting for
|
||||
a keypress or some other action on the part of the user. All you have
|
||||
to do is to be sure that the screen has been redrawn before pausing to
|
||||
wait for user input, by simply calling \code{stdscr.refresh()} or the
|
||||
\function{refresh()} method of some other relevant window.
|
||||
|
||||
A pad is a special case of a window; it can be larger than the actual
|
||||
display screen, and only a portion of it displayed at a time.
|
||||
Creating a pad simply requires the pad's height and width, while
|
||||
refreshing a pad requires giving the coordinates of the on-screen
|
||||
area where a subsection of the pad will be displayed.
|
||||
|
||||
\begin{verbatim}
|
||||
pad = curses.newpad(100, 100)
|
||||
# These loops fill the pad with letters; this is
|
||||
# explained in the next section
|
||||
for y in range(0, 100):
|
||||
for x in range(0, 100):
|
||||
try: pad.addch(y,x, ord('a') + (x*x+y*y) % 26 )
|
||||
except curses.error: pass
|
||||
|
||||
# Displays a section of the pad in the middle of the screen
|
||||
pad.refresh( 0,0, 5,5, 20,75)
|
||||
\end{verbatim}
|
||||
|
||||
The \function{refresh()} call displays a section of the pad in the
|
||||
rectangle extending from coordinate (5,5) to coordinate (20,75) on the
|
||||
screen; the upper left corner of the displayed section is coordinate
|
||||
(0,0) on the pad. Beyond that difference, pads are exactly like
|
||||
ordinary windows and support the same methods.
|
||||
|
||||
If you have multiple windows and pads on screen there is a more
|
||||
efficient way to go, which will prevent annoying screen flicker at
|
||||
refresh time. Use the \method{noutrefresh()} method
|
||||
of each window to update the data structure
|
||||
representing the desired state of the screen; then change the physical
|
||||
screen to match the desired state in one go with the function
|
||||
\function{doupdate()}. The normal \method{refresh()} method calls
|
||||
\function{doupdate()} as its last act.
|
||||
|
||||
\section{Displaying Text}
|
||||
|
||||
{}From a C programmer's point of view, curses may sometimes look like
|
||||
a twisty maze of functions, all subtly different. For example,
|
||||
\function{addstr()} displays a string at the current cursor location
|
||||
in the \code{stdscr} window, while \function{mvaddstr()} moves to a
|
||||
given y,x coordinate first before displaying the string.
|
||||
\function{waddstr()} is just like \function{addstr()}, but allows
|
||||
specifying a window to use, instead of using \code{stdscr} by default.
|
||||
\function{mvwaddstr()} follows similarly.
|
||||
|
||||
Fortunately the Python interface hides all these details;
|
||||
\code{stdscr} is a window object like any other, and methods like
|
||||
\function{addstr()} accept multiple argument forms. Usually there are
|
||||
four different forms.
|
||||
|
||||
\begin{tableii}{|c|l|}{textrm}{Form}{Description}
|
||||
\lineii{\var{str} or \var{ch}}{Display the string \var{str} or
|
||||
character \var{ch} at the current position}
|
||||
\lineii{\var{str} or \var{ch}, \var{attr}}{Display the string \var{str} or
|
||||
character \var{ch}, using attribute \var{attr} at the current position}
|
||||
\lineii{\var{y}, \var{x}, \var{str} or \var{ch}}
|
||||
{Move to position \var{y,x} within the window, and display \var{str}
|
||||
or \var{ch}}
|
||||
\lineii{\var{y}, \var{x}, \var{str} or \var{ch}, \var{attr}}
|
||||
{Move to position \var{y,x} within the window, and display \var{str}
|
||||
or \var{ch}, using attribute \var{attr}}
|
||||
\end{tableii}
|
||||
|
||||
Attributes allow displaying text in highlighted forms, such as in
|
||||
boldface, underline, reverse code, or in color. They'll be explained
|
||||
in more detail in the next subsection.
|
||||
|
||||
The \function{addstr()} function takes a Python string as the value to
|
||||
be displayed, while the \function{addch()} functions take a character,
|
||||
which can be either a Python string of length 1 or an integer. If
|
||||
it's a string, you're limited to displaying characters between 0 and
|
||||
255. SVr4 curses provides constants for extension characters; these
|
||||
constants are integers greater than 255. For example,
|
||||
\constant{ACS_PLMINUS} is a +/- symbol, and \constant{ACS_ULCORNER} is
|
||||
the upper left corner of a box (handy for drawing borders).
|
||||
|
||||
Windows remember where the cursor was left after the last operation,
|
||||
so if you leave out the \var{y,x} coordinates, the string or character
|
||||
will be displayed wherever the last operation left off. You can also
|
||||
move the cursor with the \function{move(\var{y,x})} method. Because
|
||||
some terminals always display a flashing cursor, you may want to
|
||||
ensure that the cursor is positioned in some location where it won't
|
||||
be distracting; it can be confusing to have the cursor blinking at
|
||||
some apparently random location.
|
||||
|
||||
If your application doesn't need a blinking cursor at all, you can
|
||||
call \function{curs_set(0)} to make it invisible. Equivalently, and
|
||||
for compatibility with older curses versions, there's a
|
||||
\function{leaveok(\var{bool})} function. When \var{bool} is true, the
|
||||
curses library will attempt to suppress the flashing cursor, and you
|
||||
won't need to worry about leaving it in odd locations.
|
||||
|
||||
\subsection{Attributes and Color}
|
||||
|
||||
Characters can be displayed in different ways. Status lines in a
|
||||
text-based application are commonly shown in reverse video; a text
|
||||
viewer may need to highlight certain words. curses supports this by
|
||||
allowing you to specify an attribute for each cell on the screen.
|
||||
|
||||
An attribute is a integer, each bit representing a different
|
||||
attribute. You can try to display text with multiple attribute bits
|
||||
set, but curses doesn't guarantee that all the possible combinations
|
||||
are available, or that they're all visually distinct. That depends on
|
||||
the ability of the terminal being used, so it's safest to stick to the
|
||||
most commonly available attributes, listed here.
|
||||
|
||||
\begin{tableii}{|c|l|}{constant}{Attribute}{Description}
|
||||
\lineii{A_BLINK}{Blinking text}
|
||||
\lineii{A_BOLD}{Extra bright or bold text}
|
||||
\lineii{A_DIM}{Half bright text}
|
||||
\lineii{A_REVERSE}{Reverse-video text}
|
||||
\lineii{A_STANDOUT}{The best highlighting mode available}
|
||||
\lineii{A_UNDERLINE}{Underlined text}
|
||||
\end{tableii}
|
||||
|
||||
So, to display a reverse-video status line on the top line of the
|
||||
screen,
|
||||
you could code:
|
||||
|
||||
\begin{verbatim}
|
||||
stdscr.addstr(0, 0, "Current mode: Typing mode",
|
||||
curses.A_REVERSE)
|
||||
stdscr.refresh()
|
||||
\end{verbatim}
|
||||
|
||||
The curses library also supports color on those terminals that
|
||||
provide it, The most common such terminal is probably the Linux
|
||||
console, followed by color xterms.
|
||||
|
||||
To use color, you must call the \function{start_color()} function soon
|
||||
after calling \function{initscr()}, to initialize the default color
|
||||
set (the \function{curses.wrapper.wrapper()} function does this
|
||||
automatically). Once that's done, the \function{has_colors()}
|
||||
function returns TRUE if the terminal in use can actually display
|
||||
color. (Note: curses uses the American spelling 'color', instead of
|
||||
the Canadian/British spelling 'colour'. If you're used to the British
|
||||
spelling, you'll have to resign yourself to misspelling it for the
|
||||
sake of these functions.)
|
||||
|
||||
The curses library maintains a finite number of color pairs,
|
||||
containing a foreground (or text) color and a background color. You
|
||||
can get the attribute value corresponding to a color pair with the
|
||||
\function{color_pair()} function; this can be bitwise-OR'ed with other
|
||||
attributes such as \constant{A_REVERSE}, but again, such combinations
|
||||
are not guaranteed to work on all terminals.
|
||||
|
||||
An example, which displays a line of text using color pair 1:
|
||||
|
||||
\begin{verbatim}
|
||||
stdscr.addstr( "Pretty text", curses.color_pair(1) )
|
||||
stdscr.refresh()
|
||||
\end{verbatim}
|
||||
|
||||
As I said before, a color pair consists of a foreground and
|
||||
background color. \function{start_color()} initializes 8 basic
|
||||
colors when it activates color mode. They are: 0:black, 1:red,
|
||||
2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white. The curses
|
||||
module defines named constants for each of these colors:
|
||||
\constant{curses.COLOR_BLACK}, \constant{curses.COLOR_RED}, and so
|
||||
forth.
|
||||
|
||||
The \function{init_pair(\var{n, f, b})} function changes the
|
||||
definition of color pair \var{n}, to foreground color {f} and
|
||||
background color {b}. Color pair 0 is hard-wired to white on black,
|
||||
and cannot be changed.
|
||||
|
||||
Let's put all this together. To change color 1 to red
|
||||
text on a white background, you would call:
|
||||
|
||||
\begin{verbatim}
|
||||
curses.init_pair(1, curses.COLOR_RED, curses.COLOR_WHITE)
|
||||
\end{verbatim}
|
||||
|
||||
When you change a color pair, any text already displayed using that
|
||||
color pair will change to the new colors. You can also display new
|
||||
text in this color with:
|
||||
|
||||
\begin{verbatim}
|
||||
stdscr.addstr(0,0, "RED ALERT!", curses.color_pair(1) )
|
||||
\end{verbatim}
|
||||
|
||||
Very fancy terminals can change the definitions of the actual colors
|
||||
to a given RGB value. This lets you change color 1, which is usually
|
||||
red, to purple or blue or any other color you like. Unfortunately,
|
||||
the Linux console doesn't support this, so I'm unable to try it out,
|
||||
and can't provide any examples. You can check if your terminal can do
|
||||
this by calling \function{can_change_color()}, which returns TRUE if
|
||||
the capability is there. If you're lucky enough to have such a
|
||||
talented terminal, consult your system's man pages for more
|
||||
information.
|
||||
|
||||
\section{User Input}
|
||||
|
||||
The curses library itself offers only very simple input mechanisms.
|
||||
Python's support adds a text-input widget that makes up some of the
|
||||
lack.
|
||||
|
||||
The most common way to get input to a window is to use its
|
||||
\method{getch()} method. \method{getch()} pauses and waits for the
|
||||
user to hit a key, displaying it if \function{echo()} has been called
|
||||
earlier. You can optionally specify a coordinate to which the cursor
|
||||
should be moved before pausing.
|
||||
|
||||
It's possible to change this behavior with the method
|
||||
\method{nodelay()}. After \method{nodelay(1)}, \method{getch()} for
|
||||
the window becomes non-blocking and returns \code{curses.ERR} (a value
|
||||
of -1) when no input is ready. There's also a \function{halfdelay()}
|
||||
function, which can be used to (in effect) set a timer on each
|
||||
\method{getch()}; if no input becomes available within the number of
|
||||
milliseconds specified as the argument to \function{halfdelay()},
|
||||
curses raises an exception.
|
||||
|
||||
The \method{getch()} method returns an integer; if it's between 0 and
|
||||
255, it represents the ASCII code of the key pressed. Values greater
|
||||
than 255 are special keys such as Page Up, Home, or the cursor keys.
|
||||
You can compare the value returned to constants such as
|
||||
\constant{curses.KEY_PPAGE}, \constant{curses.KEY_HOME}, or
|
||||
\constant{curses.KEY_LEFT}. Usually the main loop of your program
|
||||
will look something like this:
|
||||
|
||||
\begin{verbatim}
|
||||
while 1:
|
||||
c = stdscr.getch()
|
||||
if c == ord('p'): PrintDocument()
|
||||
elif c == ord('q'): break # Exit the while()
|
||||
elif c == curses.KEY_HOME: x = y = 0
|
||||
\end{verbatim}
|
||||
|
||||
The \module{curses.ascii} module supplies ASCII class membership
|
||||
functions that take either integer or 1-character-string
|
||||
arguments; these may be useful in writing more readable tests for
|
||||
your command interpreters. It also supplies conversion functions
|
||||
that take either integer or 1-character-string arguments and return
|
||||
the same type. For example, \function{curses.ascii.ctrl()} returns
|
||||
the control character corresponding to its argument.
|
||||
|
||||
There's also a method to retrieve an entire string,
|
||||
\constant{getstr()}. It isn't used very often, because its
|
||||
functionality is quite limited; the only editing keys available are
|
||||
the backspace key and the Enter key, which terminates the string. It
|
||||
can optionally be limited to a fixed number of characters.
|
||||
|
||||
\begin{verbatim}
|
||||
curses.echo() # Enable echoing of characters
|
||||
|
||||
# Get a 15-character string, with the cursor on the top line
|
||||
s = stdscr.getstr(0,0, 15)
|
||||
\end{verbatim}
|
||||
|
||||
The Python \module{curses.textpad} module supplies something better.
|
||||
With it, you can turn a window into a text box that supports an
|
||||
Emacs-like set of keybindings. Various methods of \class{Textbox}
|
||||
class support editing with input validation and gathering the edit
|
||||
results either with or without trailing spaces. See the library
|
||||
documentation on \module{curses.textpad} for the details.
|
||||
|
||||
\section{For More Information}
|
||||
|
||||
This HOWTO didn't cover some advanced topics, such as screen-scraping
|
||||
or capturing mouse events from an xterm instance. But the Python
|
||||
library page for the curses modules is now pretty complete. You
|
||||
should browse it next.
|
||||
|
||||
If you're in doubt about the detailed behavior of any of the ncurses
|
||||
entry points, consult the manual pages for your curses implementation,
|
||||
whether it's ncurses or a proprietary Unix vendor's. The manual pages
|
||||
will document any quirks, and provide complete lists of all the
|
||||
functions, attributes, and \constant{ACS_*} characters available to
|
||||
you.
|
||||
|
||||
Because the curses API is so large, some functions aren't supported in
|
||||
the Python interface, not because they're difficult to implement, but
|
||||
because no one has needed them yet. Feel free to add them and then
|
||||
submit a patch. Also, we don't yet have support for the menus or
|
||||
panels libraries associated with ncurses; feel free to add that.
|
||||
|
||||
If you write an interesting little program, feel free to contribute it
|
||||
as another demo. We can always use more of them!
|
||||
|
||||
The ncurses FAQ: \url{http://dickey.his.com/ncurses/ncurses.faq.html}
|
||||
|
||||
\end{document}
|
|
@ -1,344 +0,0 @@
|
|||
\documentclass{howto}
|
||||
|
||||
\title{Idioms and Anti-Idioms in Python}
|
||||
|
||||
\release{0.00}
|
||||
|
||||
\author{Moshe Zadka}
|
||||
\authoraddress{howto@zadka.site.co.il}
|
||||
|
||||
\begin{document}
|
||||
\maketitle
|
||||
|
||||
This document is placed in the public doman.
|
||||
|
||||
\begin{abstract}
|
||||
\noindent
|
||||
This document can be considered a companion to the tutorial. It
|
||||
shows how to use Python, and even more importantly, how {\em not}
|
||||
to use Python.
|
||||
\end{abstract}
|
||||
|
||||
\tableofcontents
|
||||
|
||||
\section{Language Constructs You Should Not Use}
|
||||
|
||||
While Python has relatively few gotchas compared to other languages, it
|
||||
still has some constructs which are only useful in corner cases, or are
|
||||
plain dangerous.
|
||||
|
||||
\subsection{from module import *}
|
||||
|
||||
\subsubsection{Inside Function Definitions}
|
||||
|
||||
\code{from module import *} is {\em invalid} inside function definitions.
|
||||
While many versions of Python do not check for the invalidity, it does not
|
||||
make it more valid, no more then having a smart lawyer makes a man innocent.
|
||||
Do not use it like that ever. Even in versions where it was accepted, it made
|
||||
the function execution slower, because the compiler could not be certain
|
||||
which names are local and which are global. In Python 2.1 this construct
|
||||
causes warnings, and sometimes even errors.
|
||||
|
||||
\subsubsection{At Module Level}
|
||||
|
||||
While it is valid to use \code{from module import *} at module level it
|
||||
is usually a bad idea. For one, this loses an important property Python
|
||||
otherwise has --- you can know where each toplevel name is defined by
|
||||
a simple "search" function in your favourite editor. You also open yourself
|
||||
to trouble in the future, if some module grows additional functions or
|
||||
classes.
|
||||
|
||||
One of the most awful question asked on the newsgroup is why this code:
|
||||
|
||||
\begin{verbatim}
|
||||
f = open("www")
|
||||
f.read()
|
||||
\end{verbatim}
|
||||
|
||||
does not work. Of course, it works just fine (assuming you have a file
|
||||
called "www".) But it does not work if somewhere in the module, the
|
||||
statement \code{from os import *} is present. The \module{os} module
|
||||
has a function called \function{open()} which returns an integer. While
|
||||
it is very useful, shadowing builtins is one of its least useful properties.
|
||||
|
||||
Remember, you can never know for sure what names a module exports, so either
|
||||
take what you need --- \code{from module import name1, name2}, or keep them in
|
||||
the module and access on a per-need basis ---
|
||||
\code{import module;print module.name}.
|
||||
|
||||
\subsubsection{When It Is Just Fine}
|
||||
|
||||
There are situations in which \code{from module import *} is just fine:
|
||||
|
||||
\begin{itemize}
|
||||
|
||||
\item The interactive prompt. For example, \code{from math import *} makes
|
||||
Python an amazing scientific calculator.
|
||||
|
||||
\item When extending a module in C with a module in Python.
|
||||
|
||||
\item When the module advertises itself as \code{from import *} safe.
|
||||
|
||||
\end{itemize}
|
||||
|
||||
\subsection{Unadorned \keyword{exec}, \function{execfile} and friends}
|
||||
|
||||
The word ``unadorned'' refers to the use without an explicit dictionary,
|
||||
in which case those constructs evaluate code in the {\em current} environment.
|
||||
This is dangerous for the same reasons \code{from import *} is dangerous ---
|
||||
it might step over variables you are counting on and mess up things for
|
||||
the rest of your code. Simply do not do that.
|
||||
|
||||
Bad examples:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> for name in sys.argv[1:]:
|
||||
>>> exec "%s=1" % name
|
||||
>>> def func(s, **kw):
|
||||
>>> for var, val in kw.items():
|
||||
>>> exec "s.%s=val" % var # invalid!
|
||||
>>> execfile("handler.py")
|
||||
>>> handle()
|
||||
\end{verbatim}
|
||||
|
||||
Good examples:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> d = {}
|
||||
>>> for name in sys.argv[1:]:
|
||||
>>> d[name] = 1
|
||||
>>> def func(s, **kw):
|
||||
>>> for var, val in kw.items():
|
||||
>>> setattr(s, var, val)
|
||||
>>> d={}
|
||||
>>> execfile("handle.py", d, d)
|
||||
>>> handle = d['handle']
|
||||
>>> handle()
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{from module import name1, name2}
|
||||
|
||||
This is a ``don't'' which is much weaker then the previous ``don't''s
|
||||
but is still something you should not do if you don't have good reasons
|
||||
to do that. The reason it is usually bad idea is because you suddenly
|
||||
have an object which lives in two seperate namespaces. When the binding
|
||||
in one namespace changes, the binding in the other will not, so there
|
||||
will be a discrepancy between them. This happens when, for example,
|
||||
one module is reloaded, or changes the definition of a function at runtime.
|
||||
|
||||
Bad example:
|
||||
|
||||
\begin{verbatim}
|
||||
# foo.py
|
||||
a = 1
|
||||
|
||||
# bar.py
|
||||
from foo import a
|
||||
if something():
|
||||
a = 2 # danger: foo.a != a
|
||||
\end{verbatim}
|
||||
|
||||
Good example:
|
||||
|
||||
\begin{verbatim}
|
||||
# foo.py
|
||||
a = 1
|
||||
|
||||
# bar.py
|
||||
import foo
|
||||
if something():
|
||||
foo.a = 2
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{except:}
|
||||
|
||||
Python has the \code{except:} clause, which catches all exceptions.
|
||||
Since {\em every} error in Python raises an exception, this makes many
|
||||
programming errors look like runtime problems, and hinders
|
||||
the debugging process.
|
||||
|
||||
The following code shows a great example:
|
||||
|
||||
\begin{verbatim}
|
||||
try:
|
||||
foo = opne("file") # misspelled "open"
|
||||
except:
|
||||
sys.exit("could not open file!")
|
||||
\end{verbatim}
|
||||
|
||||
The second line triggers a \exception{NameError} which is caught by the
|
||||
except clause. The program will exit, and you will have no idea that
|
||||
this has nothing to do with the readability of \code{"file"}.
|
||||
|
||||
The example above is better written
|
||||
|
||||
\begin{verbatim}
|
||||
try:
|
||||
foo = opne("file") # will be changed to "open" as soon as we run it
|
||||
except IOError:
|
||||
sys.exit("could not open file")
|
||||
\end{verbatim}
|
||||
|
||||
There are some situations in which the \code{except:} clause is useful:
|
||||
for example, in a framework when running callbacks, it is good not to
|
||||
let any callback disturb the framework.
|
||||
|
||||
\section{Exceptions}
|
||||
|
||||
Exceptions are a useful feature of Python. You should learn to raise
|
||||
them whenever something unexpected occurs, and catch them only where
|
||||
you can do something about them.
|
||||
|
||||
The following is a very popular anti-idiom
|
||||
|
||||
\begin{verbatim}
|
||||
def get_status(file):
|
||||
if not os.path.exists(file):
|
||||
print "file not found"
|
||||
sys.exit(1)
|
||||
return open(file).readline()
|
||||
\end{verbatim}
|
||||
|
||||
Consider the case the file gets deleted between the time the call to
|
||||
\function{os.path.exists} is made and the time \function{open} is called.
|
||||
That means the last line will throw an \exception{IOError}. The same would
|
||||
happen if \var{file} exists but has no read permission. Since testing this
|
||||
on a normal machine on existing and non-existing files make it seem bugless,
|
||||
that means in testing the results will seem fine, and the code will get
|
||||
shipped. Then an unhandled \exception{IOError} escapes to the user, who
|
||||
has to watch the ugly traceback.
|
||||
|
||||
Here is a better way to do it.
|
||||
|
||||
\begin{verbatim}
|
||||
def get_status(file):
|
||||
try:
|
||||
return open(file).readline()
|
||||
except (IOError, OSError):
|
||||
print "file not found"
|
||||
sys.exit(1)
|
||||
\end{verbatim}
|
||||
|
||||
In this version, *either* the file gets opened and the line is read
|
||||
(so it works even on flaky NFS or SMB connections), or the message
|
||||
is printed and the application aborted.
|
||||
|
||||
Still, \function{get_status} makes too many assumptions --- that it
|
||||
will only be used in a short running script, and not, say, in a long
|
||||
running server. Sure, the caller could do something like
|
||||
|
||||
\begin{verbatim}
|
||||
try:
|
||||
status = get_status(log)
|
||||
except SystemExit:
|
||||
status = None
|
||||
\end{verbatim}
|
||||
|
||||
So, try to make as few \code{except} clauses in your code --- those will
|
||||
usually be a catch-all in the \function{main}, or inside calls which
|
||||
should always succeed.
|
||||
|
||||
So, the best version is probably
|
||||
|
||||
\begin{verbatim}
|
||||
def get_status(file):
|
||||
return open(file).readline()
|
||||
\end{verbatim}
|
||||
|
||||
The caller can deal with the exception if it wants (for example, if it
|
||||
tries several files in a loop), or just let the exception filter upwards
|
||||
to {\em its} caller.
|
||||
|
||||
The last version is not very good either --- due to implementation details,
|
||||
the file would not be closed when an exception is raised until the handler
|
||||
finishes, and perhaps not at all in non-C implementations (e.g., Jython).
|
||||
|
||||
\begin{verbatim}
|
||||
def get_status(file):
|
||||
fp = open(file)
|
||||
try:
|
||||
return fp.readline()
|
||||
finally:
|
||||
fp.close()
|
||||
\end{verbatim}
|
||||
|
||||
\section{Using the Batteries}
|
||||
|
||||
Every so often, people seem to be writing stuff in the Python library
|
||||
again, usually poorly. While the occasional module has a poor interface,
|
||||
it is usually much better to use the rich standard library and data
|
||||
types that come with Python then inventing your own.
|
||||
|
||||
A useful module very few people know about is \module{os.path}. It
|
||||
always has the correct path arithmetic for your operating system, and
|
||||
will usually be much better then whatever you come up with yourself.
|
||||
|
||||
Compare:
|
||||
|
||||
\begin{verbatim}
|
||||
# ugh!
|
||||
return dir+"/"+file
|
||||
# better
|
||||
return os.path.join(dir, file)
|
||||
\end{verbatim}
|
||||
|
||||
More useful functions in \module{os.path}: \function{basename},
|
||||
\function{dirname} and \function{splitext}.
|
||||
|
||||
There are also many useful builtin functions people seem not to be
|
||||
aware of for some reason: \function{min()} and \function{max()} can
|
||||
find the minimum/maximum of any sequence with comparable semantics,
|
||||
for example, yet many people write their own
|
||||
\function{max()}/\function{min()}. Another highly useful function is
|
||||
\function{reduce()}. A classical use of \function{reduce()}
|
||||
is something like
|
||||
|
||||
\begin{verbatim}
|
||||
import sys, operator
|
||||
nums = map(float, sys.argv[1:])
|
||||
print reduce(operator.add, nums)/len(nums)
|
||||
\end{verbatim}
|
||||
|
||||
This cute little script prints the average of all numbers given on the
|
||||
command line. The \function{reduce()} adds up all the numbers, and
|
||||
the rest is just some pre- and postprocessing.
|
||||
|
||||
On the same note, note that \function{float()}, \function{int()} and
|
||||
\function{long()} all accept arguments of type string, and so are
|
||||
suited to parsing --- assuming you are ready to deal with the
|
||||
\exception{ValueError} they raise.
|
||||
|
||||
\section{Using Backslash to Continue Statements}
|
||||
|
||||
Since Python treats a newline as a statement terminator,
|
||||
and since statements are often more then is comfortable to put
|
||||
in one line, many people do:
|
||||
|
||||
\begin{verbatim}
|
||||
if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \
|
||||
calculate_number(10, 20) != forbulate(500, 360):
|
||||
pass
|
||||
\end{verbatim}
|
||||
|
||||
You should realize that this is dangerous: a stray space after the
|
||||
\code{\\} would make this line wrong, and stray spaces are notoriously
|
||||
hard to see in editors. In this case, at least it would be a syntax
|
||||
error, but if the code was:
|
||||
|
||||
\begin{verbatim}
|
||||
value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \
|
||||
+ calculate_number(10, 20)*forbulate(500, 360)
|
||||
\end{verbatim}
|
||||
|
||||
then it would just be subtly wrong.
|
||||
|
||||
It is usually much better to use the implicit continuation inside parenthesis:
|
||||
|
||||
This version is bulletproof:
|
||||
|
||||
\begin{verbatim}
|
||||
value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9]
|
||||
+ calculate_number(10, 20)*forbulate(500, 360))
|
||||
\end{verbatim}
|
||||
|
||||
\end{document}
|
1477
Doc/howto/regex.tex
|
@ -1,465 +0,0 @@
|
|||
\documentclass{howto}
|
||||
|
||||
\title{Socket Programming HOWTO}
|
||||
|
||||
\release{0.00}
|
||||
|
||||
\author{Gordon McMillan}
|
||||
\authoraddress{\email{gmcm@hypernet.com}}
|
||||
|
||||
\begin{document}
|
||||
\maketitle
|
||||
|
||||
\begin{abstract}
|
||||
\noindent
|
||||
Sockets are used nearly everywhere, but are one of the most severely
|
||||
misunderstood technologies around. This is a 10,000 foot overview of
|
||||
sockets. It's not really a tutorial - you'll still have work to do in
|
||||
getting things operational. It doesn't cover the fine points (and there
|
||||
are a lot of them), but I hope it will give you enough background to
|
||||
begin using them decently.
|
||||
|
||||
This document is available from the Python HOWTO page at
|
||||
\url{http://www.python.org/doc/howto}.
|
||||
|
||||
\end{abstract}
|
||||
|
||||
\tableofcontents
|
||||
|
||||
\section{Sockets}
|
||||
|
||||
Sockets are used nearly everywhere, but are one of the most severely
|
||||
misunderstood technologies around. This is a 10,000 foot overview of
|
||||
sockets. It's not really a tutorial - you'll still have work to do in
|
||||
getting things working. It doesn't cover the fine points (and there
|
||||
are a lot of them), but I hope it will give you enough background to
|
||||
begin using them decently.
|
||||
|
||||
I'm only going to talk about INET sockets, but they account for at
|
||||
least 99\% of the sockets in use. And I'll only talk about STREAM
|
||||
sockets - unless you really know what you're doing (in which case this
|
||||
HOWTO isn't for you!), you'll get better behavior and performance from
|
||||
a STREAM socket than anything else. I will try to clear up the mystery
|
||||
of what a socket is, as well as some hints on how to work with
|
||||
blocking and non-blocking sockets. But I'll start by talking about
|
||||
blocking sockets. You'll need to know how they work before dealing
|
||||
with non-blocking sockets.
|
||||
|
||||
Part of the trouble with understanding these things is that "socket"
|
||||
can mean a number of subtly different things, depending on context. So
|
||||
first, let's make a distinction between a "client" socket - an
|
||||
endpoint of a conversation, and a "server" socket, which is more like
|
||||
a switchboard operator. The client application (your browser, for
|
||||
example) uses "client" sockets exclusively; the web server it's
|
||||
talking to uses both "server" sockets and "client" sockets.
|
||||
|
||||
|
||||
\subsection{History}
|
||||
|
||||
Of the various forms of IPC (\emph{Inter Process Communication}),
|
||||
sockets are by far the most popular. On any given platform, there are
|
||||
likely to be other forms of IPC that are faster, but for
|
||||
cross-platform communication, sockets are about the only game in town.
|
||||
|
||||
They were invented in Berkeley as part of the BSD flavor of Unix. They
|
||||
spread like wildfire with the Internet. With good reason --- the
|
||||
combination of sockets with INET makes talking to arbitrary machines
|
||||
around the world unbelievably easy (at least compared to other
|
||||
schemes).
|
||||
|
||||
\section{Creating a Socket}
|
||||
|
||||
Roughly speaking, when you clicked on the link that brought you to
|
||||
this page, your browser did something like the following:
|
||||
|
||||
\begin{verbatim}
|
||||
#create an INET, STREAMing socket
|
||||
s = socket.socket(
|
||||
socket.AF_INET, socket.SOCK_STREAM)
|
||||
#now connect to the web server on port 80
|
||||
# - the normal http port
|
||||
s.connect(("www.mcmillan-inc.com", 80))
|
||||
\end{verbatim}
|
||||
|
||||
When the \code{connect} completes, the socket \code{s} can
|
||||
now be used to send in a request for the text of this page. The same
|
||||
socket will read the reply, and then be destroyed. That's right -
|
||||
destroyed. Client sockets are normally only used for one exchange (or
|
||||
a small set of sequential exchanges).
|
||||
|
||||
What happens in the web server is a bit more complex. First, the web
|
||||
server creates a "server socket".
|
||||
|
||||
\begin{verbatim}
|
||||
#create an INET, STREAMing socket
|
||||
serversocket = socket.socket(
|
||||
socket.AF_INET, socket.SOCK_STREAM)
|
||||
#bind the socket to a public host,
|
||||
# and a well-known port
|
||||
serversocket.bind((socket.gethostname(), 80))
|
||||
#become a server socket
|
||||
serversocket.listen(5)
|
||||
\end{verbatim}
|
||||
|
||||
A couple things to notice: we used \code{socket.gethostname()}
|
||||
so that the socket would be visible to the outside world. If we had
|
||||
used \code{s.bind(('', 80))} or \code{s.bind(('localhost',
|
||||
80))} or \code{s.bind(('127.0.0.1', 80))} we would still
|
||||
have a "server" socket, but one that was only visible within the same
|
||||
machine.
|
||||
|
||||
A second thing to note: low number ports are usually reserved for
|
||||
"well known" services (HTTP, SNMP etc). If you're playing around, use
|
||||
a nice high number (4 digits).
|
||||
|
||||
Finally, the argument to \code{listen} tells the socket library that
|
||||
we want it to queue up as many as 5 connect requests (the normal max)
|
||||
before refusing outside connections. If the rest of the code is
|
||||
written properly, that should be plenty.
|
||||
|
||||
OK, now we have a "server" socket, listening on port 80. Now we enter
|
||||
the mainloop of the web server:
|
||||
|
||||
\begin{verbatim}
|
||||
while 1:
|
||||
#accept connections from outside
|
||||
(clientsocket, address) = serversocket.accept()
|
||||
#now do something with the clientsocket
|
||||
#in this case, we'll pretend this is a threaded server
|
||||
ct = client_thread(clientsocket)
|
||||
ct.run()
|
||||
\end{verbatim}
|
||||
|
||||
There's actually 3 general ways in which this loop could work -
|
||||
dispatching a thread to handle \code{clientsocket}, create a new
|
||||
process to handle \code{clientsocket}, or restructure this app
|
||||
to use non-blocking sockets, and mulitplex between our "server" socket
|
||||
and any active \code{clientsocket}s using
|
||||
\code{select}. More about that later. The important thing to
|
||||
understand now is this: this is \emph{all} a "server" socket
|
||||
does. It doesn't send any data. It doesn't receive any data. It just
|
||||
produces "client" sockets. Each \code{clientsocket} is created
|
||||
in response to some \emph{other} "client" socket doing a
|
||||
\code{connect()} to the host and port we're bound to. As soon as
|
||||
we've created that \code{clientsocket}, we go back to listening
|
||||
for more connections. The two "clients" are free to chat it up - they
|
||||
are using some dynamically allocated port which will be recycled when
|
||||
the conversation ends.
|
||||
|
||||
\subsection{IPC} If you need fast IPC between two processes
|
||||
on one machine, you should look into whatever form of shared memory
|
||||
the platform offers. A simple protocol based around shared memory and
|
||||
locks or semaphores is by far the fastest technique.
|
||||
|
||||
If you do decide to use sockets, bind the "server" socket to
|
||||
\code{'localhost'}. On most platforms, this will take a shortcut
|
||||
around a couple of layers of network code and be quite a bit faster.
|
||||
|
||||
|
||||
\section{Using a Socket}
|
||||
|
||||
The first thing to note, is that the web browser's "client" socket and
|
||||
the web server's "client" socket are identical beasts. That is, this
|
||||
is a "peer to peer" conversation. Or to put it another way, \emph{as the
|
||||
designer, you will have to decide what the rules of etiquette are for
|
||||
a conversation}. Normally, the \code{connect}ing socket
|
||||
starts the conversation, by sending in a request, or perhaps a
|
||||
signon. But that's a design decision - it's not a rule of sockets.
|
||||
|
||||
Now there are two sets of verbs to use for communication. You can use
|
||||
\code{send} and \code{recv}, or you can transform your
|
||||
client socket into a file-like beast and use \code{read} and
|
||||
\code{write}. The latter is the way Java presents their
|
||||
sockets. I'm not going to talk about it here, except to warn you that
|
||||
you need to use \code{flush} on sockets. These are buffered
|
||||
"files", and a common mistake is to \code{write} something, and
|
||||
then \code{read} for a reply. Without a \code{flush} in
|
||||
there, you may wait forever for the reply, because the request may
|
||||
still be in your output buffer.
|
||||
|
||||
Now we come the major stumbling block of sockets - \code{send}
|
||||
and \code{recv} operate on the network buffers. They do not
|
||||
necessarily handle all the bytes you hand them (or expect from them),
|
||||
because their major focus is handling the network buffers. In general,
|
||||
they return when the associated network buffers have been filled
|
||||
(\code{send}) or emptied (\code{recv}). They then tell you
|
||||
how many bytes they handled. It is \emph{your} responsibility to call
|
||||
them again until your message has been completely dealt with.
|
||||
|
||||
When a \code{recv} returns 0 bytes, it means the other side has
|
||||
closed (or is in the process of closing) the connection. You will not
|
||||
receive any more data on this connection. Ever. You may be able to
|
||||
send data successfully; I'll talk about that some on the next page.
|
||||
|
||||
A protocol like HTTP uses a socket for only one transfer. The client
|
||||
sends a request, the reads a reply. That's it. The socket is
|
||||
discarded. This means that a client can detect the end of the reply by
|
||||
receiving 0 bytes.
|
||||
|
||||
But if you plan to reuse your socket for further transfers, you need
|
||||
to realize that \emph{there is no "EOT" (End of Transfer) on a
|
||||
socket.} I repeat: if a socket \code{send} or
|
||||
\code{recv} returns after handling 0 bytes, the connection has
|
||||
been broken. If the connection has \emph{not} been broken, you may
|
||||
wait on a \code{recv} forever, because the socket will
|
||||
\emph{not} tell you that there's nothing more to read (for now). Now
|
||||
if you think about that a bit, you'll come to realize a fundamental
|
||||
truth of sockets: \emph{messages must either be fixed length} (yuck),
|
||||
\emph{or be delimited} (shrug), \emph{or indicate how long they are}
|
||||
(much better), \emph{or end by shutting down the connection}. The
|
||||
choice is entirely yours, (but some ways are righter than others).
|
||||
|
||||
Assuming you don't want to end the connection, the simplest solution
|
||||
is a fixed length message:
|
||||
|
||||
\begin{verbatim}
|
||||
class mysocket:
|
||||
'''demonstration class only
|
||||
- coded for clarity, not efficiency
|
||||
'''
|
||||
|
||||
def __init__(self, sock=None):
|
||||
if sock is None:
|
||||
self.sock = socket.socket(
|
||||
socket.AF_INET, socket.SOCK_STREAM)
|
||||
else:
|
||||
self.sock = sock
|
||||
|
||||
def connect(self, host, port):
|
||||
self.sock.connect((host, port))
|
||||
|
||||
def mysend(self, msg):
|
||||
totalsent = 0
|
||||
while totalsent < MSGLEN:
|
||||
sent = self.sock.send(msg[totalsent:])
|
||||
if sent == 0:
|
||||
raise RuntimeError, \\
|
||||
"socket connection broken"
|
||||
totalsent = totalsent + sent
|
||||
|
||||
def myreceive(self):
|
||||
msg = ''
|
||||
while len(msg) < MSGLEN:
|
||||
chunk = self.sock.recv(MSGLEN-len(msg))
|
||||
if chunk == '':
|
||||
raise RuntimeError, \\
|
||||
"socket connection broken"
|
||||
msg = msg + chunk
|
||||
return msg
|
||||
\end{verbatim}
|
||||
|
||||
The sending code here is usable for almost any messaging scheme - in
|
||||
Python you send strings, and you can use \code{len()} to
|
||||
determine its length (even if it has embedded \code{\e 0}
|
||||
characters). It's mostly the receiving code that gets more
|
||||
complex. (And in C, it's not much worse, except you can't use
|
||||
\code{strlen} if the message has embedded \code{\e 0}s.)
|
||||
|
||||
The easiest enhancement is to make the first character of the message
|
||||
an indicator of message type, and have the type determine the
|
||||
length. Now you have two \code{recv}s - the first to get (at
|
||||
least) that first character so you can look up the length, and the
|
||||
second in a loop to get the rest. If you decide to go the delimited
|
||||
route, you'll be receiving in some arbitrary chunk size, (4096 or 8192
|
||||
is frequently a good match for network buffer sizes), and scanning
|
||||
what you've received for a delimiter.
|
||||
|
||||
One complication to be aware of: if your conversational protocol
|
||||
allows multiple messages to be sent back to back (without some kind of
|
||||
reply), and you pass \code{recv} an arbitrary chunk size, you
|
||||
may end up reading the start of a following message. You'll need to
|
||||
put that aside and hold onto it, until it's needed.
|
||||
|
||||
Prefixing the message with it's length (say, as 5 numeric characters)
|
||||
gets more complex, because (believe it or not), you may not get all 5
|
||||
characters in one \code{recv}. In playing around, you'll get
|
||||
away with it; but in high network loads, your code will very quickly
|
||||
break unless you use two \code{recv} loops - the first to
|
||||
determine the length, the second to get the data part of the
|
||||
message. Nasty. This is also when you'll discover that
|
||||
\code{send} does not always manage to get rid of everything in
|
||||
one pass. And despite having read this, you will eventually get bit by
|
||||
it!
|
||||
|
||||
In the interests of space, building your character, (and preserving my
|
||||
competitive position), these enhancements are left as an exercise for
|
||||
the reader. Lets move on to cleaning up.
|
||||
|
||||
\subsection{Binary Data}
|
||||
|
||||
It is perfectly possible to send binary data over a socket. The major
|
||||
problem is that not all machines use the same formats for binary
|
||||
data. For example, a Motorola chip will represent a 16 bit integer
|
||||
with the value 1 as the two hex bytes 00 01. Intel and DEC, however,
|
||||
are byte-reversed - that same 1 is 01 00. Socket libraries have calls
|
||||
for converting 16 and 32 bit integers - \code{ntohl, htonl, ntohs,
|
||||
htons} where "n" means \emph{network} and "h" means \emph{host},
|
||||
"s" means \emph{short} and "l" means \emph{long}. Where network order
|
||||
is host order, these do nothing, but where the machine is
|
||||
byte-reversed, these swap the bytes around appropriately.
|
||||
|
||||
In these days of 32 bit machines, the ascii representation of binary
|
||||
data is frequently smaller than the binary representation. That's
|
||||
because a surprising amount of the time, all those longs have the
|
||||
value 0, or maybe 1. The string "0" would be two bytes, while binary
|
||||
is four. Of course, this doesn't fit well with fixed-length
|
||||
messages. Decisions, decisions.
|
||||
|
||||
\section{Disconnecting}
|
||||
|
||||
Strictly speaking, you're supposed to use \code{shutdown} on a
|
||||
socket before you \code{close} it. The \code{shutdown} is
|
||||
an advisory to the socket at the other end. Depending on the argument
|
||||
you pass it, it can mean "I'm not going to send anymore, but I'll
|
||||
still listen", or "I'm not listening, good riddance!". Most socket
|
||||
libraries, however, are so used to programmers neglecting to use this
|
||||
piece of etiquette that normally a \code{close} is the same as
|
||||
\code{shutdown(); close()}. So in most situations, an explicit
|
||||
\code{shutdown} is not needed.
|
||||
|
||||
One way to use \code{shutdown} effectively is in an HTTP-like
|
||||
exchange. The client sends a request and then does a
|
||||
\code{shutdown(1)}. This tells the server "This client is done
|
||||
sending, but can still receive." The server can detect "EOF" by a
|
||||
receive of 0 bytes. It can assume it has the complete request. The
|
||||
server sends a reply. If the \code{send} completes successfully
|
||||
then, indeed, the client was still receiving.
|
||||
|
||||
Python takes the automatic shutdown a step further, and says that when a socket is garbage collected, it will automatically do a \code{close} if it's needed. But relying on this is a very bad habit. If your socket just disappears without doing a \code{close}, the socket at the other end may hang indefinitely, thinking you're just being slow. \emph{Please} \code{close} your sockets when you're done.
|
||||
|
||||
|
||||
\subsection{When Sockets Die}
|
||||
|
||||
Probably the worst thing about using blocking sockets is what happens
|
||||
when the other side comes down hard (without doing a
|
||||
\code{close}). Your socket is likely to hang. SOCKSTREAM is a
|
||||
reliable protocol, and it will wait a long, long time before giving up
|
||||
on a connection. If you're using threads, the entire thread is
|
||||
essentially dead. There's not much you can do about it. As long as you
|
||||
aren't doing something dumb, like holding a lock while doing a
|
||||
blocking read, the thread isn't really consuming much in the way of
|
||||
resources. Do \emph{not} try to kill the thread - part of the reason
|
||||
that threads are more efficient than processes is that they avoid the
|
||||
overhead associated with the automatic recycling of resources. In
|
||||
other words, if you do manage to kill the thread, your whole process
|
||||
is likely to be screwed up.
|
||||
|
||||
\section{Non-blocking Sockets}
|
||||
|
||||
If you've understood the preceeding, you already know most of what you
|
||||
need to know about the mechanics of using sockets. You'll still use
|
||||
the same calls, in much the same ways. It's just that, if you do it
|
||||
right, your app will be almost inside-out.
|
||||
|
||||
In Python, you use \code{socket.setblocking(0)} to make it
|
||||
non-blocking. In C, it's more complex, (for one thing, you'll need to
|
||||
choose between the BSD flavor \code{O_NONBLOCK} and the almost
|
||||
indistinguishable Posix flavor \code{O_NDELAY}, which is
|
||||
completely different from \code{TCP_NODELAY}), but it's the
|
||||
exact same idea. You do this after creating the socket, but before
|
||||
using it. (Actually, if you're nuts, you can switch back and forth.)
|
||||
|
||||
The major mechanical difference is that \code{send},
|
||||
\code{recv}, \code{connect} and \code{accept} can
|
||||
return without having done anything. You have (of course) a number of
|
||||
choices. You can check return code and error codes and generally drive
|
||||
yourself crazy. If you don't believe me, try it sometime. Your app
|
||||
will grow large, buggy and suck CPU. So let's skip the brain-dead
|
||||
solutions and do it right.
|
||||
|
||||
Use \code{select}.
|
||||
|
||||
In C, coding \code{select} is fairly complex. In Python, it's a
|
||||
piece of cake, but it's close enough to the C version that if you
|
||||
understand \code{select} in Python, you'll have little trouble
|
||||
with it in C.
|
||||
|
||||
\begin{verbatim} ready_to_read, ready_to_write, in_error = \\
|
||||
select.select(
|
||||
potential_readers,
|
||||
potential_writers,
|
||||
potential_errs,
|
||||
timeout)
|
||||
\end{verbatim}
|
||||
|
||||
You pass \code{select} three lists: the first contains all
|
||||
sockets that you might want to try reading; the second all the sockets
|
||||
you might want to try writing to, and the last (normally left empty)
|
||||
those that you want to check for errors. You should note that a
|
||||
socket can go into more than one list. The \code{select} call is
|
||||
blocking, but you can give it a timeout. This is generally a sensible
|
||||
thing to do - give it a nice long timeout (say a minute) unless you
|
||||
have good reason to do otherwise.
|
||||
|
||||
In return, you will get three lists. They have the sockets that are
|
||||
actually readable, writable and in error. Each of these lists is a
|
||||
subset (possbily empty) of the corresponding list you passed in. And
|
||||
if you put a socket in more than one input list, it will only be (at
|
||||
most) in one output list.
|
||||
|
||||
If a socket is in the output readable list, you can be
|
||||
as-close-to-certain-as-we-ever-get-in-this-business that a
|
||||
\code{recv} on that socket will return \emph{something}. Same
|
||||
idea for the writable list. You'll be able to send
|
||||
\emph{something}. Maybe not all you want to, but \emph{something} is
|
||||
better than nothing. (Actually, any reasonably healthy socket will
|
||||
return as writable - it just means outbound network buffer space is
|
||||
available.)
|
||||
|
||||
If you have a "server" socket, put it in the potential_readers
|
||||
list. If it comes out in the readable list, your \code{accept}
|
||||
will (almost certainly) work. If you have created a new socket to
|
||||
\code{connect} to someone else, put it in the ptoential_writers
|
||||
list. If it shows up in the writable list, you have a decent chance
|
||||
that it has connected.
|
||||
|
||||
One very nasty problem with \code{select}: if somewhere in those
|
||||
input lists of sockets is one which has died a nasty death, the
|
||||
\code{select} will fail. You then need to loop through every
|
||||
single damn socket in all those lists and do a
|
||||
\code{select([sock],[],[],0)} until you find the bad one. That
|
||||
timeout of 0 means it won't take long, but it's ugly.
|
||||
|
||||
Actually, \code{select} can be handy even with blocking sockets.
|
||||
It's one way of determining whether you will block - the socket
|
||||
returns as readable when there's something in the buffers. However,
|
||||
this still doesn't help with the problem of determining whether the
|
||||
other end is done, or just busy with something else.
|
||||
|
||||
\textbf{Portability alert}: On Unix, \code{select} works both with
|
||||
the sockets and files. Don't try this on Windows. On Windows,
|
||||
\code{select} works with sockets only. Also note that in C, many
|
||||
of the more advanced socket options are done differently on
|
||||
Windows. In fact, on Windows I usually use threads (which work very,
|
||||
very well) with my sockets. Face it, if you want any kind of
|
||||
performance, your code will look very different on Windows than on
|
||||
Unix. (I haven't the foggiest how you do this stuff on a Mac.)
|
||||
|
||||
\subsection{Performance}
|
||||
|
||||
There's no question that the fastest sockets code uses non-blocking
|
||||
sockets and select to multiplex them. You can put together something
|
||||
that will saturate a LAN connection without putting any strain on the
|
||||
CPU. The trouble is that an app written this way can't do much of
|
||||
anything else - it needs to be ready to shuffle bytes around at all
|
||||
times.
|
||||
|
||||
Assuming that your app is actually supposed to do something more than
|
||||
that, threading is the optimal solution, (and using non-blocking
|
||||
sockets will be faster than using blocking sockets). Unfortunately,
|
||||
threading support in Unixes varies both in API and quality. So the
|
||||
normal Unix solution is to fork a subprocess to deal with each
|
||||
connection. The overhead for this is significant (and don't do this on
|
||||
Windows - the overhead of process creation is enormous there). It also
|
||||
means that unless each subprocess is completely independent, you'll
|
||||
need to use another form of IPC, say a pipe, or shared memory and
|
||||
semaphores, to communicate between the parent and child processes.
|
||||
|
||||
Finally, remember that even though blocking sockets are somewhat
|
||||
slower than non-blocking, in many cases they are the "right"
|
||||
solution. After all, if your app is driven by the data it receives
|
||||
over a socket, there's not much sense in complicating the logic just
|
||||
so your app can wait on \code{select} instead of
|
||||
\code{recv}.
|
||||
|
||||
\end{document}
|
|
@ -1,766 +0,0 @@
|
|||
Unicode HOWTO
|
||||
================
|
||||
|
||||
**Version 1.02**
|
||||
|
||||
This HOWTO discusses Python's support for Unicode, and explains various
|
||||
problems that people commonly encounter when trying to work with Unicode.
|
||||
|
||||
Introduction to Unicode
|
||||
------------------------------
|
||||
|
||||
History of Character Codes
|
||||
''''''''''''''''''''''''''''''
|
||||
|
||||
In 1968, the American Standard Code for Information Interchange,
|
||||
better known by its acronym ASCII, was standardized. ASCII defined
|
||||
numeric codes for various characters, with the numeric values running from 0 to
|
||||
127. For example, the lowercase letter 'a' is assigned 97 as its code
|
||||
value.
|
||||
|
||||
ASCII was an American-developed standard, so it only defined
|
||||
unaccented characters. There was an 'e', but no 'é' or 'Í'. This
|
||||
meant that languages which required accented characters couldn't be
|
||||
faithfully represented in ASCII. (Actually the missing accents matter
|
||||
for English, too, which contains words such as 'naïve' and 'café', and some
|
||||
publications have house styles which require spellings such as
|
||||
'coöperate'.)
|
||||
|
||||
For a while people just wrote programs that didn't display accents. I
|
||||
remember looking at Apple ][ BASIC programs, published in French-language
|
||||
publications in the mid-1980s, that had lines like these::
|
||||
|
||||
PRINT "FICHER EST COMPLETE."
|
||||
PRINT "CARACTERE NON ACCEPTE."
|
||||
|
||||
Those messages should contain accents, and they just look wrong to
|
||||
someone who can read French.
|
||||
|
||||
In the 1980s, almost all personal computers were 8-bit, meaning that
|
||||
bytes could hold values ranging from 0 to 255. ASCII codes only went
|
||||
up to 127, so some machines assigned values between 128 and 255 to
|
||||
accented characters. Different machines had different codes, however,
|
||||
which led to problems exchanging files. Eventually various commonly
|
||||
used sets of values for the 128-255 range emerged. Some were true
|
||||
standards, defined by the International Standards Organization, and
|
||||
some were **de facto** conventions that were invented by one company
|
||||
or another and managed to catch on.
|
||||
|
||||
255 characters aren't very many. For example, you can't fit
|
||||
both the accented characters used in Western Europe and the Cyrillic
|
||||
alphabet used for Russian into the 128-255 range because there are more than
|
||||
127 such characters.
|
||||
|
||||
You could write files using different codes (all your Russian
|
||||
files in a coding system called KOI8, all your French files in
|
||||
a different coding system called Latin1), but what if you wanted
|
||||
to write a French document that quotes some Russian text? In the
|
||||
1980s people began to want to solve this problem, and the Unicode
|
||||
standardization effort began.
|
||||
|
||||
Unicode started out using 16-bit characters instead of 8-bit characters. 16
|
||||
bits means you have 2^16 = 65,536 distinct values available, making it
|
||||
possible to represent many different characters from many different
|
||||
alphabets; an initial goal was to have Unicode contain the alphabets for
|
||||
every single human language. It turns out that even 16 bits isn't enough to
|
||||
meet that goal, and the modern Unicode specification uses a wider range of
|
||||
codes, 0-1,114,111 (0x10ffff in base-16).
|
||||
|
||||
There's a related ISO standard, ISO 10646. Unicode and ISO 10646 were
|
||||
originally separate efforts, but the specifications were merged with
|
||||
the 1.1 revision of Unicode.
|
||||
|
||||
(This discussion of Unicode's history is highly simplified. I don't
|
||||
think the average Python programmer needs to worry about the
|
||||
historical details; consult the Unicode consortium site listed in the
|
||||
References for more information.)
|
||||
|
||||
|
||||
Definitions
|
||||
''''''''''''''''''''''''
|
||||
|
||||
A **character** is the smallest possible component of a text. 'A',
|
||||
'B', 'C', etc., are all different characters. So are 'È' and
|
||||
'Í'. Characters are abstractions, and vary depending on the
|
||||
language or context you're talking about. For example, the symbol for
|
||||
ohms (Ω) is usually drawn much like the capital letter
|
||||
omega (Ω) in the Greek alphabet (they may even be the same in
|
||||
some fonts), but these are two different characters that have
|
||||
different meanings.
|
||||
|
||||
The Unicode standard describes how characters are represented by
|
||||
**code points**. A code point is an integer value, usually denoted in
|
||||
base 16. In the standard, a code point is written using the notation
|
||||
U+12ca to mean the character with value 0x12ca (4810 decimal). The
|
||||
Unicode standard contains a lot of tables listing characters and their
|
||||
corresponding code points::
|
||||
|
||||
0061 'a'; LATIN SMALL LETTER A
|
||||
0062 'b'; LATIN SMALL LETTER B
|
||||
0063 'c'; LATIN SMALL LETTER C
|
||||
...
|
||||
007B '{'; LEFT CURLY BRACKET
|
||||
|
||||
Strictly, these definitions imply that it's meaningless to say 'this is
|
||||
character U+12ca'. U+12ca is a code point, which represents some particular
|
||||
character; in this case, it represents the character 'ETHIOPIC SYLLABLE WI'.
|
||||
In informal contexts, this distinction between code points and characters will
|
||||
sometimes be forgotten.
|
||||
|
||||
A character is represented on a screen or on paper by a set of graphical
|
||||
elements that's called a **glyph**. The glyph for an uppercase A, for
|
||||
example, is two diagonal strokes and a horizontal stroke, though the exact
|
||||
details will depend on the font being used. Most Python code doesn't need
|
||||
to worry about glyphs; figuring out the correct glyph to display is
|
||||
generally the job of a GUI toolkit or a terminal's font renderer.
|
||||
|
||||
|
||||
Encodings
|
||||
'''''''''
|
||||
|
||||
To summarize the previous section:
|
||||
a Unicode string is a sequence of code points, which are
|
||||
numbers from 0 to 0x10ffff. This sequence needs to be represented as
|
||||
a set of bytes (meaning, values from 0-255) in memory. The rules for
|
||||
translating a Unicode string into a sequence of bytes are called an
|
||||
**encoding**.
|
||||
|
||||
The first encoding you might think of is an array of 32-bit integers.
|
||||
In this representation, the string "Python" would look like this::
|
||||
|
||||
P y t h o n
|
||||
0x50 00 00 00 79 00 00 00 74 00 00 00 68 00 00 00 6f 00 00 00 6e 00 00 00
|
||||
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
|
||||
|
||||
This representation is straightforward but using
|
||||
it presents a number of problems.
|
||||
|
||||
1. It's not portable; different processors order the bytes
|
||||
differently.
|
||||
|
||||
2. It's very wasteful of space. In most texts, the majority of the code
|
||||
points are less than 127, or less than 255, so a lot of space is occupied
|
||||
by zero bytes. The above string takes 24 bytes compared to the 6
|
||||
bytes needed for an ASCII representation. Increased RAM usage doesn't
|
||||
matter too much (desktop computers have megabytes of RAM, and strings
|
||||
aren't usually that large), but expanding our usage of disk and
|
||||
network bandwidth by a factor of 4 is intolerable.
|
||||
|
||||
3. It's not compatible with existing C functions such as ``strlen()``,
|
||||
so a new family of wide string functions would need to be used.
|
||||
|
||||
4. Many Internet standards are defined in terms of textual data, and
|
||||
can't handle content with embedded zero bytes.
|
||||
|
||||
Generally people don't use this encoding, instead choosing other encodings
|
||||
that are more efficient and convenient.
|
||||
|
||||
Encodings don't have to handle every possible Unicode character, and
|
||||
most encodings don't. For example, Python's default encoding is the
|
||||
'ascii' encoding. The rules for converting a Unicode string into the
|
||||
ASCII encoding are simple; for each code point:
|
||||
|
||||
1. If the code point is <128, each byte is the same as the value of the
|
||||
code point.
|
||||
|
||||
2. If the code point is 128 or greater, the Unicode string can't
|
||||
be represented in this encoding. (Python raises a
|
||||
``UnicodeEncodeError`` exception in this case.)
|
||||
|
||||
Latin-1, also known as ISO-8859-1, is a similar encoding. Unicode
|
||||
code points 0-255 are identical to the Latin-1 values, so converting
|
||||
to this encoding simply requires converting code points to byte
|
||||
values; if a code point larger than 255 is encountered, the string
|
||||
can't be encoded into Latin-1.
|
||||
|
||||
Encodings don't have to be simple one-to-one mappings like Latin-1.
|
||||
Consider IBM's EBCDIC, which was used on IBM mainframes. Letter
|
||||
values weren't in one block: 'a' through 'i' had values from 129 to
|
||||
137, but 'j' through 'r' were 145 through 153. If you wanted to use
|
||||
EBCDIC as an encoding, you'd probably use some sort of lookup table to
|
||||
perform the conversion, but this is largely an internal detail.
|
||||
|
||||
UTF-8 is one of the most commonly used encodings. UTF stands for
|
||||
"Unicode Transformation Format", and the '8' means that 8-bit numbers
|
||||
are used in the encoding. (There's also a UTF-16 encoding, but it's
|
||||
less frequently used than UTF-8.) UTF-8 uses the following rules:
|
||||
|
||||
1. If the code point is <128, it's represented by the corresponding byte value.
|
||||
2. If the code point is between 128 and 0x7ff, it's turned into two byte values
|
||||
between 128 and 255.
|
||||
3. Code points >0x7ff are turned into three- or four-byte sequences, where
|
||||
each byte of the sequence is between 128 and 255.
|
||||
|
||||
UTF-8 has several convenient properties:
|
||||
|
||||
1. It can handle any Unicode code point.
|
||||
2. A Unicode string is turned into a string of bytes containing no embedded zero bytes. This avoids byte-ordering issues, and means UTF-8 strings can be processed by C functions such as ``strcpy()`` and sent through protocols that can't handle zero bytes.
|
||||
3. A string of ASCII text is also valid UTF-8 text.
|
||||
4. UTF-8 is fairly compact; the majority of code points are turned into two bytes, and values less than 128 occupy only a single byte.
|
||||
5. If bytes are corrupted or lost, it's possible to determine the start of the next UTF-8-encoded code point and resynchronize. It's also unlikely that random 8-bit data will look like valid UTF-8.
|
||||
|
||||
|
||||
|
||||
References
|
||||
''''''''''''''
|
||||
|
||||
The Unicode Consortium site at <http://www.unicode.org> has character
|
||||
charts, a glossary, and PDF versions of the Unicode specification. Be
|
||||
prepared for some difficult reading.
|
||||
<http://www.unicode.org/history/> is a chronology of the origin and
|
||||
development of Unicode.
|
||||
|
||||
To help understand the standard, Jukka Korpela has written an
|
||||
introductory guide to reading the Unicode character tables,
|
||||
available at <http://www.cs.tut.fi/~jkorpela/unicode/guide.html>.
|
||||
|
||||
Roman Czyborra wrote another explanation of Unicode's basic principles;
|
||||
it's at <http://czyborra.com/unicode/characters.html>.
|
||||
Czyborra has written a number of other Unicode-related documentation,
|
||||
available from <http://www.cyzborra.com>.
|
||||
|
||||
Two other good introductory articles were written by Joel Spolsky
|
||||
<http://www.joelonsoftware.com/articles/Unicode.html> and Jason
|
||||
Orendorff <http://www.jorendorff.com/articles/unicode/>. If this
|
||||
introduction didn't make things clear to you, you should try reading
|
||||
one of these alternate articles before continuing.
|
||||
|
||||
Wikipedia entries are often helpful; see the entries for "character
|
||||
encoding" <http://en.wikipedia.org/wiki/Character_encoding> and UTF-8
|
||||
<http://en.wikipedia.org/wiki/UTF-8>, for example.
|
||||
|
||||
|
||||
Python's Unicode Support
|
||||
------------------------
|
||||
|
||||
Now that you've learned the rudiments of Unicode, we can look at
|
||||
Python's Unicode features.
|
||||
|
||||
|
||||
The Unicode Type
|
||||
'''''''''''''''''''
|
||||
|
||||
Unicode strings are expressed as instances of the ``unicode`` type,
|
||||
one of Python's repertoire of built-in types. It derives from an
|
||||
abstract type called ``basestring``, which is also an ancestor of the
|
||||
``str`` type; you can therefore check if a value is a string type with
|
||||
``isinstance(value, basestring)``. Under the hood, Python represents
|
||||
Unicode strings as either 16- or 32-bit integers, depending on how the
|
||||
Python interpreter was compiled.
|
||||
|
||||
The ``unicode()`` constructor has the signature ``unicode(string[, encoding, errors])``.
|
||||
All of its arguments should be 8-bit strings. The first argument is converted
|
||||
to Unicode using the specified encoding; if you leave off the ``encoding`` argument,
|
||||
the ASCII encoding is used for the conversion, so characters greater than 127 will
|
||||
be treated as errors::
|
||||
|
||||
>>> unicode('abcdef')
|
||||
u'abcdef'
|
||||
>>> s = unicode('abcdef')
|
||||
>>> type(s)
|
||||
<type 'unicode'>
|
||||
>>> unicode('abcdef' + chr(255))
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 6:
|
||||
ordinal not in range(128)
|
||||
|
||||
The ``errors`` argument specifies the response when the input string can't be converted according to the encoding's rules. Legal values for this argument
|
||||
are 'strict' (raise a ``UnicodeDecodeError`` exception),
|
||||
'replace' (add U+FFFD, 'REPLACEMENT CHARACTER'),
|
||||
or 'ignore' (just leave the character out of the Unicode result).
|
||||
The following examples show the differences::
|
||||
|
||||
>>> unicode('\x80abc', errors='strict')
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
|
||||
ordinal not in range(128)
|
||||
>>> unicode('\x80abc', errors='replace')
|
||||
u'\ufffdabc'
|
||||
>>> unicode('\x80abc', errors='ignore')
|
||||
u'abc'
|
||||
|
||||
Encodings are specified as strings containing the encoding's name.
|
||||
Python 2.4 comes with roughly 100 different encodings; see the Python
|
||||
Library Reference at
|
||||
<http://docs.python.org/lib/standard-encodings.html> for a list. Some
|
||||
encodings have multiple names; for example, 'latin-1', 'iso_8859_1'
|
||||
and '8859' are all synonyms for the same encoding.
|
||||
|
||||
One-character Unicode strings can also be created with the
|
||||
``unichr()`` built-in function, which takes integers and returns a
|
||||
Unicode string of length 1 that contains the corresponding code point.
|
||||
The reverse operation is the built-in `ord()` function that takes a
|
||||
one-character Unicode string and returns the code point value::
|
||||
|
||||
>>> unichr(40960)
|
||||
u'\ua000'
|
||||
>>> ord(u'\ua000')
|
||||
40960
|
||||
|
||||
Instances of the ``unicode`` type have many of the same methods as
|
||||
the 8-bit string type for operations such as searching and formatting::
|
||||
|
||||
>>> s = u'Was ever feather so lightly blown to and fro as this multitude?'
|
||||
>>> s.count('e')
|
||||
5
|
||||
>>> s.find('feather')
|
||||
9
|
||||
>>> s.find('bird')
|
||||
-1
|
||||
>>> s.replace('feather', 'sand')
|
||||
u'Was ever sand so lightly blown to and fro as this multitude?'
|
||||
>>> s.upper()
|
||||
u'WAS EVER FEATHER SO LIGHTLY BLOWN TO AND FRO AS THIS MULTITUDE?'
|
||||
|
||||
Note that the arguments to these methods can be Unicode strings or 8-bit strings.
|
||||
8-bit strings will be converted to Unicode before carrying out the operation;
|
||||
Python's default ASCII encoding will be used, so characters greater than 127 will cause an exception::
|
||||
|
||||
>>> s.find('Was\x9f')
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
UnicodeDecodeError: 'ascii' codec can't decode byte 0x9f in position 3: ordinal not in range(128)
|
||||
>>> s.find(u'Was\x9f')
|
||||
-1
|
||||
|
||||
Much Python code that operates on strings will therefore work with
|
||||
Unicode strings without requiring any changes to the code. (Input and
|
||||
output code needs more updating for Unicode; more on this later.)
|
||||
|
||||
Another important method is ``.encode([encoding], [errors='strict'])``,
|
||||
which returns an 8-bit string version of the
|
||||
Unicode string, encoded in the requested encoding. The ``errors``
|
||||
parameter is the same as the parameter of the ``unicode()``
|
||||
constructor, with one additional possibility; as well as 'strict',
|
||||
'ignore', and 'replace', you can also pass 'xmlcharrefreplace' which
|
||||
uses XML's character references. The following example shows the
|
||||
different results::
|
||||
|
||||
>>> u = unichr(40960) + u'abcd' + unichr(1972)
|
||||
>>> u.encode('utf-8')
|
||||
'\xea\x80\x80abcd\xde\xb4'
|
||||
>>> u.encode('ascii')
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
UnicodeEncodeError: 'ascii' codec can't encode character '\ua000' in position 0: ordinal not in range(128)
|
||||
>>> u.encode('ascii', 'ignore')
|
||||
'abcd'
|
||||
>>> u.encode('ascii', 'replace')
|
||||
'?abcd?'
|
||||
>>> u.encode('ascii', 'xmlcharrefreplace')
|
||||
'ꀀabcd޴'
|
||||
|
||||
Python's 8-bit strings have a ``.decode([encoding], [errors])`` method
|
||||
that interprets the string using the given encoding::
|
||||
|
||||
>>> u = unichr(40960) + u'abcd' + unichr(1972) # Assemble a string
|
||||
>>> utf8_version = u.encode('utf-8') # Encode as UTF-8
|
||||
>>> type(utf8_version), utf8_version
|
||||
(<type 'str'>, '\xea\x80\x80abcd\xde\xb4')
|
||||
>>> u2 = utf8_version.decode('utf-8') # Decode using UTF-8
|
||||
>>> u == u2 # The two strings match
|
||||
True
|
||||
|
||||
The low-level routines for registering and accessing the available
|
||||
encodings are found in the ``codecs`` module. However, the encoding
|
||||
and decoding functions returned by this module are usually more
|
||||
low-level than is comfortable, so I'm not going to describe the
|
||||
``codecs`` module here. If you need to implement a completely new
|
||||
encoding, you'll need to learn about the ``codecs`` module interfaces,
|
||||
but implementing encodings is a specialized task that also won't be
|
||||
covered here. Consult the Python documentation to learn more about
|
||||
this module.
|
||||
|
||||
The most commonly used part of the ``codecs`` module is the
|
||||
``codecs.open()`` function which will be discussed in the section
|
||||
on input and output.
|
||||
|
||||
|
||||
Unicode Literals in Python Source Code
|
||||
''''''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
In Python source code, Unicode literals are written as strings
|
||||
prefixed with the 'u' or 'U' character: ``u'abcdefghijk'``. Specific
|
||||
code points can be written using the ``\u`` escape sequence, which is
|
||||
followed by four hex digits giving the code point. The ``\U`` escape
|
||||
sequence is similar, but expects 8 hex digits, not 4.
|
||||
|
||||
Unicode literals can also use the same escape sequences as 8-bit
|
||||
strings, including ``\x``, but ``\x`` only takes two hex digits so it
|
||||
can't express an arbitrary code point. Octal escapes can go up to
|
||||
U+01ff, which is octal 777.
|
||||
|
||||
::
|
||||
|
||||
>>> s = u"a\xac\u1234\u20ac\U00008000"
|
||||
^^^^ two-digit hex escape
|
||||
^^^^^^ four-digit Unicode escape
|
||||
^^^^^^^^^^ eight-digit Unicode escape
|
||||
>>> for c in s: print ord(c),
|
||||
...
|
||||
97 172 4660 8364 32768
|
||||
|
||||
Using escape sequences for code points greater than 127 is fine in
|
||||
small doses, but becomes an annoyance if you're using many accented
|
||||
characters, as you would in a program with messages in French or some
|
||||
other accent-using language. You can also assemble strings using the
|
||||
``unichr()`` built-in function, but this is even more tedious.
|
||||
|
||||
Ideally, you'd want to be able to write literals in your language's
|
||||
natural encoding. You could then edit Python source code with your
|
||||
favorite editor which would display the accented characters naturally,
|
||||
and have the right characters used at runtime.
|
||||
|
||||
Python supports writing Unicode literals in any encoding, but you have
|
||||
to declare the encoding being used. This is done by including a
|
||||
special comment as either the first or second line of the source
|
||||
file::
|
||||
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: latin-1 -*-
|
||||
|
||||
u = u'abcdé'
|
||||
print ord(u[-1])
|
||||
|
||||
The syntax is inspired by Emacs's notation for specifying variables local to a file.
|
||||
Emacs supports many different variables, but Python only supports 'coding'.
|
||||
The ``-*-`` symbols indicate that the comment is special; within them,
|
||||
you must supply the name ``coding`` and the name of your chosen encoding,
|
||||
separated by ``':'``.
|
||||
|
||||
If you don't include such a comment, the default encoding used will be
|
||||
ASCII. Versions of Python before 2.4 were Euro-centric and assumed
|
||||
Latin-1 as a default encoding for string literals; in Python 2.4,
|
||||
characters greater than 127 still work but result in a warning. For
|
||||
example, the following program has no encoding declaration::
|
||||
|
||||
#!/usr/bin/env python
|
||||
u = u'abcdé'
|
||||
print ord(u[-1])
|
||||
|
||||
When you run it with Python 2.4, it will output the following warning::
|
||||
|
||||
amk:~$ python p263.py
|
||||
sys:1: DeprecationWarning: Non-ASCII character '\xe9'
|
||||
in file p263.py on line 2, but no encoding declared;
|
||||
see http://www.python.org/peps/pep-0263.html for details
|
||||
|
||||
|
||||
Unicode Properties
|
||||
'''''''''''''''''''
|
||||
|
||||
The Unicode specification includes a database of information about
|
||||
code points. For each code point that's defined, the information
|
||||
includes the character's name, its category, the numeric value if
|
||||
applicable (Unicode has characters representing the Roman numerals and
|
||||
fractions such as one-third and four-fifths). There are also
|
||||
properties related to the code point's use in bidirectional text and
|
||||
other display-related properties.
|
||||
|
||||
The following program displays some information about several
|
||||
characters, and prints the numeric value of one particular character::
|
||||
|
||||
import unicodedata
|
||||
|
||||
u = unichr(233) + unichr(0x0bf2) + unichr(3972) + unichr(6000) + unichr(13231)
|
||||
|
||||
for i, c in enumerate(u):
|
||||
print i, '%04x' % ord(c), unicodedata.category(c),
|
||||
print unicodedata.name(c)
|
||||
|
||||
# Get numeric value of second character
|
||||
print unicodedata.numeric(u[1])
|
||||
|
||||
When run, this prints::
|
||||
|
||||
0 00e9 Ll LATIN SMALL LETTER E WITH ACUTE
|
||||
1 0bf2 No TAMIL NUMBER ONE THOUSAND
|
||||
2 0f84 Mn TIBETAN MARK HALANTA
|
||||
3 1770 Lo TAGBANWA LETTER SA
|
||||
4 33af So SQUARE RAD OVER S SQUARED
|
||||
1000.0
|
||||
|
||||
The category codes are abbreviations describing the nature of the
|
||||
character. These are grouped into categories such as "Letter",
|
||||
"Number", "Punctuation", or "Symbol", which in turn are broken up into
|
||||
subcategories. To take the codes from the above output, ``'Ll'``
|
||||
means 'Letter, lowercase', ``'No'`` means "Number, other", ``'Mn'`` is
|
||||
"Mark, nonspacing", and ``'So'`` is "Symbol, other". See
|
||||
<http://www.unicode.org/Public/UNIDATA/UCD.html#General_Category_Values>
|
||||
for a list of category codes.
|
||||
|
||||
References
|
||||
''''''''''''''
|
||||
|
||||
The Unicode and 8-bit string types are described in the Python library
|
||||
reference at <http://docs.python.org/lib/typesseq.html>.
|
||||
|
||||
The documentation for the ``unicodedata`` module is at
|
||||
<http://docs.python.org/lib/module-unicodedata.html>.
|
||||
|
||||
The documentation for the ``codecs`` module is at
|
||||
<http://docs.python.org/lib/module-codecs.html>.
|
||||
|
||||
Marc-André Lemburg gave a presentation at EuroPython 2002
|
||||
titled "Python and Unicode". A PDF version of his slides
|
||||
is available at <http://www.egenix.com/files/python/Unicode-EPC2002-Talk.pdf>,
|
||||
and is an excellent overview of the design of Python's Unicode features.
|
||||
|
||||
|
||||
Reading and Writing Unicode Data
|
||||
----------------------------------------
|
||||
|
||||
Once you've written some code that works with Unicode data, the next
|
||||
problem is input/output. How do you get Unicode strings into your
|
||||
program, and how do you convert Unicode into a form suitable for
|
||||
storage or transmission?
|
||||
|
||||
It's possible that you may not need to do anything depending on your
|
||||
input sources and output destinations; you should check whether the
|
||||
libraries used in your application support Unicode natively. XML
|
||||
parsers often return Unicode data, for example. Many relational
|
||||
databases also support Unicode-valued columns and can return Unicode
|
||||
values from an SQL query.
|
||||
|
||||
Unicode data is usually converted to a particular encoding before it
|
||||
gets written to disk or sent over a socket. It's possible to do all
|
||||
the work yourself: open a file, read an 8-bit string from it, and
|
||||
convert the string with ``unicode(str, encoding)``. However, the
|
||||
manual approach is not recommended.
|
||||
|
||||
One problem is the multi-byte nature of encodings; one Unicode
|
||||
character can be represented by several bytes. If you want to read
|
||||
the file in arbitrary-sized chunks (say, 1K or 4K), you need to write
|
||||
error-handling code to catch the case where only part of the bytes
|
||||
encoding a single Unicode character are read at the end of a chunk.
|
||||
One solution would be to read the entire file into memory and then
|
||||
perform the decoding, but that prevents you from working with files
|
||||
that are extremely large; if you need to read a 2Gb file, you need 2Gb
|
||||
of RAM. (More, really, since for at least a moment you'd need to have
|
||||
both the encoded string and its Unicode version in memory.)
|
||||
|
||||
The solution would be to use the low-level decoding interface to catch
|
||||
the case of partial coding sequences. The work of implementing this
|
||||
has already been done for you: the ``codecs`` module includes a
|
||||
version of the ``open()`` function that returns a file-like object
|
||||
that assumes the file's contents are in a specified encoding and
|
||||
accepts Unicode parameters for methods such as ``.read()`` and
|
||||
``.write()``.
|
||||
|
||||
The function's parameters are
|
||||
``open(filename, mode='rb', encoding=None, errors='strict', buffering=1)``. ``mode`` can be
|
||||
``'r'``, ``'w'``, or ``'a'``, just like the corresponding parameter to the
|
||||
regular built-in ``open()`` function; add a ``'+'`` to
|
||||
update the file. ``buffering`` is similarly
|
||||
parallel to the standard function's parameter.
|
||||
``encoding`` is a string giving
|
||||
the encoding to use; if it's left as ``None``, a regular Python file
|
||||
object that accepts 8-bit strings is returned. Otherwise, a wrapper
|
||||
object is returned, and data written to or read from the wrapper
|
||||
object will be converted as needed. ``errors`` specifies the action
|
||||
for encoding errors and can be one of the usual values of 'strict',
|
||||
'ignore', and 'replace'.
|
||||
|
||||
Reading Unicode from a file is therefore simple::
|
||||
|
||||
import codecs
|
||||
f = codecs.open('unicode.rst', encoding='utf-8')
|
||||
for line in f:
|
||||
print repr(line)
|
||||
|
||||
It's also possible to open files in update mode,
|
||||
allowing both reading and writing::
|
||||
|
||||
f = codecs.open('test', encoding='utf-8', mode='w+')
|
||||
f.write(u'\u4500 blah blah blah\n')
|
||||
f.seek(0)
|
||||
print repr(f.readline()[:1])
|
||||
f.close()
|
||||
|
||||
Unicode character U+FEFF is used as a byte-order mark (BOM),
|
||||
and is often written as the first character of a file in order
|
||||
to assist with autodetection of the file's byte ordering.
|
||||
Some encodings, such as UTF-16, expect a BOM to be present at
|
||||
the start of a file; when such an encoding is used,
|
||||
the BOM will be automatically written as the first character
|
||||
and will be silently dropped when the file is read. There are
|
||||
variants of these encodings, such as 'utf-16-le' and 'utf-16-be'
|
||||
for little-endian and big-endian encodings, that specify
|
||||
one particular byte ordering and don't
|
||||
skip the BOM.
|
||||
|
||||
|
||||
Unicode filenames
|
||||
'''''''''''''''''''''''''
|
||||
|
||||
Most of the operating systems in common use today support filenames
|
||||
that contain arbitrary Unicode characters. Usually this is
|
||||
implemented by converting the Unicode string into some encoding that
|
||||
varies depending on the system. For example, MacOS X uses UTF-8 while
|
||||
Windows uses a configurable encoding; on Windows, Python uses the name
|
||||
"mbcs" to refer to whatever the currently configured encoding is. On
|
||||
Unix systems, there will only be a filesystem encoding if you've set
|
||||
the ``LANG`` or ``LC_CTYPE`` environment variables; if you haven't,
|
||||
the default encoding is ASCII.
|
||||
|
||||
The ``sys.getfilesystemencoding()`` function returns the encoding to
|
||||
use on your current system, in case you want to do the encoding
|
||||
manually, but there's not much reason to bother. When opening a file
|
||||
for reading or writing, you can usually just provide the Unicode
|
||||
string as the filename, and it will be automatically converted to the
|
||||
right encoding for you::
|
||||
|
||||
filename = u'filename\u4500abc'
|
||||
f = open(filename, 'w')
|
||||
f.write('blah\n')
|
||||
f.close()
|
||||
|
||||
Functions in the ``os`` module such as ``os.stat()`` will also accept
|
||||
Unicode filenames.
|
||||
|
||||
``os.listdir()``, which returns filenames, raises an issue: should it
|
||||
return the Unicode version of filenames, or should it return 8-bit
|
||||
strings containing the encoded versions? ``os.listdir()`` will do
|
||||
both, depending on whether you provided the directory path as an 8-bit
|
||||
string or a Unicode string. If you pass a Unicode string as the path,
|
||||
filenames will be decoded using the filesystem's encoding and a list
|
||||
of Unicode strings will be returned, while passing an 8-bit path will
|
||||
return the 8-bit versions of the filenames. For example, assuming the
|
||||
default filesystem encoding is UTF-8, running the following program::
|
||||
|
||||
fn = u'filename\u4500abc'
|
||||
f = open(fn, 'w')
|
||||
f.close()
|
||||
|
||||
import os
|
||||
print os.listdir('.')
|
||||
print os.listdir(u'.')
|
||||
|
||||
will produce the following output::
|
||||
|
||||
amk:~$ python t.py
|
||||
['.svn', 'filename\xe4\x94\x80abc', ...]
|
||||
[u'.svn', u'filename\u4500abc', ...]
|
||||
|
||||
The first list contains UTF-8-encoded filenames, and the second list
|
||||
contains the Unicode versions.
|
||||
|
||||
|
||||
|
||||
Tips for Writing Unicode-aware Programs
|
||||
''''''''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
This section provides some suggestions on writing software that
|
||||
deals with Unicode.
|
||||
|
||||
The most important tip is:
|
||||
|
||||
Software should only work with Unicode strings internally,
|
||||
converting to a particular encoding on output.
|
||||
|
||||
If you attempt to write processing functions that accept both
|
||||
Unicode and 8-bit strings, you will find your program vulnerable to
|
||||
bugs wherever you combine the two different kinds of strings. Python's
|
||||
default encoding is ASCII, so whenever a character with an ASCII value >127
|
||||
is in the input data, you'll get a ``UnicodeDecodeError``
|
||||
because that character can't be handled by the ASCII encoding.
|
||||
|
||||
It's easy to miss such problems if you only test your software
|
||||
with data that doesn't contain any
|
||||
accents; everything will seem to work, but there's actually a bug in your
|
||||
program waiting for the first user who attempts to use characters >127.
|
||||
A second tip, therefore, is:
|
||||
|
||||
Include characters >127 and, even better, characters >255 in your
|
||||
test data.
|
||||
|
||||
When using data coming from a web browser or some other untrusted source,
|
||||
a common technique is to check for illegal characters in a string
|
||||
before using the string in a generated command line or storing it in a
|
||||
database. If you're doing this, be careful to check
|
||||
the string once it's in the form that will be used or stored; it's
|
||||
possible for encodings to be used to disguise characters. This is especially
|
||||
true if the input data also specifies the encoding;
|
||||
many encodings leave the commonly checked-for characters alone,
|
||||
but Python includes some encodings such as ``'base64'``
|
||||
that modify every single character.
|
||||
|
||||
For example, let's say you have a content management system that takes a
|
||||
Unicode filename, and you want to disallow paths with a '/' character.
|
||||
You might write this code::
|
||||
|
||||
def read_file (filename, encoding):
|
||||
if '/' in filename:
|
||||
raise ValueError("'/' not allowed in filenames")
|
||||
unicode_name = filename.decode(encoding)
|
||||
f = open(unicode_name, 'r')
|
||||
# ... return contents of file ...
|
||||
|
||||
However, if an attacker could specify the ``'base64'`` encoding,
|
||||
they could pass ``'L2V0Yy9wYXNzd2Q='``, which is the base-64
|
||||
encoded form of the string ``'/etc/passwd'``, to read a
|
||||
system file. The above code looks for ``'/'`` characters
|
||||
in the encoded form and misses the dangerous character
|
||||
in the resulting decoded form.
|
||||
|
||||
References
|
||||
''''''''''''''
|
||||
|
||||
The PDF slides for Marc-André Lemburg's presentation "Writing
|
||||
Unicode-aware Applications in Python" are available at
|
||||
<http://www.egenix.com/files/python/LSM2005-Developing-Unicode-aware-applications-in-Python.pdf>
|
||||
and discuss questions of character encodings as well as how to
|
||||
internationalize and localize an application.
|
||||
|
||||
|
||||
Revision History and Acknowledgements
|
||||
------------------------------------------
|
||||
|
||||
Thanks to the following people who have noted errors or offered
|
||||
suggestions on this article: Nicholas Bastin,
|
||||
Marius Gedminas, Kent Johnson, Ken Krugler,
|
||||
Marc-André Lemburg, Martin von Löwis, Chad Whitacre.
|
||||
|
||||
Version 1.0: posted August 5 2005.
|
||||
|
||||
Version 1.01: posted August 7 2005. Corrects factual and markup
|
||||
errors; adds several links.
|
||||
|
||||
Version 1.02: posted August 16 2005. Corrects factual errors.
|
||||
|
||||
|
||||
.. comment Additional topic: building Python w/ UCS2 or UCS4 support
|
||||
.. comment Describe obscure -U switch somewhere?
|
||||
.. comment Describe use of codecs.StreamRecoder and StreamReaderWriter
|
||||
|
||||
.. comment
|
||||
Original outline:
|
||||
|
||||
- [ ] Unicode introduction
|
||||
- [ ] ASCII
|
||||
- [ ] Terms
|
||||
- [ ] Character
|
||||
- [ ] Code point
|
||||
- [ ] Encodings
|
||||
- [ ] Common encodings: ASCII, Latin-1, UTF-8
|
||||
- [ ] Unicode Python type
|
||||
- [ ] Writing unicode literals
|
||||
- [ ] Obscurity: -U switch
|
||||
- [ ] Built-ins
|
||||
- [ ] unichr()
|
||||
- [ ] ord()
|
||||
- [ ] unicode() constructor
|
||||
- [ ] Unicode type
|
||||
- [ ] encode(), decode() methods
|
||||
- [ ] Unicodedata module for character properties
|
||||
- [ ] I/O
|
||||
- [ ] Reading/writing Unicode data into files
|
||||
- [ ] Byte-order marks
|
||||
- [ ] Unicode filenames
|
||||
- [ ] Writing Unicode programs
|
||||
- [ ] Do everything in Unicode
|
||||
- [ ] Declaring source code encodings (PEP 263)
|
||||
- [ ] Other issues
|
||||
- [ ] Building Python (UCS2, UCS4)
|
|
@ -1,603 +0,0 @@
|
|||
==============================================
|
||||
HOWTO Fetch Internet Resources Using urllib2
|
||||
==============================================
|
||||
----------------------------
|
||||
Fetching URLs With Python
|
||||
----------------------------
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
There is an French translation of an earlier revision of this
|
||||
HOWTO, available at `urllib2 - Le Manuel manquant
|
||||
<http://www.voidspace/python/articles/urllib2_francais.shtml>`_.
|
||||
|
||||
.. contents:: urllib2 Tutorial
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
.. sidebar:: Related Articles
|
||||
|
||||
You may also find useful the following article on fetching web
|
||||
resources with Python :
|
||||
|
||||
* `Basic Authentication <http://www.voidspace.org.uk/python/articles/authentication.shtml>`_
|
||||
|
||||
A tutorial on *Basic Authentication*, with examples in Python.
|
||||
|
||||
This HOWTO is written by `Michael Foord
|
||||
<http://www.voidspace.org.uk/python/index.shtml>`_.
|
||||
|
||||
**urllib2** is a `Python <http://www.python.org>`_ module for fetching URLs
|
||||
(Uniform Resource Locators). It offers a very simple interface, in the form of
|
||||
the *urlopen* function. This is capable of fetching URLs using a variety
|
||||
of different protocols. It also offers a slightly more complex
|
||||
interface for handling common situations - like basic authentication,
|
||||
cookies, proxies and so on. These are provided by objects called
|
||||
handlers and openers.
|
||||
|
||||
urllib2 supports fetching URLs for many "URL schemes" (identified by the string
|
||||
before the ":" in URL - for example "ftp" is the URL scheme of
|
||||
"ftp://python.org/") using their associated network protocols (e.g. FTP, HTTP).
|
||||
This tutorial focuses on the most common case, HTTP.
|
||||
|
||||
For straightforward situations *urlopen* is very easy to use. But as
|
||||
soon as you encounter errors or non-trivial cases when opening HTTP
|
||||
URLs, you will need some understanding of the HyperText Transfer
|
||||
Protocol. The most comprehensive and authoritative reference to HTTP
|
||||
is :RFC:`2616`. This is a technical document and not intended to be
|
||||
easy to read. This HOWTO aims to illustrate using *urllib2*, with
|
||||
enough detail about HTTP to help you through. It is not intended to
|
||||
replace the `urllib2 docs <http://docs.python.org/lib/module-urllib2.html>`_ ,
|
||||
but is supplementary to them.
|
||||
|
||||
|
||||
Fetching URLs
|
||||
=============
|
||||
|
||||
The simplest way to use urllib2 is as follows : ::
|
||||
|
||||
import urllib2
|
||||
response = urllib2.urlopen('http://python.org/')
|
||||
html = response.read()
|
||||
|
||||
Many uses of urllib2 will be that simple (note that instead of an
|
||||
'http:' URL we could have used an URL starting with 'ftp:', 'file:',
|
||||
etc.). However, it's the purpose of this tutorial to explain the more
|
||||
complicated cases, concentrating on HTTP.
|
||||
|
||||
HTTP is based on requests and responses - the client makes requests
|
||||
and servers send responses. urllib2 mirrors this with a ``Request``
|
||||
object which represents the HTTP request you are making. In its
|
||||
simplest form you create a Request object that specifies the URL you
|
||||
want to fetch. Calling ``urlopen`` with this Request object returns a
|
||||
response object for the URL requested. This response is a file-like
|
||||
object, which means you can for example call .read() on the response :
|
||||
::
|
||||
|
||||
import urllib2
|
||||
|
||||
req = urllib2.Request('http://www.voidspace.org.uk')
|
||||
response = urllib2.urlopen(req)
|
||||
the_page = response.read()
|
||||
|
||||
Note that urllib2 makes use of the same Request interface to handle
|
||||
all URL schemes. For example, you can make an FTP request like so: ::
|
||||
|
||||
req = urllib2.Request('ftp://example.com/')
|
||||
|
||||
In the case of HTTP, there are two extra things that Request objects
|
||||
allow you to do: First, you can pass data to be sent to the server.
|
||||
Second, you can pass extra information ("metadata") *about* the data
|
||||
or the about request itself, to the server - this information is sent
|
||||
as HTTP "headers". Let's look at each of these in turn.
|
||||
|
||||
Data
|
||||
----
|
||||
|
||||
Sometimes you want to send data to a URL (often the URL will refer to
|
||||
a CGI (Common Gateway Interface) script [#]_ or other web
|
||||
application). With HTTP, this is often done using what's known as a
|
||||
**POST** request. This is often what your browser does when you submit
|
||||
a HTML form that you filled in on the web. Not all POSTs have to come
|
||||
from forms: you can use a POST to transmit arbitrary data to your own
|
||||
application. In the common case of HTML forms, the data needs to be
|
||||
encoded in a standard way, and then passed to the Request object as
|
||||
the ``data`` argument. The encoding is done using a function from the
|
||||
``urllib`` library *not* from ``urllib2``. ::
|
||||
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
url = 'http://www.someserver.com/cgi-bin/register.cgi'
|
||||
values = {'name' : 'Michael Foord',
|
||||
'location' : 'Northampton',
|
||||
'language' : 'Python' }
|
||||
|
||||
data = urllib.urlencode(values)
|
||||
req = urllib2.Request(url, data)
|
||||
response = urllib2.urlopen(req)
|
||||
the_page = response.read()
|
||||
|
||||
Note that other encodings are sometimes required (e.g. for file upload
|
||||
from HTML forms - see
|
||||
`HTML Specification, Form Submission <http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.13>`_
|
||||
for more details).
|
||||
|
||||
If you do not pass the ``data`` argument, urllib2 uses a **GET**
|
||||
request. One way in which GET and POST requests differ is that POST
|
||||
requests often have "side-effects": they change the state of the
|
||||
system in some way (for example by placing an order with the website
|
||||
for a hundredweight of tinned spam to be delivered to your door).
|
||||
Though the HTTP standard makes it clear that POSTs are intended to
|
||||
*always* cause side-effects, and GET requests *never* to cause
|
||||
side-effects, nothing prevents a GET request from having side-effects,
|
||||
nor a POST requests from having no side-effects. Data can also be
|
||||
passed in an HTTP GET request by encoding it in the URL itself.
|
||||
|
||||
This is done as follows::
|
||||
|
||||
>>> import urllib2
|
||||
>>> import urllib
|
||||
>>> data = {}
|
||||
>>> data['name'] = 'Somebody Here'
|
||||
>>> data['location'] = 'Northampton'
|
||||
>>> data['language'] = 'Python'
|
||||
>>> url_values = urllib.urlencode(data)
|
||||
>>> print url_values
|
||||
name=Somebody+Here&language=Python&location=Northampton
|
||||
>>> url = 'http://www.example.com/example.cgi'
|
||||
>>> full_url = url + '?' + url_values
|
||||
>>> data = urllib2.open(full_url)
|
||||
|
||||
Notice that the full URL is created by adding a ``?`` to the URL, followed by
|
||||
the encoded values.
|
||||
|
||||
Headers
|
||||
-------
|
||||
|
||||
We'll discuss here one particular HTTP header, to illustrate how to
|
||||
add headers to your HTTP request.
|
||||
|
||||
Some websites [#]_ dislike being browsed by programs, or send
|
||||
different versions to different browsers [#]_ . By default urllib2
|
||||
identifies itself as ``Python-urllib/x.y`` (where ``x`` and ``y`` are
|
||||
the major and minor version numbers of the Python release,
|
||||
e.g. ``Python-urllib/2.5``), which may confuse the site, or just plain
|
||||
not work. The way a browser identifies itself is through the
|
||||
``User-Agent`` header [#]_. When you create a Request object you can
|
||||
pass a dictionary of headers in. The following example makes the same
|
||||
request as above, but identifies itself as a version of Internet
|
||||
Explorer [#]_. ::
|
||||
|
||||
import urllib
|
||||
import urllib2
|
||||
|
||||
url = 'http://www.someserver.com/cgi-bin/register.cgi'
|
||||
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
|
||||
values = {'name' : 'Michael Foord',
|
||||
'location' : 'Northampton',
|
||||
'language' : 'Python' }
|
||||
headers = { 'User-Agent' : user_agent }
|
||||
|
||||
data = urllib.urlencode(values)
|
||||
req = urllib2.Request(url, data, headers)
|
||||
response = urllib2.urlopen(req)
|
||||
the_page = response.read()
|
||||
|
||||
The response also has two useful methods. See the section on `info and
|
||||
geturl`_ which comes after we have a look at what happens when things
|
||||
go wrong.
|
||||
|
||||
|
||||
Handling Exceptions
|
||||
===================
|
||||
|
||||
*urlopen* raises ``URLError`` when it cannot handle a response (though
|
||||
as usual with Python APIs, builtin exceptions such as ValueError,
|
||||
TypeError etc. may also be raised).
|
||||
|
||||
``HTTPError`` is the subclass of ``URLError`` raised in the specific
|
||||
case of HTTP URLs.
|
||||
|
||||
URLError
|
||||
--------
|
||||
|
||||
Often, URLError is raised because there is no network connection (no
|
||||
route to the specified server), or the specified server doesn't exist.
|
||||
In this case, the exception raised will have a 'reason' attribute,
|
||||
which is a tuple containing an error code and a text error message.
|
||||
|
||||
e.g. ::
|
||||
|
||||
>>> req = urllib2.Request('http://www.pretend_server.org')
|
||||
>>> try: urllib2.urlopen(req)
|
||||
>>> except URLError, e:
|
||||
>>> print e.reason
|
||||
>>>
|
||||
(4, 'getaddrinfo failed')
|
||||
|
||||
|
||||
HTTPError
|
||||
---------
|
||||
|
||||
Every HTTP response from the server contains a numeric "status
|
||||
code". Sometimes the status code indicates that the server is unable
|
||||
to fulfil the request. The default handlers will handle some of these
|
||||
responses for you (for example, if the response is a "redirection"
|
||||
that requests the client fetch the document from a different URL,
|
||||
urllib2 will handle that for you). For those it can't handle, urlopen
|
||||
will raise an ``HTTPError``. Typical errors include '404' (page not
|
||||
found), '403' (request forbidden), and '401' (authentication
|
||||
required).
|
||||
|
||||
See section 10 of RFC 2616 for a reference on all the HTTP error
|
||||
codes.
|
||||
|
||||
The ``HTTPError`` instance raised will have an integer 'code'
|
||||
attribute, which corresponds to the error sent by the server.
|
||||
|
||||
Error Codes
|
||||
~~~~~~~~~~~
|
||||
|
||||
Because the default handlers handle redirects (codes in the 300
|
||||
range), and codes in the 100-299 range indicate success, you will
|
||||
usually only see error codes in the 400-599 range.
|
||||
|
||||
``BaseHTTPServer.BaseHTTPRequestHandler.responses`` is a useful
|
||||
dictionary of response codes in that shows all the response codes used
|
||||
by RFC 2616. The dictionary is reproduced here for convenience ::
|
||||
|
||||
# Table mapping response codes to messages; entries have the
|
||||
# form {code: (shortmessage, longmessage)}.
|
||||
responses = {
|
||||
100: ('Continue', 'Request received, please continue'),
|
||||
101: ('Switching Protocols',
|
||||
'Switching to new protocol; obey Upgrade header'),
|
||||
|
||||
200: ('OK', 'Request fulfilled, document follows'),
|
||||
201: ('Created', 'Document created, URL follows'),
|
||||
202: ('Accepted',
|
||||
'Request accepted, processing continues off-line'),
|
||||
203: ('Non-Authoritative Information', 'Request fulfilled from cache'),
|
||||
204: ('No Content', 'Request fulfilled, nothing follows'),
|
||||
205: ('Reset Content', 'Clear input form for further input.'),
|
||||
206: ('Partial Content', 'Partial content follows.'),
|
||||
|
||||
300: ('Multiple Choices',
|
||||
'Object has several resources -- see URI list'),
|
||||
301: ('Moved Permanently', 'Object moved permanently -- see URI list'),
|
||||
302: ('Found', 'Object moved temporarily -- see URI list'),
|
||||
303: ('See Other', 'Object moved -- see Method and URL list'),
|
||||
304: ('Not Modified',
|
||||
'Document has not changed since given time'),
|
||||
305: ('Use Proxy',
|
||||
'You must use proxy specified in Location to access this '
|
||||
'resource.'),
|
||||
307: ('Temporary Redirect',
|
||||
'Object moved temporarily -- see URI list'),
|
||||
|
||||
400: ('Bad Request',
|
||||
'Bad request syntax or unsupported method'),
|
||||
401: ('Unauthorized',
|
||||
'No permission -- see authorization schemes'),
|
||||
402: ('Payment Required',
|
||||
'No payment -- see charging schemes'),
|
||||
403: ('Forbidden',
|
||||
'Request forbidden -- authorization will not help'),
|
||||
404: ('Not Found', 'Nothing matches the given URI'),
|
||||
405: ('Method Not Allowed',
|
||||
'Specified method is invalid for this server.'),
|
||||
406: ('Not Acceptable', 'URI not available in preferred format.'),
|
||||
407: ('Proxy Authentication Required', 'You must authenticate with '
|
||||
'this proxy before proceeding.'),
|
||||
408: ('Request Timeout', 'Request timed out; try again later.'),
|
||||
409: ('Conflict', 'Request conflict.'),
|
||||
410: ('Gone',
|
||||
'URI no longer exists and has been permanently removed.'),
|
||||
411: ('Length Required', 'Client must specify Content-Length.'),
|
||||
412: ('Precondition Failed', 'Precondition in headers is false.'),
|
||||
413: ('Request Entity Too Large', 'Entity is too large.'),
|
||||
414: ('Request-URI Too Long', 'URI is too long.'),
|
||||
415: ('Unsupported Media Type', 'Entity body in unsupported format.'),
|
||||
416: ('Requested Range Not Satisfiable',
|
||||
'Cannot satisfy request range.'),
|
||||
417: ('Expectation Failed',
|
||||
'Expect condition could not be satisfied.'),
|
||||
|
||||
500: ('Internal Server Error', 'Server got itself in trouble'),
|
||||
501: ('Not Implemented',
|
||||
'Server does not support this operation'),
|
||||
502: ('Bad Gateway', 'Invalid responses from another server/proxy.'),
|
||||
503: ('Service Unavailable',
|
||||
'The server cannot process the request due to a high load'),
|
||||
504: ('Gateway Timeout',
|
||||
'The gateway server did not receive a timely response'),
|
||||
505: ('HTTP Version Not Supported', 'Cannot fulfill request.'),
|
||||
}
|
||||
|
||||
When an error is raised the server responds by returning an HTTP error
|
||||
code *and* an error page. You can use the ``HTTPError`` instance as a
|
||||
response on the page returned. This means that as well as the code
|
||||
attribute, it also has read, geturl, and info, methods. ::
|
||||
|
||||
>>> req = urllib2.Request('http://www.python.org/fish.html')
|
||||
>>> try:
|
||||
>>> urllib2.urlopen(req)
|
||||
>>> except URLError, e:
|
||||
>>> print e.code
|
||||
>>> print e.read()
|
||||
>>>
|
||||
404
|
||||
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
|
||||
"http://www.w3.org/TR/html4/loose.dtd">
|
||||
<?xml-stylesheet href="./css/ht2html.css"
|
||||
type="text/css"?>
|
||||
<html><head><title>Error 404: File Not Found</title>
|
||||
...... etc...
|
||||
|
||||
Wrapping it Up
|
||||
--------------
|
||||
|
||||
So if you want to be prepared for ``HTTPError`` *or* ``URLError``
|
||||
there are two basic approaches. I prefer the second approach.
|
||||
|
||||
Number 1
|
||||
~~~~~~~~
|
||||
|
||||
::
|
||||
|
||||
|
||||
from urllib2 import Request, urlopen, URLError, HTTPError
|
||||
req = Request(someurl)
|
||||
try:
|
||||
response = urlopen(req)
|
||||
except HTTPError, e:
|
||||
print 'The server couldn\'t fulfill the request.'
|
||||
print 'Error code: ', e.code
|
||||
except URLError, e:
|
||||
print 'We failed to reach a server.'
|
||||
print 'Reason: ', e.reason
|
||||
else:
|
||||
# everything is fine
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
The ``except HTTPError`` *must* come first, otherwise ``except URLError``
|
||||
will *also* catch an ``HTTPError``.
|
||||
|
||||
Number 2
|
||||
~~~~~~~~
|
||||
|
||||
::
|
||||
|
||||
from urllib2 import Request, urlopen, URLError
|
||||
req = Request(someurl)
|
||||
try:
|
||||
response = urlopen(req)
|
||||
except URLError, e:
|
||||
if hasattr(e, 'reason'):
|
||||
print 'We failed to reach a server.'
|
||||
print 'Reason: ', e.reason
|
||||
elif hasattr(e, 'code'):
|
||||
print 'The server couldn\'t fulfill the request.'
|
||||
print 'Error code: ', e.code
|
||||
else:
|
||||
# everything is fine
|
||||
|
||||
|
||||
info and geturl
|
||||
===============
|
||||
|
||||
The response returned by urlopen (or the ``HTTPError`` instance) has
|
||||
two useful methods ``info`` and ``geturl``.
|
||||
|
||||
**geturl** - this returns the real URL of the page fetched. This is
|
||||
useful because ``urlopen`` (or the opener object used) may have
|
||||
followed a redirect. The URL of the page fetched may not be the same
|
||||
as the URL requested.
|
||||
|
||||
**info** - this returns a dictionary-like object that describes the
|
||||
page fetched, particularly the headers sent by the server. It is
|
||||
currently an ``httplib.HTTPMessage`` instance.
|
||||
|
||||
Typical headers include 'Content-length', 'Content-type', and so
|
||||
on. See the
|
||||
`Quick Reference to HTTP Headers <http://www.cs.tut.fi/~jkorpela/http.html>`_
|
||||
for a useful listing of HTTP headers with brief explanations of their meaning
|
||||
and use.
|
||||
|
||||
|
||||
Openers and Handlers
|
||||
====================
|
||||
|
||||
When you fetch a URL you use an opener (an instance of the perhaps
|
||||
confusingly-named ``urllib2.OpenerDirector``). Normally we have been using
|
||||
the default opener - via ``urlopen`` - but you can create custom
|
||||
openers. Openers use handlers. All the "heavy lifting" is done by the
|
||||
handlers. Each handler knows how to open URLs for a particular URL
|
||||
scheme (http, ftp, etc.), or how to handle an aspect of URL opening,
|
||||
for example HTTP redirections or HTTP cookies.
|
||||
|
||||
You will want to create openers if you want to fetch URLs with
|
||||
specific handlers installed, for example to get an opener that handles
|
||||
cookies, or to get an opener that does not handle redirections.
|
||||
|
||||
To create an opener, instantiate an OpenerDirector, and then call
|
||||
.add_handler(some_handler_instance) repeatedly.
|
||||
|
||||
Alternatively, you can use ``build_opener``, which is a convenience
|
||||
function for creating opener objects with a single function call.
|
||||
``build_opener`` adds several handlers by default, but provides a
|
||||
quick way to add more and/or override the default handlers.
|
||||
|
||||
Other sorts of handlers you might want to can handle proxies,
|
||||
authentication, and other common but slightly specialised
|
||||
situations.
|
||||
|
||||
``install_opener`` can be used to make an ``opener`` object the
|
||||
(global) default opener. This means that calls to ``urlopen`` will use
|
||||
the opener you have installed.
|
||||
|
||||
Opener objects have an ``open`` method, which can be called directly
|
||||
to fetch urls in the same way as the ``urlopen`` function: there's no
|
||||
need to call ``install_opener``, except as a convenience.
|
||||
|
||||
|
||||
Basic Authentication
|
||||
====================
|
||||
|
||||
To illustrate creating and installing a handler we will use the
|
||||
``HTTPBasicAuthHandler``. For a more detailed discussion of this
|
||||
subject - including an explanation of how Basic Authentication works -
|
||||
see the `Basic Authentication Tutorial <http://www.voidspace.org.uk/python/articles/authentication.shtml>`_.
|
||||
|
||||
When authentication is required, the server sends a header (as well as
|
||||
the 401 error code) requesting authentication. This specifies the
|
||||
authentication scheme and a 'realm'. The header looks like :
|
||||
``Www-authenticate: SCHEME realm="REALM"``.
|
||||
|
||||
e.g. ::
|
||||
|
||||
Www-authenticate: Basic realm="cPanel Users"
|
||||
|
||||
|
||||
The client should then retry the request with the appropriate name and
|
||||
password for the realm included as a header in the request. This is
|
||||
'basic authentication'. In order to simplify this process we can
|
||||
create an instance of ``HTTPBasicAuthHandler`` and an opener to use
|
||||
this handler.
|
||||
|
||||
The ``HTTPBasicAuthHandler`` uses an object called a password manager
|
||||
to handle the mapping of URLs and realms to passwords and
|
||||
usernames. If you know what the realm is (from the authentication
|
||||
header sent by the server), then you can use a
|
||||
``HTTPPasswordMgr``. Frequently one doesn't care what the realm is. In
|
||||
that case, it is convenient to use
|
||||
``HTTPPasswordMgrWithDefaultRealm``. This allows you to specify a
|
||||
default username and password for a URL. This will be supplied in the
|
||||
absence of you providing an alternative combination for a specific
|
||||
realm. We indicate this by providing ``None`` as the realm argument to
|
||||
the ``add_password`` method.
|
||||
|
||||
The top-level URL is the first URL that requires authentication. URLs
|
||||
"deeper" than the URL you pass to .add_password() will also match. ::
|
||||
|
||||
# create a password manager
|
||||
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
|
||||
|
||||
# Add the username and password.
|
||||
# If we knew the realm, we could use it instead of ``None``.
|
||||
top_level_url = "http://example.com/foo/"
|
||||
password_mgr.add_password(None, top_level_url, username, password)
|
||||
|
||||
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
|
||||
|
||||
# create "opener" (OpenerDirector instance)
|
||||
opener = urllib2.build_opener(handler)
|
||||
|
||||
# use the opener to fetch a URL
|
||||
opener.open(a_url)
|
||||
|
||||
# Install the opener.
|
||||
# Now all calls to urllib2.urlopen use our opener.
|
||||
urllib2.install_opener(opener)
|
||||
|
||||
.. note::
|
||||
|
||||
In the above example we only supplied our ``HHTPBasicAuthHandler``
|
||||
to ``build_opener``. By default openers have the handlers for
|
||||
normal situations - ``ProxyHandler``, ``UnknownHandler``,
|
||||
``HTTPHandler``, ``HTTPDefaultErrorHandler``,
|
||||
``HTTPRedirectHandler``, ``FTPHandler``, ``FileHandler``,
|
||||
``HTTPErrorProcessor``.
|
||||
|
||||
top_level_url is in fact *either* a full URL (including the 'http:'
|
||||
scheme component and the hostname and optionally the port number)
|
||||
e.g. "http://example.com/" *or* an "authority" (i.e. the hostname,
|
||||
optionally including the port number) e.g. "example.com" or
|
||||
"example.com:8080" (the latter example includes a port number). The
|
||||
authority, if present, must NOT contain the "userinfo" component - for
|
||||
example "joe@password:example.com" is not correct.
|
||||
|
||||
|
||||
Proxies
|
||||
=======
|
||||
|
||||
**urllib2** will auto-detect your proxy settings and use those. This
|
||||
is through the ``ProxyHandler`` which is part of the normal handler
|
||||
chain. Normally that's a good thing, but there are occasions when it
|
||||
may not be helpful [#]_. One way to do this is to setup our own
|
||||
``ProxyHandler``, with no proxies defined. This is done using similar
|
||||
steps to setting up a `Basic Authentication`_ handler : ::
|
||||
|
||||
>>> proxy_support = urllib2.ProxyHandler({})
|
||||
>>> opener = urllib2.build_opener(proxy_support)
|
||||
>>> urllib2.install_opener(opener)
|
||||
|
||||
.. note::
|
||||
|
||||
Currently ``urllib2`` *does not* support fetching of ``https``
|
||||
locations through a proxy. However, this can be enabled by extending
|
||||
urllib2 as shown in the recipe [#]_.
|
||||
|
||||
|
||||
Sockets and Layers
|
||||
==================
|
||||
|
||||
The Python support for fetching resources from the web is
|
||||
layered. urllib2 uses the httplib library, which in turn uses the
|
||||
socket library.
|
||||
|
||||
As of Python 2.3 you can specify how long a socket should wait for a
|
||||
response before timing out. This can be useful in applications which
|
||||
have to fetch web pages. By default the socket module has *no timeout*
|
||||
and can hang. Currently, the socket timeout is not exposed at the
|
||||
httplib or urllib2 levels. However, you can set the default timeout
|
||||
globally for all sockets using : ::
|
||||
|
||||
import socket
|
||||
import urllib2
|
||||
|
||||
# timeout in seconds
|
||||
timeout = 10
|
||||
socket.setdefaulttimeout(timeout)
|
||||
|
||||
# this call to urllib2.urlopen now uses the default timeout
|
||||
# we have set in the socket module
|
||||
req = urllib2.Request('http://www.voidspace.org.uk')
|
||||
response = urllib2.urlopen(req)
|
||||
|
||||
|
||||
-------
|
||||
|
||||
|
||||
Footnotes
|
||||
=========
|
||||
|
||||
This document was reviewed and revised by John Lee.
|
||||
|
||||
.. [#] For an introduction to the CGI protocol see
|
||||
`Writing Web Applications in Python <http://www.pyzine.com/Issue008/Section_Articles/article_CGIOne.html>`_.
|
||||
.. [#] Like Google for example. The *proper* way to use google from a program
|
||||
is to use `PyGoogle <http://pygoogle.sourceforge.net>`_ of course. See
|
||||
`Voidspace Google <http://www.voidspace.org.uk/python/recipebook.shtml#google>`_
|
||||
for some examples of using the Google API.
|
||||
.. [#] Browser sniffing is a very bad practise for website design - building
|
||||
sites using web standards is much more sensible. Unfortunately a lot of
|
||||
sites still send different versions to different browsers.
|
||||
.. [#] The user agent for MSIE 6 is
|
||||
*'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)'*
|
||||
.. [#] For details of more HTTP request headers, see
|
||||
`Quick Reference to HTTP Headers`_.
|
||||
.. [#] In my case I have to use a proxy to access the internet at work. If you
|
||||
attempt to fetch *localhost* URLs through this proxy it blocks them. IE
|
||||
is set to use the proxy, which urllib2 picks up on. In order to test
|
||||
scripts with a localhost server, I have to prevent urllib2 from using
|
||||
the proxy.
|
||||
.. [#] urllib2 opener for SSL proxy (CONNECT method): `ASPN Cookbook Recipe
|
||||
<http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/456195>`_.
|
||||
|
|
@ -1,24 +0,0 @@
|
|||
<p> This document was generated using the <a
|
||||
href="http://saftsack.fs.uni-bayreuth.de/;SPMtilde;latex2ht/">
|
||||
<strong>LaTeX</strong>2<tt>HTML</tt></a> translator.
|
||||
</p>
|
||||
|
||||
<p> <a
|
||||
href="http://saftsack.fs.uni-bayreuth.de/;SPMtilde;latex2ht/">
|
||||
<strong>LaTeX</strong>2<tt>HTML</tt></a> is Copyright ©
|
||||
1993, 1994, 1995, 1996, 1997, <a
|
||||
href="http://cbl.leeds.ac.uk/nikos/personal.html">Nikos
|
||||
Drakos</a>, Computer Based Learning Unit, University of
|
||||
Leeds, and Copyright © 1997, 1998, <a
|
||||
href="http://www.maths.mq.edu.au/;SPMtilde;ross/">Ross
|
||||
Moore</a>, Mathematics Department, Macquarie University,
|
||||
Sydney.
|
||||
</p>
|
||||
|
||||
<p> The application of <a
|
||||
href="http://saftsack.fs.uni-bayreuth.de/;SPMtilde;latex2ht/">
|
||||
<strong>LaTeX</strong>2<tt>HTML</tt></a> to the Python
|
||||
documentation has been heavily tailored by Fred L. Drake,
|
||||
Jr. Original navigation icons were contributed by Christopher
|
||||
Petrilli.
|
||||
</p>
|
|
@ -1,84 +0,0 @@
|
|||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
||||
<html>
|
||||
<head>
|
||||
<title>About the Python Documentation</title>
|
||||
<meta name="description"
|
||||
content="Overview information about the Python documentation">
|
||||
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
||||
<link rel="contents" href="index.html" title="Python Documentation Index">
|
||||
<link rel="index" href="modindex.html" title="Global Module Index">
|
||||
<link rel="start" href="index.html" title="Python Documentation Index">
|
||||
<link rel="up" href="index.html" title="Python Documentation Index">
|
||||
<link rel="SHORTCUT ICON" href="icons/pyfav.png" type="image/png">
|
||||
<link rel="STYLESHEET" href="lib/lib.css">
|
||||
</head>
|
||||
<body>
|
||||
<div class="navigation">
|
||||
<table width="100%" cellpadding="0" cellspacing="2">
|
||||
<tr>
|
||||
<td><img width="32" height="32" align="bottom" border="0" alt=""
|
||||
src="icons/blank.png"></td>
|
||||
<td><a href="index.html"
|
||||
title="Python Documentation Index"><img width="32" height="32"
|
||||
align="bottom" border="0" alt="up"
|
||||
src="icons/up.png"></a></td>
|
||||
<td><img width="32" height="32" align="bottom" border="0" alt=""
|
||||
src="icons/blank.png"></td>
|
||||
<td align="center" width="100%">About the Python Documentation</td>
|
||||
<td><img width="32" height="32" align="bottom" border="0" alt=""
|
||||
src="icons/blank.png"></td>
|
||||
<td><img width="32" height="32" align="bottom" border="0" alt=""
|
||||
src="icons/blank.png"></td>
|
||||
<td><img width="32" height="32" align="bottom" border="0" alt=""
|
||||
src="icons/blank.png"></td>
|
||||
</tr>
|
||||
</table>
|
||||
<b class="navlabel">Up:</b>
|
||||
<span class="sectref">
|
||||
<a href="index.html" title="Python Documentation Index">
|
||||
Python Documentation Index</A></span>
|
||||
<br>
|
||||
</div>
|
||||
<hr>
|
||||
|
||||
<h2>About the Python Documentation</h2>
|
||||
|
||||
<p>The Python documentation was originally written by Guido van
|
||||
Rossum, but has increasingly become a community effort over the
|
||||
past several years. This growing collection of documents is
|
||||
available in several formats, including typeset versions in PDF
|
||||
and PostScript for printing, from the <a
|
||||
href="http://www.python.org/">Python Web site</a>.
|
||||
|
||||
<p>A <a href="acks.html">list of contributors</a> is available.
|
||||
|
||||
<h2>Comments and Questions</h2>
|
||||
|
||||
<p> General comments and questions regarding this document should
|
||||
be sent by email to <a href="mailto:docs@python.org"
|
||||
>docs@python.org</a>. If you find specific errors in
|
||||
this document, please report the bug at the <a
|
||||
href="http://sourceforge.net/bugs/?group_id=5470">Python Bug
|
||||
Tracker</a> at <a href="http://sourceforge.net/">SourceForge</a>.
|
||||
If you are able to provide suggested text, either to replace
|
||||
existing incorrect or unclear material, or additional text to
|
||||
supplement what's already available, we'd appreciate the
|
||||
contribution. There's no need to worry about text markup; our
|
||||
documentation team will gladly take care of that.
|
||||
</p>
|
||||
|
||||
<p> Questions regarding how to use the information in this
|
||||
document should be sent to the Python news group, <a
|
||||
href="news:comp.lang.python">comp.lang.python</a>, or the <a
|
||||
href="http://www.python.org/mailman/listinfo/python-list"
|
||||
>Python mailing list</a> (which is gated to the newsgroup and
|
||||
carries the same content).
|
||||
</p>
|
||||
|
||||
<p> For any of these channels, please be sure not to send HTML email.
|
||||
Thanks.
|
||||
</p>
|
||||
|
||||
<hr>
|
||||
</body>
|
||||
</html>
|
Before Width: | Height: | Size: 1.9 KiB |
Before Width: | Height: | Size: 1.0 KiB |
Before Width: | Height: | Size: 438 B |
Before Width: | Height: | Size: 649 B |
Before Width: | Height: | Size: 289 B |
Before Width: | Height: | Size: 529 B |
Before Width: | Height: | Size: 385 B |
Before Width: | Height: | Size: 598 B |
Before Width: | Height: | Size: 253 B |
Before Width: | Height: | Size: 511 B |
Before Width: | Height: | Size: 252 B |
Before Width: | Height: | Size: 511 B |
Before Width: | Height: | Size: 125 B |
Before Width: | Height: | Size: 240 B |
Before Width: | Height: | Size: 316 B |
Before Width: | Height: | Size: 577 B |
|
@ -1,140 +0,0 @@
|
|||
<html>
|
||||
<head>
|
||||
<title>Python @RELEASE@ Documentation - @DATE@</title>
|
||||
<meta name="aesop" content="links">
|
||||
<meta name="description"
|
||||
content="Top-level index to the standard documentation for
|
||||
Python @RELEASE@.">
|
||||
<link rel="SHORTCUT ICON" href="icons/pyfav.png" type="image/png">
|
||||
<link rel="STYLESHEET" href="lib/lib.css" type="text/css">
|
||||
<link rel="author" href="acks.html" title="Acknowledgements">
|
||||
<link rel="help" href="about.html" title="About the Python Documentation">
|
||||
<link rel="index" href="modindex.html" title="Global Module Index">
|
||||
<style type="text/css">
|
||||
a.title { font-weight: bold; font-size: 110%; }
|
||||
ul { margin-left: 1em; padding: 0pt; border: 0pt; }
|
||||
ul li { margin-top: 0.2em; }
|
||||
td.left-column { padding-right: 1em; }
|
||||
td.right-column { padding-left: 1em; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="navigation">
|
||||
<table align="center" width="100%" cellpadding="0" cellspacing="2">
|
||||
<tr>
|
||||
<td><img width="32" height="32" align="bottom" border="0" alt=""
|
||||
src="icons/blank.png"></td>
|
||||
<td><img width="32" height="32" align="bottom" border="0" alt=""
|
||||
src="icons/blank.png"></td>
|
||||
<td><img width="32" height="32" align="bottom" border="0" alt=""
|
||||
src="icons/blank.png"></td>
|
||||
<td align="center" width="100%">
|
||||
<b class="title">Python Documentation</b></td>
|
||||
<td><img width="32" height="32" align="bottom" border="0" alt=""
|
||||
src="icons/blank.png"></td>
|
||||
<td><a href="modindex.html"><img width="32" height="32"
|
||||
align="bottom" border="0" alt="Module Index"
|
||||
src="icons/modules.png"></a></td>
|
||||
<td><img width="32" height="32" align="bottom" border="0" alt=""
|
||||
src="icons/blank.png"></A></td>
|
||||
</tr>
|
||||
</table>
|
||||
<hr>
|
||||
</div>
|
||||
<div align="center" class="titlepage">
|
||||
<h1>Python Documentation</h1>
|
||||
|
||||
<p>
|
||||
<strong>Release @RELEASE@</strong>
|
||||
<br>
|
||||
<strong>@DATE@</strong>
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<table align="center">
|
||||
<tbody>
|
||||
<tr>
|
||||
<td class="left-column">
|
||||
<ul>
|
||||
<li> <a href="tut/tut.html" class="title">Tutorial</a>
|
||||
<br>(start here)
|
||||
</ul>
|
||||
</td>
|
||||
<td class="right-column">
|
||||
<ul>
|
||||
<li> <a href="whatsnew/@WHATSNEW@.html" class="title"
|
||||
>What's New in Python</a>
|
||||
<br>(changes since the last major release)
|
||||
</ul>
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td valign="baseline" class="left-column">
|
||||
|
||||
<ul>
|
||||
<li> <a href="modindex.html" class="title">Global Module Index</a>
|
||||
<br>(for quick access to all documentation)
|
||||
|
||||
<li> <a href="lib/lib.html" class="title">Library Reference</a>
|
||||
<br>(keep this under your pillow)
|
||||
|
||||
<li> <a href="mac/mac.html" class="title">Macintosh Module
|
||||
Reference</a>
|
||||
<br>(this too, if you use a Macintosh)
|
||||
|
||||
<li> <a href="inst/inst.html" class="title">Installing
|
||||
Python Modules</a>
|
||||
<br>(for administrators)
|
||||
|
||||
<li> <a href="dist/dist.html" class="title">Distributing
|
||||
Python Modules</a>
|
||||
<br>(for developers and packagers)
|
||||
</ul>
|
||||
</td>
|
||||
<td valign="baseline" class="right-column">
|
||||
|
||||
<ul>
|
||||
<li> <a href="ref/ref.html" class="title">Language Reference</a>
|
||||
<br>(for language lawyers)
|
||||
|
||||
<li> <a href="ext/ext.html" class="title">Extending and
|
||||
Embedding</a>
|
||||
<br>(tutorial for C/C++ programmers)
|
||||
|
||||
<li> <a href="api/api.html" class="title">Python/C API</a>
|
||||
<br>(reference for C/C++ programmers)
|
||||
|
||||
<li> <a href="doc/doc.html" class="title">Documenting Python</a>
|
||||
<br>(information for documentation authors)
|
||||
</ul>
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td valign="baseline" class="left-column">
|
||||
|
||||
<ul>
|
||||
<li> <a href="http://www.python.org/doc/" class="title"
|
||||
>Documentation Central</a>
|
||||
<br>(for everyone)
|
||||
</ul>
|
||||
</td>
|
||||
<td valign="baseline" class="right-column">
|
||||
|
||||
<ul>
|
||||
<li> <a href="http://www.python.org/doc/howto/" class="title"
|
||||
>Python How-To Guides</a>
|
||||
<br>(special topics)
|
||||
</ul>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p>
|
||||
|
||||
<address>
|
||||
<hr>
|
||||
See <i><a href="about.html">About the Python Documentation</a></i>
|
||||
for information on suggesting changes.
|
||||
</address>
|
||||
</body>
|
||||
</html>
|
|
@ -1,54 +0,0 @@
|
|||
<p> This document was generated using the <a
|
||||
href="http://saftsack.fs.uni-bayreuth.de/;SPMtilde;latex2ht/">
|
||||
<strong>LaTeX</strong>2<tt>HTML</tt></a> translator.
|
||||
</p>
|
||||
|
||||
<p> <a
|
||||
href="http://saftsack.fs.uni-bayreuth.de/;SPMtilde;latex2ht/">
|
||||
<strong>LaTeX</strong>2<tt>HTML</tt></a> is Copyright ©
|
||||
1993, 1994, 1995, 1996, 1997, <a
|
||||
href="http://cbl.leeds.ac.uk/nikos/personal.html">Nikos
|
||||
Drakos</a>, Computer Based Learning Unit, University of
|
||||
Leeds, and Copyright © 1997, 1998, <a
|
||||
href="http://www.maths.mq.edu.au/;SPMtilde;ross/">Ross
|
||||
Moore</a>, Mathematics Department, Macquarie University,
|
||||
Sydney.
|
||||
</p>
|
||||
|
||||
<p> The application of <a
|
||||
href="http://saftsack.fs.uni-bayreuth.de/;SPMtilde;latex2ht/">
|
||||
<strong>LaTeX</strong>2<tt>HTML</tt></a> to the Python
|
||||
documentation has been heavily tailored by Fred L. Drake,
|
||||
Jr. Original navigation icons were contributed by Christopher
|
||||
Petrilli.
|
||||
</p>
|
||||
|
||||
<hr>
|
||||
|
||||
<h2>Comments and Questions</h2>
|
||||
|
||||
<p> General comments and questions regarding this document should
|
||||
be sent by email to <a href="mailto:docs@python.org"
|
||||
>docs@python.org</a>. If you find specific errors in
|
||||
this document, either in the content or the presentation, please
|
||||
report the bug at the <a
|
||||
href="http://sourceforge.net/bugs/?group_id=5470">Python Bug
|
||||
Tracker</a> at <a href="http://sourceforge.net/">SourceForge</a>.
|
||||
If you are able to provide suggested text, either to replace
|
||||
existing incorrect or unclear material, or additional text to
|
||||
supplement what's already available, we'd appreciate the
|
||||
contribution. There's no need to worry about text markup; our
|
||||
documentation team will gladly take care of that.
|
||||
</p>
|
||||
|
||||
<p> Questions regarding how to use the information in this
|
||||
document should be sent to the Python news group, <a
|
||||
href="news:comp.lang.python">comp.lang.python</a>, or the <a
|
||||
href="http://www.python.org/mailman/listinfo/python-list"
|
||||
>Python mailing list</a> (which is gated to the newsgroup and
|
||||
carries the same content).
|
||||
</p>
|
||||
|
||||
<p> For any of these channels, please be sure not to send HTML email.
|
||||
Thanks.
|
||||
</p>
|
|
@ -1,243 +0,0 @@
|
|||
/*
|
||||
* The first part of this is the standard CSS generated by LaTeX2HTML,
|
||||
* with the "empty" declarations removed.
|
||||
*/
|
||||
|
||||
/* Century Schoolbook font is very similar to Computer Modern Math: cmmi */
|
||||
.math { font-family: "Century Schoolbook", serif; }
|
||||
.math i { font-family: "Century Schoolbook", serif;
|
||||
font-weight: bold }
|
||||
.boldmath { font-family: "Century Schoolbook", serif;
|
||||
font-weight: bold }
|
||||
|
||||
/*
|
||||
* Implement both fixed-size and relative sizes.
|
||||
*
|
||||
* I think these can be safely removed, as it doesn't appear that
|
||||
* LaTeX2HTML ever generates these, even though these are carried
|
||||
* over from the LaTeX2HTML stylesheet.
|
||||
*/
|
||||
small.xtiny { font-size : xx-small; }
|
||||
small.tiny { font-size : x-small; }
|
||||
small.scriptsize { font-size : smaller; }
|
||||
small.footnotesize { font-size : small; }
|
||||
big.xlarge { font-size : large; }
|
||||
big.xxlarge { font-size : x-large; }
|
||||
big.huge { font-size : larger; }
|
||||
big.xhuge { font-size : xx-large; }
|
||||
|
||||
/*
|
||||
* Document-specific styles come next;
|
||||
* these are added for the Python documentation.
|
||||
*
|
||||
* Note that the size specifications for the H* elements are because
|
||||
* Netscape on Solaris otherwise doesn't get it right; they all end up
|
||||
* the normal text size.
|
||||
*/
|
||||
|
||||
body { color: #000000;
|
||||
background-color: #ffffff; }
|
||||
|
||||
a:link:active { color: #ff0000; }
|
||||
a:link:hover { background-color: #bbeeff; }
|
||||
a:visited:hover { background-color: #bbeeff; }
|
||||
a:visited { color: #551a8b; }
|
||||
a:link { color: #0000bb; }
|
||||
|
||||
h1, h2, h3, h4, h5, h6 { font-family: avantgarde, sans-serif;
|
||||
font-weight: bold; }
|
||||
h1 { font-size: 180%; }
|
||||
h2 { font-size: 150%; }
|
||||
h3, h4 { font-size: 120%; }
|
||||
|
||||
/* These are section titles used in navigation links, so make sure we
|
||||
* match the section header font here, even it not the weight.
|
||||
*/
|
||||
.sectref { font-family: avantgarde, sans-serif; }
|
||||
/* And the label before the titles in navigation: */
|
||||
.navlabel { font-size: 85%; }
|
||||
|
||||
|
||||
/* LaTeX2HTML insists on inserting <br> elements into headers which
|
||||
* are marked with \label. This little bit of CSS magic ensures that
|
||||
* these elements don't cause spurious whitespace to be added.
|
||||
*/
|
||||
h1>br, h2>br, h3>br,
|
||||
h4>br, h5>br, h6>br { display: none; }
|
||||
|
||||
code, tt { font-family: "lucida typewriter", lucidatypewriter,
|
||||
monospace; }
|
||||
var { font-family: times, serif;
|
||||
font-style: italic;
|
||||
font-weight: normal; }
|
||||
|
||||
.Unix { font-variant: small-caps; }
|
||||
|
||||
.typelabel { font-family: lucida, sans-serif; }
|
||||
|
||||
.navigation td { background-color: #99ccff;
|
||||
font-weight: bold;
|
||||
font-family: avantgarde, sans-serif;
|
||||
font-size: 110%; }
|
||||
|
||||
div.warning { background-color: #fffaf0;
|
||||
border: thin solid black;
|
||||
padding: 1em;
|
||||
margin-left: 2em;
|
||||
margin-right: 2em; }
|
||||
|
||||
div.warning .label { font-family: sans-serif;
|
||||
font-size: 110%;
|
||||
margin-right: 0.5em; }
|
||||
|
||||
div.note { background-color: #fffaf0;
|
||||
border: thin solid black;
|
||||
padding: 1em;
|
||||
margin-left: 2em;
|
||||
margin-right: 2em; }
|
||||
|
||||
div.note .label { margin-right: 0.5em;
|
||||
font-family: sans-serif; }
|
||||
|
||||
address { font-size: 80%; }
|
||||
.release-info { font-style: italic;
|
||||
font-size: 80%; }
|
||||
|
||||
.titlegraphic { vertical-align: top; }
|
||||
|
||||
.verbatim pre { color: #00008b;
|
||||
font-family: "lucida typewriter", lucidatypewriter,
|
||||
monospace;
|
||||
font-size: 90%; }
|
||||
.verbatim { margin-left: 2em; }
|
||||
.verbatim .footer { padding: 0.05in;
|
||||
font-size: 85%;
|
||||
background-color: #99ccff;
|
||||
margin-right: 0.5in; }
|
||||
|
||||
.grammar { background-color: #99ccff;
|
||||
margin-right: 0.5in;
|
||||
padding: 0.05in; }
|
||||
.grammar-footer { padding: 0.05in;
|
||||
font-size: 85%; }
|
||||
.grammartoken { font-family: "lucida typewriter", lucidatypewriter,
|
||||
monospace; }
|
||||
|
||||
.productions { background-color: #bbeeff; }
|
||||
.productions a:active { color: #ff0000; }
|
||||
.productions a:link:hover { background-color: #99ccff; }
|
||||
.productions a:visited:hover { background-color: #99ccff; }
|
||||
.productions a:visited { color: #551a8b; }
|
||||
.productions a:link { color: #0000bb; }
|
||||
.productions table { vertical-align: baseline;
|
||||
empty-cells: show; }
|
||||
.productions > table td,
|
||||
.productions > table th { padding: 2px; }
|
||||
.productions > table td:first-child,
|
||||
.productions > table td:last-child {
|
||||
font-family: "lucida typewriter",
|
||||
lucidatypewriter,
|
||||
monospace;
|
||||
}
|
||||
/* same as the second selector above, but expressed differently for Opera */
|
||||
.productions > table td:first-child + td + td {
|
||||
font-family: "lucida typewriter",
|
||||
lucidatypewriter,
|
||||
monospace;
|
||||
vertical-align: baseline;
|
||||
}
|
||||
.productions > table td:first-child + td {
|
||||
padding-left: 1em;
|
||||
padding-right: 1em;
|
||||
}
|
||||
.productions > table tr { vertical-align: baseline; }
|
||||
|
||||
.email { font-family: avantgarde, sans-serif; }
|
||||
.mailheader { font-family: avantgarde, sans-serif; }
|
||||
.mimetype { font-family: avantgarde, sans-serif; }
|
||||
.newsgroup { font-family: avantgarde, sans-serif; }
|
||||
.url { font-family: avantgarde, sans-serif; }
|
||||
.file { font-family: avantgarde, sans-serif; }
|
||||
.guilabel { font-family: avantgarde, sans-serif; }
|
||||
|
||||
.realtable { border-collapse: collapse;
|
||||
border-color: black;
|
||||
border-style: solid;
|
||||
border-width: 0px 0px 2px 0px;
|
||||
empty-cells: show;
|
||||
margin-left: auto;
|
||||
margin-right: auto;
|
||||
padding-left: 0.4em;
|
||||
padding-right: 0.4em;
|
||||
}
|
||||
.realtable tbody { vertical-align: baseline; }
|
||||
.realtable tfoot { display: table-footer-group; }
|
||||
.realtable thead { background-color: #99ccff;
|
||||
border-width: 0px 0px 2px 1px;
|
||||
display: table-header-group;
|
||||
font-family: avantgarde, sans-serif;
|
||||
font-weight: bold;
|
||||
vertical-align: baseline;
|
||||
}
|
||||
.realtable thead :first-child {
|
||||
border-width: 0px 0px 2px 0px;
|
||||
}
|
||||
.realtable thead th { border-width: 0px 0px 2px 1px }
|
||||
.realtable td,
|
||||
.realtable th { border-color: black;
|
||||
border-style: solid;
|
||||
border-width: 0px 0px 1px 1px;
|
||||
padding-left: 0.4em;
|
||||
padding-right: 0.4em;
|
||||
}
|
||||
.realtable td:first-child,
|
||||
.realtable th:first-child {
|
||||
border-left-width: 0px;
|
||||
vertical-align: baseline;
|
||||
}
|
||||
.center { text-align: center; }
|
||||
.left { text-align: left; }
|
||||
.right { text-align: right; }
|
||||
|
||||
.refcount-info { font-style: italic; }
|
||||
.refcount-info .value { font-weight: bold;
|
||||
color: #006600; }
|
||||
|
||||
/*
|
||||
* Some decoration for the "See also:" blocks, in part inspired by some of
|
||||
* the styling on Lars Marius Garshol's XSA pages.
|
||||
* (The blue in the navigation bars is #99CCFF.)
|
||||
*/
|
||||
.seealso { background-color: #fffaf0;
|
||||
border: thin solid black;
|
||||
padding: 0pt 1em 4pt 1em; }
|
||||
|
||||
.seealso > .heading { font-size: 110%;
|
||||
font-weight: bold; }
|
||||
|
||||
/*
|
||||
* Class 'availability' is used for module availability statements at
|
||||
* the top of modules.
|
||||
*/
|
||||
.availability .platform { font-weight: bold; }
|
||||
|
||||
|
||||
/*
|
||||
* Additional styles for the distutils package.
|
||||
*/
|
||||
.du-command { font-family: monospace; }
|
||||
.du-option { font-family: avantgarde, sans-serif; }
|
||||
.du-filevar { font-family: avantgarde, sans-serif;
|
||||
font-style: italic; }
|
||||
.du-xxx:before { content: "** ";
|
||||
font-weight: bold; }
|
||||
.du-xxx:after { content: " **";
|
||||
font-weight: bold; }
|
||||
|
||||
|
||||
/*
|
||||
* Some specialization for printed output.
|
||||
*/
|
||||
@media print {
|
||||
.online-navigation { display: none; }
|
||||
}
|
|
@ -1,82 +0,0 @@
|
|||
# Generate the Python "info" documentation.
|
||||
|
||||
TOPDIR=..
|
||||
TOOLSDIR=$(TOPDIR)/tools
|
||||
HTMLDIR=$(TOPDIR)/html
|
||||
|
||||
# The emacs binary used to build the info docs. GNU Emacs 21 is required.
|
||||
EMACS=emacs
|
||||
|
||||
MKINFO=$(TOOLSDIR)/mkinfo
|
||||
SCRIPTS=$(TOOLSDIR)/checkargs.pm $(TOOLSDIR)/mkinfo $(TOOLSDIR)/py2texi.el
|
||||
|
||||
# set VERSION to code the VERSION number into the info file name
|
||||
# allowing installation of more than one set of python info docs
|
||||
# into the same directory
|
||||
VERSION=
|
||||
|
||||
all: check-emacs-version \
|
||||
api dist ext mac ref tut whatsnew \
|
||||
lib
|
||||
# doc inst
|
||||
|
||||
api: python$(VERSION)-api.info
|
||||
dist: python$(VERSION)-dist.info
|
||||
doc: python$(VERSION)-doc.info
|
||||
ext: python$(VERSION)-ext.info
|
||||
inst: python$(VERSION)-inst.info
|
||||
lib: python$(VERSION)-lib.info
|
||||
mac: python$(VERSION)-mac.info
|
||||
ref: python$(VERSION)-ref.info
|
||||
tut: python$(VERSION)-tut.info
|
||||
whatsnew: $(WHATSNEW)
|
||||
$(WHATSNEW): python$(VERSION)-$(WHATSNEW).info
|
||||
|
||||
check-emacs-version:
|
||||
@v="`$(EMACS) --version 2>&1 | egrep '^(GNU |X)Emacs [12]*'`"; \
|
||||
if `echo "$$v" | grep '^GNU Emacs 2[12]' >/dev/null 2>&1`; then \
|
||||
echo "Using $(EMACS) to build the info docs"; \
|
||||
else \
|
||||
echo "GNU Emacs 21 or 22 is required to build the info docs"; \
|
||||
echo "Found $$v"; \
|
||||
false; \
|
||||
fi
|
||||
|
||||
python$(VERSION)-api.info: ../api/api.tex $(SCRIPTS)
|
||||
EMACS=$(EMACS) $(MKINFO) $< $*.texi $@
|
||||
|
||||
python$(VERSION)-ext.info: ../ext/ext.tex $(SCRIPTS)
|
||||
EMACS=$(EMACS) $(MKINFO) $< $*.texi $@
|
||||
|
||||
python$(VERSION)-lib.info: ../lib/lib.tex $(SCRIPTS)
|
||||
EMACS=$(EMACS) $(MKINFO) $< $*.texi $@
|
||||
|
||||
python$(VERSION)-mac.info: ../mac/mac.tex $(SCRIPTS)
|
||||
EMACS=$(EMACS) $(MKINFO) $< $*.texi $@
|
||||
|
||||
python$(VERSION)-ref.info: ../ref/ref.tex $(SCRIPTS)
|
||||
EMACS=$(EMACS) $(MKINFO) $< $*.texi $@
|
||||
|
||||
python$(VERSION)-tut.info: ../tut/tut.tex $(SCRIPTS)
|
||||
EMACS=$(EMACS) $(MKINFO) $< $*.texi $@
|
||||
|
||||
# Not built by default; the conversion doesn't handle \p and \op
|
||||
python$(VERSION)-doc.info: ../doc/doc.tex $(SCRIPTS)
|
||||
EMACS=$(EMACS) $(MKINFO) $< $*.texi $@
|
||||
|
||||
python$(VERSION)-dist.info: ../dist/dist.tex $(SCRIPTS)
|
||||
EMACS=$(EMACS) $(MKINFO) $< $*.texi $@
|
||||
|
||||
# Not built by default; the conversion chokes on \installscheme
|
||||
python$(VERSION)-inst.info: ../inst/inst.tex $(SCRIPTS)
|
||||
EMACS=$(EMACS) $(MKINFO) $< $*.texi $@
|
||||
|
||||
# "whatsnew20" doesn't currently work
|
||||
python$(VERSION)-$(WHATSNEW).info: ../whatsnew/$(WHATSNEW).tex $(SCRIPTS)
|
||||
EMACS=$(EMACS) $(MKINFO) $< $*.texi $@
|
||||
|
||||
clean:
|
||||
rm -f *.texi~ *.texi
|
||||
|
||||
clobber: clean
|
||||
rm -f *.texi python*-*.info python*-*.info-[0-9]*
|
|
@ -1,21 +0,0 @@
|
|||
This archive contains the standard Python documentation in GNU info
|
||||
format. Five manuals are included:
|
||||
|
||||
python-ref.info* Python Reference Manual
|
||||
python-mac.info* Python Macintosh Modules
|
||||
python-lib.info* Python Library Reference
|
||||
python-ext.info* Extending and Embedding the Python Interpreter
|
||||
python-api.info* Python/C API Reference
|
||||
python-tut.info* Python Tutorial
|
||||
|
||||
The file python.dir is a fragment of a "dir" file that can be used to
|
||||
incorporate these documents into an existing GNU info installation:
|
||||
insert the contents of this file into the "dir" or "localdir" file at
|
||||
an appropriate point and copy the python-*.info* files to the same
|
||||
directory.
|
||||
|
||||
Thanks go to Milan Zamazal <pdm@zamazal.org> for providing this
|
||||
conversion to the info format.
|
||||
|
||||
Questions and comments on these documents should be directed to
|
||||
docs@python.org.
|
|
@ -1,11 +0,0 @@
|
|||
|
||||
Python Standard Documentation
|
||||
|
||||
* What's New: (python-whatsnew25). What's New in Python 2.5?
|
||||
* Python Library: (python-lib). Python Library Reference
|
||||
* Python Mac Modules: (python-mac). Python Macintosh Modules
|
||||
* Python Reference: (python-ref). Python Reference Manual
|
||||
* Python API: (python-api). Python/C API Reference Manual
|
||||
* Python Extending: (python-ext). Extending & Embedding Python
|
||||
* Python Tutorial: (python-tut). Python Tutorial
|
||||
* Distributing Modules: (python-dist). Distributing Python Modules
|
1112
Doc/inst/inst.tex
|
@ -1,8 +0,0 @@
|
|||
\chapter{Data Compression and Archiving}
|
||||
\label{archiving}
|
||||
|
||||
The modules described in this chapter support data compression
|
||||
with the zlib, gzip, and bzip2 algorithms, and
|
||||
the creation of ZIP- and tar-format archives.
|
||||
|
||||
\localmoduletable
|
|
@ -1,283 +0,0 @@
|
|||
\begin{longtableiii}{lll}{class}{Node type}{Attribute}{Value}
|
||||
|
||||
\lineiii{Add}{\member{left}}{left operand}
|
||||
\lineiii{}{\member{right}}{right operand}
|
||||
\hline
|
||||
|
||||
\lineiii{And}{\member{nodes}}{list of operands}
|
||||
\hline
|
||||
|
||||
\lineiii{AssAttr}{}{\emph{attribute as target of assignment}}
|
||||
\lineiii{}{\member{expr}}{expression on the left-hand side of the dot}
|
||||
\lineiii{}{\member{attrname}}{the attribute name, a string}
|
||||
\lineiii{}{\member{flags}}{XXX}
|
||||
\hline
|
||||
|
||||
\lineiii{AssList}{\member{nodes}}{list of list elements being assigned to}
|
||||
\hline
|
||||
|
||||
\lineiii{AssName}{\member{name}}{name being assigned to}
|
||||
\lineiii{}{\member{flags}}{XXX}
|
||||
\hline
|
||||
|
||||
\lineiii{AssTuple}{\member{nodes}}{list of tuple elements being assigned to}
|
||||
\hline
|
||||
|
||||
\lineiii{Assert}{\member{test}}{the expression to be tested}
|
||||
\lineiii{}{\member{fail}}{the value of the \exception{AssertionError}}
|
||||
\hline
|
||||
|
||||
\lineiii{Assign}{\member{nodes}}{a list of assignment targets, one per equal sign}
|
||||
\lineiii{}{\member{expr}}{the value being assigned}
|
||||
\hline
|
||||
|
||||
\lineiii{AugAssign}{\member{node}}{}
|
||||
\lineiii{}{\member{op}}{}
|
||||
\lineiii{}{\member{expr}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Backquote}{\member{expr}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Bitand}{\member{nodes}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Bitor}{\member{nodes}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Bitxor}{\member{nodes}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Break}{}{}
|
||||
\hline
|
||||
|
||||
\lineiii{CallFunc}{\member{node}}{expression for the callee}
|
||||
\lineiii{}{\member{args}}{a list of arguments}
|
||||
\lineiii{}{\member{star_args}}{the extended *-arg value}
|
||||
\lineiii{}{\member{dstar_args}}{the extended **-arg value}
|
||||
\hline
|
||||
|
||||
\lineiii{Class}{\member{name}}{the name of the class, a string}
|
||||
\lineiii{}{\member{bases}}{a list of base classes}
|
||||
\lineiii{}{\member{doc}}{doc string, a string or \code{None}}
|
||||
\lineiii{}{\member{code}}{the body of the class statement}
|
||||
\hline
|
||||
|
||||
\lineiii{Compare}{\member{expr}}{}
|
||||
\lineiii{}{\member{ops}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Const}{\member{value}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Continue}{}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Decorators}{\member{nodes}}{List of function decorator expressions}
|
||||
\hline
|
||||
|
||||
\lineiii{Dict}{\member{items}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Discard}{\member{expr}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Div}{\member{left}}{}
|
||||
\lineiii{}{\member{right}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Ellipsis}{}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Expression}{\member{node}}{}
|
||||
|
||||
\lineiii{Exec}{\member{expr}}{}
|
||||
\lineiii{}{\member{locals}}{}
|
||||
\lineiii{}{\member{globals}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{FloorDiv}{\member{left}}{}
|
||||
\lineiii{}{\member{right}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{For}{\member{assign}}{}
|
||||
\lineiii{}{\member{list}}{}
|
||||
\lineiii{}{\member{body}}{}
|
||||
\lineiii{}{\member{else_}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{From}{\member{modname}}{}
|
||||
\lineiii{}{\member{names}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Function}{\member{decorators}}{\class{Decorators} or \code{None}}
|
||||
\lineiii{}{\member{name}}{name used in def, a string}
|
||||
\lineiii{}{\member{argnames}}{list of argument names, as strings}
|
||||
\lineiii{}{\member{defaults}}{list of default values}
|
||||
\lineiii{}{\member{flags}}{xxx}
|
||||
\lineiii{}{\member{doc}}{doc string, a string or \code{None}}
|
||||
\lineiii{}{\member{code}}{the body of the function}
|
||||
\hline
|
||||
|
||||
\lineiii{GenExpr}{\member{code}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{GenExprFor}{\member{assign}}{}
|
||||
\lineiii{}{\member{iter}}{}
|
||||
\lineiii{}{\member{ifs}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{GenExprIf}{\member{test}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{GenExprInner}{\member{expr}}{}
|
||||
\lineiii{}{\member{quals}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Getattr}{\member{expr}}{}
|
||||
\lineiii{}{\member{attrname}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Global}{\member{names}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{If}{\member{tests}}{}
|
||||
\lineiii{}{\member{else_}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Import}{\member{names}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Invert}{\member{expr}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Keyword}{\member{name}}{}
|
||||
\lineiii{}{\member{expr}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Lambda}{\member{argnames}}{}
|
||||
\lineiii{}{\member{defaults}}{}
|
||||
\lineiii{}{\member{flags}}{}
|
||||
\lineiii{}{\member{code}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{LeftShift}{\member{left}}{}
|
||||
\lineiii{}{\member{right}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{List}{\member{nodes}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{ListComp}{\member{expr}}{}
|
||||
\lineiii{}{\member{quals}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{ListCompFor}{\member{assign}}{}
|
||||
\lineiii{}{\member{list}}{}
|
||||
\lineiii{}{\member{ifs}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{ListCompIf}{\member{test}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Mod}{\member{left}}{}
|
||||
\lineiii{}{\member{right}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Module}{\member{doc}}{doc string, a string or \code{None}}
|
||||
\lineiii{}{\member{node}}{body of the module, a \class{Stmt}}
|
||||
\hline
|
||||
|
||||
\lineiii{Mul}{\member{left}}{}
|
||||
\lineiii{}{\member{right}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Name}{\member{name}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Not}{\member{expr}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Or}{\member{nodes}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Pass}{}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Power}{\member{left}}{}
|
||||
\lineiii{}{\member{right}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Print}{\member{nodes}}{}
|
||||
\lineiii{}{\member{dest}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Printnl}{\member{nodes}}{}
|
||||
\lineiii{}{\member{dest}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Raise}{\member{expr1}}{}
|
||||
\lineiii{}{\member{expr2}}{}
|
||||
\lineiii{}{\member{expr3}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Return}{\member{value}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{RightShift}{\member{left}}{}
|
||||
\lineiii{}{\member{right}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Slice}{\member{expr}}{}
|
||||
\lineiii{}{\member{flags}}{}
|
||||
\lineiii{}{\member{lower}}{}
|
||||
\lineiii{}{\member{upper}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Sliceobj}{\member{nodes}}{list of statements}
|
||||
\hline
|
||||
|
||||
\lineiii{Stmt}{\member{nodes}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Sub}{\member{left}}{}
|
||||
\lineiii{}{\member{right}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Subscript}{\member{expr}}{}
|
||||
\lineiii{}{\member{flags}}{}
|
||||
\lineiii{}{\member{subs}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{TryExcept}{\member{body}}{}
|
||||
\lineiii{}{\member{handlers}}{}
|
||||
\lineiii{}{\member{else_}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{TryFinally}{\member{body}}{}
|
||||
\lineiii{}{\member{final}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Tuple}{\member{nodes}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{UnaryAdd}{\member{expr}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{UnarySub}{\member{expr}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{While}{\member{test}}{}
|
||||
\lineiii{}{\member{body}}{}
|
||||
\lineiii{}{\member{else_}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{With}{\member{expr}}{}
|
||||
\lineiii{}{\member{vars}}{}
|
||||
\lineiii{}{\member{body}}{}
|
||||
\hline
|
||||
|
||||
\lineiii{Yield}{\member{value}}{}
|
||||
\hline
|
||||
|
||||
\end{longtableiii}
|
|
@ -1,60 +0,0 @@
|
|||
from optparse import Option, OptionParser, _match_abbrev
|
||||
|
||||
# This case-insensitive option parser relies on having a
|
||||
# case-insensitive dictionary type available. Here's one
|
||||
# for Python 2.2. Note that a *real* case-insensitive
|
||||
# dictionary type would also have to implement __new__(),
|
||||
# update(), and setdefault() -- but that's not the point
|
||||
# of this exercise.
|
||||
|
||||
class caseless_dict (dict):
|
||||
def __setitem__ (self, key, value):
|
||||
dict.__setitem__(self, key.lower(), value)
|
||||
|
||||
def __getitem__ (self, key):
|
||||
return dict.__getitem__(self, key.lower())
|
||||
|
||||
def get (self, key, default=None):
|
||||
return dict.get(self, key.lower())
|
||||
|
||||
def has_key (self, key):
|
||||
return dict.has_key(self, key.lower())
|
||||
|
||||
|
||||
class CaselessOptionParser (OptionParser):
|
||||
|
||||
def _create_option_list (self):
|
||||
self.option_list = []
|
||||
self._short_opt = caseless_dict()
|
||||
self._long_opt = caseless_dict()
|
||||
self._long_opts = []
|
||||
self.defaults = {}
|
||||
|
||||
def _match_long_opt (self, opt):
|
||||
return _match_abbrev(opt.lower(), self._long_opt.keys())
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
from optik.errors import OptionConflictError
|
||||
|
||||
# test 1: no options to start with
|
||||
parser = CaselessOptionParser()
|
||||
try:
|
||||
parser.add_option("-H", dest="blah")
|
||||
except OptionConflictError:
|
||||
print "ok: got OptionConflictError for -H"
|
||||
else:
|
||||
print "not ok: no conflict between -h and -H"
|
||||
|
||||
parser.add_option("-f", "--file", dest="file")
|
||||
#print repr(parser.get_option("-f"))
|
||||
#print repr(parser.get_option("-F"))
|
||||
#print repr(parser.get_option("--file"))
|
||||
#print repr(parser.get_option("--fIlE"))
|
||||
(options, args) = parser.parse_args(["--FiLe", "foo"])
|
||||
assert options.file == "foo", options.file
|
||||
print "ok: case insensitive long options work"
|
||||
|
||||
(options, args) = parser.parse_args(["-F", "bar"])
|
||||
assert options.file == "bar", options.file
|
||||
print "ok: case insensitive short options work"
|
|
@ -1,353 +0,0 @@
|
|||
\chapter{Python compiler package \label{compiler}}
|
||||
|
||||
\sectionauthor{Jeremy Hylton}{jeremy@zope.com}
|
||||
|
||||
|
||||
The Python compiler package is a tool for analyzing Python source code
|
||||
and generating Python bytecode. The compiler contains libraries to
|
||||
generate an abstract syntax tree from Python source code and to
|
||||
generate Python bytecode from the tree.
|
||||
|
||||
The \refmodule{compiler} package is a Python source to bytecode
|
||||
translator written in Python. It uses the built-in parser and
|
||||
standard \refmodule{parser} module to generated a concrete syntax
|
||||
tree. This tree is used to generate an abstract syntax tree (AST) and
|
||||
then Python bytecode.
|
||||
|
||||
The full functionality of the package duplicates the builtin compiler
|
||||
provided with the Python interpreter. It is intended to match its
|
||||
behavior almost exactly. Why implement another compiler that does the
|
||||
same thing? The package is useful for a variety of purposes. It can
|
||||
be modified more easily than the builtin compiler. The AST it
|
||||
generates is useful for analyzing Python source code.
|
||||
|
||||
This chapter explains how the various components of the
|
||||
\refmodule{compiler} package work. It blends reference material with
|
||||
a tutorial.
|
||||
|
||||
The following modules are part of the \refmodule{compiler} package:
|
||||
|
||||
\localmoduletable
|
||||
|
||||
|
||||
\section{The basic interface}
|
||||
|
||||
\declaremodule{}{compiler}
|
||||
|
||||
The top-level of the package defines four functions. If you import
|
||||
\module{compiler}, you will get these functions and a collection of
|
||||
modules contained in the package.
|
||||
|
||||
\begin{funcdesc}{parse}{buf}
|
||||
Returns an abstract syntax tree for the Python source code in \var{buf}.
|
||||
The function raises \exception{SyntaxError} if there is an error in the
|
||||
source code. The return value is a \class{compiler.ast.Module} instance
|
||||
that contains the tree.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{parseFile}{path}
|
||||
Return an abstract syntax tree for the Python source code in the file
|
||||
specified by \var{path}. It is equivalent to
|
||||
\code{parse(open(\var{path}).read())}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{walk}{ast, visitor\optional{, verbose}}
|
||||
Do a pre-order walk over the abstract syntax tree \var{ast}. Call the
|
||||
appropriate method on the \var{visitor} instance for each node
|
||||
encountered.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{compile}{source, filename, mode, flags=None,
|
||||
dont_inherit=None}
|
||||
Compile the string \var{source}, a Python module, statement or
|
||||
expression, into a code object that can be executed by the exec
|
||||
statement or \function{eval()}. This function is a replacement for the
|
||||
built-in \function{compile()} function.
|
||||
|
||||
The \var{filename} will be used for run-time error messages.
|
||||
|
||||
The \var{mode} must be 'exec' to compile a module, 'single' to compile a
|
||||
single (interactive) statement, or 'eval' to compile an expression.
|
||||
|
||||
The \var{flags} and \var{dont_inherit} arguments affect future-related
|
||||
statements, but are not supported yet.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{compileFile}{source}
|
||||
Compiles the file \var{source} and generates a .pyc file.
|
||||
\end{funcdesc}
|
||||
|
||||
The \module{compiler} package contains the following modules:
|
||||
\refmodule[compiler.ast]{ast}, \module{consts}, \module{future},
|
||||
\module{misc}, \module{pyassem}, \module{pycodegen}, \module{symbols},
|
||||
\module{transformer}, and \refmodule[compiler.visitor]{visitor}.
|
||||
|
||||
\section{Limitations}
|
||||
|
||||
There are some problems with the error checking of the compiler
|
||||
package. The interpreter detects syntax errors in two distinct
|
||||
phases. One set of errors is detected by the interpreter's parser,
|
||||
the other set by the compiler. The compiler package relies on the
|
||||
interpreter's parser, so it get the first phases of error checking for
|
||||
free. It implements the second phase itself, and that implementation is
|
||||
incomplete. For example, the compiler package does not raise an error
|
||||
if a name appears more than once in an argument list:
|
||||
\code{def f(x, x): ...}
|
||||
|
||||
A future version of the compiler should fix these problems.
|
||||
|
||||
\section{Python Abstract Syntax}
|
||||
|
||||
The \module{compiler.ast} module defines an abstract syntax for
|
||||
Python. In the abstract syntax tree, each node represents a syntactic
|
||||
construct. The root of the tree is \class{Module} object.
|
||||
|
||||
The abstract syntax offers a higher level interface to parsed Python
|
||||
source code. The \refmodule{parser}
|
||||
module and the compiler written in C for the Python interpreter use a
|
||||
concrete syntax tree. The concrete syntax is tied closely to the
|
||||
grammar description used for the Python parser. Instead of a single
|
||||
node for a construct, there are often several levels of nested nodes
|
||||
that are introduced by Python's precedence rules.
|
||||
|
||||
The abstract syntax tree is created by the
|
||||
\module{compiler.transformer} module. The transformer relies on the
|
||||
builtin Python parser to generate a concrete syntax tree. It
|
||||
generates an abstract syntax tree from the concrete tree.
|
||||
|
||||
The \module{transformer} module was created by Greg
|
||||
Stein\index{Stein, Greg} and Bill Tutt\index{Tutt, Bill} for an
|
||||
experimental Python-to-C compiler. The current version contains a
|
||||
number of modifications and improvements, but the basic form of the
|
||||
abstract syntax and of the transformer are due to Stein and Tutt.
|
||||
|
||||
\subsection{AST Nodes}
|
||||
|
||||
\declaremodule{}{compiler.ast}
|
||||
|
||||
The \module{compiler.ast} module is generated from a text file that
|
||||
describes each node type and its elements. Each node type is
|
||||
represented as a class that inherits from the abstract base class
|
||||
\class{compiler.ast.Node} and defines a set of named attributes for
|
||||
child nodes.
|
||||
|
||||
\begin{classdesc}{Node}{}
|
||||
|
||||
The \class{Node} instances are created automatically by the parser
|
||||
generator. The recommended interface for specific \class{Node}
|
||||
instances is to use the public attributes to access child nodes. A
|
||||
public attribute may be bound to a single node or to a sequence of
|
||||
nodes, depending on the \class{Node} type. For example, the
|
||||
\member{bases} attribute of the \class{Class} node, is bound to a
|
||||
list of base class nodes, and the \member{doc} attribute is bound to
|
||||
a single node.
|
||||
|
||||
Each \class{Node} instance has a \member{lineno} attribute which may
|
||||
be \code{None}. XXX Not sure what the rules are for which nodes
|
||||
will have a useful lineno.
|
||||
\end{classdesc}
|
||||
|
||||
All \class{Node} objects offer the following methods:
|
||||
|
||||
\begin{methoddesc}{getChildren}{}
|
||||
Returns a flattened list of the child nodes and objects in the
|
||||
order they occur. Specifically, the order of the nodes is the
|
||||
order in which they appear in the Python grammar. Not all of the
|
||||
children are \class{Node} instances. The names of functions and
|
||||
classes, for example, are plain strings.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{getChildNodes}{}
|
||||
Returns a flattened list of the child nodes in the order they
|
||||
occur. This method is like \method{getChildren()}, except that it
|
||||
only returns those children that are \class{Node} instances.
|
||||
\end{methoddesc}
|
||||
|
||||
Two examples illustrate the general structure of \class{Node}
|
||||
classes. The \keyword{while} statement is defined by the following
|
||||
grammar production:
|
||||
|
||||
\begin{verbatim}
|
||||
while_stmt: "while" expression ":" suite
|
||||
["else" ":" suite]
|
||||
\end{verbatim}
|
||||
|
||||
The \class{While} node has three attributes: \member{test},
|
||||
\member{body}, and \member{else_}. (If the natural name for an
|
||||
attribute is also a Python reserved word, it can't be used as an
|
||||
attribute name. An underscore is appended to the word to make it a
|
||||
legal identifier, hence \member{else_} instead of \keyword{else}.)
|
||||
|
||||
The \keyword{if} statement is more complicated because it can include
|
||||
several tests.
|
||||
|
||||
\begin{verbatim}
|
||||
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
|
||||
\end{verbatim}
|
||||
|
||||
The \class{If} node only defines two attributes: \member{tests} and
|
||||
\member{else_}. The \member{tests} attribute is a sequence of test
|
||||
expression, consequent body pairs. There is one pair for each
|
||||
\keyword{if}/\keyword{elif} clause. The first element of the pair is
|
||||
the test expression. The second elements is a \class{Stmt} node that
|
||||
contains the code to execute if the test is true.
|
||||
|
||||
The \method{getChildren()} method of \class{If} returns a flat list of
|
||||
child nodes. If there are three \keyword{if}/\keyword{elif} clauses
|
||||
and no \keyword{else} clause, then \method{getChildren()} will return
|
||||
a list of six elements: the first test expression, the first
|
||||
\class{Stmt}, the second text expression, etc.
|
||||
|
||||
The following table lists each of the \class{Node} subclasses defined
|
||||
in \module{compiler.ast} and each of the public attributes available
|
||||
on their instances. The values of most of the attributes are
|
||||
themselves \class{Node} instances or sequences of instances. When the
|
||||
value is something other than an instance, the type is noted in the
|
||||
comment. The attributes are listed in the order in which they are
|
||||
returned by \method{getChildren()} and \method{getChildNodes()}.
|
||||
|
||||
\input{asttable}
|
||||
|
||||
|
||||
\subsection{Assignment nodes}
|
||||
|
||||
There is a collection of nodes used to represent assignments. Each
|
||||
assignment statement in the source code becomes a single
|
||||
\class{Assign} node in the AST. The \member{nodes} attribute is a
|
||||
list that contains a node for each assignment target. This is
|
||||
necessary because assignment can be chained, e.g. \code{a = b = 2}.
|
||||
Each \class{Node} in the list will be one of the following classes:
|
||||
\class{AssAttr}, \class{AssList}, \class{AssName}, or
|
||||
\class{AssTuple}.
|
||||
|
||||
Each target assignment node will describe the kind of object being
|
||||
assigned to: \class{AssName} for a simple name, e.g. \code{a = 1}.
|
||||
\class{AssAttr} for an attribute assigned, e.g. \code{a.x = 1}.
|
||||
\class{AssList} and \class{AssTuple} for list and tuple expansion
|
||||
respectively, e.g. \code{a, b, c = a_tuple}.
|
||||
|
||||
The target assignment nodes also have a \member{flags} attribute that
|
||||
indicates whether the node is being used for assignment or in a delete
|
||||
statement. The \class{AssName} is also used to represent a delete
|
||||
statement, e.g. \class{del x}.
|
||||
|
||||
When an expression contains several attribute references, an
|
||||
assignment or delete statement will contain only one \class{AssAttr}
|
||||
node -- for the final attribute reference. The other attribute
|
||||
references will be represented as \class{Getattr} nodes in the
|
||||
\member{expr} attribute of the \class{AssAttr} instance.
|
||||
|
||||
\subsection{Examples}
|
||||
|
||||
This section shows several simple examples of ASTs for Python source
|
||||
code. The examples demonstrate how to use the \function{parse()}
|
||||
function, what the repr of an AST looks like, and how to access
|
||||
attributes of an AST node.
|
||||
|
||||
The first module defines a single function. Assume it is stored in
|
||||
\file{/tmp/doublelib.py}.
|
||||
|
||||
\begin{verbatim}
|
||||
"""This is an example module.
|
||||
|
||||
This is the docstring.
|
||||
"""
|
||||
|
||||
def double(x):
|
||||
"Return twice the argument"
|
||||
return x * 2
|
||||
\end{verbatim}
|
||||
|
||||
In the interactive interpreter session below, I have reformatted the
|
||||
long AST reprs for readability. The AST reprs use unqualified class
|
||||
names. If you want to create an instance from a repr, you must import
|
||||
the class names from the \module{compiler.ast} module.
|
||||
|
||||
\begin{verbatim}
|
||||
>>> import compiler
|
||||
>>> mod = compiler.parseFile("/tmp/doublelib.py")
|
||||
>>> mod
|
||||
Module('This is an example module.\n\nThis is the docstring.\n',
|
||||
Stmt([Function(None, 'double', ['x'], [], 0,
|
||||
'Return twice the argument',
|
||||
Stmt([Return(Mul((Name('x'), Const(2))))]))]))
|
||||
>>> from compiler.ast import *
|
||||
>>> Module('This is an example module.\n\nThis is the docstring.\n',
|
||||
... Stmt([Function(None, 'double', ['x'], [], 0,
|
||||
... 'Return twice the argument',
|
||||
... Stmt([Return(Mul((Name('x'), Const(2))))]))]))
|
||||
Module('This is an example module.\n\nThis is the docstring.\n',
|
||||
Stmt([Function(None, 'double', ['x'], [], 0,
|
||||
'Return twice the argument',
|
||||
Stmt([Return(Mul((Name('x'), Const(2))))]))]))
|
||||
>>> mod.doc
|
||||
'This is an example module.\n\nThis is the docstring.\n'
|
||||
>>> for node in mod.node.nodes:
|
||||
... print node
|
||||
...
|
||||
Function(None, 'double', ['x'], [], 0, 'Return twice the argument',
|
||||
Stmt([Return(Mul((Name('x'), Const(2))))]))
|
||||
>>> func = mod.node.nodes[0]
|
||||
>>> func.code
|
||||
Stmt([Return(Mul((Name('x'), Const(2))))])
|
||||
\end{verbatim}
|
||||
|
||||
\section{Using Visitors to Walk ASTs}
|
||||
|
||||
\declaremodule{}{compiler.visitor}
|
||||
|
||||
The visitor pattern is ... The \refmodule{compiler} package uses a
|
||||
variant on the visitor pattern that takes advantage of Python's
|
||||
introspection features to eliminate the need for much of the visitor's
|
||||
infrastructure.
|
||||
|
||||
The classes being visited do not need to be programmed to accept
|
||||
visitors. The visitor need only define visit methods for classes it
|
||||
is specifically interested in; a default visit method can handle the
|
||||
rest.
|
||||
|
||||
XXX The magic \method{visit()} method for visitors.
|
||||
|
||||
\begin{funcdesc}{walk}{tree, visitor\optional{, verbose}}
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{classdesc}{ASTVisitor}{}
|
||||
|
||||
The \class{ASTVisitor} is responsible for walking over the tree in the
|
||||
correct order. A walk begins with a call to \method{preorder()}. For
|
||||
each node, it checks the \var{visitor} argument to \method{preorder()}
|
||||
for a method named `visitNodeType,' where NodeType is the name of the
|
||||
node's class, e.g. for a \class{While} node a \method{visitWhile()}
|
||||
would be called. If the method exists, it is called with the node as
|
||||
its first argument.
|
||||
|
||||
The visitor method for a particular node type can control how child
|
||||
nodes are visited during the walk. The \class{ASTVisitor} modifies
|
||||
the visitor argument by adding a visit method to the visitor; this
|
||||
method can be used to visit a particular child node. If no visitor is
|
||||
found for a particular node type, the \method{default()} method is
|
||||
called.
|
||||
\end{classdesc}
|
||||
|
||||
\class{ASTVisitor} objects have the following methods:
|
||||
|
||||
XXX describe extra arguments
|
||||
|
||||
\begin{methoddesc}{default}{node\optional{, \moreargs}}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{dispatch}{node\optional{, \moreargs}}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{preorder}{tree, visitor}
|
||||
\end{methoddesc}
|
||||
|
||||
|
||||
\section{Bytecode Generation}
|
||||
|
||||
The code generator is a visitor that emits bytecodes. Each visit method
|
||||
can call the \method{emit()} method to emit a new bytecode. The basic
|
||||
code generator is specialized for modules, classes, and functions. An
|
||||
assembler converts that emitted instructions to the low-level bytecode
|
||||
format. It handles things like generator of constant lists of code
|
||||
objects and calculation of jump offsets.
|
|
@ -1,13 +0,0 @@
|
|||
\chapter{Custom Python Interpreters}
|
||||
\label{custominterp}
|
||||
|
||||
The modules described in this chapter allow writing interfaces similar
|
||||
to Python's interactive interpreter. If you want a Python interpreter
|
||||
that supports some special feature in addition to the Python language,
|
||||
you should look at the \module{code} module. (The \module{codeop}
|
||||
module is lower-level, used to support compiling a possibly-incomplete
|
||||
chunk of Python code.)
|
||||
|
||||
The full list of modules described in this chapter is:
|
||||
|
||||
\localmoduletable
|
|
@ -1,10 +0,0 @@
|
|||
\chapter{Data Types}
|
||||
\label{datatypes}
|
||||
|
||||
The modules described in this chapter provide a variety of specialized
|
||||
data types such as dates and times, fixed-type arrays, heap queues,
|
||||
synchronized queues, and sets.
|
||||
|
||||
The following modules are documented in this chapter:
|
||||
|
||||
\localmoduletable
|
|
@ -1,13 +0,0 @@
|
|||
\chapter{Development Tools}
|
||||
\label{development}
|
||||
|
||||
The modules described in this chapter help you write software. For
|
||||
example, the \module{pydoc} module takes a module and generates
|
||||
documentation based on the module's contents. The \module{doctest}
|
||||
and \module{unittest} modules contains frameworks for writing unit tests
|
||||
that automatically exercise code and verify that the expected output
|
||||
is produced.
|
||||
|
||||
The list of modules described in this chapter is:
|
||||
|
||||
\localmoduletable
|
|
@ -1,38 +0,0 @@
|
|||
\section{\module{distutils} ---
|
||||
Building and installing Python modules}
|
||||
|
||||
\declaremodule{standard}{distutils}
|
||||
\modulesynopsis{Support for building and installing Python modules
|
||||
into an existing Python installation.}
|
||||
\sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
|
||||
|
||||
|
||||
The \module{distutils} package provides support for building and
|
||||
installing additional modules into a Python installation. The new
|
||||
modules may be either 100\%{}-pure Python, or may be extension modules
|
||||
written in C, or may be collections of Python packages which include
|
||||
modules coded in both Python and C.
|
||||
|
||||
This package is discussed in two separate documents which are included
|
||||
in the Python documentation package. To learn about distributing new
|
||||
modules using the \module{distutils} facilities, read
|
||||
\citetitle[../dist/dist.html]{Distributing Python Modules}; this
|
||||
includes documentation needed to extend distutils. To learn
|
||||
about installing Python modules, whether or not the author made use of
|
||||
the \module{distutils} package, read
|
||||
\citetitle[../inst/inst.html]{Installing Python Modules}.
|
||||
|
||||
|
||||
\begin{seealso}
|
||||
\seetitle[../dist/dist.html]{Distributing Python Modules}{The manual
|
||||
for developers and packagers of Python modules. This
|
||||
describes how to prepare \module{distutils}-based packages
|
||||
so that they may be easily installed into an existing
|
||||
Python installation.}
|
||||
|
||||
\seetitle[../inst/inst.html]{Installing Python Modules}{An
|
||||
``administrators'' manual which includes information on
|
||||
installing modules into an existing Python installation.
|
||||
You do not need to be a Python programmer to read this
|
||||
manual.}
|
||||
\end{seealso}
|
|
@ -1,115 +0,0 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
"""Send the contents of a directory as a MIME message."""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import smtplib
|
||||
# For guessing MIME type based on file name extension
|
||||
import mimetypes
|
||||
|
||||
from optparse import OptionParser
|
||||
|
||||
from email import encoders
|
||||
from email.message import Message
|
||||
from email.mime.audio import MIMEAudio
|
||||
from email.mime.base import MIMEBase
|
||||
from email.mime.image import MIMEImage
|
||||
from email.mime.multipart import MIMEMultipart
|
||||
from email.mime.text import MIMEText
|
||||
|
||||
COMMASPACE = ', '
|
||||
|
||||
|
||||
def main():
|
||||
parser = OptionParser(usage="""\
|
||||
Send the contents of a directory as a MIME message.
|
||||
|
||||
Usage: %prog [options]
|
||||
|
||||
Unless the -o option is given, the email is sent by forwarding to your local
|
||||
SMTP server, which then does the normal delivery process. Your local machine
|
||||
must be running an SMTP server.
|
||||
""")
|
||||
parser.add_option('-d', '--directory',
|
||||
type='string', action='store',
|
||||
help="""Mail the contents of the specified directory,
|
||||
otherwise use the current directory. Only the regular
|
||||
files in the directory are sent, and we don't recurse to
|
||||
subdirectories.""")
|
||||
parser.add_option('-o', '--output',
|
||||
type='string', action='store', metavar='FILE',
|
||||
help="""Print the composed message to FILE instead of
|
||||
sending the message to the SMTP server.""")
|
||||
parser.add_option('-s', '--sender',
|
||||
type='string', action='store', metavar='SENDER',
|
||||
help='The value of the From: header (required)')
|
||||
parser.add_option('-r', '--recipient',
|
||||
type='string', action='append', metavar='RECIPIENT',
|
||||
default=[], dest='recipients',
|
||||
help='A To: header value (at least one required)')
|
||||
opts, args = parser.parse_args()
|
||||
if not opts.sender or not opts.recipients:
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
directory = opts.directory
|
||||
if not directory:
|
||||
directory = '.'
|
||||
# Create the enclosing (outer) message
|
||||
outer = MIMEMultipart()
|
||||
outer['Subject'] = 'Contents of directory %s' % os.path.abspath(directory)
|
||||
outer['To'] = COMMASPACE.join(opts.recipients)
|
||||
outer['From'] = opts.sender
|
||||
outer.preamble = 'You will not see this in a MIME-aware mail reader.\n'
|
||||
|
||||
for filename in os.listdir(directory):
|
||||
path = os.path.join(directory, filename)
|
||||
if not os.path.isfile(path):
|
||||
continue
|
||||
# Guess the content type based on the file's extension. Encoding
|
||||
# will be ignored, although we should check for simple things like
|
||||
# gzip'd or compressed files.
|
||||
ctype, encoding = mimetypes.guess_type(path)
|
||||
if ctype is None or encoding is not None:
|
||||
# No guess could be made, or the file is encoded (compressed), so
|
||||
# use a generic bag-of-bits type.
|
||||
ctype = 'application/octet-stream'
|
||||
maintype, subtype = ctype.split('/', 1)
|
||||
if maintype == 'text':
|
||||
fp = open(path)
|
||||
# Note: we should handle calculating the charset
|
||||
msg = MIMEText(fp.read(), _subtype=subtype)
|
||||
fp.close()
|
||||
elif maintype == 'image':
|
||||
fp = open(path, 'rb')
|
||||
msg = MIMEImage(fp.read(), _subtype=subtype)
|
||||
fp.close()
|
||||
elif maintype == 'audio':
|
||||
fp = open(path, 'rb')
|
||||
msg = MIMEAudio(fp.read(), _subtype=subtype)
|
||||
fp.close()
|
||||
else:
|
||||
fp = open(path, 'rb')
|
||||
msg = MIMEBase(maintype, subtype)
|
||||
msg.set_payload(fp.read())
|
||||
fp.close()
|
||||
# Encode the payload using Base64
|
||||
encoders.encode_base64(msg)
|
||||
# Set the filename parameter
|
||||
msg.add_header('Content-Disposition', 'attachment', filename=filename)
|
||||
outer.attach(msg)
|
||||
# Now send or store the message
|
||||
composed = outer.as_string()
|
||||
if opts.output:
|
||||
fp = open(opts.output, 'w')
|
||||
fp.write(composed)
|
||||
fp.close()
|
||||
else:
|
||||
s = smtplib.SMTP()
|
||||
s.connect()
|
||||
s.sendmail(opts.sender, opts.recipients, composed)
|
||||
s.close()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
|
@ -1,32 +0,0 @@
|
|||
# Import smtplib for the actual sending function
|
||||
import smtplib
|
||||
|
||||
# Here are the email package modules we'll need
|
||||
from email.mime.image import MIMEImage
|
||||
from email.mime.multipart import MIMEMultipart
|
||||
|
||||
COMMASPACE = ', '
|
||||
|
||||
# Create the container (outer) email message.
|
||||
msg = MIMEMultipart()
|
||||
msg['Subject'] = 'Our family reunion'
|
||||
# me == the sender's email address
|
||||
# family = the list of all recipients' email addresses
|
||||
msg['From'] = me
|
||||
msg['To'] = COMMASPACE.join(family)
|
||||
msg.preamble = 'Our family reunion'
|
||||
|
||||
# Assume we know that the image files are all in PNG format
|
||||
for file in pngfiles:
|
||||
# Open the files in binary mode. Let the MIMEImage class automatically
|
||||
# guess the specific image type.
|
||||
fp = open(file, 'rb')
|
||||
img = MIMEImage(fp.read())
|
||||
fp.close()
|
||||
msg.attach(img)
|
||||
|
||||
# Send the email via our own SMTP server.
|
||||
s = smtplib.SMTP()
|
||||
s.connect()
|
||||
s.sendmail(me, family, msg.as_string())
|
||||
s.close()
|
|
@ -1,25 +0,0 @@
|
|||
# Import smtplib for the actual sending function
|
||||
import smtplib
|
||||
|
||||
# Import the email modules we'll need
|
||||
from email.mime.text import MIMEText
|
||||
|
||||
# Open a plain text file for reading. For this example, assume that
|
||||
# the text file contains only ASCII characters.
|
||||
fp = open(textfile, 'rb')
|
||||
# Create a text/plain message
|
||||
msg = MIMEText(fp.read())
|
||||
fp.close()
|
||||
|
||||
# me == the sender's email address
|
||||
# you == the recipient's email address
|
||||
msg['Subject'] = 'The contents of %s' % textfile
|
||||
msg['From'] = me
|
||||
msg['To'] = you
|
||||
|
||||
# Send the message via our own SMTP server, but don't include the
|
||||
# envelope header.
|
||||
s = smtplib.SMTP()
|
||||
s.connect()
|
||||
s.sendmail(me, [you], msg.as_string())
|
||||
s.close()
|
|
@ -1,68 +0,0 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
"""Unpack a MIME message into a directory of files."""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import email
|
||||
import errno
|
||||
import mimetypes
|
||||
|
||||
from optparse import OptionParser
|
||||
|
||||
|
||||
def main():
|
||||
parser = OptionParser(usage="""\
|
||||
Unpack a MIME message into a directory of files.
|
||||
|
||||
Usage: %prog [options] msgfile
|
||||
""")
|
||||
parser.add_option('-d', '--directory',
|
||||
type='string', action='store',
|
||||
help="""Unpack the MIME message into the named
|
||||
directory, which will be created if it doesn't already
|
||||
exist.""")
|
||||
opts, args = parser.parse_args()
|
||||
if not opts.directory:
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
|
||||
try:
|
||||
msgfile = args[0]
|
||||
except IndexError:
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
|
||||
try:
|
||||
os.mkdir(opts.directory)
|
||||
except OSError, e:
|
||||
# Ignore directory exists error
|
||||
if e.errno <> errno.EEXIST:
|
||||
raise
|
||||
|
||||
fp = open(msgfile)
|
||||
msg = email.message_from_file(fp)
|
||||
fp.close()
|
||||
|
||||
counter = 1
|
||||
for part in msg.walk():
|
||||
# multipart/* are just containers
|
||||
if part.get_content_maintype() == 'multipart':
|
||||
continue
|
||||
# Applications should really sanitize the given filename so that an
|
||||
# email message can't be used to overwrite important files
|
||||
filename = part.get_filename()
|
||||
if not filename:
|
||||
ext = mimetypes.guess_extension(part.get_type())
|
||||
if not ext:
|
||||
# Use a generic bag-of-bits extension
|
||||
ext = '.bin'
|
||||
filename = 'part-%03d%s' % (counter, ext)
|
||||
counter += 1
|
||||
fp = open(os.path.join(opts.directory, filename), 'wb')
|
||||
fp.write(part.get_payload(decode=True))
|
||||
fp.close()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
|
@ -1,402 +0,0 @@
|
|||
% Copyright (C) 2001-2007 Python Software Foundation
|
||||
% Author: barry@python.org (Barry Warsaw)
|
||||
|
||||
\section{\module{email} ---
|
||||
An email and MIME handling package}
|
||||
|
||||
\declaremodule{standard}{email}
|
||||
\modulesynopsis{Package supporting the parsing, manipulating, and
|
||||
generating email messages, including MIME documents.}
|
||||
\moduleauthor{Barry A. Warsaw}{barry@python.org}
|
||||
\sectionauthor{Barry A. Warsaw}{barry@python.org}
|
||||
|
||||
\versionadded{2.2}
|
||||
|
||||
The \module{email} package is a library for managing email messages,
|
||||
including MIME and other \rfc{2822}-based message documents. It
|
||||
subsumes most of the functionality in several older standard modules
|
||||
such as \refmodule{rfc822}, \refmodule{mimetools},
|
||||
\refmodule{multifile}, and other non-standard packages such as
|
||||
\module{mimecntl}. It is specifically \emph{not} designed to do any
|
||||
sending of email messages to SMTP (\rfc{2821}), NNTP, or other servers; those
|
||||
are functions of modules such as \refmodule{smtplib} and \refmodule{nntplib}.
|
||||
The \module{email} package attempts to be as RFC-compliant as possible,
|
||||
supporting in addition to \rfc{2822}, such MIME-related RFCs as
|
||||
\rfc{2045}, \rfc{2046}, \rfc{2047}, and \rfc{2231}.
|
||||
|
||||
The primary distinguishing feature of the \module{email} package is
|
||||
that it splits the parsing and generating of email messages from the
|
||||
internal \emph{object model} representation of email. Applications
|
||||
using the \module{email} package deal primarily with objects; you can
|
||||
add sub-objects to messages, remove sub-objects from messages,
|
||||
completely re-arrange the contents, etc. There is a separate parser
|
||||
and a separate generator which handles the transformation from flat
|
||||
text to the object model, and then back to flat text again. There
|
||||
are also handy subclasses for some common MIME object types, and a few
|
||||
miscellaneous utilities that help with such common tasks as extracting
|
||||
and parsing message field values, creating RFC-compliant dates, etc.
|
||||
|
||||
The following sections describe the functionality of the
|
||||
\module{email} package. The ordering follows a progression that
|
||||
should be common in applications: an email message is read as flat
|
||||
text from a file or other source, the text is parsed to produce the
|
||||
object structure of the email message, this structure is manipulated,
|
||||
and finally, the object tree is rendered back into flat text.
|
||||
|
||||
It is perfectly feasible to create the object structure out of whole
|
||||
cloth --- i.e. completely from scratch. From there, a similar
|
||||
progression can be taken as above.
|
||||
|
||||
Also included are detailed specifications of all the classes and
|
||||
modules that the \module{email} package provides, the exception
|
||||
classes you might encounter while using the \module{email} package,
|
||||
some auxiliary utilities, and a few examples. For users of the older
|
||||
\module{mimelib} package, or previous versions of the \module{email}
|
||||
package, a section on differences and porting is provided.
|
||||
|
||||
\begin{seealso}
|
||||
\seemodule{smtplib}{SMTP protocol client}
|
||||
\seemodule{nntplib}{NNTP protocol client}
|
||||
\end{seealso}
|
||||
|
||||
\subsection{Representing an email message}
|
||||
\input{emailmessage}
|
||||
|
||||
\subsection{Parsing email messages}
|
||||
\input{emailparser}
|
||||
|
||||
\subsection{Generating MIME documents}
|
||||
\input{emailgenerator}
|
||||
|
||||
\subsection{Creating email and MIME objects from scratch}
|
||||
\input{emailmimebase}
|
||||
|
||||
\subsection{Internationalized headers}
|
||||
\input{emailheaders}
|
||||
|
||||
\subsection{Representing character sets}
|
||||
\input{emailcharsets}
|
||||
|
||||
\subsection{Encoders}
|
||||
\input{emailencoders}
|
||||
|
||||
\subsection{Exception and Defect classes}
|
||||
\input{emailexc}
|
||||
|
||||
\subsection{Miscellaneous utilities}
|
||||
\input{emailutil}
|
||||
|
||||
\subsection{Iterators}
|
||||
\input{emailiter}
|
||||
|
||||
\subsection{Package History\label{email-pkg-history}}
|
||||
|
||||
This table describes the release history of the email package, corresponding
|
||||
to the version of Python that the package was released with. For purposes of
|
||||
this document, when you see a note about change or added versions, these refer
|
||||
to the Python version the change was made in, \emph{not} the email package
|
||||
version. This table also describes the Python compatibility of each version
|
||||
of the package.
|
||||
|
||||
\begin{tableiii}{l|l|l}{constant}{email version}{distributed with}{compatible with}
|
||||
\lineiii{1.x}{Python 2.2.0 to Python 2.2.1}{\emph{no longer supported}}
|
||||
\lineiii{2.5}{Python 2.2.2+ and Python 2.3}{Python 2.1 to 2.5}
|
||||
\lineiii{3.0}{Python 2.4}{Python 2.3 to 2.5}
|
||||
\lineiii{4.0}{Python 2.5}{Python 2.3 to 2.5}
|
||||
\end{tableiii}
|
||||
|
||||
Here are the major differences between \module{email} version 4 and version 3:
|
||||
|
||||
\begin{itemize}
|
||||
\item All modules have been renamed according to \pep{8} standards. For
|
||||
example, the version 3 module \module{email.Message} was renamed to
|
||||
\module{email.message} in version 4.
|
||||
|
||||
\item A new subpackage \module{email.mime} was added and all the version 3
|
||||
\module{email.MIME*} modules were renamed and situated into the
|
||||
\module{email.mime} subpackage. For example, the version 3 module
|
||||
\module{email.MIMEText} was renamed to \module{email.mime.text}.
|
||||
|
||||
\emph{Note that the version 3 names will continue to work until Python
|
||||
2.6}.
|
||||
|
||||
\item The \module{email.mime.application} module was added, which contains the
|
||||
\class{MIMEApplication} class.
|
||||
|
||||
\item Methods that were deprecated in version 3 have been removed. These
|
||||
include \method{Generator.__call__()}, \method{Message.get_type()},
|
||||
\method{Message.get_main_type()}, \method{Message.get_subtype()}.
|
||||
|
||||
\item Fixes have been added for \rfc{2231} support which can change some of
|
||||
the return types for \function{Message.get_param()} and friends. Under
|
||||
some circumstances, values which used to return a 3-tuple now return
|
||||
simple strings (specifically, if all extended parameter segments were
|
||||
unencoded, there is no language and charset designation expected, so the
|
||||
return type is now a simple string). Also, \%-decoding used to be done
|
||||
for both encoded and unencoded segments; this decoding is now done only
|
||||
for encoded segments.
|
||||
\end{itemize}
|
||||
|
||||
Here are the major differences between \module{email} version 3 and version 2:
|
||||
|
||||
\begin{itemize}
|
||||
\item The \class{FeedParser} class was introduced, and the \class{Parser}
|
||||
class was implemented in terms of the \class{FeedParser}. All parsing
|
||||
therefore is non-strict, and parsing will make a best effort never to
|
||||
raise an exception. Problems found while parsing messages are stored in
|
||||
the message's \var{defect} attribute.
|
||||
|
||||
\item All aspects of the API which raised \exception{DeprecationWarning}s in
|
||||
version 2 have been removed. These include the \var{_encoder} argument
|
||||
to the \class{MIMEText} constructor, the \method{Message.add_payload()}
|
||||
method, the \function{Utils.dump_address_pair()} function, and the
|
||||
functions \function{Utils.decode()} and \function{Utils.encode()}.
|
||||
|
||||
\item New \exception{DeprecationWarning}s have been added to:
|
||||
\method{Generator.__call__()}, \method{Message.get_type()},
|
||||
\method{Message.get_main_type()}, \method{Message.get_subtype()}, and
|
||||
the \var{strict} argument to the \class{Parser} class. These are
|
||||
expected to be removed in future versions.
|
||||
|
||||
\item Support for Pythons earlier than 2.3 has been removed.
|
||||
\end{itemize}
|
||||
|
||||
Here are the differences between \module{email} version 2 and version 1:
|
||||
|
||||
\begin{itemize}
|
||||
\item The \module{email.Header} and \module{email.Charset} modules
|
||||
have been added.
|
||||
|
||||
\item The pickle format for \class{Message} instances has changed.
|
||||
Since this was never (and still isn't) formally defined, this
|
||||
isn't considered a backward incompatibility. However if your
|
||||
application pickles and unpickles \class{Message} instances, be
|
||||
aware that in \module{email} version 2, \class{Message}
|
||||
instances now have private variables \var{_charset} and
|
||||
\var{_default_type}.
|
||||
|
||||
\item Several methods in the \class{Message} class have been
|
||||
deprecated, or their signatures changed. Also, many new methods
|
||||
have been added. See the documentation for the \class{Message}
|
||||
class for details. The changes should be completely backward
|
||||
compatible.
|
||||
|
||||
\item The object structure has changed in the face of
|
||||
\mimetype{message/rfc822} content types. In \module{email}
|
||||
version 1, such a type would be represented by a scalar payload,
|
||||
i.e. the container message's \method{is_multipart()} returned
|
||||
false, \method{get_payload()} was not a list object, but a single
|
||||
\class{Message} instance.
|
||||
|
||||
This structure was inconsistent with the rest of the package, so
|
||||
the object representation for \mimetype{message/rfc822} content
|
||||
types was changed. In \module{email} version 2, the container
|
||||
\emph{does} return \code{True} from \method{is_multipart()}, and
|
||||
\method{get_payload()} returns a list containing a single
|
||||
\class{Message} item.
|
||||
|
||||
Note that this is one place that backward compatibility could
|
||||
not be completely maintained. However, if you're already
|
||||
testing the return type of \method{get_payload()}, you should be
|
||||
fine. You just need to make sure your code doesn't do a
|
||||
\method{set_payload()} with a \class{Message} instance on a
|
||||
container with a content type of \mimetype{message/rfc822}.
|
||||
|
||||
\item The \class{Parser} constructor's \var{strict} argument was
|
||||
added, and its \method{parse()} and \method{parsestr()} methods
|
||||
grew a \var{headersonly} argument. The \var{strict} flag was
|
||||
also added to functions \function{email.message_from_file()}
|
||||
and \function{email.message_from_string()}.
|
||||
|
||||
\item \method{Generator.__call__()} is deprecated; use
|
||||
\method{Generator.flatten()} instead. The \class{Generator}
|
||||
class has also grown the \method{clone()} method.
|
||||
|
||||
\item The \class{DecodedGenerator} class in the
|
||||
\module{email.Generator} module was added.
|
||||
|
||||
\item The intermediate base classes \class{MIMENonMultipart} and
|
||||
\class{MIMEMultipart} have been added, and interposed in the
|
||||
class hierarchy for most of the other MIME-related derived
|
||||
classes.
|
||||
|
||||
\item The \var{_encoder} argument to the \class{MIMEText} constructor
|
||||
has been deprecated. Encoding now happens implicitly based
|
||||
on the \var{_charset} argument.
|
||||
|
||||
\item The following functions in the \module{email.Utils} module have
|
||||
been deprecated: \function{dump_address_pairs()},
|
||||
\function{decode()}, and \function{encode()}. The following
|
||||
functions have been added to the module:
|
||||
\function{make_msgid()}, \function{decode_rfc2231()},
|
||||
\function{encode_rfc2231()}, and \function{decode_params()}.
|
||||
|
||||
\item The non-public function \function{email.Iterators._structure()}
|
||||
was added.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{Differences from \module{mimelib}}
|
||||
|
||||
The \module{email} package was originally prototyped as a separate
|
||||
library called
|
||||
\ulink{\texttt{mimelib}}{http://mimelib.sf.net/}.
|
||||
Changes have been made so that
|
||||
method names are more consistent, and some methods or modules have
|
||||
either been added or removed. The semantics of some of the methods
|
||||
have also changed. For the most part, any functionality available in
|
||||
\module{mimelib} is still available in the \refmodule{email} package,
|
||||
albeit often in a different way. Backward compatibility between
|
||||
the \module{mimelib} package and the \module{email} package was not a
|
||||
priority.
|
||||
|
||||
Here is a brief description of the differences between the
|
||||
\module{mimelib} and the \refmodule{email} packages, along with hints on
|
||||
how to port your applications.
|
||||
|
||||
Of course, the most visible difference between the two packages is
|
||||
that the package name has been changed to \refmodule{email}. In
|
||||
addition, the top-level package has the following differences:
|
||||
|
||||
\begin{itemize}
|
||||
\item \function{messageFromString()} has been renamed to
|
||||
\function{message_from_string()}.
|
||||
|
||||
\item \function{messageFromFile()} has been renamed to
|
||||
\function{message_from_file()}.
|
||||
|
||||
\end{itemize}
|
||||
|
||||
The \class{Message} class has the following differences:
|
||||
|
||||
\begin{itemize}
|
||||
\item The method \method{asString()} was renamed to \method{as_string()}.
|
||||
|
||||
\item The method \method{ismultipart()} was renamed to
|
||||
\method{is_multipart()}.
|
||||
|
||||
\item The \method{get_payload()} method has grown a \var{decode}
|
||||
optional argument.
|
||||
|
||||
\item The method \method{getall()} was renamed to \method{get_all()}.
|
||||
|
||||
\item The method \method{addheader()} was renamed to \method{add_header()}.
|
||||
|
||||
\item The method \method{gettype()} was renamed to \method{get_type()}.
|
||||
|
||||
\item The method \method{getmaintype()} was renamed to
|
||||
\method{get_main_type()}.
|
||||
|
||||
\item The method \method{getsubtype()} was renamed to
|
||||
\method{get_subtype()}.
|
||||
|
||||
\item The method \method{getparams()} was renamed to
|
||||
\method{get_params()}.
|
||||
Also, whereas \method{getparams()} returned a list of strings,
|
||||
\method{get_params()} returns a list of 2-tuples, effectively
|
||||
the key/value pairs of the parameters, split on the \character{=}
|
||||
sign.
|
||||
|
||||
\item The method \method{getparam()} was renamed to \method{get_param()}.
|
||||
|
||||
\item The method \method{getcharsets()} was renamed to
|
||||
\method{get_charsets()}.
|
||||
|
||||
\item The method \method{getfilename()} was renamed to
|
||||
\method{get_filename()}.
|
||||
|
||||
\item The method \method{getboundary()} was renamed to
|
||||
\method{get_boundary()}.
|
||||
|
||||
\item The method \method{setboundary()} was renamed to
|
||||
\method{set_boundary()}.
|
||||
|
||||
\item The method \method{getdecodedpayload()} was removed. To get
|
||||
similar functionality, pass the value 1 to the \var{decode} flag
|
||||
of the {get_payload()} method.
|
||||
|
||||
\item The method \method{getpayloadastext()} was removed. Similar
|
||||
functionality
|
||||
is supported by the \class{DecodedGenerator} class in the
|
||||
\refmodule{email.generator} module.
|
||||
|
||||
\item The method \method{getbodyastext()} was removed. You can get
|
||||
similar functionality by creating an iterator with
|
||||
\function{typed_subpart_iterator()} in the
|
||||
\refmodule{email.iterators} module.
|
||||
\end{itemize}
|
||||
|
||||
The \class{Parser} class has no differences in its public interface.
|
||||
It does have some additional smarts to recognize
|
||||
\mimetype{message/delivery-status} type messages, which it represents as
|
||||
a \class{Message} instance containing separate \class{Message}
|
||||
subparts for each header block in the delivery status
|
||||
notification\footnote{Delivery Status Notifications (DSN) are defined
|
||||
in \rfc{1894}.}.
|
||||
|
||||
The \class{Generator} class has no differences in its public
|
||||
interface. There is a new class in the \refmodule{email.generator}
|
||||
module though, called \class{DecodedGenerator} which provides most of
|
||||
the functionality previously available in the
|
||||
\method{Message.getpayloadastext()} method.
|
||||
|
||||
The following modules and classes have been changed:
|
||||
|
||||
\begin{itemize}
|
||||
\item The \class{MIMEBase} class constructor arguments \var{_major}
|
||||
and \var{_minor} have changed to \var{_maintype} and
|
||||
\var{_subtype} respectively.
|
||||
|
||||
\item The \code{Image} class/module has been renamed to
|
||||
\code{MIMEImage}. The \var{_minor} argument has been renamed to
|
||||
\var{_subtype}.
|
||||
|
||||
\item The \code{Text} class/module has been renamed to
|
||||
\code{MIMEText}. The \var{_minor} argument has been renamed to
|
||||
\var{_subtype}.
|
||||
|
||||
\item The \code{MessageRFC822} class/module has been renamed to
|
||||
\code{MIMEMessage}. Note that an earlier version of
|
||||
\module{mimelib} called this class/module \code{RFC822}, but
|
||||
that clashed with the Python standard library module
|
||||
\refmodule{rfc822} on some case-insensitive file systems.
|
||||
|
||||
Also, the \class{MIMEMessage} class now represents any kind of
|
||||
MIME message with main type \mimetype{message}. It takes an
|
||||
optional argument \var{_subtype} which is used to set the MIME
|
||||
subtype. \var{_subtype} defaults to \mimetype{rfc822}.
|
||||
\end{itemize}
|
||||
|
||||
\module{mimelib} provided some utility functions in its
|
||||
\module{address} and \module{date} modules. All of these functions
|
||||
have been moved to the \refmodule{email.utils} module.
|
||||
|
||||
The \code{MsgReader} class/module has been removed. Its functionality
|
||||
is most closely supported in the \function{body_line_iterator()}
|
||||
function in the \refmodule{email.iterators} module.
|
||||
|
||||
\subsection{Examples}
|
||||
|
||||
Here are a few examples of how to use the \module{email} package to
|
||||
read, write, and send simple email messages, as well as more complex
|
||||
MIME messages.
|
||||
|
||||
First, let's see how to create and send a simple text message:
|
||||
|
||||
\verbatiminput{email-simple.py}
|
||||
|
||||
Here's an example of how to send a MIME message containing a bunch of
|
||||
family pictures that may be residing in a directory:
|
||||
|
||||
\verbatiminput{email-mime.py}
|
||||
|
||||
Here's an example of how to send the entire contents of a directory as
|
||||
an email message:
|
||||
\footnote{Thanks to Matthew Dixon Cowles for the original inspiration
|
||||
and examples.}
|
||||
|
||||
\verbatiminput{email-dir.py}
|
||||
|
||||
And finally, here's an example of how to unpack a MIME message like
|
||||
the one above, into a directory of files:
|
||||
|
||||
\verbatiminput{email-unpack.py}
|
|
@ -1,244 +0,0 @@
|
|||
\declaremodule{standard}{email.charset}
|
||||
\modulesynopsis{Character Sets}
|
||||
|
||||
This module provides a class \class{Charset} for representing
|
||||
character sets and character set conversions in email messages, as
|
||||
well as a character set registry and several convenience methods for
|
||||
manipulating this registry. Instances of \class{Charset} are used in
|
||||
several other modules within the \module{email} package.
|
||||
|
||||
Import this class from the \module{email.charset} module.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
|
||||
\begin{classdesc}{Charset}{\optional{input_charset}}
|
||||
Map character sets to their email properties.
|
||||
|
||||
This class provides information about the requirements imposed on
|
||||
email for a specific character set. It also provides convenience
|
||||
routines for converting between character sets, given the availability
|
||||
of the applicable codecs. Given a character set, it will do its best
|
||||
to provide information on how to use that character set in an email
|
||||
message in an RFC-compliant way.
|
||||
|
||||
Certain character sets must be encoded with quoted-printable or base64
|
||||
when used in email headers or bodies. Certain character sets must be
|
||||
converted outright, and are not allowed in email.
|
||||
|
||||
Optional \var{input_charset} is as described below; it is always
|
||||
coerced to lower case. After being alias normalized it is also used
|
||||
as a lookup into the registry of character sets to find out the header
|
||||
encoding, body encoding, and output conversion codec to be used for
|
||||
the character set. For example, if
|
||||
\var{input_charset} is \code{iso-8859-1}, then headers and bodies will
|
||||
be encoded using quoted-printable and no output conversion codec is
|
||||
necessary. If \var{input_charset} is \code{euc-jp}, then headers will
|
||||
be encoded with base64, bodies will not be encoded, but output text
|
||||
will be converted from the \code{euc-jp} character set to the
|
||||
\code{iso-2022-jp} character set.
|
||||
\end{classdesc}
|
||||
|
||||
\class{Charset} instances have the following data attributes:
|
||||
|
||||
\begin{datadesc}{input_charset}
|
||||
The initial character set specified. Common aliases are converted to
|
||||
their \emph{official} email names (e.g. \code{latin_1} is converted to
|
||||
\code{iso-8859-1}). Defaults to 7-bit \code{us-ascii}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{header_encoding}
|
||||
If the character set must be encoded before it can be used in an
|
||||
email header, this attribute will be set to \code{Charset.QP} (for
|
||||
quoted-printable), \code{Charset.BASE64} (for base64 encoding), or
|
||||
\code{Charset.SHORTEST} for the shortest of QP or BASE64 encoding.
|
||||
Otherwise, it will be \code{None}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{body_encoding}
|
||||
Same as \var{header_encoding}, but describes the encoding for the
|
||||
mail message's body, which indeed may be different than the header
|
||||
encoding. \code{Charset.SHORTEST} is not allowed for
|
||||
\var{body_encoding}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{output_charset}
|
||||
Some character sets must be converted before they can be used in
|
||||
email headers or bodies. If the \var{input_charset} is one of
|
||||
them, this attribute will contain the name of the character set
|
||||
output will be converted to. Otherwise, it will be \code{None}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{input_codec}
|
||||
The name of the Python codec used to convert the \var{input_charset} to
|
||||
Unicode. If no conversion codec is necessary, this attribute will be
|
||||
\code{None}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{output_codec}
|
||||
The name of the Python codec used to convert Unicode to the
|
||||
\var{output_charset}. If no conversion codec is necessary, this
|
||||
attribute will have the same value as the \var{input_codec}.
|
||||
\end{datadesc}
|
||||
|
||||
\class{Charset} instances also have the following methods:
|
||||
|
||||
\begin{methoddesc}[Charset]{get_body_encoding}{}
|
||||
Return the content transfer encoding used for body encoding.
|
||||
|
||||
This is either the string \samp{quoted-printable} or \samp{base64}
|
||||
depending on the encoding used, or it is a function, in which case you
|
||||
should call the function with a single argument, the Message object
|
||||
being encoded. The function should then set the
|
||||
\mailheader{Content-Transfer-Encoding} header itself to whatever is
|
||||
appropriate.
|
||||
|
||||
Returns the string \samp{quoted-printable} if
|
||||
\var{body_encoding} is \code{QP}, returns the string
|
||||
\samp{base64} if \var{body_encoding} is \code{BASE64}, and returns the
|
||||
string \samp{7bit} otherwise.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{convert}{s}
|
||||
Convert the string \var{s} from the \var{input_codec} to the
|
||||
\var{output_codec}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{to_splittable}{s}
|
||||
Convert a possibly multibyte string to a safely splittable format.
|
||||
\var{s} is the string to split.
|
||||
|
||||
Uses the \var{input_codec} to try and convert the string to Unicode,
|
||||
so it can be safely split on character boundaries (even for multibyte
|
||||
characters).
|
||||
|
||||
Returns the string as-is if it isn't known how to convert \var{s} to
|
||||
Unicode with the \var{input_charset}.
|
||||
|
||||
Characters that could not be converted to Unicode will be replaced
|
||||
with the Unicode replacement character \character{U+FFFD}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{from_splittable}{ustr\optional{, to_output}}
|
||||
Convert a splittable string back into an encoded string. \var{ustr}
|
||||
is a Unicode string to ``unsplit''.
|
||||
|
||||
This method uses the proper codec to try and convert the string from
|
||||
Unicode back into an encoded format. Return the string as-is if it is
|
||||
not Unicode, or if it could not be converted from Unicode.
|
||||
|
||||
Characters that could not be converted from Unicode will be replaced
|
||||
with an appropriate character (usually \character{?}).
|
||||
|
||||
If \var{to_output} is \code{True} (the default), uses
|
||||
\var{output_codec} to convert to an
|
||||
encoded format. If \var{to_output} is \code{False}, it uses
|
||||
\var{input_codec}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{get_output_charset}{}
|
||||
Return the output character set.
|
||||
|
||||
This is the \var{output_charset} attribute if that is not \code{None},
|
||||
otherwise it is \var{input_charset}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{encoded_header_len}{}
|
||||
Return the length of the encoded header string, properly calculating
|
||||
for quoted-printable or base64 encoding.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{header_encode}{s\optional{, convert}}
|
||||
Header-encode the string \var{s}.
|
||||
|
||||
If \var{convert} is \code{True}, the string will be converted from the
|
||||
input charset to the output charset automatically. This is not useful
|
||||
for multibyte character sets, which have line length issues (multibyte
|
||||
characters must be split on a character, not a byte boundary); use the
|
||||
higher-level \class{Header} class to deal with these issues (see
|
||||
\refmodule{email.header}). \var{convert} defaults to \code{False}.
|
||||
|
||||
The type of encoding (base64 or quoted-printable) will be based on
|
||||
the \var{header_encoding} attribute.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}{body_encode}{s\optional{, convert}}
|
||||
Body-encode the string \var{s}.
|
||||
|
||||
If \var{convert} is \code{True} (the default), the string will be
|
||||
converted from the input charset to output charset automatically.
|
||||
Unlike \method{header_encode()}, there are no issues with byte
|
||||
boundaries and multibyte charsets in email bodies, so this is usually
|
||||
pretty safe.
|
||||
|
||||
The type of encoding (base64 or quoted-printable) will be based on
|
||||
the \var{body_encoding} attribute.
|
||||
\end{methoddesc}
|
||||
|
||||
The \class{Charset} class also provides a number of methods to support
|
||||
standard operations and built-in functions.
|
||||
|
||||
\begin{methoddesc}[Charset]{__str__}{}
|
||||
Returns \var{input_charset} as a string coerced to lower case.
|
||||
\method{__repr__()} is an alias for \method{__str__()}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Charset]{__eq__}{other}
|
||||
This method allows you to compare two \class{Charset} instances for equality.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Header]{__ne__}{other}
|
||||
This method allows you to compare two \class{Charset} instances for inequality.
|
||||
\end{methoddesc}
|
||||
|
||||
The \module{email.charset} module also provides the following
|
||||
functions for adding new entries to the global character set, alias,
|
||||
and codec registries:
|
||||
|
||||
\begin{funcdesc}{add_charset}{charset\optional{, header_enc\optional{,
|
||||
body_enc\optional{, output_charset}}}}
|
||||
Add character properties to the global registry.
|
||||
|
||||
\var{charset} is the input character set, and must be the canonical
|
||||
name of a character set.
|
||||
|
||||
Optional \var{header_enc} and \var{body_enc} is either
|
||||
\code{Charset.QP} for quoted-printable, \code{Charset.BASE64} for
|
||||
base64 encoding, \code{Charset.SHORTEST} for the shortest of
|
||||
quoted-printable or base64 encoding, or \code{None} for no encoding.
|
||||
\code{SHORTEST} is only valid for \var{header_enc}. The default is
|
||||
\code{None} for no encoding.
|
||||
|
||||
Optional \var{output_charset} is the character set that the output
|
||||
should be in. Conversions will proceed from input charset, to
|
||||
Unicode, to the output charset when the method
|
||||
\method{Charset.convert()} is called. The default is to output in the
|
||||
same character set as the input.
|
||||
|
||||
Both \var{input_charset} and \var{output_charset} must have Unicode
|
||||
codec entries in the module's character set-to-codec mapping; use
|
||||
\function{add_codec()} to add codecs the module does
|
||||
not know about. See the \refmodule{codecs} module's documentation for
|
||||
more information.
|
||||
|
||||
The global character set registry is kept in the module global
|
||||
dictionary \code{CHARSETS}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{add_alias}{alias, canonical}
|
||||
Add a character set alias. \var{alias} is the alias name,
|
||||
e.g. \code{latin-1}. \var{canonical} is the character set's canonical
|
||||
name, e.g. \code{iso-8859-1}.
|
||||
|
||||
The global charset alias registry is kept in the module global
|
||||
dictionary \code{ALIASES}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{add_codec}{charset, codecname}
|
||||
Add a codec that map characters in the given character set to and from
|
||||
Unicode.
|
||||
|
||||
\var{charset} is the canonical name of a character set.
|
||||
\var{codecname} is the name of a Python codec, as appropriate for the
|
||||
second argument to the \function{unicode()} built-in, or to the
|
||||
\method{encode()} method of a Unicode string.
|
||||
\end{funcdesc}
|
|
@ -1,47 +0,0 @@
|
|||
\declaremodule{standard}{email.encoders}
|
||||
\modulesynopsis{Encoders for email message payloads.}
|
||||
|
||||
When creating \class{Message} objects from scratch, you often need to
|
||||
encode the payloads for transport through compliant mail servers.
|
||||
This is especially true for \mimetype{image/*} and \mimetype{text/*}
|
||||
type messages containing binary data.
|
||||
|
||||
The \module{email} package provides some convenient encodings in its
|
||||
\module{encoders} module. These encoders are actually used by the
|
||||
\class{MIMEAudio} and \class{MIMEImage} class constructors to provide default
|
||||
encodings. All encoder functions take exactly one argument, the message
|
||||
object to encode. They usually extract the payload, encode it, and reset the
|
||||
payload to this newly encoded value. They should also set the
|
||||
\mailheader{Content-Transfer-Encoding} header as appropriate.
|
||||
|
||||
Here are the encoding functions provided:
|
||||
|
||||
\begin{funcdesc}{encode_quopri}{msg}
|
||||
Encodes the payload into quoted-printable form and sets the
|
||||
\mailheader{Content-Transfer-Encoding} header to
|
||||
\code{quoted-printable}\footnote{Note that encoding with
|
||||
\method{encode_quopri()} also encodes all tabs and space characters in
|
||||
the data.}.
|
||||
This is a good encoding to use when most of your payload is normal
|
||||
printable data, but contains a few unprintable characters.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{encode_base64}{msg}
|
||||
Encodes the payload into base64 form and sets the
|
||||
\mailheader{Content-Transfer-Encoding} header to
|
||||
\code{base64}. This is a good encoding to use when most of your payload
|
||||
is unprintable data since it is a more compact form than
|
||||
quoted-printable. The drawback of base64 encoding is that it
|
||||
renders the text non-human readable.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{encode_7or8bit}{msg}
|
||||
This doesn't actually modify the message's payload, but it does set
|
||||
the \mailheader{Content-Transfer-Encoding} header to either \code{7bit} or
|
||||
\code{8bit} as appropriate, based on the payload data.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{encode_noop}{msg}
|
||||
This does nothing; it doesn't even set the
|
||||
\mailheader{Content-Transfer-Encoding} header.
|
||||
\end{funcdesc}
|
|
@ -1,87 +0,0 @@
|
|||
\declaremodule{standard}{email.errors}
|
||||
\modulesynopsis{The exception classes used by the email package.}
|
||||
|
||||
The following exception classes are defined in the
|
||||
\module{email.errors} module:
|
||||
|
||||
\begin{excclassdesc}{MessageError}{}
|
||||
This is the base class for all exceptions that the \module{email}
|
||||
package can raise. It is derived from the standard
|
||||
\exception{Exception} class and defines no additional methods.
|
||||
\end{excclassdesc}
|
||||
|
||||
\begin{excclassdesc}{MessageParseError}{}
|
||||
This is the base class for exceptions thrown by the \class{Parser}
|
||||
class. It is derived from \exception{MessageError}.
|
||||
\end{excclassdesc}
|
||||
|
||||
\begin{excclassdesc}{HeaderParseError}{}
|
||||
Raised under some error conditions when parsing the \rfc{2822} headers of
|
||||
a message, this class is derived from \exception{MessageParseError}.
|
||||
It can be raised from the \method{Parser.parse()} or
|
||||
\method{Parser.parsestr()} methods.
|
||||
|
||||
Situations where it can be raised include finding an envelope
|
||||
header after the first \rfc{2822} header of the message, finding a
|
||||
continuation line before the first \rfc{2822} header is found, or finding
|
||||
a line in the headers which is neither a header or a continuation
|
||||
line.
|
||||
\end{excclassdesc}
|
||||
|
||||
\begin{excclassdesc}{BoundaryError}{}
|
||||
Raised under some error conditions when parsing the \rfc{2822} headers of
|
||||
a message, this class is derived from \exception{MessageParseError}.
|
||||
It can be raised from the \method{Parser.parse()} or
|
||||
\method{Parser.parsestr()} methods.
|
||||
|
||||
Situations where it can be raised include not being able to find the
|
||||
starting or terminating boundary in a \mimetype{multipart/*} message
|
||||
when strict parsing is used.
|
||||
\end{excclassdesc}
|
||||
|
||||
\begin{excclassdesc}{MultipartConversionError}{}
|
||||
Raised when a payload is added to a \class{Message} object using
|
||||
\method{add_payload()}, but the payload is already a scalar and the
|
||||
message's \mailheader{Content-Type} main type is not either
|
||||
\mimetype{multipart} or missing. \exception{MultipartConversionError}
|
||||
multiply inherits from \exception{MessageError} and the built-in
|
||||
\exception{TypeError}.
|
||||
|
||||
Since \method{Message.add_payload()} is deprecated, this exception is
|
||||
rarely raised in practice. However the exception may also be raised
|
||||
if the \method{attach()} method is called on an instance of a class
|
||||
derived from \class{MIMENonMultipart} (e.g. \class{MIMEImage}).
|
||||
\end{excclassdesc}
|
||||
|
||||
Here's the list of the defects that the \class{FeedParser} can find while
|
||||
parsing messages. Note that the defects are added to the message where the
|
||||
problem was found, so for example, if a message nested inside a
|
||||
\mimetype{multipart/alternative} had a malformed header, that nested message
|
||||
object would have a defect, but the containing messages would not.
|
||||
|
||||
All defect classes are subclassed from \class{email.errors.MessageDefect}, but
|
||||
this class is \emph{not} an exception!
|
||||
|
||||
\versionadded[All the defect classes were added]{2.4}
|
||||
|
||||
\begin{itemize}
|
||||
\item \class{NoBoundaryInMultipartDefect} -- A message claimed to be a
|
||||
multipart, but had no \mimetype{boundary} parameter.
|
||||
|
||||
\item \class{StartBoundaryNotFoundDefect} -- The start boundary claimed in the
|
||||
\mailheader{Content-Type} header was never found.
|
||||
|
||||
\item \class{FirstHeaderLineIsContinuationDefect} -- The message had a
|
||||
continuation line as its first header line.
|
||||
|
||||
\item \class{MisplacedEnvelopeHeaderDefect} - A ``Unix From'' header was found
|
||||
in the middle of a header block.
|
||||
|
||||
\item \class{MalformedHeaderDefect} -- A header was found that was missing a
|
||||
colon, or was otherwise malformed.
|
||||
|
||||
\item \class{MultipartInvariantViolationDefect} -- A message claimed to be a
|
||||
\mimetype{multipart}, but no subparts were found. Note that when a
|
||||
message has this defect, its \method{is_multipart()} method may return
|
||||
false even though its content type claims to be \mimetype{multipart}.
|
||||
\end{itemize}
|
|
@ -1,133 +0,0 @@
|
|||
\declaremodule{standard}{email.generator}
|
||||
\modulesynopsis{Generate flat text email messages from a message structure.}
|
||||
|
||||
One of the most common tasks is to generate the flat text of the email
|
||||
message represented by a message object structure. You will need to do
|
||||
this if you want to send your message via the \refmodule{smtplib}
|
||||
module or the \refmodule{nntplib} module, or print the message on the
|
||||
console. Taking a message object structure and producing a flat text
|
||||
document is the job of the \class{Generator} class.
|
||||
|
||||
Again, as with the \refmodule{email.parser} module, you aren't limited
|
||||
to the functionality of the bundled generator; you could write one
|
||||
from scratch yourself. However the bundled generator knows how to
|
||||
generate most email in a standards-compliant way, should handle MIME
|
||||
and non-MIME email messages just fine, and is designed so that the
|
||||
transformation from flat text, to a message structure via the
|
||||
\class{Parser} class, and back to flat text, is idempotent (the input
|
||||
is identical to the output).
|
||||
|
||||
Here are the public methods of the \class{Generator} class, imported from the
|
||||
\module{email.generator} module:
|
||||
|
||||
\begin{classdesc}{Generator}{outfp\optional{, mangle_from_\optional{,
|
||||
maxheaderlen}}}
|
||||
The constructor for the \class{Generator} class takes a file-like
|
||||
object called \var{outfp} for an argument. \var{outfp} must support
|
||||
the \method{write()} method and be usable as the output file in a
|
||||
Python extended print statement.
|
||||
|
||||
Optional \var{mangle_from_} is a flag that, when \code{True}, puts a
|
||||
\samp{>} character in front of any line in the body that starts exactly as
|
||||
\samp{From }, i.e. \code{From} followed by a space at the beginning of the
|
||||
line. This is the only guaranteed portable way to avoid having such
|
||||
lines be mistaken for a \UNIX{} mailbox format envelope header separator (see
|
||||
\ulink{WHY THE CONTENT-LENGTH FORMAT IS BAD}
|
||||
{http://www.jwz.org/doc/content-length.html}
|
||||
for details). \var{mangle_from_} defaults to \code{True}, but you
|
||||
might want to set this to \code{False} if you are not writing \UNIX{}
|
||||
mailbox format files.
|
||||
|
||||
Optional \var{maxheaderlen} specifies the longest length for a
|
||||
non-continued header. When a header line is longer than
|
||||
\var{maxheaderlen} (in characters, with tabs expanded to 8 spaces),
|
||||
the header will be split as defined in the \module{email.header.Header}
|
||||
class. Set to zero to disable header wrapping. The default is 78, as
|
||||
recommended (but not required) by \rfc{2822}.
|
||||
\end{classdesc}
|
||||
|
||||
The other public \class{Generator} methods are:
|
||||
|
||||
\begin{methoddesc}[Generator]{flatten}{msg\optional{, unixfrom}}
|
||||
Print the textual representation of the message object structure rooted at
|
||||
\var{msg} to the output file specified when the \class{Generator}
|
||||
instance was created. Subparts are visited depth-first and the
|
||||
resulting text will be properly MIME encoded.
|
||||
|
||||
Optional \var{unixfrom} is a flag that forces the printing of the
|
||||
envelope header delimiter before the first \rfc{2822} header of the
|
||||
root message object. If the root object has no envelope header, a
|
||||
standard one is crafted. By default, this is set to \code{False} to
|
||||
inhibit the printing of the envelope delimiter.
|
||||
|
||||
Note that for subparts, no envelope header is ever printed.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Generator]{clone}{fp}
|
||||
Return an independent clone of this \class{Generator} instance with
|
||||
the exact same options.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Generator]{write}{s}
|
||||
Write the string \var{s} to the underlying file object,
|
||||
i.e. \var{outfp} passed to \class{Generator}'s constructor. This
|
||||
provides just enough file-like API for \class{Generator} instances to
|
||||
be used in extended print statements.
|
||||
\end{methoddesc}
|
||||
|
||||
As a convenience, see the methods \method{Message.as_string()} and
|
||||
\code{str(aMessage)}, a.k.a. \method{Message.__str__()}, which
|
||||
simplify the generation of a formatted string representation of a
|
||||
message object. For more detail, see \refmodule{email.message}.
|
||||
|
||||
The \module{email.generator} module also provides a derived class,
|
||||
called \class{DecodedGenerator} which is like the \class{Generator}
|
||||
base class, except that non-\mimetype{text} parts are substituted with
|
||||
a format string representing the part.
|
||||
|
||||
\begin{classdesc}{DecodedGenerator}{outfp\optional{, mangle_from_\optional{,
|
||||
maxheaderlen\optional{, fmt}}}}
|
||||
|
||||
This class, derived from \class{Generator} walks through all the
|
||||
subparts of a message. If the subpart is of main type
|
||||
\mimetype{text}, then it prints the decoded payload of the subpart.
|
||||
Optional \var{_mangle_from_} and \var{maxheaderlen} are as with the
|
||||
\class{Generator} base class.
|
||||
|
||||
If the subpart is not of main type \mimetype{text}, optional \var{fmt}
|
||||
is a format string that is used instead of the message payload.
|
||||
\var{fmt} is expanded with the following keywords, \samp{\%(keyword)s}
|
||||
format:
|
||||
|
||||
\begin{itemize}
|
||||
\item \code{type} -- Full MIME type of the non-\mimetype{text} part
|
||||
|
||||
\item \code{maintype} -- Main MIME type of the non-\mimetype{text} part
|
||||
|
||||
\item \code{subtype} -- Sub-MIME type of the non-\mimetype{text} part
|
||||
|
||||
\item \code{filename} -- Filename of the non-\mimetype{text} part
|
||||
|
||||
\item \code{description} -- Description associated with the
|
||||
non-\mimetype{text} part
|
||||
|
||||
\item \code{encoding} -- Content transfer encoding of the
|
||||
non-\mimetype{text} part
|
||||
|
||||
\end{itemize}
|
||||
|
||||
The default value for \var{fmt} is \code{None}, meaning
|
||||
|
||||
\begin{verbatim}
|
||||
[Non-text (%(type)s) part of message omitted, filename %(filename)s]
|
||||
\end{verbatim}
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{classdesc}
|
||||
|
||||
\versionchanged[The previously deprecated method \method{__call__()} was
|
||||
removed]{2.5}
|
|
@ -1,178 +0,0 @@
|
|||
\declaremodule{standard}{email.header}
|
||||
\modulesynopsis{Representing non-ASCII headers}
|
||||
|
||||
\rfc{2822} is the base standard that describes the format of email
|
||||
messages. It derives from the older \rfc{822} standard which came
|
||||
into widespread use at a time when most email was composed of \ASCII{}
|
||||
characters only. \rfc{2822} is a specification written assuming email
|
||||
contains only 7-bit \ASCII{} characters.
|
||||
|
||||
Of course, as email has been deployed worldwide, it has become
|
||||
internationalized, such that language specific character sets can now
|
||||
be used in email messages. The base standard still requires email
|
||||
messages to be transferred using only 7-bit \ASCII{} characters, so a
|
||||
slew of RFCs have been written describing how to encode email
|
||||
containing non-\ASCII{} characters into \rfc{2822}-compliant format.
|
||||
These RFCs include \rfc{2045}, \rfc{2046}, \rfc{2047}, and \rfc{2231}.
|
||||
The \module{email} package supports these standards in its
|
||||
\module{email.header} and \module{email.charset} modules.
|
||||
|
||||
If you want to include non-\ASCII{} characters in your email headers,
|
||||
say in the \mailheader{Subject} or \mailheader{To} fields, you should
|
||||
use the \class{Header} class and assign the field in the
|
||||
\class{Message} object to an instance of \class{Header} instead of
|
||||
using a string for the header value. Import the \class{Header} class from the
|
||||
\module{email.header} module. For example:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> from email.message import Message
|
||||
>>> from email.header import Header
|
||||
>>> msg = Message()
|
||||
>>> h = Header('p\xf6stal', 'iso-8859-1')
|
||||
>>> msg['Subject'] = h
|
||||
>>> print msg.as_string()
|
||||
Subject: =?iso-8859-1?q?p=F6stal?=
|
||||
|
||||
|
||||
\end{verbatim}
|
||||
|
||||
Notice here how we wanted the \mailheader{Subject} field to contain a
|
||||
non-\ASCII{} character? We did this by creating a \class{Header}
|
||||
instance and passing in the character set that the byte string was
|
||||
encoded in. When the subsequent \class{Message} instance was
|
||||
flattened, the \mailheader{Subject} field was properly \rfc{2047}
|
||||
encoded. MIME-aware mail readers would show this header using the
|
||||
embedded ISO-8859-1 character.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
|
||||
Here is the \class{Header} class description:
|
||||
|
||||
\begin{classdesc}{Header}{\optional{s\optional{, charset\optional{,
|
||||
maxlinelen\optional{, header_name\optional{, continuation_ws\optional{,
|
||||
errors}}}}}}}
|
||||
Create a MIME-compliant header that can contain strings in different
|
||||
character sets.
|
||||
|
||||
Optional \var{s} is the initial header value. If \code{None} (the
|
||||
default), the initial header value is not set. You can later append
|
||||
to the header with \method{append()} method calls. \var{s} may be a
|
||||
byte string or a Unicode string, but see the \method{append()}
|
||||
documentation for semantics.
|
||||
|
||||
Optional \var{charset} serves two purposes: it has the same meaning as
|
||||
the \var{charset} argument to the \method{append()} method. It also
|
||||
sets the default character set for all subsequent \method{append()}
|
||||
calls that omit the \var{charset} argument. If \var{charset} is not
|
||||
provided in the constructor (the default), the \code{us-ascii}
|
||||
character set is used both as \var{s}'s initial charset and as the
|
||||
default for subsequent \method{append()} calls.
|
||||
|
||||
The maximum line length can be specified explicit via
|
||||
\var{maxlinelen}. For splitting the first line to a shorter value (to
|
||||
account for the field header which isn't included in \var{s},
|
||||
e.g. \mailheader{Subject}) pass in the name of the field in
|
||||
\var{header_name}. The default \var{maxlinelen} is 76, and the
|
||||
default value for \var{header_name} is \code{None}, meaning it is not
|
||||
taken into account for the first line of a long, split header.
|
||||
|
||||
Optional \var{continuation_ws} must be \rfc{2822}-compliant folding
|
||||
whitespace, and is usually either a space or a hard tab character.
|
||||
This character will be prepended to continuation lines.
|
||||
\end{classdesc}
|
||||
|
||||
Optional \var{errors} is passed straight through to the
|
||||
\method{append()} method.
|
||||
|
||||
\begin{methoddesc}[Header]{append}{s\optional{, charset\optional{, errors}}}
|
||||
Append the string \var{s} to the MIME header.
|
||||
|
||||
Optional \var{charset}, if given, should be a \class{Charset} instance
|
||||
(see \refmodule{email.charset}) or the name of a character set, which
|
||||
will be converted to a \class{Charset} instance. A value of
|
||||
\code{None} (the default) means that the \var{charset} given in the
|
||||
constructor is used.
|
||||
|
||||
\var{s} may be a byte string or a Unicode string. If it is a byte
|
||||
string (i.e. \code{isinstance(s, str)} is true), then
|
||||
\var{charset} is the encoding of that byte string, and a
|
||||
\exception{UnicodeError} will be raised if the string cannot be
|
||||
decoded with that character set.
|
||||
|
||||
If \var{s} is a Unicode string, then \var{charset} is a hint
|
||||
specifying the character set of the characters in the string. In this
|
||||
case, when producing an \rfc{2822}-compliant header using \rfc{2047}
|
||||
rules, the Unicode string will be encoded using the following charsets
|
||||
in order: \code{us-ascii}, the \var{charset} hint, \code{utf-8}. The
|
||||
first character set to not provoke a \exception{UnicodeError} is used.
|
||||
|
||||
Optional \var{errors} is passed through to any \function{unicode()} or
|
||||
\function{ustr.encode()} call, and defaults to ``strict''.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Header]{encode}{\optional{splitchars}}
|
||||
Encode a message header into an RFC-compliant format, possibly
|
||||
wrapping long lines and encapsulating non-\ASCII{} parts in base64 or
|
||||
quoted-printable encodings. Optional \var{splitchars} is a string
|
||||
containing characters to split long ASCII lines on, in rough support
|
||||
of \rfc{2822}'s \emph{highest level syntactic breaks}. This doesn't
|
||||
affect \rfc{2047} encoded lines.
|
||||
\end{methoddesc}
|
||||
|
||||
The \class{Header} class also provides a number of methods to support
|
||||
standard operators and built-in functions.
|
||||
|
||||
\begin{methoddesc}[Header]{__str__}{}
|
||||
A synonym for \method{Header.encode()}. Useful for
|
||||
\code{str(aHeader)}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Header]{__unicode__}{}
|
||||
A helper for the built-in \function{unicode()} function. Returns the
|
||||
header as a Unicode string.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Header]{__eq__}{other}
|
||||
This method allows you to compare two \class{Header} instances for equality.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Header]{__ne__}{other}
|
||||
This method allows you to compare two \class{Header} instances for inequality.
|
||||
\end{methoddesc}
|
||||
|
||||
The \module{email.header} module also provides the following
|
||||
convenient functions.
|
||||
|
||||
\begin{funcdesc}{decode_header}{header}
|
||||
Decode a message header value without converting the character set.
|
||||
The header value is in \var{header}.
|
||||
|
||||
This function returns a list of \code{(decoded_string, charset)} pairs
|
||||
containing each of the decoded parts of the header. \var{charset} is
|
||||
\code{None} for non-encoded parts of the header, otherwise a lower
|
||||
case string containing the name of the character set specified in the
|
||||
encoded string.
|
||||
|
||||
Here's an example:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> from email.header import decode_header
|
||||
>>> decode_header('=?iso-8859-1?q?p=F6stal?=')
|
||||
[('p\xf6stal', 'iso-8859-1')]
|
||||
\end{verbatim}
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{make_header}{decoded_seq\optional{, maxlinelen\optional{,
|
||||
header_name\optional{, continuation_ws}}}}
|
||||
Create a \class{Header} instance from a sequence of pairs as returned
|
||||
by \function{decode_header()}.
|
||||
|
||||
\function{decode_header()} takes a header value string and returns a
|
||||
sequence of pairs of the format \code{(decoded_string, charset)} where
|
||||
\var{charset} is the name of the character set.
|
||||
|
||||
This function takes one of those sequence of pairs and returns a
|
||||
\class{Header} instance. Optional \var{maxlinelen},
|
||||
\var{header_name}, and \var{continuation_ws} are as in the
|
||||
\class{Header} constructor.
|
||||
\end{funcdesc}
|
|
@ -1,65 +0,0 @@
|
|||
\declaremodule{standard}{email.iterators}
|
||||
\modulesynopsis{Iterate over a message object tree.}
|
||||
|
||||
Iterating over a message object tree is fairly easy with the
|
||||
\method{Message.walk()} method. The \module{email.iterators} module
|
||||
provides some useful higher level iterations over message object
|
||||
trees.
|
||||
|
||||
\begin{funcdesc}{body_line_iterator}{msg\optional{, decode}}
|
||||
This iterates over all the payloads in all the subparts of \var{msg},
|
||||
returning the string payloads line-by-line. It skips over all the
|
||||
subpart headers, and it skips over any subpart with a payload that
|
||||
isn't a Python string. This is somewhat equivalent to reading the
|
||||
flat text representation of the message from a file using
|
||||
\method{readline()}, skipping over all the intervening headers.
|
||||
|
||||
Optional \var{decode} is passed through to \method{Message.get_payload()}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{typed_subpart_iterator}{msg\optional{,
|
||||
maintype\optional{, subtype}}}
|
||||
This iterates over all the subparts of \var{msg}, returning only those
|
||||
subparts that match the MIME type specified by \var{maintype} and
|
||||
\var{subtype}.
|
||||
|
||||
Note that \var{subtype} is optional; if omitted, then subpart MIME
|
||||
type matching is done only with the main type. \var{maintype} is
|
||||
optional too; it defaults to \mimetype{text}.
|
||||
|
||||
Thus, by default \function{typed_subpart_iterator()} returns each
|
||||
subpart that has a MIME type of \mimetype{text/*}.
|
||||
\end{funcdesc}
|
||||
|
||||
The following function has been added as a useful debugging tool. It
|
||||
should \emph{not} be considered part of the supported public interface
|
||||
for the package.
|
||||
|
||||
\begin{funcdesc}{_structure}{msg\optional{, fp\optional{, level}}}
|
||||
Prints an indented representation of the content types of the
|
||||
message object structure. For example:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> msg = email.message_from_file(somefile)
|
||||
>>> _structure(msg)
|
||||
multipart/mixed
|
||||
text/plain
|
||||
text/plain
|
||||
multipart/digest
|
||||
message/rfc822
|
||||
text/plain
|
||||
message/rfc822
|
||||
text/plain
|
||||
message/rfc822
|
||||
text/plain
|
||||
message/rfc822
|
||||
text/plain
|
||||
message/rfc822
|
||||
text/plain
|
||||
text/plain
|
||||
\end{verbatim}
|
||||
|
||||
Optional \var{fp} is a file-like object to print the output to. It
|
||||
must be suitable for Python's extended print statement. \var{level}
|
||||
is used internally.
|
||||
\end{funcdesc}
|
|
@ -1,561 +0,0 @@
|
|||
\declaremodule{standard}{email.message}
|
||||
\modulesynopsis{The base class representing email messages.}
|
||||
|
||||
The central class in the \module{email} package is the
|
||||
\class{Message} class, imported from the \module{email.message} module. It is
|
||||
the base class for the \module{email} object model. \class{Message} provides
|
||||
the core functionality for setting and querying header fields, and for
|
||||
accessing message bodies.
|
||||
|
||||
Conceptually, a \class{Message} object consists of \emph{headers} and
|
||||
\emph{payloads}. Headers are \rfc{2822} style field names and
|
||||
values where the field name and value are separated by a colon. The
|
||||
colon is not part of either the field name or the field value.
|
||||
|
||||
Headers are stored and returned in case-preserving form but are
|
||||
matched case-insensitively. There may also be a single envelope
|
||||
header, also known as the \emph{Unix-From} header or the
|
||||
\code{From_} header. The payload is either a string in the case of
|
||||
simple message objects or a list of \class{Message} objects for
|
||||
MIME container documents (e.g. \mimetype{multipart/*} and
|
||||
\mimetype{message/rfc822}).
|
||||
|
||||
\class{Message} objects provide a mapping style interface for
|
||||
accessing the message headers, and an explicit interface for accessing
|
||||
both the headers and the payload. It provides convenience methods for
|
||||
generating a flat text representation of the message object tree, for
|
||||
accessing commonly used header parameters, and for recursively walking
|
||||
over the object tree.
|
||||
|
||||
Here are the methods of the \class{Message} class:
|
||||
|
||||
\begin{classdesc}{Message}{}
|
||||
The constructor takes no arguments.
|
||||
\end{classdesc}
|
||||
|
||||
\begin{methoddesc}[Message]{as_string}{\optional{unixfrom}}
|
||||
Return the entire message flatten as a string. When optional
|
||||
\var{unixfrom} is \code{True}, the envelope header is included in the
|
||||
returned string. \var{unixfrom} defaults to \code{False}.
|
||||
|
||||
Note that this method is provided as a convenience and may not always format
|
||||
the message the way you want. For example, by default it mangles lines that
|
||||
begin with \code{From }. For more flexibility, instantiate a
|
||||
\class{Generator} instance and use its
|
||||
\method{flatten()} method directly. For example:
|
||||
|
||||
\begin{verbatim}
|
||||
from cStringIO import StringIO
|
||||
from email.generator import Generator
|
||||
fp = StringIO()
|
||||
g = Generator(fp, mangle_from_=False, maxheaderlen=60)
|
||||
g.flatten(msg)
|
||||
text = fp.getvalue()
|
||||
\end{verbatim}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{__str__}{}
|
||||
Equivalent to \method{as_string(unixfrom=True)}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{is_multipart}{}
|
||||
Return \code{True} if the message's payload is a list of
|
||||
sub-\class{Message} objects, otherwise return \code{False}. When
|
||||
\method{is_multipart()} returns False, the payload should be a string
|
||||
object.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{set_unixfrom}{unixfrom}
|
||||
Set the message's envelope header to \var{unixfrom}, which should be a string.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_unixfrom}{}
|
||||
Return the message's envelope header. Defaults to \code{None} if the
|
||||
envelope header was never set.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{attach}{payload}
|
||||
Add the given \var{payload} to the current payload, which must be
|
||||
\code{None} or a list of \class{Message} objects before the call.
|
||||
After the call, the payload will always be a list of \class{Message}
|
||||
objects. If you want to set the payload to a scalar object (e.g. a
|
||||
string), use \method{set_payload()} instead.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_payload}{\optional{i\optional{, decode}}}
|
||||
Return a reference the current payload, which will be a list of
|
||||
\class{Message} objects when \method{is_multipart()} is \code{True}, or a
|
||||
string when \method{is_multipart()} is \code{False}. If the
|
||||
payload is a list and you mutate the list object, you modify the
|
||||
message's payload in place.
|
||||
|
||||
With optional argument \var{i}, \method{get_payload()} will return the
|
||||
\var{i}-th element of the payload, counting from zero, if
|
||||
\method{is_multipart()} is \code{True}. An \exception{IndexError}
|
||||
will be raised if \var{i} is less than 0 or greater than or equal to
|
||||
the number of items in the payload. If the payload is a string
|
||||
(i.e. \method{is_multipart()} is \code{False}) and \var{i} is given, a
|
||||
\exception{TypeError} is raised.
|
||||
|
||||
Optional \var{decode} is a flag indicating whether the payload should be
|
||||
decoded or not, according to the \mailheader{Content-Transfer-Encoding} header.
|
||||
When \code{True} and the message is not a multipart, the payload will be
|
||||
decoded if this header's value is \samp{quoted-printable} or
|
||||
\samp{base64}. If some other encoding is used, or
|
||||
\mailheader{Content-Transfer-Encoding} header is
|
||||
missing, or if the payload has bogus base64 data, the payload is
|
||||
returned as-is (undecoded). If the message is a multipart and the
|
||||
\var{decode} flag is \code{True}, then \code{None} is returned. The
|
||||
default for \var{decode} is \code{False}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{set_payload}{payload\optional{, charset}}
|
||||
Set the entire message object's payload to \var{payload}. It is the
|
||||
client's responsibility to ensure the payload invariants. Optional
|
||||
\var{charset} sets the message's default character set; see
|
||||
\method{set_charset()} for details.
|
||||
|
||||
\versionchanged[\var{charset} argument added]{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{set_charset}{charset}
|
||||
Set the character set of the payload to \var{charset}, which can
|
||||
either be a \class{Charset} instance (see \refmodule{email.charset}), a
|
||||
string naming a character set,
|
||||
or \code{None}. If it is a string, it will be converted to a
|
||||
\class{Charset} instance. If \var{charset} is \code{None}, the
|
||||
\code{charset} parameter will be removed from the
|
||||
\mailheader{Content-Type} header. Anything else will generate a
|
||||
\exception{TypeError}.
|
||||
|
||||
The message will be assumed to be of type \mimetype{text/*} encoded with
|
||||
\var{charset.input_charset}. It will be converted to
|
||||
\var{charset.output_charset}
|
||||
and encoded properly, if needed, when generating the plain text
|
||||
representation of the message. MIME headers
|
||||
(\mailheader{MIME-Version}, \mailheader{Content-Type},
|
||||
\mailheader{Content-Transfer-Encoding}) will be added as needed.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_charset}{}
|
||||
Return the \class{Charset} instance associated with the message's payload.
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
The following methods implement a mapping-like interface for accessing
|
||||
the message's \rfc{2822} headers. Note that there are some
|
||||
semantic differences between these methods and a normal mapping
|
||||
(i.e. dictionary) interface. For example, in a dictionary there are
|
||||
no duplicate keys, but here there may be duplicate message headers. Also,
|
||||
in dictionaries there is no guaranteed order to the keys returned by
|
||||
\method{keys()}, but in a \class{Message} object, headers are always
|
||||
returned in the order they appeared in the original message, or were
|
||||
added to the message later. Any header deleted and then re-added are
|
||||
always appended to the end of the header list.
|
||||
|
||||
These semantic differences are intentional and are biased toward
|
||||
maximal convenience.
|
||||
|
||||
Note that in all cases, any envelope header present in the message is
|
||||
not included in the mapping interface.
|
||||
|
||||
\begin{methoddesc}[Message]{__len__}{}
|
||||
Return the total number of headers, including duplicates.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{__contains__}{name}
|
||||
Return true if the message object has a field named \var{name}.
|
||||
Matching is done case-insensitively and \var{name} should not include the
|
||||
trailing colon. Used for the \code{in} operator,
|
||||
e.g.:
|
||||
|
||||
\begin{verbatim}
|
||||
if 'message-id' in myMessage:
|
||||
print 'Message-ID:', myMessage['message-id']
|
||||
\end{verbatim}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{__getitem__}{name}
|
||||
Return the value of the named header field. \var{name} should not
|
||||
include the colon field separator. If the header is missing,
|
||||
\code{None} is returned; a \exception{KeyError} is never raised.
|
||||
|
||||
Note that if the named field appears more than once in the message's
|
||||
headers, exactly which of those field values will be returned is
|
||||
undefined. Use the \method{get_all()} method to get the values of all
|
||||
the extant named headers.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{__setitem__}{name, val}
|
||||
Add a header to the message with field name \var{name} and value
|
||||
\var{val}. The field is appended to the end of the message's existing
|
||||
fields.
|
||||
|
||||
Note that this does \emph{not} overwrite or delete any existing header
|
||||
with the same name. If you want to ensure that the new header is the
|
||||
only one present in the message with field name
|
||||
\var{name}, delete the field first, e.g.:
|
||||
|
||||
\begin{verbatim}
|
||||
del msg['subject']
|
||||
msg['subject'] = 'Python roolz!'
|
||||
\end{verbatim}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{__delitem__}{name}
|
||||
Delete all occurrences of the field with name \var{name} from the
|
||||
message's headers. No exception is raised if the named field isn't
|
||||
present in the headers.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{has_key}{name}
|
||||
Return true if the message contains a header field named \var{name},
|
||||
otherwise return false.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{keys}{}
|
||||
Return a list of all the message's header field names.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{values}{}
|
||||
Return a list of all the message's field values.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{items}{}
|
||||
Return a list of 2-tuples containing all the message's field headers
|
||||
and values.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get}{name\optional{, failobj}}
|
||||
Return the value of the named header field. This is identical to
|
||||
\method{__getitem__()} except that optional \var{failobj} is returned
|
||||
if the named header is missing (defaults to \code{None}).
|
||||
\end{methoddesc}
|
||||
|
||||
Here are some additional useful methods:
|
||||
|
||||
\begin{methoddesc}[Message]{get_all}{name\optional{, failobj}}
|
||||
Return a list of all the values for the field named \var{name}.
|
||||
If there are no such named headers in the message, \var{failobj} is
|
||||
returned (defaults to \code{None}).
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{add_header}{_name, _value, **_params}
|
||||
Extended header setting. This method is similar to
|
||||
\method{__setitem__()} except that additional header parameters can be
|
||||
provided as keyword arguments. \var{_name} is the header field to add
|
||||
and \var{_value} is the \emph{primary} value for the header.
|
||||
|
||||
For each item in the keyword argument dictionary \var{_params}, the
|
||||
key is taken as the parameter name, with underscores converted to
|
||||
dashes (since dashes are illegal in Python identifiers). Normally,
|
||||
the parameter will be added as \code{key="value"} unless the value is
|
||||
\code{None}, in which case only the key will be added.
|
||||
|
||||
Here's an example:
|
||||
|
||||
\begin{verbatim}
|
||||
msg.add_header('Content-Disposition', 'attachment', filename='bud.gif')
|
||||
\end{verbatim}
|
||||
|
||||
This will add a header that looks like
|
||||
|
||||
\begin{verbatim}
|
||||
Content-Disposition: attachment; filename="bud.gif"
|
||||
\end{verbatim}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{replace_header}{_name, _value}
|
||||
Replace a header. Replace the first header found in the message that
|
||||
matches \var{_name}, retaining header order and field name case. If
|
||||
no matching header was found, a \exception{KeyError} is raised.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_content_type}{}
|
||||
Return the message's content type. The returned string is coerced to
|
||||
lower case of the form \mimetype{maintype/subtype}. If there was no
|
||||
\mailheader{Content-Type} header in the message the default type as
|
||||
given by \method{get_default_type()} will be returned. Since
|
||||
according to \rfc{2045}, messages always have a default type,
|
||||
\method{get_content_type()} will always return a value.
|
||||
|
||||
\rfc{2045} defines a message's default type to be
|
||||
\mimetype{text/plain} unless it appears inside a
|
||||
\mimetype{multipart/digest} container, in which case it would be
|
||||
\mimetype{message/rfc822}. If the \mailheader{Content-Type} header
|
||||
has an invalid type specification, \rfc{2045} mandates that the
|
||||
default type be \mimetype{text/plain}.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_content_maintype}{}
|
||||
Return the message's main content type. This is the
|
||||
\mimetype{maintype} part of the string returned by
|
||||
\method{get_content_type()}.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_content_subtype}{}
|
||||
Return the message's sub-content type. This is the \mimetype{subtype}
|
||||
part of the string returned by \method{get_content_type()}.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_default_type}{}
|
||||
Return the default content type. Most messages have a default content
|
||||
type of \mimetype{text/plain}, except for messages that are subparts
|
||||
of \mimetype{multipart/digest} containers. Such subparts have a
|
||||
default content type of \mimetype{message/rfc822}.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{set_default_type}{ctype}
|
||||
Set the default content type. \var{ctype} should either be
|
||||
\mimetype{text/plain} or \mimetype{message/rfc822}, although this is
|
||||
not enforced. The default content type is not stored in the
|
||||
\mailheader{Content-Type} header.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_params}{\optional{failobj\optional{,
|
||||
header\optional{, unquote}}}}
|
||||
Return the message's \mailheader{Content-Type} parameters, as a list. The
|
||||
elements of the returned list are 2-tuples of key/value pairs, as
|
||||
split on the \character{=} sign. The left hand side of the
|
||||
\character{=} is the key, while the right hand side is the value. If
|
||||
there is no \character{=} sign in the parameter the value is the empty
|
||||
string, otherwise the value is as described in \method{get_param()} and is
|
||||
unquoted if optional \var{unquote} is \code{True} (the default).
|
||||
|
||||
Optional \var{failobj} is the object to return if there is no
|
||||
\mailheader{Content-Type} header. Optional \var{header} is the header to
|
||||
search instead of \mailheader{Content-Type}.
|
||||
|
||||
\versionchanged[\var{unquote} argument added]{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_param}{param\optional{,
|
||||
failobj\optional{, header\optional{, unquote}}}}
|
||||
Return the value of the \mailheader{Content-Type} header's parameter
|
||||
\var{param} as a string. If the message has no \mailheader{Content-Type}
|
||||
header or if there is no such parameter, then \var{failobj} is
|
||||
returned (defaults to \code{None}).
|
||||
|
||||
Optional \var{header} if given, specifies the message header to use
|
||||
instead of \mailheader{Content-Type}.
|
||||
|
||||
Parameter keys are always compared case insensitively. The return
|
||||
value can either be a string, or a 3-tuple if the parameter was
|
||||
\rfc{2231} encoded. When it's a 3-tuple, the elements of the value are of
|
||||
the form \code{(CHARSET, LANGUAGE, VALUE)}. Note that both \code{CHARSET} and
|
||||
\code{LANGUAGE} can be \code{None}, in which case you should consider
|
||||
\code{VALUE} to be encoded in the \code{us-ascii} charset. You can
|
||||
usually ignore \code{LANGUAGE}.
|
||||
|
||||
If your application doesn't care whether the parameter was encoded as in
|
||||
\rfc{2231}, you can collapse the parameter value by calling
|
||||
\function{email.Utils.collapse_rfc2231_value()}, passing in the return value
|
||||
from \method{get_param()}. This will return a suitably decoded Unicode string
|
||||
whn the value is a tuple, or the original string unquoted if it isn't. For
|
||||
example:
|
||||
|
||||
\begin{verbatim}
|
||||
rawparam = msg.get_param('foo')
|
||||
param = email.Utils.collapse_rfc2231_value(rawparam)
|
||||
\end{verbatim}
|
||||
|
||||
In any case, the parameter value (either the returned string, or the
|
||||
\code{VALUE} item in the 3-tuple) is always unquoted, unless
|
||||
\var{unquote} is set to \code{False}.
|
||||
|
||||
\versionchanged[\var{unquote} argument added, and 3-tuple return value
|
||||
possible]{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{set_param}{param, value\optional{,
|
||||
header\optional{, requote\optional{, charset\optional{, language}}}}}
|
||||
|
||||
Set a parameter in the \mailheader{Content-Type} header. If the
|
||||
parameter already exists in the header, its value will be replaced
|
||||
with \var{value}. If the \mailheader{Content-Type} header as not yet
|
||||
been defined for this message, it will be set to \mimetype{text/plain}
|
||||
and the new parameter value will be appended as per \rfc{2045}.
|
||||
|
||||
Optional \var{header} specifies an alternative header to
|
||||
\mailheader{Content-Type}, and all parameters will be quoted as
|
||||
necessary unless optional \var{requote} is \code{False} (the default
|
||||
is \code{True}).
|
||||
|
||||
If optional \var{charset} is specified, the parameter will be encoded
|
||||
according to \rfc{2231}. Optional \var{language} specifies the RFC
|
||||
2231 language, defaulting to the empty string. Both \var{charset} and
|
||||
\var{language} should be strings.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{del_param}{param\optional{, header\optional{,
|
||||
requote}}}
|
||||
Remove the given parameter completely from the
|
||||
\mailheader{Content-Type} header. The header will be re-written in
|
||||
place without the parameter or its value. All values will be quoted
|
||||
as necessary unless \var{requote} is \code{False} (the default is
|
||||
\code{True}). Optional \var{header} specifies an alternative to
|
||||
\mailheader{Content-Type}.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{set_type}{type\optional{, header}\optional{,
|
||||
requote}}
|
||||
Set the main type and subtype for the \mailheader{Content-Type}
|
||||
header. \var{type} must be a string in the form
|
||||
\mimetype{maintype/subtype}, otherwise a \exception{ValueError} is
|
||||
raised.
|
||||
|
||||
This method replaces the \mailheader{Content-Type} header, keeping all
|
||||
the parameters in place. If \var{requote} is \code{False}, this
|
||||
leaves the existing header's quoting as is, otherwise the parameters
|
||||
will be quoted (the default).
|
||||
|
||||
An alternative header can be specified in the \var{header} argument.
|
||||
When the \mailheader{Content-Type} header is set a
|
||||
\mailheader{MIME-Version} header is also added.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_filename}{\optional{failobj}}
|
||||
Return the value of the \code{filename} parameter of the
|
||||
\mailheader{Content-Disposition} header of the message. If the header does
|
||||
not have a \code{filename} parameter, this method falls back to looking for
|
||||
the \code{name} parameter. If neither is found, or the header is missing,
|
||||
then \var{failobj} is returned. The returned string will always be unquoted
|
||||
as per \method{Utils.unquote()}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_boundary}{\optional{failobj}}
|
||||
Return the value of the \code{boundary} parameter of the
|
||||
\mailheader{Content-Type} header of the message, or \var{failobj} if either
|
||||
the header is missing, or has no \code{boundary} parameter. The
|
||||
returned string will always be unquoted as per
|
||||
\method{Utils.unquote()}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{set_boundary}{boundary}
|
||||
Set the \code{boundary} parameter of the \mailheader{Content-Type}
|
||||
header to \var{boundary}. \method{set_boundary()} will always quote
|
||||
\var{boundary} if necessary. A \exception{HeaderParseError} is raised
|
||||
if the message object has no \mailheader{Content-Type} header.
|
||||
|
||||
Note that using this method is subtly different than deleting the old
|
||||
\mailheader{Content-Type} header and adding a new one with the new boundary
|
||||
via \method{add_header()}, because \method{set_boundary()} preserves the
|
||||
order of the \mailheader{Content-Type} header in the list of headers.
|
||||
However, it does \emph{not} preserve any continuation lines which may
|
||||
have been present in the original \mailheader{Content-Type} header.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_content_charset}{\optional{failobj}}
|
||||
Return the \code{charset} parameter of the \mailheader{Content-Type}
|
||||
header, coerced to lower case. If there is no
|
||||
\mailheader{Content-Type} header, or if that header has no
|
||||
\code{charset} parameter, \var{failobj} is returned.
|
||||
|
||||
Note that this method differs from \method{get_charset()} which
|
||||
returns the \class{Charset} instance for the default encoding of the
|
||||
message body.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{get_charsets}{\optional{failobj}}
|
||||
Return a list containing the character set names in the message. If
|
||||
the message is a \mimetype{multipart}, then the list will contain one
|
||||
element for each subpart in the payload, otherwise, it will be a list
|
||||
of length 1.
|
||||
|
||||
Each item in the list will be a string which is the value of the
|
||||
\code{charset} parameter in the \mailheader{Content-Type} header for the
|
||||
represented subpart. However, if the subpart has no
|
||||
\mailheader{Content-Type} header, no \code{charset} parameter, or is not of
|
||||
the \mimetype{text} main MIME type, then that item in the returned list
|
||||
will be \var{failobj}.
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Message]{walk}{}
|
||||
The \method{walk()} method is an all-purpose generator which can be
|
||||
used to iterate over all the parts and subparts of a message object
|
||||
tree, in depth-first traversal order. You will typically use
|
||||
\method{walk()} as the iterator in a \code{for} loop; each
|
||||
iteration returns the next subpart.
|
||||
|
||||
Here's an example that prints the MIME type of every part of a
|
||||
multipart message structure:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> for part in msg.walk():
|
||||
... print part.get_content_type()
|
||||
multipart/report
|
||||
text/plain
|
||||
message/delivery-status
|
||||
text/plain
|
||||
text/plain
|
||||
message/rfc822
|
||||
\end{verbatim}
|
||||
\end{methoddesc}
|
||||
|
||||
\versionchanged[The previously deprecated methods \method{get_type()},
|
||||
\method{get_main_type()}, and \method{get_subtype()} were removed]{2.5}
|
||||
|
||||
\class{Message} objects can also optionally contain two instance
|
||||
attributes, which can be used when generating the plain text of a MIME
|
||||
message.
|
||||
|
||||
\begin{datadesc}{preamble}
|
||||
The format of a MIME document allows for some text between the blank
|
||||
line following the headers, and the first multipart boundary string.
|
||||
Normally, this text is never visible in a MIME-aware mail reader
|
||||
because it falls outside the standard MIME armor. However, when
|
||||
viewing the raw text of the message, or when viewing the message in a
|
||||
non-MIME aware reader, this text can become visible.
|
||||
|
||||
The \var{preamble} attribute contains this leading extra-armor text
|
||||
for MIME documents. When the \class{Parser} discovers some text after
|
||||
the headers but before the first boundary string, it assigns this text
|
||||
to the message's \var{preamble} attribute. When the \class{Generator}
|
||||
is writing out the plain text representation of a MIME message, and it
|
||||
finds the message has a \var{preamble} attribute, it will write this
|
||||
text in the area between the headers and the first boundary. See
|
||||
\refmodule{email.parser} and \refmodule{email.generator} for details.
|
||||
|
||||
Note that if the message object has no preamble, the
|
||||
\var{preamble} attribute will be \code{None}.
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{epilogue}
|
||||
The \var{epilogue} attribute acts the same way as the \var{preamble}
|
||||
attribute, except that it contains text that appears between the last
|
||||
boundary and the end of the message.
|
||||
|
||||
\versionchanged[You do not need to set the epilogue to the empty string in
|
||||
order for the \class{Generator} to print a newline at the end of the
|
||||
file]{2.5}
|
||||
\end{datadesc}
|
||||
|
||||
\begin{datadesc}{defects}
|
||||
The \var{defects} attribute contains a list of all the problems found when
|
||||
parsing this message. See \refmodule{email.errors} for a detailed description
|
||||
of the possible parsing defects.
|
||||
|
||||
\versionadded{2.4}
|
||||
\end{datadesc}
|
|
@ -1,186 +0,0 @@
|
|||
\declaremodule{standard}{email.mime}
|
||||
\declaremodule{standard}{email.mime.base}
|
||||
\declaremodule{standard}{email.mime.nonmultipart}
|
||||
\declaremodule{standard}{email.mime.multipart}
|
||||
\declaremodule{standard}{email.mime.audio}
|
||||
\declaremodule{standard}{email.mime.image}
|
||||
\declaremodule{standard}{email.mime.message}
|
||||
\declaremodule{standard}{email.mime.text}
|
||||
Ordinarily, you get a message object structure by passing a file or
|
||||
some text to a parser, which parses the text and returns the root
|
||||
message object. However you can also build a complete message
|
||||
structure from scratch, or even individual \class{Message} objects by
|
||||
hand. In fact, you can also take an existing structure and add new
|
||||
\class{Message} objects, move them around, etc. This makes a very
|
||||
convenient interface for slicing-and-dicing MIME messages.
|
||||
|
||||
You can create a new object structure by creating \class{Message} instances,
|
||||
adding attachments and all the appropriate headers manually. For MIME
|
||||
messages though, the \module{email} package provides some convenient
|
||||
subclasses to make things easier.
|
||||
|
||||
Here are the classes:
|
||||
|
||||
\begin{classdesc}{MIMEBase}{_maintype, _subtype, **_params}
|
||||
Module: \module{email.mime.base}
|
||||
|
||||
This is the base class for all the MIME-specific subclasses of
|
||||
\class{Message}. Ordinarily you won't create instances specifically
|
||||
of \class{MIMEBase}, although you could. \class{MIMEBase} is provided
|
||||
primarily as a convenient base class for more specific MIME-aware
|
||||
subclasses.
|
||||
|
||||
\var{_maintype} is the \mailheader{Content-Type} major type
|
||||
(e.g. \mimetype{text} or \mimetype{image}), and \var{_subtype} is the
|
||||
\mailheader{Content-Type} minor type
|
||||
(e.g. \mimetype{plain} or \mimetype{gif}). \var{_params} is a parameter
|
||||
key/value dictionary and is passed directly to
|
||||
\method{Message.add_header()}.
|
||||
|
||||
The \class{MIMEBase} class always adds a \mailheader{Content-Type} header
|
||||
(based on \var{_maintype}, \var{_subtype}, and \var{_params}), and a
|
||||
\mailheader{MIME-Version} header (always set to \code{1.0}).
|
||||
\end{classdesc}
|
||||
|
||||
\begin{classdesc}{MIMENonMultipart}{}
|
||||
Module: \module{email.mime.nonmultipart}
|
||||
|
||||
A subclass of \class{MIMEBase}, this is an intermediate base class for
|
||||
MIME messages that are not \mimetype{multipart}. The primary purpose
|
||||
of this class is to prevent the use of the \method{attach()} method,
|
||||
which only makes sense for \mimetype{multipart} messages. If
|
||||
\method{attach()} is called, a \exception{MultipartConversionError}
|
||||
exception is raised.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{classdesc}
|
||||
|
||||
\begin{classdesc}{MIMEMultipart}{\optional{subtype\optional{,
|
||||
boundary\optional{, _subparts\optional{, _params}}}}}
|
||||
Module: \module{email.mime.multipart}
|
||||
|
||||
A subclass of \class{MIMEBase}, this is an intermediate base class for
|
||||
MIME messages that are \mimetype{multipart}. Optional \var{_subtype}
|
||||
defaults to \mimetype{mixed}, but can be used to specify the subtype
|
||||
of the message. A \mailheader{Content-Type} header of
|
||||
\mimetype{multipart/}\var{_subtype} will be added to the message
|
||||
object. A \mailheader{MIME-Version} header will also be added.
|
||||
|
||||
Optional \var{boundary} is the multipart boundary string. When
|
||||
\code{None} (the default), the boundary is calculated when needed.
|
||||
|
||||
\var{_subparts} is a sequence of initial subparts for the payload. It
|
||||
must be possible to convert this sequence to a list. You can always
|
||||
attach new subparts to the message by using the
|
||||
\method{Message.attach()} method.
|
||||
|
||||
Additional parameters for the \mailheader{Content-Type} header are
|
||||
taken from the keyword arguments, or passed into the \var{_params}
|
||||
argument, which is a keyword dictionary.
|
||||
|
||||
\versionadded{2.2.2}
|
||||
\end{classdesc}
|
||||
|
||||
\begin{classdesc}{MIMEApplication}{_data\optional{, _subtype\optional{,
|
||||
_encoder\optional{, **_params}}}}
|
||||
Module: \module{email.mime.application}
|
||||
|
||||
A subclass of \class{MIMENonMultipart}, the \class{MIMEApplication} class is
|
||||
used to represent MIME message objects of major type \mimetype{application}.
|
||||
\var{_data} is a string containing the raw byte data. Optional \var{_subtype}
|
||||
specifies the MIME subtype and defaults to \mimetype{octet-stream}.
|
||||
|
||||
Optional \var{_encoder} is a callable (i.e. function) which will
|
||||
perform the actual encoding of the data for transport. This
|
||||
callable takes one argument, which is the \class{MIMEApplication} instance.
|
||||
It should use \method{get_payload()} and \method{set_payload()} to
|
||||
change the payload to encoded form. It should also add any
|
||||
\mailheader{Content-Transfer-Encoding} or other headers to the message
|
||||
object as necessary. The default encoding is base64. See the
|
||||
\refmodule{email.encoders} module for a list of the built-in encoders.
|
||||
|
||||
\var{_params} are passed straight through to the base class constructor.
|
||||
\versionadded{2.5}
|
||||
\end{classdesc}
|
||||
|
||||
\begin{classdesc}{MIMEAudio}{_audiodata\optional{, _subtype\optional{,
|
||||
_encoder\optional{, **_params}}}}
|
||||
Module: \module{email.mime.audio}
|
||||
|
||||
A subclass of \class{MIMENonMultipart}, the \class{MIMEAudio} class
|
||||
is used to create MIME message objects of major type \mimetype{audio}.
|
||||
\var{_audiodata} is a string containing the raw audio data. If this
|
||||
data can be decoded by the standard Python module \refmodule{sndhdr},
|
||||
then the subtype will be automatically included in the
|
||||
\mailheader{Content-Type} header. Otherwise you can explicitly specify the
|
||||
audio subtype via the \var{_subtype} parameter. If the minor type could
|
||||
not be guessed and \var{_subtype} was not given, then \exception{TypeError}
|
||||
is raised.
|
||||
|
||||
Optional \var{_encoder} is a callable (i.e. function) which will
|
||||
perform the actual encoding of the audio data for transport. This
|
||||
callable takes one argument, which is the \class{MIMEAudio} instance.
|
||||
It should use \method{get_payload()} and \method{set_payload()} to
|
||||
change the payload to encoded form. It should also add any
|
||||
\mailheader{Content-Transfer-Encoding} or other headers to the message
|
||||
object as necessary. The default encoding is base64. See the
|
||||
\refmodule{email.encoders} module for a list of the built-in encoders.
|
||||
|
||||
\var{_params} are passed straight through to the base class constructor.
|
||||
\end{classdesc}
|
||||
|
||||
\begin{classdesc}{MIMEImage}{_imagedata\optional{, _subtype\optional{,
|
||||
_encoder\optional{, **_params}}}}
|
||||
Module: \module{email.mime.image}
|
||||
|
||||
A subclass of \class{MIMENonMultipart}, the \class{MIMEImage} class is
|
||||
used to create MIME message objects of major type \mimetype{image}.
|
||||
\var{_imagedata} is a string containing the raw image data. If this
|
||||
data can be decoded by the standard Python module \refmodule{imghdr},
|
||||
then the subtype will be automatically included in the
|
||||
\mailheader{Content-Type} header. Otherwise you can explicitly specify the
|
||||
image subtype via the \var{_subtype} parameter. If the minor type could
|
||||
not be guessed and \var{_subtype} was not given, then \exception{TypeError}
|
||||
is raised.
|
||||
|
||||
Optional \var{_encoder} is a callable (i.e. function) which will
|
||||
perform the actual encoding of the image data for transport. This
|
||||
callable takes one argument, which is the \class{MIMEImage} instance.
|
||||
It should use \method{get_payload()} and \method{set_payload()} to
|
||||
change the payload to encoded form. It should also add any
|
||||
\mailheader{Content-Transfer-Encoding} or other headers to the message
|
||||
object as necessary. The default encoding is base64. See the
|
||||
\refmodule{email.encoders} module for a list of the built-in encoders.
|
||||
|
||||
\var{_params} are passed straight through to the \class{MIMEBase}
|
||||
constructor.
|
||||
\end{classdesc}
|
||||
|
||||
\begin{classdesc}{MIMEMessage}{_msg\optional{, _subtype}}
|
||||
Module: \module{email.mime.message}
|
||||
|
||||
A subclass of \class{MIMENonMultipart}, the \class{MIMEMessage} class
|
||||
is used to create MIME objects of main type \mimetype{message}.
|
||||
\var{_msg} is used as the payload, and must be an instance of class
|
||||
\class{Message} (or a subclass thereof), otherwise a
|
||||
\exception{TypeError} is raised.
|
||||
|
||||
Optional \var{_subtype} sets the subtype of the message; it defaults
|
||||
to \mimetype{rfc822}.
|
||||
\end{classdesc}
|
||||
|
||||
\begin{classdesc}{MIMEText}{_text\optional{, _subtype\optional{, _charset}}}
|
||||
Module: \module{email.mime.text}
|
||||
|
||||
A subclass of \class{MIMENonMultipart}, the \class{MIMEText} class is
|
||||
used to create MIME objects of major type \mimetype{text}.
|
||||
\var{_text} is the string for the payload. \var{_subtype} is the
|
||||
minor type and defaults to \mimetype{plain}. \var{_charset} is the
|
||||
character set of the text and is passed as a parameter to the
|
||||
\class{MIMENonMultipart} constructor; it defaults to \code{us-ascii}. No
|
||||
guessing or encoding is performed on the text data.
|
||||
|
||||
\versionchanged[The previously deprecated \var{_encoding} argument has
|
||||
been removed. Encoding happens implicitly based on the \var{_charset}
|
||||
argument]{2.4}
|
||||
\end{classdesc}
|
|
@ -1,208 +0,0 @@
|
|||
\declaremodule{standard}{email.parser}
|
||||
\modulesynopsis{Parse flat text email messages to produce a message
|
||||
object structure.}
|
||||
|
||||
Message object structures can be created in one of two ways: they can be
|
||||
created from whole cloth by instantiating \class{Message} objects and
|
||||
stringing them together via \method{attach()} and
|
||||
\method{set_payload()} calls, or they can be created by parsing a flat text
|
||||
representation of the email message.
|
||||
|
||||
The \module{email} package provides a standard parser that understands
|
||||
most email document structures, including MIME documents. You can
|
||||
pass the parser a string or a file object, and the parser will return
|
||||
to you the root \class{Message} instance of the object structure. For
|
||||
simple, non-MIME messages the payload of this root object will likely
|
||||
be a string containing the text of the message. For MIME
|
||||
messages, the root object will return \code{True} from its
|
||||
\method{is_multipart()} method, and the subparts can be accessed via
|
||||
the \method{get_payload()} and \method{walk()} methods.
|
||||
|
||||
There are actually two parser interfaces available for use, the classic
|
||||
\class{Parser} API and the incremental \class{FeedParser} API. The classic
|
||||
\class{Parser} API is fine if you have the entire text of the message in
|
||||
memory as a string, or if the entire message lives in a file on the file
|
||||
system. \class{FeedParser} is more appropriate for when you're reading the
|
||||
message from a stream which might block waiting for more input (e.g. reading
|
||||
an email message from a socket). The \class{FeedParser} can consume and parse
|
||||
the message incrementally, and only returns the root object when you close the
|
||||
parser\footnote{As of email package version 3.0, introduced in
|
||||
Python 2.4, the classic \class{Parser} was re-implemented in terms of the
|
||||
\class{FeedParser}, so the semantics and results are identical between the two
|
||||
parsers.}.
|
||||
|
||||
Note that the parser can be extended in limited ways, and of course
|
||||
you can implement your own parser completely from scratch. There is
|
||||
no magical connection between the \module{email} package's bundled
|
||||
parser and the \class{Message} class, so your custom parser can create
|
||||
message object trees any way it finds necessary.
|
||||
|
||||
\subsubsection{FeedParser API}
|
||||
|
||||
\versionadded{2.4}
|
||||
|
||||
The \class{FeedParser}, imported from the \module{email.feedparser} module,
|
||||
provides an API that is conducive to incremental parsing of email messages,
|
||||
such as would be necessary when reading the text of an email message from a
|
||||
source that can block (e.g. a socket). The
|
||||
\class{FeedParser} can of course be used to parse an email message fully
|
||||
contained in a string or a file, but the classic \class{Parser} API may be
|
||||
more convenient for such use cases. The semantics and results of the two
|
||||
parser APIs are identical.
|
||||
|
||||
The \class{FeedParser}'s API is simple; you create an instance, feed it a
|
||||
bunch of text until there's no more to feed it, then close the parser to
|
||||
retrieve the root message object. The \class{FeedParser} is extremely
|
||||
accurate when parsing standards-compliant messages, and it does a very good
|
||||
job of parsing non-compliant messages, providing information about how a
|
||||
message was deemed broken. It will populate a message object's \var{defects}
|
||||
attribute with a list of any problems it found in a message. See the
|
||||
\refmodule{email.errors} module for the list of defects that it can find.
|
||||
|
||||
Here is the API for the \class{FeedParser}:
|
||||
|
||||
\begin{classdesc}{FeedParser}{\optional{_factory}}
|
||||
Create a \class{FeedParser} instance. Optional \var{_factory} is a
|
||||
no-argument callable that will be called whenever a new message object is
|
||||
needed. It defaults to the \class{email.message.Message} class.
|
||||
\end{classdesc}
|
||||
|
||||
\begin{methoddesc}[FeedParser]{feed}{data}
|
||||
Feed the \class{FeedParser} some more data. \var{data} should be a
|
||||
string containing one or more lines. The lines can be partial and the
|
||||
\class{FeedParser} will stitch such partial lines together properly. The
|
||||
lines in the string can have any of the common three line endings, carriage
|
||||
return, newline, or carriage return and newline (they can even be mixed).
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[FeedParser]{close}{}
|
||||
Closing a \class{FeedParser} completes the parsing of all previously fed data,
|
||||
and returns the root message object. It is undefined what happens if you feed
|
||||
more data to a closed \class{FeedParser}.
|
||||
\end{methoddesc}
|
||||
|
||||
\subsubsection{Parser class API}
|
||||
|
||||
The \class{Parser} class, imported from the \module{email.parser} module,
|
||||
provides an API that can be used to parse a message when the complete contents
|
||||
of the message are available in a string or file. The
|
||||
\module{email.parser} module also provides a second class, called
|
||||
\class{HeaderParser} which can be used if you're only interested in
|
||||
the headers of the message. \class{HeaderParser} can be much faster in
|
||||
these situations, since it does not attempt to parse the message body,
|
||||
instead setting the payload to the raw body as a string.
|
||||
\class{HeaderParser} has the same API as the \class{Parser} class.
|
||||
|
||||
\begin{classdesc}{Parser}{\optional{_class}}
|
||||
The constructor for the \class{Parser} class takes an optional
|
||||
argument \var{_class}. This must be a callable factory (such as a
|
||||
function or a class), and it is used whenever a sub-message object
|
||||
needs to be created. It defaults to \class{Message} (see
|
||||
\refmodule{email.message}). The factory will be called without
|
||||
arguments.
|
||||
|
||||
The optional \var{strict} flag is ignored. \deprecated{2.4}{Because the
|
||||
\class{Parser} class is a backward compatible API wrapper around the
|
||||
new-in-Python 2.4 \class{FeedParser}, \emph{all} parsing is effectively
|
||||
non-strict. You should simply stop passing a \var{strict} flag to the
|
||||
\class{Parser} constructor.}
|
||||
|
||||
\versionchanged[The \var{strict} flag was added]{2.2.2}
|
||||
\versionchanged[The \var{strict} flag was deprecated]{2.4}
|
||||
\end{classdesc}
|
||||
|
||||
The other public \class{Parser} methods are:
|
||||
|
||||
\begin{methoddesc}[Parser]{parse}{fp\optional{, headersonly}}
|
||||
Read all the data from the file-like object \var{fp}, parse the
|
||||
resulting text, and return the root message object. \var{fp} must
|
||||
support both the \method{readline()} and the \method{read()} methods
|
||||
on file-like objects.
|
||||
|
||||
The text contained in \var{fp} must be formatted as a block of \rfc{2822}
|
||||
style headers and header continuation lines, optionally preceded by a
|
||||
envelope header. The header block is terminated either by the
|
||||
end of the data or by a blank line. Following the header block is the
|
||||
body of the message (which may contain MIME-encoded subparts).
|
||||
|
||||
Optional \var{headersonly} is as with the \method{parse()} method.
|
||||
|
||||
\versionchanged[The \var{headersonly} flag was added]{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
\begin{methoddesc}[Parser]{parsestr}{text\optional{, headersonly}}
|
||||
Similar to the \method{parse()} method, except it takes a string
|
||||
object instead of a file-like object. Calling this method on a string
|
||||
is exactly equivalent to wrapping \var{text} in a \class{StringIO}
|
||||
instance first and calling \method{parse()}.
|
||||
|
||||
Optional \var{headersonly} is a flag specifying whether to stop
|
||||
parsing after reading the headers or not. The default is \code{False},
|
||||
meaning it parses the entire contents of the file.
|
||||
|
||||
\versionchanged[The \var{headersonly} flag was added]{2.2.2}
|
||||
\end{methoddesc}
|
||||
|
||||
Since creating a message object structure from a string or a file
|
||||
object is such a common task, two functions are provided as a
|
||||
convenience. They are available in the top-level \module{email}
|
||||
package namespace.
|
||||
|
||||
\begin{funcdesc}{message_from_string}{s\optional{, _class\optional{, strict}}}
|
||||
Return a message object structure from a string. This is exactly
|
||||
equivalent to \code{Parser().parsestr(s)}. Optional \var{_class} and
|
||||
\var{strict} are interpreted as with the \class{Parser} class constructor.
|
||||
|
||||
\versionchanged[The \var{strict} flag was added]{2.2.2}
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{message_from_file}{fp\optional{, _class\optional{, strict}}}
|
||||
Return a message object structure tree from an open file object. This
|
||||
is exactly equivalent to \code{Parser().parse(fp)}. Optional
|
||||
\var{_class} and \var{strict} are interpreted as with the
|
||||
\class{Parser} class constructor.
|
||||
|
||||
\versionchanged[The \var{strict} flag was added]{2.2.2}
|
||||
\end{funcdesc}
|
||||
|
||||
Here's an example of how you might use this at an interactive Python
|
||||
prompt:
|
||||
|
||||
\begin{verbatim}
|
||||
>>> import email
|
||||
>>> msg = email.message_from_string(myString)
|
||||
\end{verbatim}
|
||||
|
||||
\subsubsection{Additional notes}
|
||||
|
||||
Here are some notes on the parsing semantics:
|
||||
|
||||
\begin{itemize}
|
||||
\item Most non-\mimetype{multipart} type messages are parsed as a single
|
||||
message object with a string payload. These objects will return
|
||||
\code{False} for \method{is_multipart()}. Their
|
||||
\method{get_payload()} method will return a string object.
|
||||
|
||||
\item All \mimetype{multipart} type messages will be parsed as a
|
||||
container message object with a list of sub-message objects for
|
||||
their payload. The outer container message will return
|
||||
\code{True} for \method{is_multipart()} and their
|
||||
\method{get_payload()} method will return the list of
|
||||
\class{Message} subparts.
|
||||
|
||||
\item Most messages with a content type of \mimetype{message/*}
|
||||
(e.g. \mimetype{message/delivery-status} and
|
||||
\mimetype{message/rfc822}) will also be parsed as container
|
||||
object containing a list payload of length 1. Their
|
||||
\method{is_multipart()} method will return \code{True}. The
|
||||
single element in the list payload will be a sub-message object.
|
||||
|
||||
\item Some non-standards compliant messages may not be internally consistent
|
||||
about their \mimetype{multipart}-edness. Such messages may have a
|
||||
\mailheader{Content-Type} header of type \mimetype{multipart}, but their
|
||||
\method{is_multipart()} method may return \code{False}. If such
|
||||
messages were parsed with the \class{FeedParser}, they will have an
|
||||
instance of the \class{MultipartInvariantViolationDefect} class in their
|
||||
\var{defects} attribute list. See \refmodule{email.errors} for
|
||||
details.
|
||||
\end{itemize}
|
|
@ -1,157 +0,0 @@
|
|||
\declaremodule{standard}{email.utils}
|
||||
\modulesynopsis{Miscellaneous email package utilities.}
|
||||
|
||||
There are several useful utilities provided in the \module{email.utils}
|
||||
module:
|
||||
|
||||
\begin{funcdesc}{quote}{str}
|
||||
Return a new string with backslashes in \var{str} replaced by two
|
||||
backslashes, and double quotes replaced by backslash-double quote.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{unquote}{str}
|
||||
Return a new string which is an \emph{unquoted} version of \var{str}.
|
||||
If \var{str} ends and begins with double quotes, they are stripped
|
||||
off. Likewise if \var{str} ends and begins with angle brackets, they
|
||||
are stripped off.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{parseaddr}{address}
|
||||
Parse address -- which should be the value of some address-containing
|
||||
field such as \mailheader{To} or \mailheader{Cc} -- into its constituent
|
||||
\emph{realname} and \emph{email address} parts. Returns a tuple of that
|
||||
information, unless the parse fails, in which case a 2-tuple of
|
||||
\code{('', '')} is returned.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{formataddr}{pair}
|
||||
The inverse of \method{parseaddr()}, this takes a 2-tuple of the form
|
||||
\code{(realname, email_address)} and returns the string value suitable
|
||||
for a \mailheader{To} or \mailheader{Cc} header. If the first element of
|
||||
\var{pair} is false, then the second element is returned unmodified.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{getaddresses}{fieldvalues}
|
||||
This method returns a list of 2-tuples of the form returned by
|
||||
\code{parseaddr()}. \var{fieldvalues} is a sequence of header field
|
||||
values as might be returned by \method{Message.get_all()}. Here's a
|
||||
simple example that gets all the recipients of a message:
|
||||
|
||||
\begin{verbatim}
|
||||
from email.utils import getaddresses
|
||||
|
||||
tos = msg.get_all('to', [])
|
||||
ccs = msg.get_all('cc', [])
|
||||
resent_tos = msg.get_all('resent-to', [])
|
||||
resent_ccs = msg.get_all('resent-cc', [])
|
||||
all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs)
|
||||
\end{verbatim}
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{parsedate}{date}
|
||||
Attempts to parse a date according to the rules in \rfc{2822}.
|
||||
however, some mailers don't follow that format as specified, so
|
||||
\function{parsedate()} tries to guess correctly in such cases.
|
||||
\var{date} is a string containing an \rfc{2822} date, such as
|
||||
\code{"Mon, 20 Nov 1995 19:12:08 -0500"}. If it succeeds in parsing
|
||||
the date, \function{parsedate()} returns a 9-tuple that can be passed
|
||||
directly to \function{time.mktime()}; otherwise \code{None} will be
|
||||
returned. Note that indexes 6, 7, and 8 of the result tuple are not
|
||||
usable.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{parsedate_tz}{date}
|
||||
Performs the same function as \function{parsedate()}, but returns
|
||||
either \code{None} or a 10-tuple; the first 9 elements make up a tuple
|
||||
that can be passed directly to \function{time.mktime()}, and the tenth
|
||||
is the offset of the date's timezone from UTC (which is the official
|
||||
term for Greenwich Mean Time)\footnote{Note that the sign of the timezone
|
||||
offset is the opposite of the sign of the \code{time.timezone}
|
||||
variable for the same timezone; the latter variable follows the
|
||||
\POSIX{} standard while this module follows \rfc{2822}.}. If the input
|
||||
string has no timezone, the last element of the tuple returned is
|
||||
\code{None}. Note that indexes 6, 7, and 8 of the result tuple are not
|
||||
usable.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{mktime_tz}{tuple}
|
||||
Turn a 10-tuple as returned by \function{parsedate_tz()} into a UTC
|
||||
timestamp. It the timezone item in the tuple is \code{None}, assume
|
||||
local time. Minor deficiency: \function{mktime_tz()} interprets the
|
||||
first 8 elements of \var{tuple} as a local time and then compensates
|
||||
for the timezone difference. This may yield a slight error around
|
||||
changes in daylight savings time, though not worth worrying about for
|
||||
common use.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{formatdate}{\optional{timeval\optional{, localtime}\optional{, usegmt}}}
|
||||
Returns a date string as per \rfc{2822}, e.g.:
|
||||
|
||||
\begin{verbatim}
|
||||
Fri, 09 Nov 2001 01:08:47 -0000
|
||||
\end{verbatim}
|
||||
|
||||
Optional \var{timeval} if given is a floating point time value as
|
||||
accepted by \function{time.gmtime()} and \function{time.localtime()},
|
||||
otherwise the current time is used.
|
||||
|
||||
Optional \var{localtime} is a flag that when \code{True}, interprets
|
||||
\var{timeval}, and returns a date relative to the local timezone
|
||||
instead of UTC, properly taking daylight savings time into account.
|
||||
The default is \code{False} meaning UTC is used.
|
||||
|
||||
Optional \var{usegmt} is a flag that when \code{True}, outputs a
|
||||
date string with the timezone as an ascii string \code{GMT}, rather
|
||||
than a numeric \code{-0000}. This is needed for some protocols (such
|
||||
as HTTP). This only applies when \var{localtime} is \code{False}.
|
||||
\versionadded{2.4}
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{make_msgid}{\optional{idstring}}
|
||||
Returns a string suitable for an \rfc{2822}-compliant
|
||||
\mailheader{Message-ID} header. Optional \var{idstring} if given, is
|
||||
a string used to strengthen the uniqueness of the message id.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{decode_rfc2231}{s}
|
||||
Decode the string \var{s} according to \rfc{2231}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{encode_rfc2231}{s\optional{, charset\optional{, language}}}
|
||||
Encode the string \var{s} according to \rfc{2231}. Optional
|
||||
\var{charset} and \var{language}, if given is the character set name
|
||||
and language name to use. If neither is given, \var{s} is returned
|
||||
as-is. If \var{charset} is given but \var{language} is not, the
|
||||
string is encoded using the empty string for \var{language}.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{collapse_rfc2231_value}{value\optional{, errors\optional{,
|
||||
fallback_charset}}}
|
||||
When a header parameter is encoded in \rfc{2231} format,
|
||||
\method{Message.get_param()} may return a 3-tuple containing the character
|
||||
set, language, and value. \function{collapse_rfc2231_value()} turns this into
|
||||
a unicode string. Optional \var{errors} is passed to the \var{errors}
|
||||
argument of the built-in \function{unicode()} function; it defaults to
|
||||
\code{replace}. Optional \var{fallback_charset} specifies the character set
|
||||
to use if the one in the \rfc{2231} header is not known by Python; it defaults
|
||||
to \code{us-ascii}.
|
||||
|
||||
For convenience, if the \var{value} passed to
|
||||
\function{collapse_rfc2231_value()} is not a tuple, it should be a string and
|
||||
it is returned unquoted.
|
||||
\end{funcdesc}
|
||||
|
||||
\begin{funcdesc}{decode_params}{params}
|
||||
Decode parameters list according to \rfc{2231}. \var{params} is a
|
||||
sequence of 2-tuples containing elements of the form
|
||||
\code{(content-type, string-value)}.
|
||||
\end{funcdesc}
|
||||
|
||||
\versionchanged[The \function{dump_address_pair()} function has been removed;
|
||||
use \function{formataddr()} instead]{2.4}
|
||||
|
||||
\versionchanged[The \function{decode()} function has been removed; use the
|
||||
\method{Header.decode_header()} method instead]{2.4}
|
||||
|
||||
\versionchanged[The \function{encode()} function has been removed; use the
|
||||
\method{Header.encode()} method instead]{2.4}
|
|
@ -1,7 +0,0 @@
|
|||
\chapter{File Formats}
|
||||
\label{fileformats}
|
||||
|
||||
The modules described in this chapter parse various miscellaneous file
|
||||
formats that aren't markup languages or are related to e-mail.
|
||||
|
||||
\localmoduletable
|
|
@ -1,18 +0,0 @@
|
|||
\chapter{File and Directory Access}
|
||||
\label{filesys}
|
||||
|
||||
The modules described in this chapter deal with disk files and
|
||||
directories. For example, there are modules for reading the
|
||||
properties of files, manipulating paths in a portable way, and
|
||||
creating temporary files. The full list of modules in this chapter is:
|
||||
|
||||
\localmoduletable
|
||||
|
||||
% XXX can this be included in the seealso environment? --amk
|
||||
Also see section \ref{bltin-file-objects} for a description
|
||||
of Python's built-in file objects.
|
||||
|
||||
\begin{seealso}
|
||||
\seemodule{os}{Operating system interfaces, including functions to
|
||||
work with files at a lower level than the built-in file object.}
|
||||
\end{seealso}
|
|
@ -1,10 +0,0 @@
|
|||
\chapter{Program Frameworks}
|
||||
\label{frameworks}
|
||||
|
||||
The modules described in this chapter are frameworks that will largely
|
||||
dictate the structure of your program. Currently the modules described
|
||||
here are all oriented toward writing command-line interfaces.
|
||||
|
||||
The full list of modules described in this chapter is:
|
||||
|
||||
\localmoduletable
|