Merged revisions 75374 via svnmerge from

svn+ssh://svn.python.org/python/branches/py3k

................
  r75374 | georg.brandl | 2009-10-11 23:25:26 +0200 (So, 11 Okt 2009) | 9 lines

  Merged revisions 75363 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r75363 | georg.brandl | 2009-10-11 20:31:23 +0200 (So, 11 Okt 2009) | 1 line

    Add the Python FAQ lists to the documentation.  Copied from sandbox/faq.  Many thanks to AMK for the preparation work.
  ........
................
This commit is contained in:
Georg Brandl 2009-10-27 20:20:38 +00:00
parent ef9c9af915
commit cb7cb247b3
12 changed files with 5388 additions and 0 deletions

View File

@ -15,6 +15,7 @@
install/index.rst
documenting/index.rst
howto/index.rst
faq/index.rst
glossary.rst
about.rst

924
Doc/faq/design.rst Normal file
View File

@ -0,0 +1,924 @@
======================
Design and History FAQ
======================
Why does Python use indentation for grouping of statements?
-----------------------------------------------------------
Guido van Rossum believes that using indentation for grouping is extremely
elegant and contributes a lot to the clarity of the average Python program.
Most people learn to love this feature after awhile.
Since there are no begin/end brackets there cannot be a disagreement between
grouping perceived by the parser and the human reader. Occasionally C
programmers will encounter a fragment of code like this::
if (x <= y)
x++;
y--;
z++;
Only the ``x++`` statement is executed if the condition is true, but the
indentation leads you to believe otherwise. Even experienced C programmers will
sometimes stare at it a long time wondering why ``y`` is being decremented even
for ``x > y``.
Because there are no begin/end brackets, Python is much less prone to
coding-style conflicts. In C there are many different ways to place the braces.
If you're used to reading and writing code that uses one style, you will feel at
least slightly uneasy when reading (or being required to write) another style.
Many coding styles place begin/end brackets on a line by themself. This makes
programs considerably longer and wastes valuable screen space, making it harder
to get a good overview of a program. Ideally, a function should fit on one
screen (say, 20-30 lines). 20 lines of Python can do a lot more work than 20
lines of C. This is not solely due to the lack of begin/end brackets -- the
lack of declarations and the high-level data types are also responsible -- but
the indentation-based syntax certainly helps.
Why am I getting strange results with simple arithmetic operations?
-------------------------------------------------------------------
See the next question.
Why are floating point calculations so inaccurate?
--------------------------------------------------
People are often very surprised by results like this::
>>> 1.2-1.0
0.199999999999999996
and think it is a bug in Python. It's not. This has nothing to do with Python,
but with how the underlying C platform handles floating point numbers, and
ultimately with the inaccuracies introduced when writing down numbers as a
string of a fixed number of digits.
The internal representation of floating point numbers uses a fixed number of
binary digits to represent a decimal number. Some decimal numbers can't be
represented exactly in binary, resulting in small roundoff errors.
In decimal math, there are many numbers that can't be represented with a fixed
number of decimal digits, e.g. 1/3 = 0.3333333333.......
In base 2, 1/2 = 0.1, 1/4 = 0.01, 1/8 = 0.001, etc. .2 equals 2/10 equals 1/5,
resulting in the binary fractional number 0.001100110011001...
Floating point numbers only have 32 or 64 bits of precision, so the digits are
cut off at some point, and the resulting number is 0.199999999999999996 in
decimal, not 0.2.
A floating point number's ``repr()`` function prints as many digits are
necessary to make ``eval(repr(f)) == f`` true for any float f. The ``str()``
function prints fewer digits and this often results in the more sensible number
that was probably intended::
>>> 0.2
0.20000000000000001
>>> print 0.2
0.2
One of the consequences of this is that it is error-prone to compare the result
of some computation to a float with ``==``. Tiny inaccuracies may mean that
``==`` fails. Instead, you have to check that the difference between the two
numbers is less than a certain threshold::
epsilon = 0.0000000000001 # Tiny allowed error
expected_result = 0.4
if expected_result-epsilon <= computation() <= expected_result+epsilon:
...
Please see the chapter on :ref:`floating point arithmetic <tut-fp-issues>` in
the Python tutorial for more information.
Why are Python strings immutable?
---------------------------------
There are several advantages.
One is performance: knowing that a string is immutable means we can allocate
space for it at creation time, and the storage requirements are fixed and
unchanging. This is also one of the reasons for the distinction between tuples
and lists.
Another advantage is that strings in Python are considered as "elemental" as
numbers. No amount of activity will change the value 8 to anything else, and in
Python, no amount of activity will change the string "eight" to anything else.
.. _why-self:
Why must 'self' be used explicitly in method definitions and calls?
-------------------------------------------------------------------
The idea was borrowed from Modula-3. It turns out to be very useful, for a
variety of reasons.
First, it's more obvious that you are using a method or instance attribute
instead of a local variable. Reading ``self.x`` or ``self.meth()`` makes it
absolutely clear that an instance variable or method is used even if you don't
know the class definition by heart. In C++, you can sort of tell by the lack of
a local variable declaration (assuming globals are rare or easily recognizable)
-- but in Python, there are no local variable declarations, so you'd have to
look up the class definition to be sure. Some C++ and Java coding standards
call for instance attributes to have an ``m_`` prefix, so this explicitness is
still useful in those languages, too.
Second, it means that no special syntax is necessary if you want to explicitly
reference or call the method from a particular class. In C++, if you want to
use a method from a base class which is overridden in a derived class, you have
to use the ``::`` operator -- in Python you can write baseclass.methodname(self,
<argument list>). This is particularly useful for :meth:`__init__` methods, and
in general in cases where a derived class method wants to extend the base class
method of the same name and thus has to call the base class method somehow.
Finally, for instance variables it solves a syntactic problem with assignment:
since local variables in Python are (by definition!) those variables to which a
value assigned in a function body (and that aren't explicitly declared global),
there has to be some way to tell the interpreter that an assignment was meant to
assign to an instance variable instead of to a local variable, and it should
preferably be syntactic (for efficiency reasons). C++ does this through
declarations, but Python doesn't have declarations and it would be a pity having
to introduce them just for this purpose. Using the explicit "self.var" solves
this nicely. Similarly, for using instance variables, having to write
"self.var" means that references to unqualified names inside a method don't have
to search the instance's directories. To put it another way, local variables
and instance variables live in two different namespaces, and you need to tell
Python which namespace to use.
Why can't I use an assignment in an expression?
-----------------------------------------------
Many people used to C or Perl complain that they want to use this C idiom:
.. code-block:: c
while (line = readline(f)) {
// do something with line
}
where in Python you're forced to write this::
while True:
line = f.readline()
if not line:
break
... # do something with line
The reason for not allowing assignment in Python expressions is a common,
hard-to-find bug in those other languages, caused by this construct:
.. code-block:: c
if (x = 0) {
// error handling
}
else {
// code that only works for nonzero x
}
The error is a simple typo: ``x = 0``, which assigns 0 to the variable ``x``,
was written while the comparison ``x == 0`` is certainly what was intended.
Many alternatives have been proposed. Most are hacks that save some typing but
use arbitrary or cryptic syntax or keywords, and fail the simple criterion for
language change proposals: it should intuitively suggest the proper meaning to a
human reader who has not yet been introduced to the construct.
An interesting phenomenon is that most experienced Python programmers recognize
the ``while True`` idiom and don't seem to be missing the assignment in
expression construct much; it's only newcomers who express a strong desire to
add this to the language.
There's an alternative way of spelling this that seems attractive but is
generally less robust than the "while True" solution::
line = f.readline()
while line:
... # do something with line...
line = f.readline()
The problem with this is that if you change your mind about exactly how you get
the next line (e.g. you want to change it into ``sys.stdin.readline()``) you
have to remember to change two places in your program -- the second occurrence
is hidden at the bottom of the loop.
The best approach is to use iterators, making it possible to loop through
objects using the ``for`` statement. For example, in the current version of
Python file objects support the iterator protocol, so you can now write simply::
for line in f:
... # do something with line...
Why does Python use methods for some functionality (e.g. list.index()) but functions for other (e.g. len(list))?
----------------------------------------------------------------------------------------------------------------
The major reason is history. Functions were used for those operations that were
generic for a group of types and which were intended to work even for objects
that didn't have methods at all (e.g. tuples). It is also convenient to have a
function that can readily be applied to an amorphous collection of objects when
you use the functional features of Python (``map()``, ``apply()`` et al).
In fact, implementing ``len()``, ``max()``, ``min()`` as a built-in function is
actually less code than implementing them as methods for each type. One can
quibble about individual cases but it's a part of Python, and it's too late to
make such fundamental changes now. The functions have to remain to avoid massive
code breakage.
.. XXX talk about protocols?
Note that for string operations Python has moved from external functions (the
``string`` module) to methods. However, ``len()`` is still a function.
Why is join() a string method instead of a list or tuple method?
----------------------------------------------------------------
Strings became much more like other standard types starting in Python 1.6, when
methods were added which give the same functionality that has always been
available using the functions of the string module. Most of these new methods
have been widely accepted, but the one which appears to make some programmers
feel uncomfortable is::
", ".join(['1', '2', '4', '8', '16'])
which gives the result::
"1, 2, 4, 8, 16"
There are two common arguments against this usage.
The first runs along the lines of: "It looks really ugly using a method of a
string literal (string constant)", to which the answer is that it might, but a
string literal is just a fixed value. If the methods are to be allowed on names
bound to strings there is no logical reason to make them unavailable on
literals.
The second objection is typically cast as: "I am really telling a sequence to
join its members together with a string constant". Sadly, you aren't. For some
reason there seems to be much less difficulty with having :meth:`~str.split` as
a string method, since in that case it is easy to see that ::
"1, 2, 4, 8, 16".split(", ")
is an instruction to a string literal to return the substrings delimited by the
given separator (or, by default, arbitrary runs of white space). In this case a
Unicode string returns a list of Unicode strings, an ASCII string returns a list
of ASCII strings, and everyone is happy.
:meth:`~str.join` is a string method because in using it you are telling the
separator string to iterate over a sequence of strings and insert itself between
adjacent elements. This method can be used with any argument which obeys the
rules for sequence objects, including any new classes you might define yourself.
Because this is a string method it can work for Unicode strings as well as plain
ASCII strings. If ``join()`` were a method of the sequence types then the
sequence types would have to decide which type of string to return depending on
the type of the separator.
.. XXX remove next paragraph eventually
If none of these arguments persuade you, then for the moment you can continue to
use the ``join()`` function from the string module, which allows you to write ::
string.join(['1', '2', '4', '8', '16'], ", ")
How fast are exceptions?
------------------------
A try/except block is extremely efficient. Actually catching an exception is
expensive. In versions of Python prior to 2.0 it was common to use this idiom::
try:
value = dict[key]
except KeyError:
dict[key] = getvalue(key)
value = dict[key]
This only made sense when you expected the dict to have the key almost all the
time. If that wasn't the case, you coded it like this::
if dict.has_key(key):
value = dict[key]
else:
dict[key] = getvalue(key)
value = dict[key]
(In Python 2.0 and higher, you can code this as ``value = dict.setdefault(key,
getvalue(key))``.)
Why isn't there a switch or case statement in Python?
-----------------------------------------------------
You can do this easily enough with a sequence of ``if... elif... elif... else``.
There have been some proposals for switch statement syntax, but there is no
consensus (yet) on whether and how to do range tests. See :pep:`275` for
complete details and the current status.
For cases where you need to choose from a very large number of possibilities,
you can create a dictionary mapping case values to functions to call. For
example::
def function_1(...):
...
functions = {'a': function_1,
'b': function_2,
'c': self.method_1, ...}
func = functions[value]
func()
For calling methods on objects, you can simplify yet further by using the
:func:`getattr` built-in to retrieve methods with a particular name::
def visit_a(self, ...):
...
...
def dispatch(self, value):
method_name = 'visit_' + str(value)
method = getattr(self, method_name)
method()
It's suggested that you use a prefix for the method names, such as ``visit_`` in
this example. Without such a prefix, if values are coming from an untrusted
source, an attacker would be able to call any method on your object.
Can't you emulate threads in the interpreter instead of relying on an OS-specific thread implementation?
--------------------------------------------------------------------------------------------------------
Answer 1: Unfortunately, the interpreter pushes at least one C stack frame for
each Python stack frame. Also, extensions can call back into Python at almost
random moments. Therefore, a complete threads implementation requires thread
support for C.
Answer 2: Fortunately, there is `Stackless Python <http://www.stackless.com>`_,
which has a completely redesigned interpreter loop that avoids the C stack.
It's still experimental but looks very promising. Although it is binary
compatible with standard Python, it's still unclear whether Stackless will make
it into the core -- maybe it's just too revolutionary.
Why can't lambda forms contain statements?
------------------------------------------
Python lambda forms cannot contain statements because Python's syntactic
framework can't handle statements nested inside expressions. However, in
Python, this is not a serious problem. Unlike lambda forms in other languages,
where they add functionality, Python lambdas are only a shorthand notation if
you're too lazy to define a function.
Functions are already first class objects in Python, and can be declared in a
local scope. Therefore the only advantage of using a lambda form instead of a
locally-defined function is that you don't need to invent a name for the
function -- but that's just a local variable to which the function object (which
is exactly the same type of object that a lambda form yields) is assigned!
Can Python be compiled to machine code, C or some other language?
-----------------------------------------------------------------
Not easily. Python's high level data types, dynamic typing of objects and
run-time invocation of the interpreter (using :func:`eval` or :keyword:`exec`)
together mean that a "compiled" Python program would probably consist mostly of
calls into the Python run-time system, even for seemingly simple operations like
``x+1``.
Several projects described in the Python newsgroup or at past `Python
conferences <http://python.org/community/workshops/>`_ have shown that this approach is feasible,
although the speedups reached so far are only modest (e.g. 2x). Jython uses the
same strategy for compiling to Java bytecode. (Jim Hugunin has demonstrated
that in combination with whole-program analysis, speedups of 1000x are feasible
for small demo programs. See the proceedings from the `1997 Python conference
<http://python.org/community/workshops/1997-10/proceedings/>`_ for more information.)
Internally, Python source code is always translated into a bytecode
representation, and this bytecode is then executed by the Python virtual
machine. In order to avoid the overhead of repeatedly parsing and translating
modules that rarely change, this byte code is written into a file whose name
ends in ".pyc" whenever a module is parsed. When the corresponding .py file is
changed, it is parsed and translated again and the .pyc file is rewritten.
There is no performance difference once the .pyc file has been loaded, as the
bytecode read from the .pyc file is exactly the same as the bytecode created by
direct translation. The only difference is that loading code from a .pyc file
is faster than parsing and translating a .py file, so the presence of
precompiled .pyc files improves the start-up time of Python scripts. If
desired, the Lib/compileall.py module can be used to create valid .pyc files for
a given set of modules.
Note that the main script executed by Python, even if its filename ends in .py,
is not compiled to a .pyc file. It is compiled to bytecode, but the bytecode is
not saved to a file. Usually main scripts are quite short, so this doesn't cost
much speed.
.. XXX check which of these projects are still alive
There are also several programs which make it easier to intermingle Python and C
code in various ways to increase performance. See, for example, `Psyco
<http://psyco.sourceforge.net/>`_, `Pyrex
<http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/>`_, `PyInline
<http://pyinline.sourceforge.net/>`_, `Py2Cmod
<http://sourceforge.net/projects/py2cmod/>`_, and `Weave
<http://www.scipy.org/site_content/weave>`_.
How does Python manage memory?
------------------------------
The details of Python memory management depend on the implementation. The
standard C implementation of Python uses reference counting to detect
inaccessible objects, and another mechanism to collect reference cycles,
periodically executing a cycle detection algorithm which looks for inaccessible
cycles and deletes the objects involved. The :mod:`gc` module provides functions
to perform a garbage collection, obtain debugging statistics, and tune the
collector's parameters.
Jython relies on the Java runtime so the JVM's garbage collector is used. This
difference can cause some subtle porting problems if your Python code depends on
the behavior of the reference counting implementation.
Sometimes objects get stuck in tracebacks temporarily and hence are not
deallocated when you might expect. Clear the tracebacks with::
import sys
sys.exc_clear()
sys.exc_traceback = sys.last_traceback = None
Tracebacks are used for reporting errors, implementing debuggers and related
things. They contain a portion of the program state extracted during the
handling of an exception (usually the most recent exception).
In the absence of circularities and tracebacks, Python programs need not
explicitly manage memory.
Why doesn't Python use a more traditional garbage collection scheme? For one
thing, this is not a C standard feature and hence it's not portable. (Yes, we
know about the Boehm GC library. It has bits of assembler code for *most*
common platforms, not for all of them, and although it is mostly transparent, it
isn't completely transparent; patches are required to get Python to work with
it.)
Traditional GC also becomes a problem when Python is embedded into other
applications. While in a standalone Python it's fine to replace the standard
malloc() and free() with versions provided by the GC library, an application
embedding Python may want to have its *own* substitute for malloc() and free(),
and may not want Python's. Right now, Python works with anything that
implements malloc() and free() properly.
In Jython, the following code (which is fine in CPython) will probably run out
of file descriptors long before it runs out of memory::
for file in <very long list of files>:
f = open(file)
c = f.read(1)
Using the current reference counting and destructor scheme, each new assignment
to f closes the previous file. Using GC, this is not guaranteed. If you want
to write code that will work with any Python implementation, you should
explicitly close the file; this will work regardless of GC::
for file in <very long list of files>:
f = open(file)
c = f.read(1)
f.close()
Why isn't all memory freed when Python exits?
---------------------------------------------
Objects referenced from the global namespaces of Python modules are not always
deallocated when Python exits. This may happen if there are circular
references. There are also certain bits of memory that are allocated by the C
library that are impossible to free (e.g. a tool like Purify will complain about
these). Python is, however, aggressive about cleaning up memory on exit and
does try to destroy every single object.
If you want to force Python to delete certain things on deallocation use the
:mod:`atexit` module to run a function that will force those deletions.
Why are there separate tuple and list data types?
-------------------------------------------------
Lists and tuples, while similar in many respects, are generally used in
fundamentally different ways. Tuples can be thought of as being similar to
Pascal records or C structs; they're small collections of related data which may
be of different types which are operated on as a group. For example, a
Cartesian coordinate is appropriately represented as a tuple of two or three
numbers.
Lists, on the other hand, are more like arrays in other languages. They tend to
hold a varying number of objects all of which have the same type and which are
operated on one-by-one. For example, ``os.listdir('.')`` returns a list of
strings representing the files in the current directory. Functions which
operate on this output would generally not break if you added another file or
two to the directory.
Tuples are immutable, meaning that once a tuple has been created, you can't
replace any of its elements with a new value. Lists are mutable, meaning that
you can always change a list's elements. Only immutable elements can be used as
dictionary keys, and hence only tuples and not lists can be used as keys.
How are lists implemented?
--------------------------
Python's lists are really variable-length arrays, not Lisp-style linked lists.
The implementation uses a contiguous array of references to other objects, and
keeps a pointer to this array and the array's length in a list head structure.
This makes indexing a list ``a[i]`` an operation whose cost is independent of
the size of the list or the value of the index.
When items are appended or inserted, the array of references is resized. Some
cleverness is applied to improve the performance of appending items repeatedly;
when the array must be grown, some extra space is allocated so the next few
times don't require an actual resize.
How are dictionaries implemented?
---------------------------------
Python's dictionaries are implemented as resizable hash tables. Compared to
B-trees, this gives better performance for lookup (the most common operation by
far) under most circumstances, and the implementation is simpler.
Dictionaries work by computing a hash code for each key stored in the dictionary
using the :func:`hash` built-in function. The hash code varies widely depending
on the key; for example, "Python" hashes to -539294296 while "python", a string
that differs by a single bit, hashes to 1142331976. The hash code is then used
to calculate a location in an internal array where the value will be stored.
Assuming that you're storing keys that all have different hash values, this
means that dictionaries take constant time -- O(1), in computer science notation
-- to retrieve a key. It also means that no sorted order of the keys is
maintained, and traversing the array as the ``.keys()`` and ``.items()`` do will
output the dictionary's content in some arbitrary jumbled order.
Why must dictionary keys be immutable?
--------------------------------------
The hash table implementation of dictionaries uses a hash value calculated from
the key value to find the key. If the key were a mutable object, its value
could change, and thus its hash could also change. But since whoever changes
the key object can't tell that it was being used as a dictionary key, it can't
move the entry around in the dictionary. Then, when you try to look up the same
object in the dictionary it won't be found because its hash value is different.
If you tried to look up the old value it wouldn't be found either, because the
value of the object found in that hash bin would be different.
If you want a dictionary indexed with a list, simply convert the list to a tuple
first; the function ``tuple(L)`` creates a tuple with the same entries as the
list ``L``. Tuples are immutable and can therefore be used as dictionary keys.
Some unacceptable solutions that have been proposed:
- Hash lists by their address (object ID). This doesn't work because if you
construct a new list with the same value it won't be found; e.g.::
d = {[1,2]: '12'}
print d[[1,2]]
would raise a KeyError exception because the id of the ``[1,2]`` used in the
second line differs from that in the first line. In other words, dictionary
keys should be compared using ``==``, not using :keyword:`is`.
- Make a copy when using a list as a key. This doesn't work because the list,
being a mutable object, could contain a reference to itself, and then the
copying code would run into an infinite loop.
- Allow lists as keys but tell the user not to modify them. This would allow a
class of hard-to-track bugs in programs when you forgot or modified a list by
accident. It also invalidates an important invariant of dictionaries: every
value in ``d.keys()`` is usable as a key of the dictionary.
- Mark lists as read-only once they are used as a dictionary key. The problem
is that it's not just the top-level object that could change its value; you
could use a tuple containing a list as a key. Entering anything as a key into
a dictionary would require marking all objects reachable from there as
read-only -- and again, self-referential objects could cause an infinite loop.
There is a trick to get around this if you need to, but use it at your own risk:
You can wrap a mutable structure inside a class instance which has both a
:meth:`__cmp_` and a :meth:`__hash__` method. You must then make sure that the
hash value for all such wrapper objects that reside in a dictionary (or other
hash based structure), remain fixed while the object is in the dictionary (or
other structure). ::
class ListWrapper:
def __init__(self, the_list):
self.the_list = the_list
def __cmp__(self, other):
return self.the_list == other.the_list
def __hash__(self):
l = self.the_list
result = 98767 - len(l)*555
for i in range(len(l)):
try:
result = result + (hash(l[i]) % 9999999) * 1001 + i
except:
result = (result % 7777777) + i * 333
return result
Note that the hash computation is complicated by the possibility that some
members of the list may be unhashable and also by the possibility of arithmetic
overflow.
Furthermore it must always be the case that if ``o1 == o2`` (ie ``o1.__cmp__(o2)
== 0``) then ``hash(o1) == hash(o2)`` (ie, ``o1.__hash__() == o2.__hash__()``),
regardless of whether the object is in a dictionary or not. If you fail to meet
these restrictions dictionaries and other hash based structures will misbehave.
In the case of ListWrapper, whenever the wrapper object is in a dictionary the
wrapped list must not change to avoid anomalies. Don't do this unless you are
prepared to think hard about the requirements and the consequences of not
meeting them correctly. Consider yourself warned.
Why doesn't list.sort() return the sorted list?
-----------------------------------------------
In situations where performance matters, making a copy of the list just to sort
it would be wasteful. Therefore, :meth:`list.sort` sorts the list in place. In
order to remind you of that fact, it does not return the sorted list. This way,
you won't be fooled into accidentally overwriting a list when you need a sorted
copy but also need to keep the unsorted version around.
In Python 2.4 a new builtin -- :func:`sorted` -- has been added. This function
creates a new list from a provided iterable, sorts it and returns it. For
example, here's how to iterate over the keys of a dictionary in sorted order::
for key in sorted(dict.iterkeys()):
... # do whatever with dict[key]...
How do you specify and enforce an interface spec in Python?
-----------------------------------------------------------
An interface specification for a module as provided by languages such as C++ and
Java describes the prototypes for the methods and functions of the module. Many
feel that compile-time enforcement of interface specifications helps in the
construction of large programs.
Python 2.6 adds an :mod:`abc` module that lets you define Abstract Base Classes
(ABCs). You can then use :func:`isinstance` and :func:`issubclass` to check
whether an instance or a class implements a particular ABC. The
:mod:`collections` modules defines a set of useful ABCs such as
:class:`Iterable`, :class:`Container`, and :class:`MutableMapping`.
For Python, many of the advantages of interface specifications can be obtained
by an appropriate test discipline for components. There is also a tool,
PyChecker, which can be used to find problems due to subclassing.
A good test suite for a module can both provide a regression test and serve as a
module interface specification and a set of examples. Many Python modules can
be run as a script to provide a simple "self test." Even modules which use
complex external interfaces can often be tested in isolation using trivial
"stub" emulations of the external interface. The :mod:`doctest` and
:mod:`unittest` modules or third-party test frameworks can be used to construct
exhaustive test suites that exercise every line of code in a module.
An appropriate testing discipline can help build large complex applications in
Python as well as having interface specifications would. In fact, it can be
better because an interface specification cannot test certain properties of a
program. For example, the :meth:`append` method is expected to add new elements
to the end of some internal list; an interface specification cannot test that
your :meth:`append` implementation will actually do this correctly, but it's
trivial to check this property in a test suite.
Writing test suites is very helpful, and you might want to design your code with
an eye to making it easily tested. One increasingly popular technique,
test-directed development, calls for writing parts of the test suite first,
before you write any of the actual code. Of course Python allows you to be
sloppy and not write test cases at all.
Why are default values shared between objects?
----------------------------------------------
This type of bug commonly bites neophyte programmers. Consider this function::
def foo(D={}): # Danger: shared reference to one dict for all calls
... compute something ...
D[key] = value
return D
The first time you call this function, ``D`` contains a single item. The second
time, ``D`` contains two items because when ``foo()`` begins executing, ``D``
starts out with an item already in it.
It is often expected that a function call creates new objects for default
values. This is not what happens. Default values are created exactly once, when
the function is defined. If that object is changed, like the dictionary in this
example, subsequent calls to the function will refer to this changed object.
By definition, immutable objects such as numbers, strings, tuples, and ``None``,
are safe from change. Changes to mutable objects such as dictionaries, lists,
and class instances can lead to confusion.
Because of this feature, it is good programming practice to not use mutable
objects as default values. Instead, use ``None`` as the default value and
inside the function, check if the parameter is ``None`` and create a new
list/dictionary/whatever if it is. For example, don't write::
def foo(dict={}):
...
but::
def foo(dict=None):
if dict is None:
dict = {} # create a new dict for local namespace
This feature can be useful. When you have a function that's time-consuming to
compute, a common technique is to cache the parameters and the resulting value
of each call to the function, and return the cached value if the same value is
requested again. This is called "memoizing", and can be implemented like this::
# Callers will never provide a third parameter for this function.
def expensive (arg1, arg2, _cache={}):
if _cache.has_key((arg1, arg2)):
return _cache[(arg1, arg2)]
# Calculate the value
result = ... expensive computation ...
_cache[(arg1, arg2)] = result # Store result in the cache
return result
You could use a global variable containing a dictionary instead of the default
value; it's a matter of taste.
Why is there no goto?
---------------------
You can use exceptions to provide a "structured goto" that even works across
function calls. Many feel that exceptions can conveniently emulate all
reasonable uses of the "go" or "goto" constructs of C, Fortran, and other
languages. For example::
class label: pass # declare a label
try:
...
if (condition): raise label() # goto label
...
except label: # where to goto
pass
...
This doesn't allow you to jump into the middle of a loop, but that's usually
considered an abuse of goto anyway. Use sparingly.
Why can't raw strings (r-strings) end with a backslash?
-------------------------------------------------------
More precisely, they can't end with an odd number of backslashes: the unpaired
backslash at the end escapes the closing quote character, leaving an
unterminated string.
Raw strings were designed to ease creating input for processors (chiefly regular
expression engines) that want to do their own backslash escape processing. Such
processors consider an unmatched trailing backslash to be an error anyway, so
raw strings disallow that. In return, they allow you to pass on the string
quote character by escaping it with a backslash. These rules work well when
r-strings are used for their intended purpose.
If you're trying to build Windows pathnames, note that all Windows system calls
accept forward slashes too::
f = open("/mydir/file.txt") # works fine!
If you're trying to build a pathname for a DOS command, try e.g. one of ::
dir = r"\this\is\my\dos\dir" "\\"
dir = r"\this\is\my\dos\dir\ "[:-1]
dir = "\\this\\is\\my\\dos\\dir\\"
Why doesn't Python have a "with" statement for attribute assignments?
---------------------------------------------------------------------
Python has a 'with' statement that wraps the execution of a block, calling code
on the entrance and exit from the block. Some language have a construct that
looks like this::
with obj:
a = 1 # equivalent to obj.a = 1
total = total + 1 # obj.total = obj.total + 1
In Python, such a construct would be ambiguous.
Other languages, such as Object Pascal, Delphi, and C++, use static types, so
it's possible to know, in an unambiguous way, what member is being assigned
to. This is the main point of static typing -- the compiler *always* knows the
scope of every variable at compile time.
Python uses dynamic types. It is impossible to know in advance which attribute
will be referenced at runtime. Member attributes may be added or removed from
objects on the fly. This makes it impossible to know, from a simple reading,
what attribute is being referenced: a local one, a global one, or a member
attribute?
For instance, take the following incomplete snippet::
def foo(a):
with a:
print x
The snippet assumes that "a" must have a member attribute called "x". However,
there is nothing in Python that tells the interpreter this. What should happen
if "a" is, let us say, an integer? If there is a global variable named "x",
will it be used inside the with block? As you see, the dynamic nature of Python
makes such choices much harder.
The primary benefit of "with" and similar language features (reduction of code
volume) can, however, easily be achieved in Python by assignment. Instead of::
function(args).dict[index][index].a = 21
function(args).dict[index][index].b = 42
function(args).dict[index][index].c = 63
write this::
ref = function(args).dict[index][index]
ref.a = 21
ref.b = 42
ref.c = 63
This also has the side-effect of increasing execution speed because name
bindings are resolved at run-time in Python, and the second version only needs
to perform the resolution once. If the referenced object does not have a, b and
c attributes, of course, the end result is still a run-time exception.
Why are colons required for the if/while/def/class statements?
--------------------------------------------------------------
The colon is required primarily to enhance readability (one of the results of
the experimental ABC language). Consider this::
if a == b
print a
versus ::
if a == b:
print a
Notice how the second one is slightly easier to read. Notice further how a
colon sets off the example in this FAQ answer; it's a standard usage in English.
Another minor reason is that the colon makes it easier for editors with syntax
highlighting; they can look for colons to decide when indentation needs to be
increased instead of having to do a more elaborate parsing of the program text.
Why does Python allow commas at the end of lists and tuples?
------------------------------------------------------------
Python lets you add a trailing comma at the end of lists, tuples, and
dictionaries::
[1, 2, 3,]
('a', 'b', 'c',)
d = {
"A": [1, 5],
"B": [6, 7], # last trailing comma is optional but good style
}
There are several reasons to allow this.
When you have a literal value for a list, tuple, or dictionary spread across
multiple lines, it's easier to add more elements because you don't have to
remember to add a comma to the previous line. The lines can also be sorted in
your editor without creating a syntax error.
Accidentally omitting the comma can lead to errors that are hard to diagnose.
For example::
x = [
"fee",
"fie"
"foo",
"fum"
]
This list looks like it has four elements, but it actually contains three:
"fee", "fiefoo" and "fum". Always adding the comma avoids this source of error.
Allowing the trailing comma may also make programmatic code generation easier.

481
Doc/faq/extending.rst Normal file
View File

@ -0,0 +1,481 @@
=======================
Extending/Embedding FAQ
=======================
.. contents::
.. highlight:: c
Can I create my own functions in C?
-----------------------------------
Yes, you can create built-in modules containing functions, variables, exceptions
and even new types in C. This is explained in the document
:ref:`extending-index`.
Most intermediate or advanced Python books will also cover this topic.
Can I create my own functions in C++?
-------------------------------------
Yes, using the C compatibility features found in C++. Place ``extern "C" {
... }`` around the Python include files and put ``extern "C"`` before each
function that is going to be called by the Python interpreter. Global or static
C++ objects with constructors are probably not a good idea.
Writing C is hard; are there any alternatives?
----------------------------------------------
There are a number of alternatives to writing your own C extensions, depending
on what you're trying to do.
.. XXX make sure these all work; mention Cython
If you need more speed, `Psyco <http://psyco.sourceforge.net/>`_ generates x86
assembly code from Python bytecode. You can use Psyco to compile the most
time-critical functions in your code, and gain a significant improvement with
very little effort, as long as you're running on a machine with an
x86-compatible processor.
`Pyrex <http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/>`_ is a compiler
that accepts a slightly modified form of Python and generates the corresponding
C code. Pyrex makes it possible to write an extension without having to learn
Python's C API.
If you need to interface to some C or C++ library for which no Python extension
currently exists, you can try wrapping the library's data types and functions
with a tool such as `SWIG <http://www.swig.org>`_. `SIP
<http://www.riverbankcomputing.co.uk/sip/>`_, `CXX
<http://cxx.sourceforge.net/>`_ `Boost
<http://www.boost.org/libs/python/doc/index.html>`_, or `Weave
<http://www.scipy.org/site_content/weave>`_ are also alternatives for wrapping
C++ libraries.
How can I execute arbitrary Python statements from C?
-----------------------------------------------------
The highest-level function to do this is :cfunc:`PyRun_SimpleString` which takes
a single string argument to be executed in the context of the module
``__main__`` and returns 0 for success and -1 when an exception occurred
(including ``SyntaxError``). If you want more control, use
:cfunc:`PyRun_String`; see the source for :cfunc:`PyRun_SimpleString` in
``Python/pythonrun.c``.
How can I evaluate an arbitrary Python expression from C?
---------------------------------------------------------
Call the function :cfunc:`PyRun_String` from the previous question with the
start symbol :cdata:`Py_eval_input`; it parses an expression, evaluates it and
returns its value.
How do I extract C values from a Python object?
-----------------------------------------------
That depends on the object's type. If it's a tuple, :cfunc:`PyTuple_Size`
returns its length and :cfunc:`PyTuple_GetItem` returns the item at a specified
index. Lists have similar functions, :cfunc:`PyListSize` and
:cfunc:`PyList_GetItem`.
For strings, :cfunc:`PyString_Size` returns its length and
:cfunc:`PyString_AsString` a pointer to its value. Note that Python strings may
contain null bytes so C's :cfunc:`strlen` should not be used.
To test the type of an object, first make sure it isn't *NULL*, and then use
:cfunc:`PyString_Check`, :cfunc:`PyTuple_Check`, :cfunc:`PyList_Check`, etc.
There is also a high-level API to Python objects which is provided by the
so-called 'abstract' interface -- read ``Include/abstract.h`` for further
details. It allows interfacing with any kind of Python sequence using calls
like :cfunc:`PySequence_Length`, :cfunc:`PySequence_GetItem`, etc.) as well as
many other useful protocols.
How do I use Py_BuildValue() to create a tuple of arbitrary length?
-------------------------------------------------------------------
You can't. Use ``t = PyTuple_New(n)`` instead, and fill it with objects using
``PyTuple_SetItem(t, i, o)`` -- note that this "eats" a reference count of
``o``, so you have to :cfunc:`Py_INCREF` it. Lists have similar functions
``PyList_New(n)`` and ``PyList_SetItem(l, i, o)``. Note that you *must* set all
the tuple items to some value before you pass the tuple to Python code --
``PyTuple_New(n)`` initializes them to NULL, which isn't a valid Python value.
How do I call an object's method from C?
----------------------------------------
The :cfunc:`PyObject_CallMethod` function can be used to call an arbitrary
method of an object. The parameters are the object, the name of the method to
call, a format string like that used with :cfunc:`Py_BuildValue`, and the
argument values::
PyObject *
PyObject_CallMethod(PyObject *object, char *method_name,
char *arg_format, ...);
This works for any object that has methods -- whether built-in or user-defined.
You are responsible for eventually :cfunc:`Py_DECREF`\ 'ing the return value.
To call, e.g., a file object's "seek" method with arguments 10, 0 (assuming the
file object pointer is "f")::
res = PyObject_CallMethod(f, "seek", "(ii)", 10, 0);
if (res == NULL) {
... an exception occurred ...
}
else {
Py_DECREF(res);
}
Note that since :cfunc:`PyObject_CallObject` *always* wants a tuple for the
argument list, to call a function without arguments, pass "()" for the format,
and to call a function with one argument, surround the argument in parentheses,
e.g. "(i)".
How do I catch the output from PyErr_Print() (or anything that prints to stdout/stderr)?
----------------------------------------------------------------------------------------
In Python code, define an object that supports the ``write()`` method. Assign
this object to :data:`sys.stdout` and :data:`sys.stderr`. Call print_error, or
just allow the standard traceback mechanism to work. Then, the output will go
wherever your ``write()`` method sends it.
The easiest way to do this is to use the StringIO class in the standard library.
Sample code and use for catching stdout:
>>> class StdoutCatcher:
... def __init__(self):
... self.data = ''
... def write(self, stuff):
... self.data = self.data + stuff
...
>>> import sys
>>> sys.stdout = StdoutCatcher()
>>> print 'foo'
>>> print 'hello world!'
>>> sys.stderr.write(sys.stdout.data)
foo
hello world!
How do I access a module written in Python from C?
--------------------------------------------------
You can get a pointer to the module object as follows::
module = PyImport_ImportModule("<modulename>");
If the module hasn't been imported yet (i.e. it is not yet present in
:data:`sys.modules`), this initializes the module; otherwise it simply returns
the value of ``sys.modules["<modulename>"]``. Note that it doesn't enter the
module into any namespace -- it only ensures it has been initialized and is
stored in :data:`sys.modules`.
You can then access the module's attributes (i.e. any name defined in the
module) as follows::
attr = PyObject_GetAttrString(module, "<attrname>");
Calling :cfunc:`PyObject_SetAttrString` to assign to variables in the module
also works.
How do I interface to C++ objects from Python?
----------------------------------------------
Depending on your requirements, there are many approaches. To do this manually,
begin by reading :ref:`the "Extending and Embedding" document
<extending-index>`. Realize that for the Python run-time system, there isn't a
whole lot of difference between C and C++ -- so the strategy of building a new
Python type around a C structure (pointer) type will also work for C++ objects.
For C++ libraries, you can look at `SIP
<http://www.riverbankcomputing.co.uk/sip/>`_, `CXX
<http://cxx.sourceforge.net/>`_, `Boost
<http://www.boost.org/libs/python/doc/index.html>`_, `Weave
<http://www.scipy.org/site_content/weave>`_ or `SWIG <http://www.swig.org>`_
I added a module using the Setup file and the make fails; why?
--------------------------------------------------------------
Setup must end in a newline, if there is no newline there, the build process
fails. (Fixing this requires some ugly shell script hackery, and this bug is so
minor that it doesn't seem worth the effort.)
How do I debug an extension?
----------------------------
When using GDB with dynamically loaded extensions, you can't set a breakpoint in
your extension until your extension is loaded.
In your ``.gdbinit`` file (or interactively), add the command::
br _PyImport_LoadDynamicModule
Then, when you run GDB::
$ gdb /local/bin/python
gdb) run myscript.py
gdb) continue # repeat until your extension is loaded
gdb) finish # so that your extension is loaded
gdb) br myfunction.c:50
gdb) continue
I want to compile a Python module on my Linux system, but some files are missing. Why?
--------------------------------------------------------------------------------------
Most packaged versions of Python don't include the
:file:`/usr/lib/python2.{x}/config/` directory, which contains various files
required for compiling Python extensions.
For Red Hat, install the python-devel RPM to get the necessary files.
For Debian, run ``apt-get install python-dev``.
What does "SystemError: _PyImport_FixupExtension: module yourmodule not loaded" mean?
-------------------------------------------------------------------------------------
This means that you have created an extension module named "yourmodule", but
your module init function does not initialize with that name.
Every module init function will have a line similar to::
module = Py_InitModule("yourmodule", yourmodule_functions);
If the string passed to this function is not the same name as your extension
module, the :exc:`SystemError` exception will be raised.
How do I tell "incomplete input" from "invalid input"?
------------------------------------------------------
Sometimes you want to emulate the Python interactive interpreter's behavior,
where it gives you a continuation prompt when the input is incomplete (e.g. you
typed the start of an "if" statement or you didn't close your parentheses or
triple string quotes), but it gives you a syntax error message immediately when
the input is invalid.
In Python you can use the :mod:`codeop` module, which approximates the parser's
behavior sufficiently. IDLE uses this, for example.
The easiest way to do it in C is to call :cfunc:`PyRun_InteractiveLoop` (perhaps
in a separate thread) and let the Python interpreter handle the input for
you. You can also set the :cfunc:`PyOS_ReadlineFunctionPointer` to point at your
custom input function. See ``Modules/readline.c`` and ``Parser/myreadline.c``
for more hints.
However sometimes you have to run the embedded Python interpreter in the same
thread as your rest application and you can't allow the
:cfunc:`PyRun_InteractiveLoop` to stop while waiting for user input. The one
solution then is to call :cfunc:`PyParser_ParseString` and test for ``e.error``
equal to ``E_EOF``, which means the input is incomplete). Here's a sample code
fragment, untested, inspired by code from Alex Farber::
#include <Python.h>
#include <node.h>
#include <errcode.h>
#include <grammar.h>
#include <parsetok.h>
#include <compile.h>
int testcomplete(char *code)
/* code should end in \n */
/* return -1 for error, 0 for incomplete, 1 for complete */
{
node *n;
perrdetail e;
n = PyParser_ParseString(code, &_PyParser_Grammar,
Py_file_input, &e);
if (n == NULL) {
if (e.error == E_EOF)
return 0;
return -1;
}
PyNode_Free(n);
return 1;
}
Another solution is trying to compile the received string with
:cfunc:`Py_CompileString`. If it compiles without errors, try to execute the
returned code object by calling :cfunc:`PyEval_EvalCode`. Otherwise save the
input for later. If the compilation fails, find out if it's an error or just
more input is required - by extracting the message string from the exception
tuple and comparing it to the string "unexpected EOF while parsing". Here is a
complete example using the GNU readline library (you may want to ignore
**SIGINT** while calling readline())::
#include <stdio.h>
#include <readline.h>
#include <Python.h>
#include <object.h>
#include <compile.h>
#include <eval.h>
int main (int argc, char* argv[])
{
int i, j, done = 0; /* lengths of line, code */
char ps1[] = ">>> ";
char ps2[] = "... ";
char *prompt = ps1;
char *msg, *line, *code = NULL;
PyObject *src, *glb, *loc;
PyObject *exc, *val, *trb, *obj, *dum;
Py_Initialize ();
loc = PyDict_New ();
glb = PyDict_New ();
PyDict_SetItemString (glb, "__builtins__", PyEval_GetBuiltins ());
while (!done)
{
line = readline (prompt);
if (NULL == line) /* CTRL-D pressed */
{
done = 1;
}
else
{
i = strlen (line);
if (i > 0)
add_history (line); /* save non-empty lines */
if (NULL == code) /* nothing in code yet */
j = 0;
else
j = strlen (code);
code = realloc (code, i + j + 2);
if (NULL == code) /* out of memory */
exit (1);
if (0 == j) /* code was empty, so */
code[0] = '\0'; /* keep strncat happy */
strncat (code, line, i); /* append line to code */
code[i + j] = '\n'; /* append '\n' to code */
code[i + j + 1] = '\0';
src = Py_CompileString (code, "<stdin>", Py_single_input);
if (NULL != src) /* compiled just fine - */
{
if (ps1 == prompt || /* ">>> " or */
'\n' == code[i + j - 1]) /* "... " and double '\n' */
{ /* so execute it */
dum = PyEval_EvalCode ((PyCodeObject *)src, glb, loc);
Py_XDECREF (dum);
Py_XDECREF (src);
free (code);
code = NULL;
if (PyErr_Occurred ())
PyErr_Print ();
prompt = ps1;
}
} /* syntax error or E_EOF? */
else if (PyErr_ExceptionMatches (PyExc_SyntaxError))
{
PyErr_Fetch (&exc, &val, &trb); /* clears exception! */
if (PyArg_ParseTuple (val, "sO", &msg, &obj) &&
!strcmp (msg, "unexpected EOF while parsing")) /* E_EOF */
{
Py_XDECREF (exc);
Py_XDECREF (val);
Py_XDECREF (trb);
prompt = ps2;
}
else /* some other syntax error */
{
PyErr_Restore (exc, val, trb);
PyErr_Print ();
free (code);
code = NULL;
prompt = ps1;
}
}
else /* some non-syntax error */
{
PyErr_Print ();
free (code);
code = NULL;
prompt = ps1;
}
free (line);
}
}
Py_XDECREF(glb);
Py_XDECREF(loc);
Py_Finalize();
exit(0);
}
How do I find undefined g++ symbols __builtin_new or __pure_virtual?
--------------------------------------------------------------------
To dynamically load g++ extension modules, you must recompile Python, relink it
using g++ (change LINKCC in the python Modules Makefile), and link your
extension module using g++ (e.g., ``g++ -shared -o mymodule.so mymodule.o``).
Can I create an object class with some methods implemented in C and others in Python (e.g. through inheritance)?
----------------------------------------------------------------------------------------------------------------
In Python 2.2, you can inherit from builtin classes such as :class:`int`,
:class:`list`, :class:`dict`, etc.
The Boost Python Library (BPL, http://www.boost.org/libs/python/doc/index.html)
provides a way of doing this from C++ (i.e. you can inherit from an extension
class written in C++ using the BPL).
When importing module X, why do I get "undefined symbol: PyUnicodeUCS2*"?
-------------------------------------------------------------------------
You are using a version of Python that uses a 4-byte representation for Unicode
characters, but some C extension module you are importing was compiled using a
Python that uses a 2-byte representation for Unicode characters (the default).
If instead the name of the undefined symbol starts with ``PyUnicodeUCS4``, the
problem is the reverse: Python was built using 2-byte Unicode characters, and
the extension module was compiled using a Python with 4-byte Unicode characters.
This can easily occur when using pre-built extension packages. RedHat Linux
7.x, in particular, provided a "python2" binary that is compiled with 4-byte
Unicode. This only causes the link failure if the extension uses any of the
``PyUnicode_*()`` functions. It is also a problem if an extension uses any of
the Unicode-related format specifiers for :cfunc:`Py_BuildValue` (or similar) or
parameter specifications for :cfunc:`PyArg_ParseTuple`.
You can check the size of the Unicode character a Python interpreter is using by
checking the value of sys.maxunicode:
>>> import sys
>>> if sys.maxunicode > 65535:
... print 'UCS4 build'
... else:
... print 'UCS2 build'
The only way to solve this problem is to use extension modules compiled with a
Python binary built using the same size for Unicode characters.

510
Doc/faq/general.rst Normal file
View File

@ -0,0 +1,510 @@
:tocdepth: 2
==================
General Python FAQ
==================
.. contents::
General Information
===================
What is Python?
---------------
Python is an interpreted, interactive, object-oriented programming language. It
incorporates modules, exceptions, dynamic typing, very high level dynamic data
types, and classes. Python combines remarkable power with very clear syntax.
It has interfaces to many system calls and libraries, as well as to various
window systems, and is extensible in C or C++. It is also usable as an
extension language for applications that need a programmable interface.
Finally, Python is portable: it runs on many Unix variants, on the Mac, and on
PCs under MS-DOS, Windows, Windows NT, and OS/2.
To find out more, start with :ref:`tutorial-index`. The `Beginner's Guide to
Python <http://wiki.python.org/moin/BeginnersGuide>`_ links to other
introductory tutorials and resources for learning Python.
What is the Python Software Foundation?
---------------------------------------
The Python Software Foundation is an independent non-profit organization that
holds the copyright on Python versions 2.1 and newer. The PSF's mission is to
advance open source technology related to the Python programming language and to
publicize the use of Python. The PSF's home page is at
http://www.python.org/psf/.
Donations to the PSF are tax-exempt in the US. If you use Python and find it
helpful, please contribute via `the PSF donation page
<http://www.python.org/psf/donations/>`_.
Are there copyright restrictions on the use of Python?
------------------------------------------------------
You can do anything you want with the source, as long as you leave the
copyrights in and display those copyrights in any documentation about Python
that you produce. If you honor the copyright rules, it's OK to use Python for
commercial use, to sell copies of Python in source or binary form (modified or
unmodified), or to sell products that incorporate Python in some form. We would
still like to know about all commercial use of Python, of course.
See `the PSF license page <http://python.org/psf/license/>`_ to find further
explanations and a link to the full text of the license.
The Python logo is trademarked, and in certain cases permission is required to
use it. Consult `the Trademark Usage Policy
<http://www.python.org/psf/trademarks/>`__ for more information.
Why was Python created in the first place?
------------------------------------------
Here's a *very* brief summary of what started it all, written by Guido van
Rossum:
I had extensive experience with implementing an interpreted language in the
ABC group at CWI, and from working with this group I had learned a lot about
language design. This is the origin of many Python features, including the
use of indentation for statement grouping and the inclusion of
very-high-level data types (although the details are all different in
Python).
I had a number of gripes about the ABC language, but also liked many of its
features. It was impossible to extend the ABC language (or its
implementation) to remedy my complaints -- in fact its lack of extensibility
was one of its biggest problems. I had some experience with using Modula-2+
and talked with the designers of Modula-3 and read the Modula-3 report.
Modula-3 is the origin of the syntax and semantics used for exceptions, and
some other Python features.
I was working in the Amoeba distributed operating system group at CWI. We
needed a better way to do system administration than by writing either C
programs or Bourne shell scripts, since Amoeba had its own system call
interface which wasn't easily accessible from the Bourne shell. My
experience with error handling in Amoeba made me acutely aware of the
importance of exceptions as a programming language feature.
It occurred to me that a scripting language with a syntax like ABC but with
access to the Amoeba system calls would fill the need. I realized that it
would be foolish to write an Amoeba-specific language, so I decided that I
needed a language that was generally extensible.
During the 1989 Christmas holidays, I had a lot of time on my hand, so I
decided to give it a try. During the next year, while still mostly working
on it in my own time, Python was used in the Amoeba project with increasing
success, and the feedback from colleagues made me add many early
improvements.
In February 1991, after just over a year of development, I decided to post to
USENET. The rest is in the ``Misc/HISTORY`` file.
What is Python good for?
------------------------
Python is a high-level general-purpose programming language that can be applied
to many different classes of problems.
The language comes with a large standard library that covers areas such as
string processing (regular expressions, Unicode, calculating differences between
files), Internet protocols (HTTP, FTP, SMTP, XML-RPC, POP, IMAP, CGI
programming), software engineering (unit testing, logging, profiling, parsing
Python code), and operating system interfaces (system calls, filesystems, TCP/IP
sockets). Look at the table of contents for :ref:`library-index` to get an idea
of what's available. A wide variety of third-party extensions are also
available. Consult `the Python Package Index <http://pypi.python.org/pypi>`_ to
find packages of interest to you.
How does the Python version numbering scheme work?
--------------------------------------------------
Python versions are numbered A.B.C or A.B. A is the major version number -- it
is only incremented for really major changes in the language. B is the minor
version number, incremented for less earth-shattering changes. C is the
micro-level -- it is incremented for each bugfix release. See :pep:`6` for more
information about bugfix releases.
Not all releases are bugfix releases. In the run-up to a new major release, a
series of development releases are made, denoted as alpha, beta, or release
candidate. Alphas are early releases in which interfaces aren't yet finalized;
it's not unexpected to see an interface change between two alpha releases.
Betas are more stable, preserving existing interfaces but possibly adding new
modules, and release candidates are frozen, making no changes except as needed
to fix critical bugs.
Alpha, beta and release candidate versions have an additional suffix. The
suffix for an alpha version is "aN" for some small number N, the suffix for a
beta version is "bN" for some small number N, and the suffix for a release
candidate version is "cN" for some small number N. In other words, all versions
labeled 2.0aN precede the versions labeled 2.0bN, which precede versions labeled
2.0cN, and *those* precede 2.0.
You may also find version numbers with a "+" suffix, e.g. "2.2+". These are
unreleased versions, built directly from the Subversion trunk. In practice,
after a final minor release is made, the Subversion trunk is incremented to the
next minor version, which becomes the "a0" version,
e.g. "2.4a0".
See also the documentation for ``sys.version``, ``sys.hexversion``, and
``sys.version_info``.
How do I obtain a copy of the Python source?
--------------------------------------------
The latest Python source distribution is always available from python.org, at
http://www.python.org/download/. The latest development sources can be obtained
via anonymous Subversion at http://svn.python.org/projects/python/trunk.
The source distribution is a gzipped tar file containing the complete C source,
Sphinx-formatted documentation, Python library modules, example programs, and
several useful pieces of freely distributable software. The source will compile
and run out of the box on most UNIX platforms.
Consult the `Developer FAQ
<http://www.python.org/dev/devfaq.html#subversion-svn>`__ for more information
on getting the source code and compiling it.
How do I get documentation on Python?
-------------------------------------
.. XXX mention py3k
The standard documentation for the current stable version of Python is available
at http://docs.python.org/. PDF, plain text, and downloadable HTML versions are
also available at http://docs.python.org/download/.
The documentation is written in reStructuredText and processed by `the Sphinx
documentation tool <http://sphinx.pocoo.org/>`__. The reStructuredText source
for the documentation is part of the Python source distribution.
I've never programmed before. Is there a Python tutorial?
---------------------------------------------------------
There are numerous tutorials and books available. The standard documentation
includes :ref:`tutorial-index`.
Consult `the Beginner's Guide <http://wiki.python.org/moin/BeginnersGuide>`_ to
find information for beginning Python programmers, including lists of tutorials.
Is there a newsgroup or mailing list devoted to Python?
-------------------------------------------------------
There is a newsgroup, :newsgroup:`comp.lang.python`, and a mailing list,
`python-list <http://mail.python.org/mailman/listinfo/python-list>`_. The
newsgroup and mailing list are gatewayed into each other -- if you can read news
it's unnecessary to subscribe to the mailing list.
:newsgroup:`comp.lang.python` is high-traffic, receiving hundreds of postings
every day, and Usenet readers are often more able to cope with this volume.
Announcements of new software releases and events can be found in
comp.lang.python.announce, a low-traffic moderated list that receives about five
postings per day. It's available as `the python-announce mailing list
<http://mail.python.org/mailman/listinfo/python-announce-list>`_.
More info about other mailing lists and newsgroups
can be found at http://www.python.org/community/lists/.
How do I get a beta test version of Python?
-------------------------------------------
Alpha and beta releases are available from http://www.python.org/download/. All
releases are announced on the comp.lang.python and comp.lang.python.announce
newsgroups and on the Python home page at http://www.python.org/; an RSS feed of
news is available.
You can also access the development version of Python through Subversion. See
http://www.python.org/dev/devfaq.html#subversion-svn for details.
How do I submit bug reports and patches for Python?
---------------------------------------------------
To report a bug or submit a patch, please use the Roundup installation at
http://bugs.python.org/.
You must have a Roundup account to report bugs; this makes it possible for us to
contact you if we have follow-up questions. It will also enable Roundup to send
you updates as we act on your bug. If you had previously used SourceForge to
report bugs to Python, you can obtain your Roundup password through Roundup's
`password reset procedure <http://bugs.python.org/user?@template=forgotten>`_.
.. XXX adapt link to dev guide
For more information on how Python is developed, consult `the Python Developer's
Guide <http://python.org/dev/>`_.
Are there any published articles about Python that I can reference?
-------------------------------------------------------------------
It's probably best to cite your favorite book about Python.
The very first article about Python was written in 1991 and is now quite
outdated.
Guido van Rossum and Jelke de Boer, "Interactively Testing Remote Servers
Using the Python Programming Language", CWI Quarterly, Volume 4, Issue 4
(December 1991), Amsterdam, pp 283-303.
Are there any books on Python?
------------------------------
Yes, there are many, and more are being published. See the python.org wiki at
http://wiki.python.org/moin/PythonBooks for a list.
You can also search online bookstores for "Python" and filter out the Monty
Python references; or perhaps search for "Python" and "language".
Where in the world is www.python.org located?
---------------------------------------------
It's currently in Amsterdam, graciously hosted by `XS4ALL
<http://www.xs4all.nl>`_. Thanks to Thomas Wouters for his work in arranging
python.org's hosting.
Why is it called Python?
------------------------
When he began implementing Python, Guido van Rossum was also reading the
published scripts from `"Monty Python's Flying Circus"
<http://pythonline.com/>`__, a BBC comedy series from the 1970s. Van Rossum
thought he needed a name that was short, unique, and slightly mysterious, so he
decided to call the language Python.
Do I have to like "Monty Python's Flying Circus"?
-------------------------------------------------
No, but it helps. :)
Python in the real world
========================
How stable is Python?
---------------------
Very stable. New, stable releases have been coming out roughly every 6 to 18
months since 1991, and this seems likely to continue. Currently there are
usually around 18 months between major releases.
The developers issue "bugfix" releases of older versions, so the stability of
existing releases gradually improves. Bugfix releases, indicated by a third
component of the version number (e.g. 2.5.3, 2.6.2), are managed for stability;
only fixes for known problems are included in a bugfix release, and it's
guaranteed that interfaces will remain the same throughout a series of bugfix
releases.
.. XXX this gets out of date pretty often
The `2.6.4 release <http://python.org/download/>`_ is recommended
production-ready version at this point in time. Python 3.1 is also considered
production-ready, but may be less useful, since currently there is more third
party software available for Python 2 than for Python 3. Python 2 code will
generally not run unchanged in Python 3.
How many people are using Python?
---------------------------------
There are probably tens of thousands of users, though it's difficult to obtain
an exact count.
Python is available for free download, so there are no sales figures, and it's
available from many different sites and packaged with many Linux distributions,
so download statistics don't tell the whole story either.
The comp.lang.python newsgroup is very active, but not all Python users post to
the group or even read it.
Have any significant projects been done in Python?
--------------------------------------------------
See http://python.org/about/success for a list of projects that use Python.
Consulting the proceedings for `past Python conferences
<http://python.org/community/workshops/>`_ will reveal contributions from many
different companies and organizations.
High-profile Python projects include `the Mailman mailing list manager
<http://www.list.org>`_ and `the Zope application server
<http://www.zope.org>`_. Several Linux distributions, most notably `Red Hat
<http://www.redhat.com>`_, have written part or all of their installer and
system administration software in Python. Companies that use Python internally
include Google, Yahoo, and Lucasfilm Ltd.
What new developments are expected for Python in the future?
------------------------------------------------------------
See http://www.python.org/dev/peps/ for the Python Enhancement Proposals
(PEPs). PEPs are design documents describing a suggested new feature for Python,
providing a concise technical specification and a rationale. Look for a PEP
titled "Python X.Y Release Schedule", where X.Y is a version that hasn't been
publicly released yet.
New development is discussed on `the python-dev mailing list
<http://mail.python.org/mailman/listinfo/python-dev/>`_.
Is it reasonable to propose incompatible changes to Python?
-----------------------------------------------------------
In general, no. There are already millions of lines of Python code around the
world, so any change in the language that invalidates more than a very small
fraction of existing programs has to be frowned upon. Even if you can provide a
conversion program, there's still the problem of updating all documentation;
many books have been written about Python, and we don't want to invalidate them
all at a single stroke.
Providing a gradual upgrade path is necessary if a feature has to be changed.
:pep:`5` describes the procedure followed for introducing backward-incompatible
changes while minimizing disruption for users.
Is Python Y2K (Year 2000) Compliant?
------------------------------------
.. remove this question?
As of August, 2003 no major problems have been reported and Y2K compliance seems
to be a non-issue.
Python does very few date calculations and for those it does perform relies on
the C library functions. Python generally represents times either as seconds
since 1970 or as a ``(year, month, day, ...)`` tuple where the year is expressed
with four digits, which makes Y2K bugs unlikely. So as long as your C library
is okay, Python should be okay. Of course, it's possible that a particular
application written in Python makes assumptions about 2-digit years.
Because Python is available free of charge, there are no absolute guarantees.
If there *are* unforeseen problems, liability is the user's problem rather than
the developers', and there is nobody you can sue for damages. The Python
copyright notice contains the following disclaimer:
4. PSF is making Python 2.3 available to Licensee on an "AS IS"
basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY
WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND DISCLAIMS ANY
REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR
PURPOSE OR THAT THE USE OF PYTHON 2.3 WILL NOT INFRINGE ANY THIRD PARTY
RIGHTS.
5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
2.3 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 2.3,
OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
The good news is that *if* you encounter a problem, you have full source
available to track it down and fix it. This is one advantage of an open source
programming environment.
Is Python a good language for beginning programmers?
----------------------------------------------------
Yes.
It is still common to start students with a procedural and statically typed
language such as Pascal, C, or a subset of C++ or Java. Students may be better
served by learning Python as their first language. Python has a very simple and
consistent syntax and a large standard library and, most importantly, using
Python in a beginning programming course lets students concentrate on important
programming skills such as problem decomposition and data type design. With
Python, students can be quickly introduced to basic concepts such as loops and
procedures. They can probably even work with user-defined objects in their very
first course.
For a student who has never programmed before, using a statically typed language
seems unnatural. It presents additional complexity that the student must master
and slows the pace of the course. The students are trying to learn to think
like a computer, decompose problems, design consistent interfaces, and
encapsulate data. While learning to use a statically typed language is
important in the long term, it is not necessarily the best topic to address in
the students' first programming course.
Many other aspects of Python make it a good first language. Like Java, Python
has a large standard library so that students can be assigned programming
projects very early in the course that *do* something. Assignments aren't
restricted to the standard four-function calculator and check balancing
programs. By using the standard library, students can gain the satisfaction of
working on realistic applications as they learn the fundamentals of programming.
Using the standard library also teaches students about code reuse. Third-party
modules such as PyGame are also helpful in extending the students' reach.
Python's interactive interpreter enables students to test language features
while they're programming. They can keep a window with the interpreter running
while they enter their program's source in another window. If they can't
remember the methods for a list, they can do something like this::
>>> L = []
>>> dir(L)
['append', 'count', 'extend', 'index', 'insert', 'pop', 'remove',
'reverse', 'sort']
>>> help(L.append)
Help on built-in function append:
append(...)
L.append(object) -- append object to end
>>> L.append(1)
>>> L
[1]
With the interpreter, documentation is never far from the student as he's
programming.
There are also good IDEs for Python. IDLE is a cross-platform IDE for Python
that is written in Python using Tkinter. PythonWin is a Windows-specific IDE.
Emacs users will be happy to know that there is a very good Python mode for
Emacs. All of these programming environments provide syntax highlighting,
auto-indenting, and access to the interactive interpreter while coding. Consult
http://www.python.org/editors/ for a full list of Python editing environments.
If you want to discuss Python's use in education, you may be interested in
joining `the edu-sig mailing list
<http://python.org/community/sigs/current/edu-sig>`_.
Upgrading Python
================
What is this bsddb185 module my application keeps complaining about?
--------------------------------------------------------------------
.. XXX remove this question?
Starting with Python2.3, the distribution includes the `PyBSDDB package
<http://pybsddb.sf.net/>` as a replacement for the old bsddb module. It
includes functions which provide backward compatibility at the API level, but
requires a newer version of the underlying `Berkeley DB
<http://www.sleepycat.com>`_ library. Files created with the older bsddb module
can't be opened directly using the new module.
Using your old version of Python and a pair of scripts which are part of Python
2.3 (db2pickle.py and pickle2db.py, in the Tools/scripts directory) you can
convert your old database files to the new format. Using your old Python
version, run the db2pickle.py script to convert it to a pickle, e.g.::
python2.2 <pathto>/db2pickley.py database.db database.pck
Rename your database file::
mv database.db olddatabase.db
Now convert the pickle file to a new format database::
python <pathto>/pickle2db.py database.db database.pck
The precise commands you use will vary depending on the particulars of your
installation. For full details about operation of these two scripts check the
doc string at the start of each one.

160
Doc/faq/gui.rst Normal file
View File

@ -0,0 +1,160 @@
:tocdepth: 2
==========================
Graphic User Interface FAQ
==========================
.. contents::
General GUI Questions
=====================
What platform-independent GUI toolkits exist for Python?
--------------------------------------------------------
Depending on what platform(s) you are aiming at, there are several.
.. XXX check links
Tkinter
'''''''
Standard builds of Python include an object-oriented interface to the Tcl/Tk
widget set, called Tkinter. This is probably the easiest to install and use.
For more info about Tk, including pointers to the source, see the Tcl/Tk home
page at http://www.tcl.tk. Tcl/Tk is fully portable to the MacOS, Windows, and
Unix platforms.
wxWindows
'''''''''
wxWindows is a portable GUI class library written in C++ that's a portable
interface to various platform-specific libraries; wxWidgets is a Python
interface to wxWindows. wxWindows supports Windows and MacOS; on Unix variants,
it supports both GTk+ and Motif toolkits. wxWindows preserves the look and feel
of the underlying graphics toolkit, and there is quite a rich widget set and
collection of GDI classes. See `the wxWindows page <http://www.wxwindows.org>`_
for more details.
`wxWidgets <http://wxwidgets.org>`_ is an extension module that wraps many of
the wxWindows C++ classes, and is quickly gaining popularity amongst Python
developers. You can get wxWidgets as part of the source or CVS distribution of
wxWindows, or directly from its home page.
Qt
'''
There are bindings available for the Qt toolkit (`PyQt
<http://www.riverbankcomputing.co.uk/pyqt/>`_) and for KDE (PyKDE). If you're
writing open source software, you don't need to pay for PyQt, but if you want to
write proprietary applications, you must buy a PyQt license from `Riverbank
Computing <http://www.riverbankcomputing.co.uk>`_ and a Qt license from
`Trolltech <http://www.trolltech.com>`_.
Gtk+
''''
PyGtk bindings for the `Gtk+ toolkit <http://www.gtk.org>`_ have been
implemented by by James Henstridge; see ftp://ftp.gtk.org/pub/gtk/python/.
FLTK
''''
Python bindings for `the FLTK toolkit <http://www.fltk.org>`_, a simple yet
powerful and mature cross-platform windowing system, are available from `the
PyFLTK project <http://pyfltk.sourceforge.net>`_.
FOX
'''
A wrapper for `the FOX toolkit <http://www.fox-toolkit.org/>`_ called `FXpy
<http://fxpy.sourceforge.net/>`_ is available. FOX supports both Unix variants
and Windows.
OpenGL
''''''
For OpenGL bindings, see `PyOpenGL <http://pyopengl.sourceforge.net>`_.
What platform-specific GUI toolkits exist for Python?
-----------------------------------------------------
`The Mac port <http://python.org/download/mac>`_ by Jack Jansen has a rich and
ever-growing set of modules that support the native Mac toolbox calls. The port
includes support for MacOS9 and MacOS X's Carbon libraries. By installing the
`PyObjc Objective-C bridge <http://pyobjc.sourceforge.net>`_, Python programs
can use MacOS X's Cocoa libraries. See the documentation that comes with the Mac
port.
:ref:`Pythonwin <windows-faq>` by Mark Hammond includes an interface to the
Microsoft Foundation Classes and a Python programming environment using it
that's written mostly in Python.
Tkinter questions
=================
How do I freeze Tkinter applications?
-------------------------------------
Freeze is a tool to create stand-alone applications. When freezing Tkinter
applications, the applications will not be truly stand-alone, as the application
will still need the Tcl and Tk libraries.
One solution is to ship the application with the tcl and tk libraries, and point
to them at run-time using the :envvar:`TCL_LIBRARY` and :envvar:`TK_LIBRARY`
environment variables.
To get truly stand-alone applications, the Tcl scripts that form the library
have to be integrated into the application as well. One tool supporting that is
SAM (stand-alone modules), which is part of the Tix distribution
(http://tix.mne.com). Build Tix with SAM enabled, perform the appropriate call
to Tclsam_init etc inside Python's Modules/tkappinit.c, and link with libtclsam
and libtksam (you might include the Tix libraries as well).
Can I have Tk events handled while waiting for I/O?
---------------------------------------------------
Yes, and you don't even need threads! But you'll have to restructure your I/O
code a bit. Tk has the equivalent of Xt's XtAddInput() call, which allows you
to register a callback function which will be called from the Tk mainloop when
I/O is possible on a file descriptor. Here's what you need::
from Tkinter import tkinter
tkinter.createfilehandler(file, mask, callback)
The file may be a Python file or socket object (actually, anything with a
fileno() method), or an integer file descriptor. The mask is one of the
constants tkinter.READABLE or tkinter.WRITABLE. The callback is called as
follows::
callback(file, mask)
You must unregister the callback when you're done, using ::
tkinter.deletefilehandler(file)
Note: since you don't know *how many bytes* are available for reading, you can't
use the Python file object's read or readline methods, since these will insist
on reading a predefined number of bytes. For sockets, the :meth:`recv` or
:meth:`recvfrom` methods will work fine; for other files, use
``os.read(file.fileno(), maxbytecount)``.
I can't get key bindings to work in Tkinter: why?
-------------------------------------------------
An often-heard complaint is that event handlers bound to events with the
:meth:`bind` method don't get handled even when the appropriate key is pressed.
The most common cause is that the widget to which the binding applies doesn't
have "keyboard focus". Check out the Tk documentation for the focus command.
Usually a widget is given the keyboard focus by clicking in it (but not for
labels; see the takefocus option).

18
Doc/faq/index.rst Normal file
View File

@ -0,0 +1,18 @@
###################################
Python Frequently Asked Questions
###################################
:Release: |version|
:Date: |today|
.. toctree::
:maxdepth: 1
general.rst
programming.rst
design.rst
library.rst
extending.rst
windows.rst
gui.rst
installed.rst

53
Doc/faq/installed.rst Normal file
View File

@ -0,0 +1,53 @@
=============================================
"Why is Python Installed on my Computer?" FAQ
=============================================
What is Python?
---------------
Python is a programming language. It's used for many different applications.
It's used in some high schools and colleges as an introductory programming
language because Python is easy to learn, but it's also used by professional
software developers at places such as Google, NASA, and Lucasfilm Ltd.
If you wish to learn more about Python, start with the `Beginner's Guide to
Python <http://wiki.python.org/moin/BeginnersGuide>`_.
Why is Python installed on my machine?
--------------------------------------
If you find Python installed on your system but don't remember installing it,
there are several possible ways it could have gotten there.
* Perhaps another user on the computer wanted to learn programming and installed
it; you'll have to figure out who's been using the machine and might have
installed it.
* A third-party application installed on the machine might have been written in
Python and included a Python installation. For a home computer, the most
common such application is `PySol <http://pysolfc.sourceforge.net/>`_, a
solitaire game that includes over 1000 different games and variations.
* Some Windows machines also have Python installed. At this writing we're aware
of computers from Hewlett-Packard and Compaq that include Python. Apparently
some of HP/Compaq's administrative tools are written in Python.
* All Apple computers running Mac OS X have Python installed; it's included in
the base installation.
Can I delete Python?
--------------------
That depends on where Python came from.
If someone installed it deliberately, you can remove it without hurting
anything. On Windows, use the Add/Remove Programs icon in the Control Panel.
If Python was installed by a third-party application, you can also remove it,
but that application will no longer work. You should use that application's
uninstaller rather than removing Python directly.
If Python came with your operating system, removing it is not recommended. If
you remove it, whatever tools were written in Python will no longer run, and
some of them might be important to you. Reinstalling the whole system would
then be required to fix things again.

880
Doc/faq/library.rst Normal file
View File

@ -0,0 +1,880 @@
:tocdepth: 2
=========================
Library and Extension FAQ
=========================
.. contents::
General Library Questions
=========================
How do I find a module or application to perform task X?
--------------------------------------------------------
Check :ref:`the Library Reference <library-index>` to see if there's a relevant
standard library module. (Eventually you'll learn what's in the standard
library and will able to skip this step.)
Search the `Python Package Index <http://pypi.python.org/pypi>`_.
Next, check the `Vaults of Parnassus <http://www.vex.net/parnassus/>`_, an older
index of packages.
Finally, try `Google <http://www.google.com>`_ or other Web search engine.
Searching for "Python" plus a keyword or two for your topic of interest will
usually find something helpful.
Where is the math.py (socket.py, regex.py, etc.) source file?
-------------------------------------------------------------
If you can't find a source file for a module it may be a builtin or dynamically
loaded module implemented in C, C++ or other compiled language. In this case
you may not have the source file or it may be something like mathmodule.c,
somewhere in a C source directory (not on the Python Path).
There are (at least) three kinds of modules in Python:
1) modules written in Python (.py);
2) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
3) modules written in C and linked with the interpreter; to get a list of these,
type::
import sys
print sys.builtin_module_names
How do I make a Python script executable on Unix?
-------------------------------------------------
You need to do two things: the script file's mode must be executable and the
first line must begin with ``#!`` followed by the path of the Python
interpreter.
The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
scriptfile``.
The second can be done in a number of ways. The most straightforward way is to
write ::
#!/usr/local/bin/python
as the very first line of your file, using the pathname for where the Python
interpreter is installed on your platform.
If you would like the script to be independent of where the Python interpreter
lives, you can use the "env" program. Almost all Unix variants support the
following, assuming the python interpreter is in a directory on the user's
$PATH::
#!/usr/bin/env python
*Don't* do this for CGI scripts. The $PATH variable for CGI scripts is often
very minimal, so you need to use the actual absolute pathname of the
interpreter.
Occasionally, a user's environment is so full that the /usr/bin/env program
fails; or there's no env program at all. In that case, you can try the
following hack (due to Alex Rezinsky)::
#! /bin/sh
""":"
exec python $0 ${1+"$@"}
"""
The minor disadvantage is that this defines the script's __doc__ string.
However, you can fix that by adding ::
__doc__ = """...Whatever..."""
Is there a curses/termcap package for Python?
---------------------------------------------
.. XXX curses *is* built by default, isn't it?
For Unix variants: The standard Python source distribution comes with a curses
module in the ``Modules/`` subdirectory, though it's not compiled by default
(note that this is not available in the Windows distribution -- there is no
curses module for Windows).
The curses module supports basic curses features as well as many additional
functions from ncurses and SYSV curses such as colour, alternative character set
support, pads, and mouse support. This means the module isn't compatible with
operating systems that only have BSD curses, but there don't seem to be any
currently maintained OSes that fall into this category.
For Windows: use `the consolelib module
<http://effbot.org/zone/console-index.htm>`_.
Is there an equivalent to C's onexit() in Python?
-------------------------------------------------
The :mod:`atexit` module provides a register function that is similar to C's
onexit.
Why don't my signal handlers work?
----------------------------------
The most common problem is that the signal handler is declared with the wrong
argument list. It is called as ::
handler(signum, frame)
so it should be declared with two arguments::
def handler(signum, frame):
...
Common tasks
============
How do I test a Python program or component?
--------------------------------------------
Python comes with two testing frameworks. The :mod:`doctest` module finds
examples in the docstrings for a module and runs them, comparing the output with
the expected output given in the docstring.
The :mod:`unittest` module is a fancier testing framework modelled on Java and
Smalltalk testing frameworks.
For testing, it helps to write the program so that it may be easily tested by
using good modular design. Your program should have almost all functionality
encapsulated in either functions or class methods -- and this sometimes has the
surprising and delightful effect of making the program run faster (because local
variable accesses are faster than global accesses). Furthermore the program
should avoid depending on mutating global variables, since this makes testing
much more difficult to do.
The "global main logic" of your program may be as simple as ::
if __name__ == "__main__":
main_logic()
at the bottom of the main module of your program.
Once your program is organized as a tractable collection of functions and class
behaviours you should write test functions that exercise the behaviours. A test
suite can be associated with each module which automates a sequence of tests.
This sounds like a lot of work, but since Python is so terse and flexible it's
surprisingly easy. You can make coding much more pleasant and fun by writing
your test functions in parallel with the "production code", since this makes it
easy to find bugs and even design flaws earlier.
"Support modules" that are not intended to be the main module of a program may
include a self-test of the module. ::
if __name__ == "__main__":
self_test()
Even programs that interact with complex external interfaces may be tested when
the external interfaces are unavailable by using "fake" interfaces implemented
in Python.
How do I create documentation from doc strings?
-----------------------------------------------
.. XXX mention Sphinx/epydoc
The :mod:`pydoc` module can create HTML from the doc strings in your Python
source code. An alternative is `pythondoc
<http://starship.python.net/crew/danilo/pythondoc/>`_.
How do I get a single keypress at a time?
-----------------------------------------
For Unix variants: There are several solutions. It's straightforward to do this
using curses, but curses is a fairly large module to learn. Here's a solution
without curses::
import termios, fcntl, sys, os
fd = sys.stdin.fileno()
oldterm = termios.tcgetattr(fd)
newattr = termios.tcgetattr(fd)
newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
termios.tcsetattr(fd, termios.TCSANOW, newattr)
oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
try:
while 1:
try:
c = sys.stdin.read(1)
print "Got character", `c`
except IOError: pass
finally:
termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
You need the :mod:`termios` and the :mod:`fcntl` module for any of this to work,
and I've only tried it on Linux, though it should work elsewhere. In this code,
characters are read and printed one at a time.
:func:`termios.tcsetattr` turns off stdin's echoing and disables canonical mode.
:func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags and modify
them for non-blocking mode. Since reading stdin when it is empty results in an
:exc:`IOError`, this error is caught and ignored.
Threads
=======
How do I program using threads?
-------------------------------
.. XXX it's _thread in py3k
Be sure to use the :mod:`threading` module and not the :mod:`thread` module.
The :mod:`threading` module builds convenient abstractions on top of the
low-level primitives provided by the :mod:`thread` module.
Aahz has a set of slides from his threading tutorial that are helpful; see
http://starship.python.net/crew/aahz/OSCON2001/.
None of my threads seem to run: why?
------------------------------------
As soon as the main thread exits, all threads are killed. Your main thread is
running too quickly, giving the threads no time to do any work.
A simple fix is to add a sleep to the end of the program that's long enough for
all the threads to finish::
import threading, time
def thread_task(name, n):
for i in range(n): print name, i
for i in range(10):
T = threading.Thread(target=thread_task, args=(str(i), i))
T.start()
time.sleep(10) # <----------------------------!
But now (on many platforms) the threads don't run in parallel, but appear to run
sequentially, one at a time! The reason is that the OS thread scheduler doesn't
start a new thread until the previous thread is blocked.
A simple fix is to add a tiny sleep to the start of the run function::
def thread_task(name, n):
time.sleep(0.001) # <---------------------!
for i in range(n): print name, i
for i in range(10):
T = threading.Thread(target=thread_task, args=(str(i), i))
T.start()
time.sleep(10)
Instead of trying to guess how long a :func:`time.sleep` delay will be enough,
it's better to use some kind of semaphore mechanism. One idea is to use the
:mod:`Queue` module to create a queue object, let each thread append a token to
the queue when it finishes, and let the main thread read as many tokens from the
queue as there are threads.
How do I parcel out work among a bunch of worker threads?
---------------------------------------------------------
Use the :mod:`Queue` module to create a queue containing a list of jobs. The
:class:`~Queue.Queue` class maintains a list of objects with ``.put(obj)`` to
add an item to the queue and ``.get()`` to return an item. The class will take
care of the locking necessary to ensure that each job is handed out exactly
once.
Here's a trivial example::
import threading, Queue, time
# The worker thread gets jobs off the queue. When the queue is empty, it
# assumes there will be no more work and exits.
# (Realistically workers will run until terminated.)
def worker ():
print 'Running worker'
time.sleep(0.1)
while True:
try:
arg = q.get(block=False)
except Queue.Empty:
print 'Worker', threading.currentThread(),
print 'queue empty'
break
else:
print 'Worker', threading.currentThread(),
print 'running with argument', arg
time.sleep(0.5)
# Create queue
q = Queue.Queue()
# Start a pool of 5 workers
for i in range(5):
t = threading.Thread(target=worker, name='worker %i' % (i+1))
t.start()
# Begin adding work to the queue
for i in range(50):
q.put(i)
# Give threads time to run
print 'Main thread sleeping'
time.sleep(5)
When run, this will produce the following output:
Running worker
Running worker
Running worker
Running worker
Running worker
Main thread sleeping
Worker <Thread(worker 1, started)> running with argument 0
Worker <Thread(worker 2, started)> running with argument 1
Worker <Thread(worker 3, started)> running with argument 2
Worker <Thread(worker 4, started)> running with argument 3
Worker <Thread(worker 5, started)> running with argument 4
Worker <Thread(worker 1, started)> running with argument 5
...
Consult the module's documentation for more details; the ``Queue`` class
provides a featureful interface.
What kinds of global value mutation are thread-safe?
----------------------------------------------------
A global interpreter lock (GIL) is used internally to ensure that only one
thread runs in the Python VM at a time. In general, Python offers to switch
among threads only between bytecode instructions; how frequently it switches can
be set via :func:`sys.setcheckinterval`. Each bytecode instruction and
therefore all the C implementation code reached from each instruction is
therefore atomic from the point of view of a Python program.
In theory, this means an exact accounting requires an exact understanding of the
PVM bytecode implementation. In practice, it means that operations on shared
variables of builtin data types (ints, lists, dicts, etc) that "look atomic"
really are.
For example, the following operations are all atomic (L, L1, L2 are lists, D,
D1, D2 are dicts, x, y are objects, i, j are ints)::
L.append(x)
L1.extend(L2)
x = L[i]
x = L.pop()
L1[i:j] = L2
L.sort()
x = y
x.field = y
D[x] = y
D1.update(D2)
D.keys()
These aren't::
i = i+1
L.append(L[-1])
L[i] = L[j]
D[x] = D[x] + 1
Operations that replace other objects may invoke those other objects'
:meth:`__del__` method when their reference count reaches zero, and that can
affect things. This is especially true for the mass updates to dictionaries and
lists. When in doubt, use a mutex!
Can't we get rid of the Global Interpreter Lock?
------------------------------------------------
.. XXX mention multiprocessing
The Global Interpreter Lock (GIL) is often seen as a hindrance to Python's
deployment on high-end multiprocessor server machines, because a multi-threaded
Python program effectively only uses one CPU, due to the insistence that
(almost) all Python code can only run while the GIL is held.
Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
patch set (the "free threading" patches) that removed the GIL and replaced it
with fine-grained locking. Unfortunately, even on Windows (where locks are very
efficient) this ran ordinary Python code about twice as slow as the interpreter
using the GIL. On Linux the performance loss was even worse because pthread
locks aren't as efficient.
Since then, the idea of getting rid of the GIL has occasionally come up but
nobody has found a way to deal with the expected slowdown, and users who don't
use threads would not be happy if their code ran at half at the speed. Greg's
free threading patch set has not been kept up-to-date for later Python versions.
This doesn't mean that you can't make good use of Python on multi-CPU machines!
You just have to be creative with dividing the work up between multiple
*processes* rather than multiple *threads*. Judicious use of C extensions will
also help; if you use a C extension to perform a time-consuming task, the
extension can release the GIL while the thread of execution is in the C code and
allow other threads to get some work done.
It has been suggested that the GIL should be a per-interpreter-state lock rather
than truly global; interpreters then wouldn't be able to share objects.
Unfortunately, this isn't likely to happen either. It would be a tremendous
amount of work, because many object implementations currently have global state.
For example, small integers and short strings are cached; these caches would
have to be moved to the interpreter state. Other object types have their own
free list; these free lists would have to be moved to the interpreter state.
And so on.
And I doubt that it can even be done in finite time, because the same problem
exists for 3rd party extensions. It is likely that 3rd party extensions are
being written at a faster rate than you can convert them to store all their
global state in the interpreter state.
And finally, once you have multiple interpreters not sharing any state, what
have you gained over running each interpreter in a separate process?
Input and Output
================
How do I delete a file? (And other file questions...)
-----------------------------------------------------
Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
the :mod:`os` module. The two functions are identical; :func:`unlink` is simply
the name of the Unix system call for this function.
To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
``os.makedirs(path)`` will create any intermediate directories in ``path`` that
don't exist. ``os.removedirs(path)`` will remove intermediate directories as
long as they're empty; if you want to delete an entire directory tree and its
contents, use :func:`shutil.rmtree`.
To rename a file, use ``os.rename(old_path, new_path)``.
To truncate a file, open it using ``f = open(filename, "r+")``, and use
``f.truncate(offset)``; offset defaults to the current seek position. There's
also ```os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
``fd`` is the file descriptor (a small integer).
The :mod:`shutil` module also contains a number of functions to work on files
including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
:func:`~shutil.rmtree`.
How do I copy a file?
---------------------
The :mod:`shutil` module contains a :func:`~shutil.copyfile` function. Note
that on MacOS 9 it doesn't copy the resource fork and Finder info.
How do I read (or write) binary data?
-------------------------------------
To read or write complex binary data formats, it's best to use the :mod:`struct`
module. It allows you to take a string containing binary data (usually numbers)
and convert it to Python objects; and vice versa.
For example, the following code reads two 2-byte integers and one 4-byte integer
in big-endian format from a file::
import struct
f = open(filename, "rb") # Open in binary mode for portability
s = f.read(8)
x, y, z = struct.unpack(">hhl", s)
The '>' in the format string forces big-endian data; the letter 'h' reads one
"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
string.
For data that is more regular (e.g. a homogeneous list of ints or thefloats),
you can also use the :mod:`array` module.
I can't seem to use os.read() on a pipe created with os.popen(); why?
---------------------------------------------------------------------
:func:`os.read` is a low-level function which takes a file descriptor, a small
integer representing the opened file. :func:`os.popen` creates a high-level
file object, the same type returned by the builtin :func:`open` function. Thus,
to read n bytes from a pipe p created with :func:`os.popen`, you need to use
``p.read(n)``.
How do I run a subprocess with pipes connected to both input and output?
------------------------------------------------------------------------
.. XXX update to use subprocess
Use the :mod:`popen2` module. For example::
import popen2
fromchild, tochild = popen2.popen2("command")
tochild.write("input\n")
tochild.flush()
output = fromchild.readline()
Warning: in general it is unwise to do this because you can easily cause a
deadlock where your process is blocked waiting for output from the child while
the child is blocked waiting for input from you. This can be caused because the
parent expects the child to output more text than it does, or it can be caused
by data being stuck in stdio buffers due to lack of flushing. The Python parent
can of course explicitly flush the data it sends to the child before it reads
any output, but if the child is a naive C program it may have been written to
never explicitly flush its output, even if it is interactive, since flushing is
normally automatic.
Note that a deadlock is also possible if you use :func:`popen3` to read stdout
and stderr. If one of the two is too large for the internal buffer (increasing
the buffer size does not help) and you ``read()`` the other one first, there is
a deadlock, too.
Note on a bug in popen2: unless your program calls ``wait()`` or ``waitpid()``,
finished child processes are never removed, and eventually calls to popen2 will
fail because of a limit on the number of child processes. Calling
:func:`os.waitpid` with the :data:`os.WNOHANG` option can prevent this; a good
place to insert such a call would be before calling ``popen2`` again.
In many cases, all you really need is to run some data through a command and get
the result back. Unless the amount of data is very large, the easiest way to do
this is to write it to a temporary file and run the command with that temporary
file as input. The standard module :mod:`tempfile` exports a ``mktemp()``
function to generate unique temporary file names. ::
import tempfile
import os
class Popen3:
"""
This is a deadlock-safe version of popen that returns
an object with errorlevel, out (a string) and err (a string).
(capturestderr may not work under windows.)
Example: print Popen3('grep spam','\n\nhere spam\n\n').out
"""
def __init__(self,command,input=None,capturestderr=None):
outfile=tempfile.mktemp()
command="( %s ) > %s" % (command,outfile)
if input:
infile=tempfile.mktemp()
open(infile,"w").write(input)
command=command+" <"+infile
if capturestderr:
errfile=tempfile.mktemp()
command=command+" 2>"+errfile
self.errorlevel=os.system(command) >> 8
self.out=open(outfile,"r").read()
os.remove(outfile)
if input:
os.remove(infile)
if capturestderr:
self.err=open(errfile,"r").read()
os.remove(errfile)
Note that many interactive programs (e.g. vi) don't work well with pipes
substituted for standard input and output. You will have to use pseudo ttys
("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
"expect" library. A Python extension that interfaces to expect is called "expy"
and available from http://expectpy.sourceforge.net. A pure Python solution that
works like expect is ` pexpect <http://pexpect.sourceforge.net>`_.
How do I access the serial (RS232) port?
----------------------------------------
For Win32, POSIX (Linux, BSD, etc.), Jython:
http://pyserial.sourceforge.net
For Unix, see a Usenet post by Mitch Chapman:
http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
Why doesn't closing sys.stdout (stdin, stderr) really close it?
---------------------------------------------------------------
Python file objects are a high-level layer of abstraction on top of C streams,
which in turn are a medium-level layer of abstraction on top of (among other
things) low-level C file descriptors.
For most file objects you create in Python via the builtin ``file`` constructor,
``f.close()`` marks the Python file object as being closed from Python's point
of view, and also arranges to close the underlying C stream. This also happens
automatically in f's destructor, when f becomes garbage.
But stdin, stdout and stderr are treated specially by Python, because of the
special status also given to them by C. Running ``sys.stdout.close()`` marks
the Python-level file object as being closed, but does *not* close the
associated C stream.
To close the underlying C stream for one of these three, you should first be
sure that's what you really want to do (e.g., you may confuse extension modules
trying to do I/O). If it is, use os.close::
os.close(0) # close C's stdin stream
os.close(1) # close C's stdout stream
os.close(2) # close C's stderr stream
Network/Internet Programming
============================
What WWW tools are there for Python?
------------------------------------
See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
Reference Manual. Python has many modules that will help you build server-side
and client-side web systems.
.. XXX check if wiki page is still up to date
A summary of available frameworks is maintained by Paul Boddie at
http://wiki.python.org/moin/WebProgramming .
Cameron Laird maintains a useful set of pages about Python web technologies at
http://phaseit.net/claird/comp.lang.python/web_python.
How can I mimic CGI form submission (METHOD=POST)?
--------------------------------------------------
I would like to retrieve web pages that are the result of POSTing a form. Is
there existing code that would let me do this easily?
Yes. Here's a simple example that uses httplib::
#!/usr/local/bin/python
import httplib, sys, time
### build the query string
qs = "First=Josephine&MI=Q&Last=Public"
### connect and send the server a path
httpobj = httplib.HTTP('www.some-server.out-there', 80)
httpobj.putrequest('POST', '/cgi-bin/some-cgi-script')
### now generate the rest of the HTTP headers...
httpobj.putheader('Accept', '*/*')
httpobj.putheader('Connection', 'Keep-Alive')
httpobj.putheader('Content-type', 'application/x-www-form-urlencoded')
httpobj.putheader('Content-length', '%d' % len(qs))
httpobj.endheaders()
httpobj.send(qs)
### find out what the server said in response...
reply, msg, hdrs = httpobj.getreply()
if reply != 200:
sys.stdout.write(httpobj.getfile().read())
Note that in general for URL-encoded POST operations, query strings must be
quoted by using :func:`urllib.quote`. For example to send name="Guy Steele,
Jr."::
>>> from urllib import quote
>>> x = quote("Guy Steele, Jr.")
>>> x
'Guy%20Steele,%20Jr.'
>>> query_string = "name="+x
>>> query_string
'name=Guy%20Steele,%20Jr.'
What module should I use to help with generating HTML?
------------------------------------------------------
.. XXX add modern template languages
There are many different modules available:
* HTMLgen is a class library of objects corresponding to all the HTML 3.2 markup
tags. It's used when you are writing in Python and wish to synthesize HTML
pages for generating a web or for CGI forms, etc.
* DocumentTemplate and Zope Page Templates are two different systems that are
part of Zope.
* Quixote's PTL uses Python syntax to assemble strings of text.
Consult the `Web Programming wiki pages
<http://wiki.python.org/moin/WebProgramming>`_ for more links.
How do I send mail from a Python script?
----------------------------------------
Use the standard library module :mod:`smtplib`.
Here's a very simple interactive mail sender that uses it. This method will
work on any host that supports an SMTP listener. ::
import sys, smtplib
fromaddr = raw_input("From: ")
toaddrs = raw_input("To: ").split(',')
print "Enter message, end with ^D:"
msg = ''
while True:
line = sys.stdin.readline()
if not line:
break
msg += line
# The actual mail send
server = smtplib.SMTP('localhost')
server.sendmail(fromaddr, toaddrs, msg)
server.quit()
A Unix-only alternative uses sendmail. The location of the sendmail program
varies between systems; sometimes it is ``/usr/lib/sendmail``, sometime
``/usr/sbin/sendmail``. The sendmail manual page will help you out. Here's
some sample code::
SENDMAIL = "/usr/sbin/sendmail" # sendmail location
import os
p = os.popen("%s -t -i" % SENDMAIL, "w")
p.write("To: receiver@example.com\n")
p.write("Subject: test\n")
p.write("\n") # blank line separating headers from body
p.write("Some text\n")
p.write("some more text\n")
sts = p.close()
if sts != 0:
print "Sendmail exit status", sts
How do I avoid blocking in the connect() method of a socket?
------------------------------------------------------------
The select module is commonly used to help with asynchronous I/O on sockets.
To prevent the TCP connect from blocking, you can set the socket to non-blocking
mode. Then when you do the ``connect()``, you will either connect immediately
(unlikely) or get an exception that contains the error number as ``.errno``.
``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
finished yet. Different OSes will return different values, so you're going to
have to check what's returned on your system.
You can use the ``connect_ex()`` method to avoid creating an exception. It will
just return the errno value. To poll, you can call ``connect_ex()`` again later
-- 0 or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
socket to select to check if it's writable.
Databases
=========
Are there any interfaces to database packages in Python?
--------------------------------------------------------
Yes.
.. XXX remove bsddb in py3k, fix other module names
Python 2.3 includes the :mod:`bsddb` package which provides an interface to the
BerkeleyDB library. Interfaces to disk-based hashes such as :mod:`DBM <dbm>`
and :mod:`GDBM <gdbm>` are also included with standard Python.
Support for most relational databases is available. See the
`DatabaseProgramming wiki page
<http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
How do you implement persistent objects in Python?
--------------------------------------------------
The :mod:`pickle` library module solves this in a very general way (though you
still can't store things like open files, sockets or windows), and the
:mod:`shelve` library module uses pickle and (g)dbm to create persistent
mappings containing arbitrary Python objects. For better performance, you can
use the :mod:`cPickle` module.
A more awkward way of doing things is to use pickle's little sister, marshal.
The :mod:`marshal` module provides very fast ways to store noncircular basic
Python types to files and strings, and back again. Although marshal does not do
fancy things like store instances or handle shared references properly, it does
run extremely fast. For example loading a half megabyte of data may take less
than a third of a second. This often beats doing something more complex and
general such as using gdbm with pickle/shelve.
Why is cPickle so slow?
-----------------------
.. XXX update this, default protocol is 2/3
The default format used by the pickle module is a slow one that results in
readable pickles. Making it the default, but it would break backward
compatibility::
largeString = 'z' * (100 * 1024)
myPickle = cPickle.dumps(largeString, protocol=1)
If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come?
------------------------------------------------------------------------------------------
Databases opened for write access with the bsddb module (and often by the anydbm
module, since it will preferentially use bsddb) must explicitly be closed using
the ``.close()`` method of the database. The underlying library caches database
contents which need to be converted to on-disk form and written.
If you have initialized a new bsddb database but not written anything to it
before the program crashes, you will often wind up with a zero-length file and
encounter an exception the next time the file is opened.
I tried to open Berkeley DB file, but bsddb produces bsddb.error: (22, 'Invalid argument'). Help! How can I restore my data?
----------------------------------------------------------------------------------------------------------------------------
Don't panic! Your data is probably intact. The most frequent cause for the error
is that you tried to open an earlier Berkeley DB file with a later version of
the Berkeley DB library.
Many Linux systems now have all three versions of Berkeley DB available. If you
are migrating from version 1 to a newer version use db_dump185 to dump a plain
text version of the database. If you are migrating from version 2 to version 3
use db2_dump to create a plain text version of the database. In either case,
use db_load to create a new native database for the latest version installed on
your computer. If you have version 3 of Berkeley DB installed, you should be
able to use db2_load to create a native version 2 database.
You should move away from Berkeley DB version 1 files because the hash file code
contains known bugs that can corrupt your data.
Mathematics and Numerics
========================
How do I generate random numbers in Python?
-------------------------------------------
The standard module :mod:`random` implements a random number generator. Usage
is simple::
import random
random.random()
This returns a random floating point number in the range [0, 1).
There are also many other specialized generators in this module, such as:
* ``randrange(a, b)`` chooses an integer in the range [a, b).
* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
Some higher-level functions operate on sequences directly, such as:
* ``choice(S)`` chooses random element from a given sequence
* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
There's also a ``Random`` class you can instantiate to create independent
multiple random number generators.

1752
Doc/faq/programming.rst Normal file

File diff suppressed because it is too large Load Diff

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.6 KiB

607
Doc/faq/windows.rst Normal file
View File

@ -0,0 +1,607 @@
:tocdepth: 2
.. _windows-faq:
=====================
Python on Windows FAQ
=====================
.. contents::
How do I run a Python program under Windows?
--------------------------------------------
This is not necessarily a straightforward question. If you are already familiar
with running programs from the Windows command line then everything will seem
obvious; otherwise, you might need a little more guidance. There are also
differences between Windows 95, 98, NT, ME, 2000 and XP which can add to the
confusion.
.. sidebar:: |Python Development on XP|_
:subtitle: `Python Development on XP`_
This series of screencasts aims to get you up and running with Python on
Windows XP. The knowledge is distilled into 1.5 hours and will get you up
and running with the right Python distribution, coding in your choice of IDE,
and debugging and writing solid code with unit-tests.
.. |Python Development on XP| image:: python-video-icon.png
.. _`Python Development on XP`:
http://www.showmedo.com/videos/series?name=pythonOzsvaldPyNewbieSeries
Unless you use some sort of integrated development environment, you will end up
*typing* Windows commands into what is variously referred to as a "DOS window"
or "Command prompt window". Usually you can create such a window from your
Start menu; under Windows 2000 the menu selection is :menuselection:`Start -->
Programs --> Accessories --> Command Prompt`. You should be able to recognize
when you have started such a window because you will see a Windows "command
prompt", which usually looks like this::
C:\>
The letter may be different, and there might be other things after it, so you
might just as easily see something like::
D:\Steve\Projects\Python>
depending on how your computer has been set up and what else you have recently
done with it. Once you have started such a window, you are well on the way to
running Python programs.
You need to realize that your Python scripts have to be processed by another
program called the Python interpreter. The interpreter reads your script,
compiles it into bytecodes, and then executes the bytecodes to run your
program. So, how do you arrange for the interpreter to handle your Python?
First, you need to make sure that your command window recognises the word
"python" as an instruction to start the interpreter. If you have opened a
command window, you should try entering the command ``python`` and hitting
return. You should then see something like::
Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
You have started the interpreter in "interactive mode". That means you can enter
Python statements or expressions interactively and have them executed or
evaluated while you wait. This is one of Python's strongest features. Check it
by entering a few expressions of your choice and seeing the results::
>>> print "Hello"
Hello
>>> "Hello" * 3
HelloHelloHello
Many people use the interactive mode as a convenient yet highly programmable
calculator. When you want to end your interactive Python session, hold the Ctrl
key down while you enter a Z, then hit the "Enter" key to get back to your
Windows command prompt.
You may also find that you have a Start-menu entry such as :menuselection:`Start
--> Programs --> Python 2.2 --> Python (command line)` that results in you
seeing the ``>>>`` prompt in a new window. If so, the window will disappear
after you enter the Ctrl-Z character; Windows is running a single "python"
command in the window, and closes it when you terminate the interpreter.
If the ``python`` command, instead of displaying the interpreter prompt ``>>>``,
gives you a message like::
'python' is not recognized as an internal or external command,
operable program or batch file.
.. sidebar:: |Adding Python to DOS Path|_
:subtitle: `Adding Python to DOS Path`_
Python is not added to the DOS path by default. This screencast will walk
you through the steps to add the correct entry to the `System Path`, allowing
Python to be executed from the command-line by all users.
.. |Adding Python to DOS Path| image:: python-video-icon.png
.. _`Adding Python to DOS Path`:
http://showmedo.com/videos/video?name=960000&fromSeriesID=96
or::
Bad command or filename
then you need to make sure that your computer knows where to find the Python
interpreter. To do this you will have to modify a setting called PATH, which is
a list of directories where Windows will look for programs.
You should arrange for Python's installation directory to be added to the PATH
of every command window as it starts. If you installed Python fairly recently
then the command ::
dir C:\py*
will probably tell you where it is installed; the usual location is something
like ``C:\Python23``. Otherwise you will be reduced to a search of your whole
disk ... use :menuselection:`Tools --> Find` or hit the :guilabel:`Search`
button and look for "python.exe". Supposing you discover that Python is
installed in the ``C:\Python23`` directory (the default at the time of writing),
you should make sure that entering the command ::
c:\Python23\python
starts up the interpreter as above (and don't forget you'll need a "CTRL-Z" and
an "Enter" to get out of it). Once you have verified the directory, you need to
add it to the start-up routines your computer goes through. For older versions
of Windows the easiest way to do this is to edit the ``C:\AUTOEXEC.BAT``
file. You would want to add a line like the following to ``AUTOEXEC.BAT``::
PATH C:\Python23;%PATH%
For Windows NT, 2000 and (I assume) XP, you will need to add a string such as ::
;C:\Python23
to the current setting for the PATH environment variable, which you will find in
the properties window of "My Computer" under the "Advanced" tab. Note that if
you have sufficient privilege you might get a choice of installing the settings
either for the Current User or for System. The latter is preferred if you want
everybody to be able to run Python on the machine.
If you aren't confident doing any of these manipulations yourself, ask for help!
At this stage you may want to reboot your system to make absolutely sure the new
setting has taken effect. You probably won't need to reboot for Windows NT, XP
or 2000. You can also avoid it in earlier versions by editing the file
``C:\WINDOWS\COMMAND\CMDINIT.BAT`` instead of ``AUTOEXEC.BAT``.
You should now be able to start a new command window, enter ``python`` at the
``C:\>`` (or whatever) prompt, and see the ``>>>`` prompt that indicates the
Python interpreter is reading interactive commands.
Let's suppose you have a program called ``pytest.py`` in directory
``C:\Steve\Projects\Python``. A session to run that program might look like
this::
C:\> cd \Steve\Projects\Python
C:\Steve\Projects\Python> python pytest.py
Because you added a file name to the command to start the interpreter, when it
starts up it reads the Python script in the named file, compiles it, executes
it, and terminates, so you see another ``C:\>`` prompt. You might also have
entered ::
C:\> python \Steve\Projects\Python\pytest.py
if you hadn't wanted to change your current directory.
Under NT, 2000 and XP you may well find that the installation process has also
arranged that the command ``pytest.py`` (or, if the file isn't in the current
directory, ``C:\Steve\Projects\Python\pytest.py``) will automatically recognize
the ".py" extension and run the Python interpreter on the named file. Using this
feature is fine, but *some* versions of Windows have bugs which mean that this
form isn't exactly equivalent to using the interpreter explicitly, so be
careful.
The important things to remember are:
1. Start Python from the Start Menu, or make sure the PATH is set correctly so
Windows can find the Python interpreter. ::
python
should give you a '>>>' prompt from the Python interpreter. Don't forget the
CTRL-Z and ENTER to terminate the interpreter (and, if you started the window
from the Start Menu, make the window disappear).
2. Once this works, you run programs with commands::
python {program-file}
3. When you know the commands to use you can build Windows shortcuts to run the
Python interpreter on any of your scripts, naming particular working
directories, and adding them to your menus. Take a look at ::
python --help
if your needs are complex.
4. Interactive mode (where you see the ``>>>`` prompt) is best used for checking
that individual statements and expressions do what you think they will, and
for developing code by experiment.
How do I make python scripts executable?
----------------------------------------
On Windows 2000, the standard Python installer already associates the .py
extension with a file type (Python.File) and gives that file type an open
command that runs the interpreter (``D:\Program Files\Python\python.exe "%1"
%*``). This is enough to make scripts executable from the command prompt as
'foo.py'. If you'd rather be able to execute the script by simple typing 'foo'
with no extension you need to add .py to the PATHEXT environment variable.
On Windows NT, the steps taken by the installer as described above allow you to
run a script with 'foo.py', but a longtime bug in the NT command processor
prevents you from redirecting the input or output of any script executed in this
way. This is often important.
The incantation for making a Python script executable under WinNT is to give the
file an extension of .cmd and add the following as the first line::
@setlocal enableextensions & python -x %~f0 %* & goto :EOF
Why does Python sometimes take so long to start?
------------------------------------------------
Usually Python starts very quickly on Windows, but occasionally there are bug
reports that Python suddenly begins to take a long time to start up. This is
made even more puzzling because Python will work fine on other Windows systems
which appear to be configured identically.
The problem may be caused by a misconfiguration of virus checking software on
the problem machine. Some virus scanners have been known to introduce startup
overhead of two orders of magnitude when the scanner is configured to monitor
all reads from the filesystem. Try checking the configuration of virus scanning
software on your systems to ensure that they are indeed configured identically.
McAfee, when configured to scan all file system read activity, is a particular
offender.
Where is Freeze for Windows?
----------------------------
"Freeze" is a program that allows you to ship a Python program as a single
stand-alone executable file. It is *not* a compiler; your programs don't run
any faster, but they are more easily distributable, at least to platforms with
the same OS and CPU. Read the README file of the freeze program for more
disclaimers.
You can use freeze on Windows, but you must download the source tree (see
http://www.python.org/download/source). The freeze program is in the
``Tools\freeze`` subdirectory of the source tree.
You need the Microsoft VC++ compiler, and you probably need to build Python.
The required project files are in the PCbuild directory.
Is a ``*.pyd`` file the same as a DLL?
--------------------------------------
.. XXX update for py3k (PyInit_foo)
Yes, .pyd files are dll's, but there are a few differences. If you have a DLL
named ``foo.pyd``, then it must have a function ``initfoo()``. You can then
write Python "import foo", and Python will search for foo.pyd (as well as
foo.py, foo.pyc) and if it finds it, will attempt to call ``initfoo()`` to
initialize it. You do not link your .exe with foo.lib, as that would cause
Windows to require the DLL to be present.
Note that the search path for foo.pyd is PYTHONPATH, not the same as the path
that Windows uses to search for foo.dll. Also, foo.pyd need not be present to
run your program, whereas if you linked your program with a dll, the dll is
required. Of course, foo.pyd is required if you want to say ``import foo``. In
a DLL, linkage is declared in the source code with ``__declspec(dllexport)``.
In a .pyd, linkage is defined in a list of available functions.
How can I embed Python into a Windows application?
--------------------------------------------------
Embedding the Python interpreter in a Windows app can be summarized as follows:
1. Do _not_ build Python into your .exe file directly. On Windows, Python must
be a DLL to handle importing modules that are themselves DLL's. (This is the
first key undocumented fact.) Instead, link to :file:`python{NN}.dll`; it is
typically installed in ``C:\Windows\System``. NN is the Python version, a
number such as "23" for Python 2.3.
You can link to Python statically or dynamically. Linking statically means
linking against :file:`python{NN}.lib`, while dynamically linking means
linking against :file:`python{NN}.dll`. The drawback to dynamic linking is
that your app won't run if :file:`python{NN}.dll` does not exist on your
system. (General note: :file:`python{NN}.lib` is the so-called "import lib"
corresponding to :file:`python.dll`. It merely defines symbols for the
linker.)
Linking dynamically greatly simplifies link options; everything happens at
run time. Your code must load :file:`python{NN}.dll` using the Windows
``LoadLibraryEx()`` routine. The code must also use access routines and data
in :file:`python{NN}.dll` (that is, Python's C API's) using pointers obtained
by the Windows ``GetProcAddress()`` routine. Macros can make using these
pointers transparent to any C code that calls routines in Python's C API.
Borland note: convert :file:`python{NN}.lib` to OMF format using Coff2Omf.exe
first.
2. If you use SWIG, it is easy to create a Python "extension module" that will
make the app's data and methods available to Python. SWIG will handle just
about all the grungy details for you. The result is C code that you link
*into* your .exe file (!) You do _not_ have to create a DLL file, and this
also simplifies linking.
3. SWIG will create an init function (a C function) whose name depends on the
name of the extension module. For example, if the name of the module is leo,
the init function will be called initleo(). If you use SWIG shadow classes,
as you should, the init function will be called initleoc(). This initializes
a mostly hidden helper class used by the shadow class.
The reason you can link the C code in step 2 into your .exe file is that
calling the initialization function is equivalent to importing the module
into Python! (This is the second key undocumented fact.)
4. In short, you can use the following code to initialize the Python interpreter
with your extension module.
.. code-block:: c
#include "python.h"
...
Py_Initialize(); // Initialize Python.
initmyAppc(); // Initialize (import) the helper class.
PyRun_SimpleString("import myApp") ; // Import the shadow class.
5. There are two problems with Python's C API which will become apparent if you
use a compiler other than MSVC, the compiler used to build pythonNN.dll.
Problem 1: The so-called "Very High Level" functions that take FILE *
arguments will not work in a multi-compiler environment because each
compiler's notion of a struct FILE will be different. From an implementation
standpoint these are very _low_ level functions.
Problem 2: SWIG generates the following code when generating wrappers to void
functions:
.. code-block:: c
Py_INCREF(Py_None);
_resultobj = Py_None;
return _resultobj;
Alas, Py_None is a macro that expands to a reference to a complex data
structure called _Py_NoneStruct inside pythonNN.dll. Again, this code will
fail in a mult-compiler environment. Replace such code by:
.. code-block:: c
return Py_BuildValue("");
It may be possible to use SWIG's ``%typemap`` command to make the change
automatically, though I have not been able to get this to work (I'm a
complete SWIG newbie).
6. Using a Python shell script to put up a Python interpreter window from inside
your Windows app is not a good idea; the resulting window will be independent
of your app's windowing system. Rather, you (or the wxPythonWindow class)
should create a "native" interpreter window. It is easy to connect that
window to the Python interpreter. You can redirect Python's i/o to _any_
object that supports read and write, so all you need is a Python object
(defined in your extension module) that contains read() and write() methods.
How do I use Python for CGI?
----------------------------
On the Microsoft IIS server or on the Win95 MS Personal Web Server you set up
Python in the same way that you would set up any other scripting engine.
Run regedt32 and go to::
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W3SVC\Parameters\ScriptMap
and enter the following line (making any specific changes that your system may
need)::
.py :REG_SZ: c:\<path to python>\python.exe -u %s %s
This line will allow you to call your script with a simple reference like:
http://yourserver/scripts/yourscript.py provided "scripts" is an "executable"
directory for your server (which it usually is by default). The "-u" flag
specifies unbuffered and binary mode for stdin - needed when working with binary
data.
In addition, it is recommended that using ".py" may not be a good idea for the
file extensions when used in this context (you might want to reserve ``*.py``
for support modules and use ``*.cgi`` or ``*.cgp`` for "main program" scripts).
In order to set up Internet Information Services 5 to use Python for CGI
processing, please see the following links:
http://www.e-coli.net/pyiis_server.html (for Win2k Server)
http://www.e-coli.net/pyiis.html (for Win2k pro)
Configuring Apache is much simpler. In the Apache configuration file
``httpd.conf``, add the following line at the end of the file::
ScriptInterpreterSource Registry
Then, give your Python CGI-scripts the extension .py and put them in the cgi-bin
directory.
How do I keep editors from inserting tabs into my Python source?
----------------------------------------------------------------
The FAQ does not recommend using tabs, and the Python style guide, :pep:`8`,
recommends 4 spaces for distributed Python code; this is also the Emacs
python-mode default.
Under any editor, mixing tabs and spaces is a bad idea. MSVC is no different in
this respect, and is easily configured to use spaces: Take :menuselection:`Tools
--> Options --> Tabs`, and for file type "Default" set "Tab size" and "Indent
size" to 4, and select the "Insert spaces" radio button.
If you suspect mixed tabs and spaces are causing problems in leading whitespace,
run Python with the :option:`-t` switch or run ``Tools/Scripts/tabnanny.py`` to
check a directory tree in batch mode.
How do I check for a keypress without blocking?
-----------------------------------------------
Use the msvcrt module. This is a standard Windows-specific extension module.
It defines a function ``kbhit()`` which checks whether a keyboard hit is
present, and ``getch()`` which gets one character without echoing it.
How do I emulate os.kill() in Windows?
--------------------------------------
Use win32api::
def kill(pid):
"""kill function for Win32"""
import win32api
handle = win32api.OpenProcess(1, 0, pid)
return (0 != win32api.TerminateProcess(handle, 0))
Why does os.path.isdir() fail on NT shared directories?
-------------------------------------------------------
The solution appears to be always append the "\\" on the end of shared
drives.
>>> import os
>>> os.path.isdir( '\\\\rorschach\\public')
0
>>> os.path.isdir( '\\\\rorschach\\public\\')
1
It helps to think of share points as being like drive letters. Example::
k: is not a directory
k:\ is a directory
k:\media is a directory
k:\media\ is not a directory
The same rules apply if you substitute "k:" with "\\conky\foo"::
\\conky\foo is not a directory
\\conky\foo\ is a directory
\\conky\foo\media is a directory
\\conky\foo\media\ is not a directory
cgi.py (or other CGI programming) doesn't work sometimes on NT or win95!
------------------------------------------------------------------------
Be sure you have the latest python.exe, that you are using python.exe rather
than a GUI version of Python and that you have configured the server to execute
::
"...\python.exe -u ..."
for the CGI execution. The :option:`-u` (unbuffered) option on NT and Win95
prevents the interpreter from altering newlines in the standard input and
output. Without it post/multipart requests will seem to have the wrong length
and binary (e.g. GIF) responses may get garbled (resulting in broken images, PDF
files, and other binary downloads failing).
Why doesn't os.popen() work in PythonWin on NT?
-----------------------------------------------
The reason that os.popen() doesn't work from within PythonWin is due to a bug in
Microsoft's C Runtime Library (CRT). The CRT assumes you have a Win32 console
attached to the process.
You should use the win32pipe module's popen() instead which doesn't depend on
having an attached Win32 console.
Example::
import win32pipe
f = win32pipe.popen('dir /c c:\\')
print f.readlines()
f.close()
Why doesn't os.popen()/win32pipe.popen() work on Win9x?
-------------------------------------------------------
There is a bug in Win9x that prevents os.popen/win32pipe.popen* from
working. The good news is there is a way to work around this problem. The
Microsoft Knowledge Base article that you need to lookup is: Q150956. You will
find links to the knowledge base at: http://www.microsoft.com/kb.
PyRun_SimpleFile() crashes on Windows but not on Unix; why?
-----------------------------------------------------------
This is very sensitive to the compiler vendor, version and (perhaps) even
options. If the FILE* structure in your embedding program isn't the same as is
assumed by the Python interpreter it won't work.
The Python 1.5.* DLLs (``python15.dll``) are all compiled with MS VC++ 5.0 and
with multithreading-DLL options (``/MD``).
If you can't change compilers or flags, try using :cfunc:`Py_RunSimpleString`.
A trick to get it to run an arbitrary file is to construct a call to
:func:`execfile` with the name of your file as argument.
Also note that you can not mix-and-match Debug and Release versions. If you
wish to use the Debug Multithreaded DLL, then your module *must* have an "_d"
appended to the base name.
Importing _tkinter fails on Windows 95/98: why?
------------------------------------------------
Sometimes, the import of _tkinter fails on Windows 95 or 98, complaining with a
message like the following::
ImportError: DLL load failed: One of the library files needed
to run this application cannot be found.
It could be that you haven't installed Tcl/Tk, but if you did install Tcl/Tk,
and the Wish application works correctly, the problem may be that its installer
didn't manage to edit the autoexec.bat file correctly. It tries to add a
statement that changes the PATH environment variable to include the Tcl/Tk 'bin'
subdirectory, but sometimes this edit doesn't quite work. Opening it with
notepad usually reveals what the problem is.
(One additional hint, noted by David Szafranski: you can't use long filenames
here; e.g. use ``C:\PROGRA~1\Tcl\bin`` instead of ``C:\Program Files\Tcl\bin``.)
How do I extract the downloaded documentation on Windows?
---------------------------------------------------------
Sometimes, when you download the documentation package to a Windows machine
using a web browser, the file extension of the saved file ends up being .EXE.
This is a mistake; the extension should be .TGZ.
Simply rename the downloaded file to have the .TGZ extension, and WinZip will be
able to handle it. (If your copy of WinZip doesn't, get a newer one from
http://www.winzip.com.)
Missing cw3215mt.dll (or missing cw3215.dll)
--------------------------------------------
Sometimes, when using Tkinter on Windows, you get an error that cw3215mt.dll or
cw3215.dll is missing.
Cause: you have an old Tcl/Tk DLL built with cygwin in your path (probably
``C:\Windows``). You must use the Tcl/Tk DLLs from the standard Tcl/Tk
installation (Python 1.5.2 comes with one).
Warning about CTL3D32 version from installer
--------------------------------------------
The Python installer issues a warning like this::
This version uses ``CTL3D32.DLL`` which is not the correct version.
This version is used for windows NT applications only.
Tim Peters:
This is a Microsoft DLL, and a notorious source of problems. The message
means what it says: you have the wrong version of this DLL for your operating
system. The Python installation did not cause this -- something else you
installed previous to this overwrote the DLL that came with your OS (probably
older shareware of some sort, but there's no way to tell now). If you search
for "CTL3D32" using any search engine (AltaVista, for example), you'll find
hundreds and hundreds of web pages complaining about the same problem with
all sorts of installation programs. They'll point you to ways to get the
correct version reinstalled on your system (since Python doesn't cause this,
we can't fix it).
David A Burton has written a little program to fix this. Go to
http://www.burtonsys.com/download.html and click on "ctl3dfix.zip".

View File

@ -26,6 +26,8 @@
<span class="linkdescr">sharing modules with others</span></p>
<p class="biglink"><a class="biglink" href="{{ pathto("documenting/index") }}">Documenting Python</a><br/>
<span class="linkdescr">guide for documentation authors</span></p>
<p class="biglink"><a class="biglink" href="{{ pathto("faq/index") }}">FAQs</a><br/>
<span class="linkdescr">frequently asked questions (with answers!)</span></p>
</td></tr>
</table>