cpython/Doc/whatsnew/whatsnew23.tex

325 lines
12 KiB
TeX

\documentclass{howto}
% $Id$
\title{What's New in Python 2.3}
\release{0.01}
\author{A.M. Kuchling}
\authoraddress{\email{akuchlin@mems-exchange.org}}
\begin{document}
\maketitle
\tableofcontents
%\section{Introduction \label{intro}}
{\large This article is a draft, and is currently up to date for some
random version of the CVS tree around March 26 2002. Please send any
additions, comments or errata to the author.}
This article explains the new features in Python 2.3. The tentative
release date of Python 2.3 is currently scheduled for August 30 2002.
This article doesn't attempt to provide a complete specification of
the new features, but instead provides a convenient overview. For
full details, you should refer to the documentation for Python 2.3,
such as the
\citetitle[http://www.python.org/doc/2.3/lib/lib.html]{Python Library
Reference} and the
\citetitle[http://www.python.org/doc/2.3/ref/ref.html]{Python
Reference Manual}. If you want to understand the complete
implementation and design rationale for a change, refer to the PEP for
a particular new feature.
%======================================================================
\section{PEP 255: Simple Generators}
In Python 2.2, generators were added as an optional feature, to be
enabled by a \code{from __future__ import generators} directive. In
2.3 generators no longer need to be specially enabled, and are now
always present; this means that \keyword{yield} is now always a
keyword. The rest of this section is a copy of the description of
generators from the ``What's New in Python 2.2'' document; if you read
it when 2.2 came out, you can skip the rest of this section.
Generators are a new feature that interacts with the iterators
introduced in Python 2.2.
You're doubtless familiar with how function calls work in Python or
C. When you call a function, it gets a private namespace where its local
variables are created. When the function reaches a \keyword{return}
statement, the local variables are destroyed and the resulting value
is returned to the caller. A later call to the same function will get
a fresh new set of local variables. But, what if the local variables
weren't thrown away on exiting a function? What if you could later
resume the function where it left off? This is what generators
provide; they can be thought of as resumable functions.
Here's the simplest example of a generator function:
\begin{verbatim}
def generate_ints(N):
for i in range(N):
yield i
\end{verbatim}
A new keyword, \keyword{yield}, was introduced for generators. Any
function containing a \keyword{yield} statement is a generator
function; this is detected by Python's bytecode compiler which
compiles the function specially as a result.
When you call a generator function, it doesn't return a single value;
instead it returns a generator object that supports the iterator
protocol. On executing the \keyword{yield} statement, the generator
outputs the value of \code{i}, similar to a \keyword{return}
statement. The big difference between \keyword{yield} and a
\keyword{return} statement is that on reaching a \keyword{yield} the
generator's state of execution is suspended and local variables are
preserved. On the next call to the generator's \code{.next()} method,
the function will resume executing immediately after the
\keyword{yield} statement. (For complicated reasons, the
\keyword{yield} statement isn't allowed inside the \keyword{try} block
of a \code{try...finally} statement; read \pep{255} for a full
explanation of the interaction between \keyword{yield} and
exceptions.)
Here's a sample usage of the \function{generate_ints} generator:
\begin{verbatim}
>>> gen = generate_ints(3)
>>> gen
<generator object at 0x8117f90>
>>> gen.next()
0
>>> gen.next()
1
>>> gen.next()
2
>>> gen.next()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 2, in generate_ints
StopIteration
\end{verbatim}
You could equally write \code{for i in generate_ints(5)}, or
\code{a,b,c = generate_ints(3)}.
Inside a generator function, the \keyword{return} statement can only
be used without a value, and signals the end of the procession of
values; afterwards the generator cannot return any further values.
\keyword{return} with a value, such as \code{return 5}, is a syntax
error inside a generator function. The end of the generator's results
can also be indicated by raising \exception{StopIteration} manually,
or by just letting the flow of execution fall off the bottom of the
function.
You could achieve the effect of generators manually by writing your
own class and storing all the local variables of the generator as
instance variables. For example, returning a list of integers could
be done by setting \code{self.count} to 0, and having the
\method{next()} method increment \code{self.count} and return it.
However, for a moderately complicated generator, writing a
corresponding class would be much messier.
\file{Lib/test/test_generators.py} contains a number of more
interesting examples. The simplest one implements an in-order
traversal of a tree using generators recursively.
\begin{verbatim}
# A recursive generator that generates Tree leaves in in-order.
def inorder(t):
if t:
for x in inorder(t.left):
yield x
yield t.label
for x in inorder(t.right):
yield x
\end{verbatim}
Two other examples in \file{Lib/test/test_generators.py} produce
solutions for the N-Queens problem (placing $N$ queens on an $NxN$
chess board so that no queen threatens another) and the Knight's Tour
(a route that takes a knight to every square of an $NxN$ chessboard
without visiting any square twice).
The idea of generators comes from other programming languages,
especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
idea of generators is central. In Icon, every
expression and function call behaves like a generator. One example
from ``An Overview of the Icon Programming Language'' at
\url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
what this looks like:
\begin{verbatim}
sentence := "Store it in the neighboring harbor"
if (i := find("or", sentence)) > 5 then write(i)
\end{verbatim}
In Icon the \function{find()} function returns the indexes at which the
substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement,
\code{i} is first assigned a value of 3, but 3 is less than 5, so the
comparison fails, and Icon retries it with the second value of 23. 23
is greater than 5, so the comparison now succeeds, and the code prints
the value 23 to the screen.
Python doesn't go nearly as far as Icon in adopting generators as a
central concept. Generators are considered a new part of the core
Python language, but learning or using them isn't compulsory; if they
don't solve any problems that you have, feel free to ignore them.
One novel feature of Python's interface as compared to
Icon's is that a generator's state is represented as a concrete object
(the iterator) that can be passed around to other functions or stored
in a data structure.
\begin{seealso}
\seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer
and Tim Peters, with other fixes from the Python Labs crew.}
\end{seealso}
%======================================================================
\section{PEP 278: Universal Newline Support}
XXX write this section
%Highlights: import and friends will understand any of \r, \n and \r\n
%as end of line. Python file input will do the same if you use mode 'U'.
%Everything can be disabled by configuring with --without-universal-newlines.
\begin{seealso}
\seepep{278}{Universal Newline Support}{Written
and implemented by Jack Jansen.}
\end{seealso}
%======================================================================
\section{PEP 285: The \class{bool} Type}
XXX write this section
\begin{seealso}
\seepep{285}{Adding a bool type}{Written and implemented by GvR.}
\end{seealso}
%======================================================================
\section{New and Improved Modules}
arraymodule.c: - add Py_UNICODE arrays
- support +=, *=
distutils: command/bdist_packager, support for Solaris pkgtool
and HP-UX swinstall
Return enhanced tuples in grpmodule
posixmodule: killpg, mknod, fchdir,
Expat is now included with the Python source
Readline: Add get_history_item, get_current_history_length, and
redisplay functions.
Add optional arg to string methods strip(), lstrip(), rstrip().
The optional arg specifies characters to delete.
New method: string.zfill()
Add dict method pop().
New enumerate() built-in.
%======================================================================
\section{Interpreter Changes and Fixes}
file object can now be subtyped (did this not work before?)
yield is now always available
This adds the module name and a dot in front of the type name in every
type object initializer, except for built-in types (and those that
already had this). Note that it touches lots of Mac modules -- I have
no way to test these but the changes look right. Apologies if they're
not. This also touches the weakref docs, which contains a sample type
object initializer. It also touches the mmap test output, because the
mmap type's repr is included in that output. It touches object.h to
put the correct description in a comment.
File objects: Grow the string buffer at a mildly exponential rate for
the getc version of get_line. This makes test_bufio finish in 1.7
seconds instead of 57 seconds on my machine (with Py_DEBUG defined).
%======================================================================
\section{Other Changes and Fixes}
The tools used to build the documentation now work under Cygwin as
well as \UNIX.
% ======================================================================
\section{C Interface Changes}
Patch \#527027: Allow building python as shared library with
--enable-shared
pymalloc is now enabled by default (also mention debug-mode pymalloc)
Memory API reworking -- which functions are deprecated?
PyObject_DelItemString() added
PyArg_NoArgs macro is now deprecated
===
Introduce two new flag bits that can be set in a PyMethodDef method
descriptor, as used for the tp_methods slot of a type. These new flag
bits are both optional, and mutually exclusive. Most methods will not
use either. These flags are used to create special method types which
exist in the same namespace as normal methods without having to use
tedious construction code to insert the new special method objects in
the type's tp_dict after PyType_Ready() has been called.
If METH_CLASS is specified, the method will represent a class method
like that returned by the classmethod() built-in.
If METH_STATIC is specified, the method will represent a static method
like that returned by the staticmethod() built-in.
These flags may not be used in the PyMethodDef table for modules since
these special method types are not meaningful in that case; a
ValueError will be raised if these flags are found in that context.
===
Ports:
OS/2 EMX port
MacOS: Weaklink most toolbox modules, improving backward
compatibility. Modules will no longer fail to load if a single routine
is missing on the curent OS version, in stead calling the missing
routine will raise an exception. Should finally fix 531398. 2.2.1
candidate. Also blacklisted some constants with definitions that
were not Python-compatible.
Checked in Sean Reifschneider's RPM spec file and patches.
%======================================================================
\section{Acknowledgements \label{acks}}
The author would like to thank the following people for offering
suggestions, corrections and assistance with various drafts of this
article: Fred~L. Drake, Jr.
\end{document}