2003-07-30 16:14:09 -03:00
|
|
|
\documentclass{howto}
|
|
|
|
\usepackage{distutils}
|
|
|
|
% $Id$
|
|
|
|
|
|
|
|
\title{What's New in Python 2.4}
|
|
|
|
\release{0.0}
|
|
|
|
\author{A.M.\ Kuchling}
|
|
|
|
\authoraddress{\email{amk@amk.ca}}
|
|
|
|
|
|
|
|
\begin{document}
|
|
|
|
\maketitle
|
|
|
|
\tableofcontents
|
|
|
|
|
|
|
|
This article explains the new features in Python 2.4. No release date
|
2003-11-26 13:52:45 -04:00
|
|
|
for Python 2.4 has been set; expect that this will happen mid-2004.
|
2003-07-30 16:14:09 -03:00
|
|
|
|
|
|
|
While Python 2.3 was primarily a library development release, Python
|
|
|
|
2.4 may extend the core language and interpreter in
|
|
|
|
as-yet-undetermined ways.
|
|
|
|
|
|
|
|
This article doesn't attempt to provide a complete specification of
|
|
|
|
the new features, but instead provides a convenient overview. For
|
|
|
|
full details, you should refer to the documentation for Python 2.4.
|
|
|
|
% add hyperlink when the documentation becomes available online.
|
|
|
|
If you want to understand the complete implementation and design
|
|
|
|
rationale, refer to the PEP for a particular new feature.
|
|
|
|
|
2003-11-24 03:14:54 -04:00
|
|
|
%======================================================================
|
|
|
|
\section{PEP 218: Built-In Set Objects}
|
|
|
|
|
|
|
|
Two new built-in types, \function{set(iterable)} and
|
|
|
|
\function{frozenset(iterable)} provide high speed data types for
|
|
|
|
membership testing, for eliminating duplicates from sequences, and
|
|
|
|
for mathematical operations like unions, intersections, differences,
|
|
|
|
and symmetric differences.
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
>>> a = set('abracadabra') # form a set from a string
|
|
|
|
>>> 'z' in a # fast membership testing
|
|
|
|
False
|
|
|
|
>>> a # unique letters in a
|
|
|
|
set(['a', 'r', 'b', 'c', 'd'])
|
|
|
|
>>> ''.join(a) # convert back into a string
|
|
|
|
'arbcd'
|
2003-11-26 13:52:45 -04:00
|
|
|
|
2003-11-24 03:14:54 -04:00
|
|
|
>>> b = set('alacazam') # form a second set
|
|
|
|
>>> a - b # letters in a but not in b
|
|
|
|
set(['r', 'd', 'b'])
|
|
|
|
>>> a | b # letters in either a or b
|
|
|
|
set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
|
|
|
|
>>> a & b # letters in both a and b
|
|
|
|
set(['a', 'c'])
|
|
|
|
>>> a ^ b # letters in a or b but not both
|
|
|
|
set(['r', 'd', 'b', 'm', 'z', 'l'])
|
2003-11-26 13:52:45 -04:00
|
|
|
|
2003-11-24 03:14:54 -04:00
|
|
|
>>> a.add('z') # add a new element
|
|
|
|
>>> a.update('wxy') # add multiple new elements
|
|
|
|
>>> a
|
|
|
|
set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z'])
|
|
|
|
>>> a.remove('x') # take one element out
|
|
|
|
>>> a
|
|
|
|
set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z'])
|
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
The type \function{frozenset()} is an immutable version of \function{set()}.
|
|
|
|
Since it is immutable and hashable, it may be used as a dictionary key or
|
|
|
|
as a member of another set. Accordingly, it does not have methods
|
|
|
|
like \method{add()} and \method{remove()} which could alter its contents.
|
|
|
|
|
|
|
|
\begin{seealso}
|
|
|
|
\seepep{218}{Adding a Built-In Set Object Type}{Originally proposed by
|
|
|
|
Greg Wilson and ultimately implemented by Raymond Hettinger.}
|
|
|
|
\end{seealso}
|
2003-07-30 16:14:09 -03:00
|
|
|
|
|
|
|
%======================================================================
|
2003-11-08 11:58:49 -04:00
|
|
|
\section{PEP 322: Reverse Iteration}
|
2003-07-30 16:14:09 -03:00
|
|
|
|
2003-11-08 11:58:49 -04:00
|
|
|
A new built-in function, \function{reversed(seq)}, takes a sequence
|
|
|
|
and returns an iterator that returns the elements of the sequence
|
|
|
|
in reverse order.
|
|
|
|
|
|
|
|
\begin{verbatim}
|
2003-11-12 12:39:30 -04:00
|
|
|
>>> for i in reversed(xrange(1,4)):
|
2003-11-08 11:58:49 -04:00
|
|
|
... print i
|
|
|
|
...
|
|
|
|
3
|
|
|
|
2
|
|
|
|
1
|
|
|
|
\end{verbatim}
|
|
|
|
|
2003-11-12 12:39:30 -04:00
|
|
|
Compared to extended slicing, \code{range(1,4)[::-1]}, \function{reversed()}
|
|
|
|
is easier to read, runs faster, and uses substantially less memory.
|
|
|
|
|
2003-11-08 11:58:49 -04:00
|
|
|
Note that \function{reversed()} only accepts sequences, not arbitrary
|
2003-11-12 12:39:30 -04:00
|
|
|
iterators. If you want to reverse an iterator, first convert it to
|
|
|
|
a list with \function{list()}.
|
2003-11-08 11:58:49 -04:00
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
>>> input = open('/etc/passwd', 'r')
|
|
|
|
>>> for line in reversed(list(input)):
|
|
|
|
... print line
|
|
|
|
...
|
|
|
|
root:*:0:0:System Administrator:/var/root:/bin/tcsh
|
|
|
|
...
|
|
|
|
\end{verbatim}
|
2003-07-30 16:14:09 -03:00
|
|
|
|
2003-11-08 12:05:37 -04:00
|
|
|
\begin{seealso}
|
|
|
|
\seepep{322}{Reverse Iteration}{Written and implemented by Raymond Hettinger.}
|
|
|
|
|
|
|
|
\end{seealso}
|
|
|
|
|
2003-07-30 16:14:09 -03:00
|
|
|
|
|
|
|
%======================================================================
|
|
|
|
\section{Other Language Changes}
|
|
|
|
|
|
|
|
Here are all of the changes that Python 2.4 makes to the core Python
|
|
|
|
language.
|
|
|
|
|
|
|
|
\begin{itemize}
|
2003-11-26 13:52:45 -04:00
|
|
|
|
|
|
|
\item The string methods, \method{ljust()}, \method{rjust()}, and
|
2003-11-26 14:03:48 -04:00
|
|
|
\method{center()} now take an optional argument for specifying a
|
2003-11-26 13:52:45 -04:00
|
|
|
fill character other than a space.
|
|
|
|
|
2003-10-21 09:31:16 -03:00
|
|
|
\item The \method{sort()} method of lists gained three keyword
|
|
|
|
arguments, \var{cmp}, \var{key}, and \var{reverse}. These arguments
|
|
|
|
make some common usages of \method{sort()} simpler. All are optional.
|
|
|
|
|
|
|
|
\var{cmp} is the same as the previous single argument to
|
|
|
|
\method{sort()}; if provided, the value should be a comparison
|
|
|
|
function that takes two arguments and returns -1, 0, or +1 depending
|
|
|
|
on how the arguments compare.
|
|
|
|
|
|
|
|
\var{key} should be a single-argument function that takes a list
|
|
|
|
element and returns a comparison key for the element. The list is
|
2003-11-12 12:27:50 -04:00
|
|
|
then sorted using the comparison keys. The following example sorts a
|
|
|
|
list case-insensitively:
|
2003-10-21 09:31:16 -03:00
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
>>> L = ['A', 'b', 'c', 'D']
|
|
|
|
>>> L.sort() # Case-sensitive sort
|
|
|
|
>>> L
|
|
|
|
['A', 'D', 'b', 'c']
|
|
|
|
>>> L.sort(key=lambda x: x.lower())
|
|
|
|
>>> L
|
|
|
|
['A', 'b', 'c', 'D']
|
|
|
|
>>> L.sort(cmp=lambda x,y: cmp(x.lower(), y.lower()))
|
|
|
|
>>> L
|
|
|
|
['A', 'b', 'c', 'D']
|
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
The last example, which uses the \var{cmp} parameter, is the old way
|
|
|
|
to perform a case-insensitive sort. It works, but is slower than
|
|
|
|
using a \var{key} parameter. Using \var{key} results in calling the
|
|
|
|
\method{lower()} method once for each element in the list while using
|
|
|
|
\var{cmp} will call the method twice for each comparison.
|
|
|
|
|
2003-11-13 17:33:26 -04:00
|
|
|
For simple key functions and comparison functions, it is often
|
|
|
|
possible to avoid a \keyword{lambda} expression by using an unbound
|
2003-11-12 12:27:50 -04:00
|
|
|
method instead. For example, the above case-insensitive sort is best
|
|
|
|
coded as:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
>>> L.sort(key=str.lower)
|
|
|
|
>>> L
|
|
|
|
['A', 'b', 'c', 'D']
|
|
|
|
\end{verbatim}
|
|
|
|
|
2003-10-21 09:31:16 -03:00
|
|
|
The \var{reverse} parameter should have a Boolean value. If the value is
|
|
|
|
\constant{True}, the list will be sorted into reverse order. Instead
|
2003-11-12 12:27:50 -04:00
|
|
|
of \code{L.sort(lambda x,y: cmp(y.score, x.score))}, you can now write:
|
|
|
|
\code{L.sort(key = lambda x: x.score, reverse=True)}.
|
|
|
|
|
2003-11-13 17:33:26 -04:00
|
|
|
The results of sorting are now guaranteed to be stable. This means
|
|
|
|
that two entries with equal keys will be returned in the same order as
|
|
|
|
they were input. For example, you can sort a list of people by name,
|
|
|
|
and then sort the list by age, resulting in a list sorted by age where
|
|
|
|
people with the same age are in name-sorted order.
|
2003-11-12 12:27:50 -04:00
|
|
|
|
|
|
|
\item The list type gained a \method{sorted(iterable)} method that works
|
|
|
|
like the in-place \method{sort()} method but has been made suitable for
|
|
|
|
use in expressions. The differences are:
|
|
|
|
\begin{itemize}
|
2003-11-12 12:42:10 -04:00
|
|
|
\item the input may be any iterable;
|
|
|
|
\item a newly formed copy is sorted, leaving the original intact; and
|
2003-11-12 12:27:50 -04:00
|
|
|
\item the expression returns the new sorted copy
|
|
|
|
\end{itemize}
|
2003-11-08 11:58:49 -04:00
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
>>> L = [9,7,8,3,2,4,1,6,5]
|
2003-11-12 12:27:50 -04:00
|
|
|
>>> [10+i for i in list.sorted(L)] # usable in a list comprehension
|
|
|
|
[11, 12, 13, 14, 15, 16, 17, 18, 19]
|
|
|
|
>>> L = [9,7,8,3,2,4,1,6,5] # original is left unchanged
|
|
|
|
[9,7,8,3,2,4,1,6,5]
|
2003-11-26 13:52:45 -04:00
|
|
|
|
2003-11-12 12:27:50 -04:00
|
|
|
>>> list.sorted('Monte Python') # any iterable may be an input
|
|
|
|
[' ', 'M', 'P', 'e', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y']
|
2003-11-26 13:52:45 -04:00
|
|
|
|
|
|
|
>>> # List the contents of a dict sorted by key values
|
2003-11-12 12:27:50 -04:00
|
|
|
>>> colormap = dict(red=1, blue=2, green=3, black=4, yellow=5)
|
|
|
|
>>> for k, v in list.sorted(colormap.iteritems()):
|
|
|
|
... print k, v
|
|
|
|
...
|
|
|
|
black 4
|
|
|
|
blue 2
|
|
|
|
green 3
|
|
|
|
red 1
|
|
|
|
yellow 5
|
|
|
|
|
2003-11-08 11:58:49 -04:00
|
|
|
\end{verbatim}
|
|
|
|
|
|
|
|
|
2003-11-12 12:27:50 -04:00
|
|
|
\item The \function{zip()} built-in function and \function{itertools.izip()}
|
2003-11-26 14:03:48 -04:00
|
|
|
now return an empty list instead of raising a \exception{TypeError}
|
2003-11-12 12:27:50 -04:00
|
|
|
exception if called with no arguments. This makes the functions more
|
|
|
|
suitable for use with variable length argument lists:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
>>> def transpose(array):
|
|
|
|
... return zip(*array)
|
|
|
|
...
|
|
|
|
>>> transpose([(1,2,3), (4,5,6)])
|
|
|
|
[(1, 4), (2, 5), (3, 6)]
|
|
|
|
>>> transpose([])
|
|
|
|
[]
|
|
|
|
\end{verbatim}
|
2003-10-21 09:48:23 -03:00
|
|
|
|
2003-07-30 16:14:09 -03:00
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
|
|
%======================================================================
|
|
|
|
\subsection{Optimizations}
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
|
|
|
\item Optimizations should be described here.
|
|
|
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
The net result of the 2.4 optimizations is that Python 2.4 runs the
|
|
|
|
pystone benchmark around XX\% faster than Python 2.3 and YY\% faster
|
|
|
|
than Python 2.2.
|
|
|
|
|
|
|
|
|
|
|
|
%======================================================================
|
|
|
|
\section{New, Improved, and Deprecated Modules}
|
|
|
|
|
|
|
|
As usual, Python's standard library received a number of enhancements and
|
|
|
|
bug fixes. Here's a partial list of the most notable changes, sorted
|
|
|
|
alphabetically by module name. Consult the
|
|
|
|
\file{Misc/NEWS} file in the source tree for a more
|
|
|
|
complete list of changes, or look through the CVS logs for all the
|
|
|
|
details.
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
2003-08-13 20:11:04 -03:00
|
|
|
\item The \module{curses} modules now supports the ncurses extension
|
|
|
|
\function{use_default_colors()}. On platforms where the terminal
|
|
|
|
supports transparency, this makes it possible to use a transparent background.
|
|
|
|
(Contributed by J\"org Lehmann.)
|
2003-10-21 09:48:23 -03:00
|
|
|
|
2003-11-12 12:27:50 -04:00
|
|
|
\item The \module{heapq} module has been converted to C. The resulting
|
|
|
|
ten-fold improvement in speed makes the module suitable for handling
|
|
|
|
high volumes of data.
|
2003-11-08 11:58:49 -04:00
|
|
|
|
2003-11-20 18:22:19 -04:00
|
|
|
\item The \module{imaplib} module now supports IMAP's THREAD command.
|
|
|
|
(Contributed by Yves Dionne.)
|
|
|
|
|
2003-12-06 19:19:23 -04:00
|
|
|
\item The \module{itertools} module gained a
|
|
|
|
\function{groupby(\var{iterable}\optional{, \var{func}})} function,
|
|
|
|
inspired by the GROUP BY clause from SQL.
|
|
|
|
\var{iterable} returns a succession of elements, and the optional
|
|
|
|
\var{func} is a function that takes an element and returns a key
|
|
|
|
value; if omitted, the key is simply the element itself.
|
|
|
|
\function{groupby()} then groups the elements into subsequences
|
|
|
|
which have matching values of the key, and returns a series of 2-tuples
|
|
|
|
containing the key value and an iterator over the subsequence.
|
|
|
|
|
|
|
|
Here's an example. The \var{key} function simply returns whether a
|
|
|
|
number is even or odd, so the result of \function{groupby()} is to
|
|
|
|
return consecutive runs of odd or even numbers.
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
>>> import itertools
|
|
|
|
>>> L = [2,4,6, 7,8,9,11, 12, 14]
|
|
|
|
>>> for key_val, it in itertools.groupby(L, lambda x: x % 2):
|
|
|
|
... print key_val, list(it)
|
|
|
|
...
|
|
|
|
0 [2, 4, 6]
|
|
|
|
1 [7]
|
|
|
|
0 [8]
|
|
|
|
1 [9, 11]
|
|
|
|
0 [12, 14]
|
|
|
|
>>>
|
|
|
|
\end{verbatim}
|
|
|
|
|
2003-12-12 09:13:47 -04:00
|
|
|
Like its SQL counterpart, \function{groupby()} is typically used with
|
|
|
|
sorted input. The logic for \function{groupby()} is similar to the
|
|
|
|
\UNIX{} \code{uniq} filter which makes it handy for eliminating,
|
|
|
|
counting, or identifying duplicate elements:
|
|
|
|
|
|
|
|
\begin{verbatim}
|
|
|
|
>>> word = 'abracadabra'
|
|
|
|
>>> [k for k, g in groupby(list.sorted(word))]
|
|
|
|
['a', 'b', 'c', 'd', 'r']
|
|
|
|
>>> [(k, len(list(g))) for k, g in groupby(list.sorted(word))]
|
|
|
|
[('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)]
|
|
|
|
>>> [k for k, g in groupby(list.sorted(word)) if len(list(g)) > 1]
|
|
|
|
['a', 'b', 'r']
|
|
|
|
\end{verbatim}
|
|
|
|
|
2003-11-20 18:22:19 -04:00
|
|
|
\item A new \function{getsid()} function was added to the
|
|
|
|
\module{posix} module that underlies the \module{os} module.
|
|
|
|
(Contributed by J. Raynor.)
|
|
|
|
|
2003-10-21 09:48:23 -03:00
|
|
|
\item The \module{random} module has a new method called \method{getrandbits(N)}
|
2003-11-12 12:27:50 -04:00
|
|
|
which returns an N-bit long integer. This method supports the existing
|
|
|
|
\method{randrange()} method, making it possible to efficiently generate
|
|
|
|
arbitrarily large random numbers (suitable for prime number generation in
|
|
|
|
RSA applications).
|
2003-10-21 09:48:23 -03:00
|
|
|
|
|
|
|
\item The regular expression language accepted by the \module{re} module
|
|
|
|
was extended with simple conditional expressions, written as
|
|
|
|
\code{(?(\var{group})\var{A}|\var{B})}. \var{group} is either a
|
|
|
|
numeric group ID or a group name defined with \code{(?P<group>...)}
|
|
|
|
earlier in the expression. If the specified group matched, the
|
|
|
|
regular expression pattern \var{A} will be tested against the string; if
|
|
|
|
the group didn't match, the pattern \var{B} will be used instead.
|
2003-08-13 20:11:04 -03:00
|
|
|
|
2003-07-30 16:14:09 -03:00
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
|
|
%======================================================================
|
|
|
|
% whole new modules get described in \subsections here
|
|
|
|
|
|
|
|
|
|
|
|
% ======================================================================
|
|
|
|
\section{Build and C API Changes}
|
|
|
|
|
|
|
|
Changes to Python's build process and to the C API include:
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
2003-10-21 09:48:23 -03:00
|
|
|
\item Three new convenience macros were added for common return
|
|
|
|
values from extension functions: \csimplemacro{Py_RETURN_NONE},
|
|
|
|
\csimplemacro{Py_RETURN_TRUE}, and \csimplemacro{Py_RETURN_FALSE}.
|
|
|
|
|
|
|
|
\item A new function, \cfunction{PyTuple_Pack(N, obj1, obj2, ...,
|
|
|
|
objN)}, constructs tuples from a variable length argument list of
|
|
|
|
Python objects.
|
2003-07-30 16:14:09 -03:00
|
|
|
|
2003-11-26 14:05:26 -04:00
|
|
|
\item A new function, \cfunction{PyDict_Contains(d, k)}, implements
|
|
|
|
fast dictionary lookups without masking exceptions raised during the
|
|
|
|
look-up process.
|
2003-11-26 13:52:45 -04:00
|
|
|
|
2003-07-30 16:14:09 -03:00
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
|
|
%======================================================================
|
|
|
|
\subsection{Port-Specific Changes}
|
|
|
|
|
|
|
|
Platform-specific changes go here.
|
|
|
|
|
|
|
|
|
|
|
|
%======================================================================
|
|
|
|
\section{Other Changes and Fixes \label{section-other}}
|
|
|
|
|
|
|
|
As usual, there were a bunch of other improvements and bugfixes
|
|
|
|
scattered throughout the source tree. A search through the CVS change
|
|
|
|
logs finds there were XXX patches applied and YYY bugs fixed between
|
|
|
|
Python 2.3 and 2.4. Both figures are likely to be underestimates.
|
|
|
|
|
|
|
|
Some of the more notable changes are:
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
|
|
|
\item Details go here.
|
|
|
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
|
|
%======================================================================
|
|
|
|
\section{Porting to Python 2.4}
|
|
|
|
|
|
|
|
This section lists previously described changes that may require
|
|
|
|
changes to your code:
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
2003-11-12 12:27:50 -04:00
|
|
|
\item The \function{zip()} built-in function and \function{itertools.izip()}
|
|
|
|
now return an empty list instead of raising a \exception{TypeError}
|
|
|
|
exception if called with no arguments.
|
2003-10-21 09:48:23 -03:00
|
|
|
|
|
|
|
\item \function{dircache.listdir()} now passes exceptions to the caller
|
|
|
|
instead of returning empty lists.
|
2003-07-30 16:14:09 -03:00
|
|
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
|
|
%======================================================================
|
|
|
|
\section{Acknowledgements \label{acks}}
|
|
|
|
|
|
|
|
The author would like to thank the following people for offering
|
|
|
|
suggestions, corrections and assistance with various drafts of this
|
2003-11-13 17:33:26 -04:00
|
|
|
article: Raymond Hettinger.
|
2003-07-30 16:14:09 -03:00
|
|
|
|
|
|
|
\end{document}
|