mirror of https://github.com/python/cpython
Rewrite two sections
This commit is contained in:
parent
49a5fe107f
commit
c8f8a814e2
|
@ -2,6 +2,10 @@
|
||||||
\usepackage{distutils}
|
\usepackage{distutils}
|
||||||
% $Id$
|
% $Id$
|
||||||
|
|
||||||
|
% Don't write extensive text for new sections; I'll do that.
|
||||||
|
% Feel free to add commented-out reminders of things that need
|
||||||
|
% to be covered. --amk
|
||||||
|
|
||||||
\title{What's New in Python 2.4}
|
\title{What's New in Python 2.4}
|
||||||
\release{0.0}
|
\release{0.0}
|
||||||
\author{A.M.\ Kuchling}
|
\author{A.M.\ Kuchling}
|
||||||
|
@ -89,73 +93,61 @@ Greg Wilson and ultimately implemented by Raymond Hettinger.}
|
||||||
XXX write this.
|
XXX write this.
|
||||||
|
|
||||||
%======================================================================
|
%======================================================================
|
||||||
\section{PEP 229: Generator Expressions}
|
\section{PEP 289: Generator Expressions}
|
||||||
|
|
||||||
Now, simple generators can be coded succinctly as expressions using a syntax
|
The iterator feature introduced in Python 2.2 makes it easier to write
|
||||||
like list comprehensions but with parentheses instead of brackets. These
|
programs that loop through large data sets without having the entire
|
||||||
expressions are designed for situations where the generator is used right
|
data set in memory at one time. Programmers can use iterators and the
|
||||||
away by an enclosing function. Generator expressions are more compact but
|
\module{itertools} module to write code in a fairly functional style.
|
||||||
less versatile than full generator definitions and they tend to be more memory
|
|
||||||
friendly than equivalent list comprehensions.
|
The fly in the ointment has been list comprehensions, because they
|
||||||
|
produce a Python list object containing all of the items, unavoidably
|
||||||
|
pulling them all into memory. When trying to write a program using the functional approach, it would be natural to write something like:
|
||||||
|
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
g = (tgtexp for var1 in exp1 for var2 in exp2 if exp3)
|
links = [link for link in get_all_links() if not link.followed]
|
||||||
|
for link in links:
|
||||||
|
...
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
is equivalent to:
|
instead of
|
||||||
|
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
def __gen(exp):
|
for link in get_all_links():
|
||||||
for var1 in exp:
|
if link.followed:
|
||||||
for var2 in exp2:
|
continue
|
||||||
if exp3:
|
...
|
||||||
yield tgtexp
|
|
||||||
g = __gen(iter(exp1))
|
|
||||||
del __gen
|
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
The advantage over full generator definitions is in economy of
|
The first form is more concise and perhaps more readable, but if
|
||||||
expression. Their advantage over list comprehensions is in saving
|
you're dealing with a large number of link objects the second form
|
||||||
memory by creating data only when it is needed rather than forming
|
would have to be used.
|
||||||
a whole list is memory all at once. Applications using memory
|
|
||||||
friendly generator expressions may scale-up to high volumes of data
|
|
||||||
more readily than with list comprehensions.
|
|
||||||
|
|
||||||
Generator expressions are best used in functions that consume their
|
Generator expressions work similarly to list comprehensions but don't
|
||||||
data all at once and would not benefit from having a full list instead
|
materialize the entire list; instead they create a generator that will
|
||||||
of a generator as an input:
|
return elements one by one. The above example could be written as:
|
||||||
|
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
>>> sum(i*i for i in range(10))
|
links = (link for link in get_all_links() if not link.followed)
|
||||||
285
|
for link in links:
|
||||||
|
...
|
||||||
>>> sorted(set(i*i for i in xrange(-20, 20) if i%2==1)) # odd squares
|
|
||||||
[1, 9, 25, 49, 81, 121, 169, 225, 289, 361]
|
|
||||||
|
|
||||||
>>> from itertools import izip
|
|
||||||
>>> xvec = [10, 20, 30]
|
|
||||||
>>> yvec = [7, 5, 3]
|
|
||||||
>>> sum(x*y for x,y in izip(xvec, yvec)) # dot product
|
|
||||||
260
|
|
||||||
|
|
||||||
>>> from math import pi, sin
|
|
||||||
>>> sine_table = dict((x, sin(x*pi/180)) for x in xrange(0, 91))
|
|
||||||
|
|
||||||
>>> unique_words = set(word for line in page for word in line.split())
|
|
||||||
|
|
||||||
>>> valedictorian = max((student.gpa, student.name) for student in graduates)
|
|
||||||
|
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
For more complex uses of generators, it is strongly recommended that
|
Generator expressions always have to be written inside parentheses, as
|
||||||
the traditional full generator definitions be used instead. In a
|
in the above example. The parentheses signalling a function call also
|
||||||
generator expression, the first for-loop expression is evaluated
|
count, so if you want to create a iterator that will be immediately
|
||||||
as soon as the expression is defined while the other expressions do
|
passed to a function you could write:
|
||||||
not get evaluated until the generator is run. This nuance is never
|
|
||||||
an issue when the generator is used immediately; however, if it is not
|
\begin{verbatim}
|
||||||
used right away, a full generator definition would be much more clear
|
print sum(obj.count for obj in list_all_objects())
|
||||||
about when the sub-expressions are evaluated and would be more obvious
|
\end{verbatim}
|
||||||
about the visibility and lifetime of the variables.
|
|
||||||
|
There are some small differences from list comprehensions. Most
|
||||||
|
notably, the loop variable (\var{obj} in the above example) is not
|
||||||
|
accessible outside of the generator expression. List comprehensions
|
||||||
|
leave the variable assigned to its last value; future versions of
|
||||||
|
Python will change this, making list comprehensions match generator
|
||||||
|
expressions in this respect.
|
||||||
|
|
||||||
\begin{seealso}
|
\begin{seealso}
|
||||||
\seepep{289}{Generator Expressions}{Proposed by Raymond Hettinger and
|
\seepep{289}{Generator Expressions}{Proposed by Raymond Hettinger and
|
||||||
|
@ -203,62 +195,222 @@ root:*:0:0:System Administrator:/var/root:/bin/tcsh
|
||||||
%======================================================================
|
%======================================================================
|
||||||
\section{PEP 327: Decimal Data Type}
|
\section{PEP 327: Decimal Data Type}
|
||||||
|
|
||||||
A new module, \module{decimal}, offers a \class{Decimal} data type for
|
Python has always supported floating-point (FP) numbers as a data
|
||||||
decimal floating point arithmetic. Compared to the built-in \class{float}
|
type, based on the underlying C \ctype{double} type. However, while
|
||||||
type implemented with binary floating point, the new class is especially
|
most programming languages provide a floating-point type, most people
|
||||||
useful for financial applications and other uses which require exact
|
(even programmers) are unaware that computing with floating-point
|
||||||
decimal representation, control over precision, control over rounding
|
numbers entails certain unavoidable inaccuracies. The new decimal
|
||||||
to meet legal or regulatory requirements, tracking of significant
|
type provides a way to avoid these inaccuracies.
|
||||||
decimal places, or for applications where the user expects the results
|
|
||||||
to match hand calculations done the way they were taught in school.
|
|
||||||
|
|
||||||
For example, calculating a 5% tax on a 70 cent phone charge gives
|
\subsection{Why is Decimal needed?}
|
||||||
different results in decimal floating point and binary floating point
|
|
||||||
with the difference being significant when rounding to the nearest
|
|
||||||
cent:
|
|
||||||
|
|
||||||
|
The limitations arise from the representation used for floating-point numbers.
|
||||||
|
FP numbers are made up of three components:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item The sign, which is -1 or +1.
|
||||||
|
\item The mantissa, which is a single-digit binary number
|
||||||
|
followed by a fractional part. For example, \code{1.01} in base-2 notation
|
||||||
|
is \code{1 + 0/2 + 1/4}, or 1.25 in decimal notation.
|
||||||
|
\item The exponent, which tells where the decimal point is located in the number represented.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
For example, the number 1.25 has sign +1, mantissa 1.01 (in binary),
|
||||||
|
and exponent of 0 (the decimal point doesn't need to be shifted). The
|
||||||
|
number 5 has the same sign and mantissa, but the exponent is 2
|
||||||
|
because the mantissa is multiplied by 4 (2 to the power of the exponent 2).
|
||||||
|
|
||||||
|
Modern systems usually provide floating-point support that conforms to
|
||||||
|
a relevant standard called IEEE 754. C's \ctype{double} type is
|
||||||
|
usually implemented as a 64-bit IEEE 754 number, which uses 52 bits of
|
||||||
|
space for the mantissa. This means that numbers can only be specified
|
||||||
|
to 52 bits of precision. If you're trying to represent numbers whose
|
||||||
|
expansion repeats endlessly, the expansion is cut off after 52 bits.
|
||||||
|
Unfortunately, most software needs to produce output in base 10, and
|
||||||
|
base 10 often gives rise to such repeating decimals. For example, 1.1
|
||||||
|
decimal is binary \code{1.0001100110011 ...}; .1 = 1/16 + 1/32 + 1/256
|
||||||
|
plus an infinite number of additional terms. IEEE 754 has to chop off
|
||||||
|
that infinitely repeated decimal after 52 digits, so the
|
||||||
|
representation is slightly inaccurate.
|
||||||
|
|
||||||
|
Sometimes you can see this inaccuracy when the number is printed:
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
>>> from decimal import *
|
>>> 1.1
|
||||||
>>> Decimal('0.70') * Decimal('1.05')
|
1.1000000000000001
|
||||||
Decimal("0.7350")
|
|
||||||
>>> .70 * 1.05
|
|
||||||
0.73499999999999999
|
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
Note that the \class{Decimal} result keeps a trailing zero, automatically
|
The inaccuracy isn't always visible when you print the number because
|
||||||
inferring four place significance from two digit mulitiplicands. A key
|
the FP-to-decimal-string conversion is provided by the C library, and
|
||||||
goal is to reproduce the mathematics we do by hand and avoid the tricky
|
most C libraries try to produce sensible output, but the inaccuracy is
|
||||||
issues that arise when decimal numbers cannot be represented exactly in
|
still there and subsequent operations can magnify the error.
|
||||||
binary floating point.
|
|
||||||
|
|
||||||
Exact representation enables the \class{Decimal} class to perform
|
For many applications this doesn't matter. If I'm plotting points and
|
||||||
modulo calculations and equality tests that would fail in binary
|
displaying them on my monitor, the difference between 1.1 and
|
||||||
floating point:
|
1.1000000000000001 is too small to be visible. Reports often limit
|
||||||
|
output to a certain number of decimal places, and if you round the
|
||||||
|
number to two or three or even eight decimal places, the error is
|
||||||
|
never apparent. However, for applications where it does matter,
|
||||||
|
it's a lot of work to implement your own custom arithmetic routines.
|
||||||
|
|
||||||
|
\subsection{The \class{Decimal} type}
|
||||||
|
|
||||||
|
A new module, \module{decimal}, was added to Python's standard library.
|
||||||
|
It contains two classes, \class{Decimal} and \class{Context}.
|
||||||
|
\class{Decimal} instances represent numbers, and
|
||||||
|
\class{Context} instances are used to wrap up various settings such as the precision and default rounding mode.
|
||||||
|
|
||||||
|
\class{Decimal} instances, like regular Python integers and FP numbers, are immutable; once they've been created, you can't change the value it represents.
|
||||||
|
\class{Decimal} instances can be created from integers or strings:
|
||||||
|
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
>>> Decimal('1.00') % Decimal('.10')
|
>>> import decimal
|
||||||
Decimal("0.00")
|
>>> decimal.Decimal(1972)
|
||||||
>>> 1.00 % 0.10
|
Decimal("1972")
|
||||||
0.09999999999999995
|
>>> decimal.Decimal("1.1")
|
||||||
|
Decimal("1.1")
|
||||||
>>> sum([Decimal('0.1')]*10) == Decimal('1.0')
|
|
||||||
True
|
|
||||||
>>> sum([0.1]*10) == 1.0
|
|
||||||
False
|
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
The \module{decimal} module also allows arbitrarily large precisions to be
|
You can also provide tuples containing the sign, mantissa represented
|
||||||
set for calculation:
|
as a tuple of decimal digits, and exponent:
|
||||||
|
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
>>> getcontext().prec = 24
|
>>> decimal.Decimal((1, (1, 4, 7, 5), -2))
|
||||||
>>> Decimal(1) / Decimal(7)
|
Decimal("-14.75")
|
||||||
Decimal("0.142857142857142857142857")
|
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
|
||||||
|
Cautionary note: the sign bit is a Boolean value, so 0 is positive and 1 is negative.
|
||||||
|
|
||||||
|
Floating-point numbers posed a bit of a problem: should the FP number
|
||||||
|
representing 1.1 turn into the decimal number for exactly 1.1, or for
|
||||||
|
1.1 plus whatever inaccuracies are introduced? The decision was to
|
||||||
|
leave such a conversion out of the API. Instead, you should convert
|
||||||
|
the floating-point number into a string using the desired precision and
|
||||||
|
pass the string to the \class{Decimal} constructor:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
>>> f = 1.1
|
||||||
|
>>> decimal.Decimal(str(f))
|
||||||
|
Decimal("1.1")
|
||||||
|
>>> decimal.Decimal(repr(f))
|
||||||
|
Decimal("1.1000000000000001")
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
Once you have \class{Decimal} instances, you can perform the usual
|
||||||
|
mathematical operations on them. One limitation: exponentiation
|
||||||
|
requires an integer exponent:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
>>> a = decimal.Decimal('35.72')
|
||||||
|
>>> b = decimal.Decimal('1.73')
|
||||||
|
>>> a+b
|
||||||
|
Decimal("37.45")
|
||||||
|
>>> a-b
|
||||||
|
Decimal("33.99")
|
||||||
|
>>> a*b
|
||||||
|
Decimal("61.7956")
|
||||||
|
>>> a/b
|
||||||
|
Decimal("20.6473988")
|
||||||
|
>>> a ** 2
|
||||||
|
Decimal("1275.9184")
|
||||||
|
>>> a ** b
|
||||||
|
Decimal("NaN")
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
You can combine \class{Decimal} instances with integers, but not with
|
||||||
|
floating-point numbers:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
>>> a + 4
|
||||||
|
Decimal("39.72")
|
||||||
|
>>> a + 4.5
|
||||||
|
Traceback (most recent call last):
|
||||||
|
...
|
||||||
|
TypeError: You can interact Decimal only with int, long or Decimal data types.
|
||||||
|
>>>
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
\class{Decimal} numbers can be used with the \module{math} and
|
||||||
|
\module{cmath} modules, though you'll get back a regular
|
||||||
|
floating-point number and not a \class{Decimal}. Instances also have a \method{sqrt()} method:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
>>> import math, cmath
|
||||||
|
>>> d = decimal.Decimal('123456789012.345')
|
||||||
|
>>> math.sqrt(d)
|
||||||
|
351364.18288201344
|
||||||
|
>>> cmath.sqrt(-d)
|
||||||
|
351364.18288201344j
|
||||||
|
>>> d.sqrt()
|
||||||
|
Decimal(``351364.1828820134592177245001'')
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
|
||||||
|
\subsection{The \class{Context} type}
|
||||||
|
|
||||||
|
Instances of the \class{Context} class encapsulate several settings for
|
||||||
|
decimal operations:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item \member{prec} is the precision, the number of decimal places.
|
||||||
|
\item \member{rounding} specifies the rounding mode. The \module{decimal}
|
||||||
|
module has constants for the various possibilities:
|
||||||
|
\constant{ROUND_DOWN}, \constant{ROUND_CEILING}, \constant{ROUND_HALF_EVEN}, and various others.
|
||||||
|
\item \member{trap_enablers} is a dictionary specifying what happens on
|
||||||
|
encountering certain error conditions: either an exception is raised or
|
||||||
|
a value is returned. Some examples of error conditions are
|
||||||
|
division by zero, loss of precision, and overflow.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
There's a thread-local default context available by calling
|
||||||
|
\function{getcontext()}; you can change the properties of this context
|
||||||
|
to alter the default precision, rounding, or trap handling.
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
>>> decimal.getcontext().prec
|
||||||
|
28
|
||||||
|
>>> decimal.Decimal(1) / decimal.Decimal(7)
|
||||||
|
Decimal(``0.1428571428571428571428571429'')
|
||||||
|
>>> decimal.getcontext().prec = 9
|
||||||
|
>>> decimal.Decimal(1) / decimal.Decimal(7)
|
||||||
|
Decimal(``0.142857143'')
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
The default action for error conditions is to return a special value
|
||||||
|
such as infinity or not-a-number, but you can request that exceptions
|
||||||
|
be raised:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
>>> decimal.Decimal(1) / decimal.Decimal(0)
|
||||||
|
Decimal(``Infinity'')
|
||||||
|
>>> decimal.getcontext().trap_enablers[decimal.DivisionByZero] = True
|
||||||
|
>>> decimal.Decimal(1) / decimal.Decimal(0)
|
||||||
|
Traceback (most recent call last):
|
||||||
|
...
|
||||||
|
decimal.DivisionByZero: x / 0
|
||||||
|
>>>
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
The \class{Context} instance also has various methods for formatting
|
||||||
|
numbers such as \method{to_eng_string()} and \method{to_sci_string()}.
|
||||||
|
|
||||||
|
|
||||||
\begin{seealso}
|
\begin{seealso}
|
||||||
\seepep{327}{Decimal Data Type}{Written by Facundo Batista and implemented
|
\seepep{327}{Decimal Data Type}{Written by Facundo Batista and implemented
|
||||||
by Eric Price, Facundo Bastista, Raymond Hettinger, Aahz, and Tim Peters.}
|
by Facundo Batista, Eric Price, Raymond Hettinger, Aahz, and Tim Peters.}
|
||||||
|
|
||||||
|
\seeurl{http://research.microsoft.com/~hollasch/cgindex/coding/ieeefloat.html}
|
||||||
|
{A more detailed overview of the IEEE-754 representation.}
|
||||||
|
|
||||||
|
\seeurl{http://www.lahey.com/float.htm}
|
||||||
|
{The article uses Fortran code to illustrate many of the problems
|
||||||
|
that floating-point inaccuracy can cause.}
|
||||||
|
|
||||||
|
\seeurl{http://www2.hursley.ibm.com/decimal/}
|
||||||
|
{A description of a decimal-based representation. This representation
|
||||||
|
is being proposed as a standard, and underlies the new Python decimal
|
||||||
|
type. Much of this material was written by Mike Cowlishaw, designer of the
|
||||||
|
REXX language.}
|
||||||
|
|
||||||
\end{seealso}
|
\end{seealso}
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue