549 lines
21 KiB
ReStructuredText
549 lines
21 KiB
ReStructuredText
****************************
|
|
What's New In Python 3.1
|
|
****************************
|
|
|
|
:Author: Raymond Hettinger
|
|
:Release: |release|
|
|
:Date: |today|
|
|
|
|
.. $Id$
|
|
Rules for maintenance:
|
|
|
|
* Anyone can add text to this document. Do not spend very much time
|
|
on the wording of your changes, because your text will probably
|
|
get rewritten to some degree.
|
|
|
|
* The maintainer will go through Misc/NEWS periodically and add
|
|
changes; it's therefore more important to add your changes to
|
|
Misc/NEWS than to this file.
|
|
|
|
* This is not a complete list of every single change; completeness
|
|
is the purpose of Misc/NEWS. Some changes I consider too small
|
|
or esoteric to include. If such a change is added to the text,
|
|
I'll just remove it. (This is another reason you shouldn't spend
|
|
too much time on writing your addition.)
|
|
|
|
* If you want to draw your new text to the attention of the
|
|
maintainer, add 'XXX' to the beginning of the paragraph or
|
|
section.
|
|
|
|
* It's OK to just add a fragmentary note about a change. For
|
|
example: "XXX Describe the transmogrify() function added to the
|
|
socket module." The maintainer will research the change and
|
|
write the necessary text.
|
|
|
|
* You can comment out your additions if you like, but it's not
|
|
necessary (especially when a final release is some months away).
|
|
|
|
* Credit the author of a patch or bugfix. Just the name is
|
|
sufficient; the e-mail address isn't necessary.
|
|
|
|
* It's helpful to add the bug/patch number as a comment:
|
|
|
|
% Patch 12345
|
|
XXX Describe the transmogrify() function added to the socket
|
|
module.
|
|
(Contributed by P.Y. Developer.)
|
|
|
|
This saves the maintainer the effort of going through the SVN log
|
|
when researching a change.
|
|
|
|
This article explains the new features in Python 3.1, compared to 3.0.
|
|
|
|
|
|
PEP 372: Ordered Dictionaries
|
|
=============================
|
|
|
|
Regular Python dictionaries iterate over key/value pairs in arbitrary order.
|
|
Over the years, a number of authors have written alternative implementations
|
|
that remember the order that the keys were originally inserted. Based on
|
|
the experiences from those implementations, a new
|
|
:class:`collections.OrderedDict` class has been introduced.
|
|
|
|
The OrderedDict API is substantially the same as regular dictionaries
|
|
but will iterate over keys and values in a guaranteed order depending on
|
|
when a key was first inserted. If a new entry overwrites an existing entry,
|
|
the original insertion position is left unchanged. Deleting an entry and
|
|
reinserting it will move it to the end.
|
|
|
|
The standard library now supports use of ordered dictionaries in several
|
|
modules. The :mod:`configparser` module uses them by default. This lets
|
|
configuration files be read, modified, and then written back in their original
|
|
order. The *_asdict()* method for :func:`collections.namedtuple` now
|
|
returns an ordered dictionary with the values appearing in the same order as
|
|
the underlying tuple indicies. The :mod:`json` module is being built-out with
|
|
an *object_pairs_hook* to allow OrderedDicts to be built by the decoder.
|
|
Support was also added for third-party tools like `PyYAML <http://pyyaml.org/>`_.
|
|
|
|
.. seealso::
|
|
|
|
:pep:`372` - Ordered Dictionaries
|
|
PEP written by Armin Ronacher and Raymond Hettinger. Implementation
|
|
written by Raymond Hettinger.
|
|
|
|
|
|
PEP 378: Format Specifier for Thousands Separator
|
|
=================================================
|
|
|
|
The built-in :func:`format` function and the :meth:`str.format` method use
|
|
a mini-language that now includes a simple, non-locale aware way to format
|
|
a number with a thousands separator. That provides a way to humanize a
|
|
program's output, improving its professional appearance and readability::
|
|
|
|
>>> format(1234567, ',d')
|
|
'1,234,567'
|
|
>>> format(1234567.89, ',.2f')
|
|
'1,234,567.89'
|
|
>>> format(12345.6 + 8901234.12j, ',f')
|
|
'12,345.600000+8,901,234.120000j'
|
|
>>> format(Decimal('1234567.89'), ',f')
|
|
'1,234,567.89'
|
|
|
|
The supported types are :class:`int`, :class:`float`, :class:`complex`
|
|
and :class:`decimal.Decimal`.
|
|
|
|
Discussions are underway about how to specify alternative separators
|
|
like dots, spaces, apostrophes, or underscores. Locale-aware applications
|
|
should use the existing *n* format specifier which already has some support
|
|
for thousands separators.
|
|
|
|
.. seealso::
|
|
|
|
:pep:`378` - Format Specifier for Thousands Separator
|
|
PEP written by Raymond Hettinger and implemented by Eric Smith and
|
|
Mark Dickinson.
|
|
|
|
|
|
Other Language Changes
|
|
======================
|
|
|
|
Some smaller changes made to the core Python language are:
|
|
|
|
* The :func:`int` type gained a ``bit_length`` method that returns the
|
|
number of bits necessary to represent its argument in binary::
|
|
|
|
>>> n = 37
|
|
>>> bin(37)
|
|
'0b100101'
|
|
>>> n.bit_length()
|
|
6
|
|
>>> n = 2**123-1
|
|
>>> n.bit_length()
|
|
123
|
|
>>> (n+1).bit_length()
|
|
124
|
|
|
|
(Contributed by Fredrik Johansson, Victor Stinner, Raymond Hettinger,
|
|
and Mark Dickinson; :issue:`3439`.)
|
|
|
|
* The fields in :func:`format` strings can now be automatically
|
|
numbered::
|
|
|
|
>>> 'Sir {} of {}'.format('Gallahad', 'Camelot')
|
|
'Sir Gallahad of Camelot'
|
|
|
|
Formerly, the string would have required numbered fields such as:
|
|
``'Sir {0} of {1}'``.
|
|
|
|
(Contributed by Eric Smith; :issue:`5237`.)
|
|
|
|
* The :func:`string.maketrans` function is deprecated and is replaced by new
|
|
static methods, :meth:`bytes.maketrans` and :meth:`bytearray.maketrans`.
|
|
This change solves the confusion around which types were supported by the
|
|
:mod:`string` module. Now, :class:`str`, :class:`bytes`, and
|
|
:class:`bytearray` each have their own **maketrans** and **translate**
|
|
methods with intermediate translation tables of the appropriate type.
|
|
|
|
(Contributed by Georg Brandl; :issue:`5675`.)
|
|
|
|
* The syntax of the :keyword:`with` statement now allows multiple context
|
|
managers in a single statement::
|
|
|
|
>>> with open('mylog.txt') as infile, open('a.out', 'w') as outfile:
|
|
... for line in infile:
|
|
... if '<critical>' in line:
|
|
... outfile.write(line)
|
|
|
|
With the new syntax, the :func:`contextlib.nested` function is no longer
|
|
needed and is now deprecated.
|
|
|
|
(Contributed by Georg Brandl and Mattias Brändström;
|
|
`appspot issue 53094 <http://codereview.appspot.com/53094>`_.)
|
|
|
|
* ``round(x, n)`` now returns an integer if *x* is an integer.
|
|
Previously it returned a float::
|
|
|
|
>>> round(1123, -2)
|
|
1100
|
|
|
|
(Contributed by Mark Dickinson; :issue:`4707`.)
|
|
|
|
* Python now uses David Gay's algorithm for finding the shortest floating
|
|
point representation that doesn't change its value. This should help
|
|
mitigate some of the confusion surrounding binary floating point
|
|
numbers.
|
|
|
|
The significance is easily seen with a number like ``1.1`` which does not
|
|
have an exact equivalent in binary floating point. Since there is no exact
|
|
equivalent, an expression like ``float('1.1')`` evaluates to the nearest
|
|
representable value which is ``0x1.199999999999ap+0`` in hex or
|
|
``1.100000000000000088817841970012523233890533447265625`` in decimal. That
|
|
nearest value was and still is used in subsequent floating point
|
|
calculations.
|
|
|
|
What is new is how the number gets displayed. Formerly, Python used a
|
|
simple approach. The value of ``repr(1.1)`` was computed as ``format(1.1,
|
|
'.17g')`` which evaluated to ``'1.1000000000000001'``. The advantage of
|
|
using 17 digits was that it relied on IEEE-754 guarantees to assure that
|
|
``eval(repr(1.1))`` would round-trip exactly to its original value. The
|
|
disadvantage is that many people found the output to be confusing (mistaking
|
|
intrinsic limitations of binary floating point representation as being a
|
|
problem with Python itself).
|
|
|
|
The new algorithm for ``repr(1.1)`` is smarter and returns ``'1.1'``.
|
|
Effectively, it searches all equivalent string representations (ones that
|
|
get stored with the same underlying float value) and returns the shortest
|
|
representation.
|
|
|
|
The new algorithm tends to emit cleaner representations when possible, but
|
|
it does not change the underlying values. So, it is still the case that
|
|
``1.1 + 2.2 != 3.3`` even though the representations may suggest otherwise.
|
|
|
|
The new algorithm depends on certain features in the underlying floating
|
|
point implementation. If the required features are not found, the old
|
|
algorithm will continue to be used. Also, the text pickle protocols
|
|
assure cross-platform portability by using the old algorithm.
|
|
|
|
(Contributed by Eric Smith and Mark Dickinson; :issue:`1580`)
|
|
|
|
New, Improved, and Deprecated Modules
|
|
=====================================
|
|
|
|
* Added a :class:`collections.Counter` class to support convenient
|
|
counting of unique items in a sequence or iterable::
|
|
|
|
>>> Counter(['red', 'blue', 'red', 'green', 'blue', 'blue'])
|
|
Counter({'blue': 3, 'red': 2, 'green': 1})
|
|
|
|
(Contributed by Raymond Hettinger; :issue:`1696199`.)
|
|
|
|
* Added a new module, :mod:`tkinter.ttk` for access to the Tk themed widget set.
|
|
The basic idea of ttk is to separate, to the extent possible, the code
|
|
implementing a widget's behavior from the code implementing its appearance.
|
|
|
|
(Contributed by Guilherme Polo; :issue:`2983`.)
|
|
|
|
* The :class:`gzip.GzipFile` and :class:`bz2.BZ2File` classes now support
|
|
the context manager protocol::
|
|
|
|
>>> # Automatically close file after writing
|
|
>>> with gzip.GzipFile(filename, "wb") as f:
|
|
... f.write(b"xxx")
|
|
|
|
(Contributed by Antoine Pitrou.)
|
|
|
|
* The :mod:`decimal` module now supports methods for creating a
|
|
decimal object from a binary :class:`float`. The conversion is
|
|
exact but can sometimes be surprising::
|
|
|
|
>>> Decimal.from_float(1.1)
|
|
Decimal('1.100000000000000088817841970012523233890533447265625')
|
|
|
|
The long decimal result shows the actual binary fraction being
|
|
stored for *1.1*. The fraction has many digits because *1.1* cannot
|
|
be exactly represented in binary.
|
|
|
|
(Contributed by Raymond Hettinger and Mark Dickinson.)
|
|
|
|
* The :mod:`itertools` module grew two new functions. The
|
|
:func:`itertools.combinations_with_replacement` function is one of
|
|
four for generating combinatorics including permutations and Cartesian
|
|
products. The :func:`itertools.compress` function mimics its namesake
|
|
from APL. Also, the existing :func:`itertools.count` function now has
|
|
an optional *step* argument and can accept any type of counting
|
|
sequence including :class:`fractions.Fraction` and
|
|
:class:`decimal.Decimal`::
|
|
|
|
>>> [p+q for p,q in combinations_with_replacement('LOVE', 2)]
|
|
['LL', 'LO', 'LV', 'LE', 'OO', 'OV', 'OE', 'VV', 'VE', 'EE']
|
|
|
|
>>> list(compress(data=range(10), selectors=[0,0,1,1,0,1,0,1,0,0]))
|
|
[2, 3, 5, 7]
|
|
|
|
>>> c = count(start=Fraction(1,2), step=Fraction(1,6))
|
|
>>> [next(c), next(c), next(c), next(c)]
|
|
[Fraction(1, 2), Fraction(2, 3), Fraction(5, 6), Fraction(1, 1)]
|
|
|
|
(Contributed by Raymond Hettinger.)
|
|
|
|
* :func:`collections.namedtuple` now supports a keyword argument
|
|
*rename* which lets invalid fieldnames be automatically converted to
|
|
positional names in the form _0, _1, etc. This is useful when
|
|
the field names are being created by an external source such as a
|
|
CSV header, SQL field list, or user input::
|
|
|
|
>>> query = input()
|
|
SELECT region, dept, count(*) FROM main GROUPBY region, dept
|
|
|
|
>>> cursor.execute(query)
|
|
>>> query_fields = [desc[0] for desc in cursor.description]
|
|
>>> UserQuery = namedtuple('UserQuery', query_fields, rename=True)
|
|
>>> pprint.pprint([UserQuery(*row) for row in cursor])
|
|
[UserQuery(region='South', dept='Shipping', _2=185),
|
|
UserQuery(region='North', dept='Accounting', _2=37),
|
|
UserQuery(region='West', dept='Sales', _2=419)]
|
|
|
|
(Contributed by Raymond Hettinger; :issue:`1818`.)
|
|
|
|
* The :func:`re.sub`, :func:`re.subn` and :func:`re.split` functions now
|
|
accept a flags parameter.
|
|
|
|
(Contributed by Gregory Smith.)
|
|
|
|
* The :mod:`logging` module now implements a simple :class:`logging.NullHandler`
|
|
class for applications that are not using logging but are calling
|
|
library code that does. Setting-up a null handler will suppress
|
|
spurious warnings such as "No handlers could be found for logger foo"::
|
|
|
|
>>> h = logging.NullHandler()
|
|
>>> logging.getLogger("foo").addHandler(h)
|
|
|
|
(Contributed by Vinay Sajip; :issue:`4384`).
|
|
|
|
* The :mod:`runpy` module which supports the ``-m`` command line switch
|
|
now supports the execution of packages by looking for and executing
|
|
a ``__main__`` submodule when a package name is supplied.
|
|
|
|
(Contributed by Andi Vajda; :issue:`4195`.)
|
|
|
|
* The :mod:`pdb` module can now access and display source code loaded via
|
|
:mod:`zipimport` (or any other conformant :pep:`302` loader).
|
|
|
|
(Contributed by Alexander Belopolsky; :issue:`4201`.)
|
|
|
|
* :class:`functools.partial` objects can now be pickled.
|
|
|
|
(Suggested by Antoine Pitrou and Jesse Noller. Implemented by
|
|
Jack Diedrich; :issue:`5228`.)
|
|
|
|
* Add :mod:`pydoc` help topics for symbols so that ``help('@')``
|
|
works as expected in the interactive environment.
|
|
|
|
(Contributed by David Laban; :issue:`4739`.)
|
|
|
|
* The :mod:`unittest` module now supports skipping individual tests or classes
|
|
of tests. And it supports marking a test as a expected failure, a test that
|
|
is known to be broken, but shouldn't be counted as a failure on a
|
|
TestResult::
|
|
|
|
class TestGizmo(unittest.TestCase):
|
|
|
|
@unittest.skipUnless(sys.platform.startswith("win"), "requires Windows")
|
|
def test_gizmo_on_windows(self):
|
|
...
|
|
|
|
@unittest.expectedFailure
|
|
def test_gimzo_without_required_library(self):
|
|
...
|
|
|
|
Also, tests for exceptions have been builtout to work with context managers
|
|
using the :keyword:`with` statement::
|
|
|
|
def test_division_by_zero(self):
|
|
with self.assertRaises(ZeroDivisionError):
|
|
x / 0
|
|
|
|
In addition, several new assertion methods were added including
|
|
:func:`assertSetEqual`, :func:`assertDictEqual`,
|
|
:func:`assertDictContainsSubset`, :func:`assertListEqual`,
|
|
:func:`assertTupleEqual`, :func:`assertSequenceEqual`,
|
|
:func:`assertRaisesRegexp`, :func:`assertIsNone`,
|
|
and :func:`assertIsNotNone`.
|
|
|
|
(Contributed by Benjamin Peterson and Antoine Pitrou.)
|
|
|
|
* The :mod:`io` module has three new constants for the :meth:`seek`
|
|
method :data:`SEEK_SET`, :data:`SEEK_CUR`, and :data:`SEEK_END`.
|
|
|
|
* The :attr:`sys.version_info` tuple is now a named tuple::
|
|
|
|
>>> sys.version_info
|
|
sys.version_info(major=3, minor=1, micro=0, releaselevel='alpha', serial=2)
|
|
|
|
(Contributed by Ross Light; :issue:`4285`.)
|
|
|
|
* The :mod:`nntplib` and :mod:`imaplib` modules now support IPv6.
|
|
|
|
(Contributed by Derek Morr; :issue:`1655` and :issue:`1664`.)
|
|
|
|
* The :mod:`pickle` module has been adapted for better interoperability with
|
|
Python 2.x when used with protocol 2 or lower. The reorganization of the
|
|
standard library changed the formal reference for many objects. For
|
|
example, ``__builtin__.set`` in Python 2 is called ``builtins.set`` in Python
|
|
3. This change confounded efforts to share data between different versions of
|
|
Python. But now when protocol 2 or lower is selected, the pickler will
|
|
automatically use the old Python 2 names for both loading and dumping. This
|
|
remapping is turned-on by default but can be disabled with the *fix_imports*
|
|
option::
|
|
|
|
>>> s = {1, 2, 3}
|
|
>>> pickle.dumps(s, protocol=0)
|
|
b'c__builtin__\nset\np0\n((lp1\nL1L\naL2L\naL3L\natp2\nRp3\n.'
|
|
>>> pickle.dumps(s, protocol=0, fix_imports=False)
|
|
b'cbuiltins\nset\np0\n((lp1\nL1L\naL2L\naL3L\natp2\nRp3\n.'
|
|
|
|
An unfortunate but unavoidable side-effect of this change is that protocol 2
|
|
pickles produced by Python 3.1 won't be readable with Python 3.0. The latest
|
|
pickle protocol, protocol 3, should be used when migrating data between
|
|
Python 3.x implementations, as it doesn't attempt to remain compatible with
|
|
Python 2.x.
|
|
|
|
(Contributed by Alexandre Vassalotti and Antoine Pitrou, :issue:`6137`.)
|
|
|
|
* A new module, :mod:`importlib` was added. It provides a complete, portable,
|
|
pure Python reference implementation of the :keyword:`import` statement and its
|
|
counterpart, the :func:`__import__` function. It represents a substantial
|
|
step forward in documenting and defining the actions that take place during
|
|
imports.
|
|
|
|
(Contributed by Brett Cannon.)
|
|
|
|
Optimizations
|
|
=============
|
|
|
|
Major performance enhancements have been added:
|
|
|
|
* The new I/O library (as defined in :pep:`3116`) was mostly written in
|
|
Python and quickly proved to be a problematic bottleneck in Python 3.0.
|
|
In Python 3.1, the I/O library has been entirely rewritten in C and is
|
|
2 to 20 times faster depending on the task at hand. The pure Python
|
|
version is still available for experimentation purposes through
|
|
the ``_pyio`` module.
|
|
|
|
(Contributed by Amaury Forgeot d'Arc and Antoine Pitrou.)
|
|
|
|
* Added a heuristic so that tuples and dicts containing only untrackable objects
|
|
are not tracked by the garbage collector. This can reduce the size of
|
|
collections and therefore the garbage collection overhead on long-running
|
|
programs, depending on their particular use of datatypes.
|
|
|
|
(Contributed by Antoine Pitrou, :issue:`4688`.)
|
|
|
|
* Enabling a configure option named ``--with-computed-gotos``
|
|
on compilers that support it (notably: gcc, SunPro, icc), the bytecode
|
|
evaluation loop is compiled with a new dispatch mechanism which gives
|
|
speedups of up to 20%, depending on the system, the compiler, and
|
|
the benchmark.
|
|
|
|
(Contributed by Antoine Pitrou along with a number of other participants,
|
|
:issue:`4753`).
|
|
|
|
* The decoding of UTF-8, UTF-16 and LATIN-1 is now two to four times
|
|
faster.
|
|
|
|
(Contributed by Antoine Pitrou and Amaury Forgeot d'Arc, :issue:`4868`.)
|
|
|
|
* The :mod:`json` module now has a C extension to substantially improve
|
|
its performance. In addition, the API was modified so that json works
|
|
only with :class:`str`, not with :class:`bytes`. That change makes the
|
|
module closely match the `JSON specification <http://json.org/>`_
|
|
which is defined in terms of Unicode.
|
|
|
|
(Contributed by Bob Ippolito and converted to Py3.1 by Antoine Pitrou
|
|
and Benjamin Peterson; :issue:`4136`.)
|
|
|
|
* Unpickling now interns the attribute names of pickled objects. This saves
|
|
memory and allows pickles to be smaller.
|
|
|
|
(Contributed by Jake McGuire and Antoine Pitrou; :issue:`5084`.)
|
|
|
|
IDLE
|
|
====
|
|
|
|
* IDLE's format menu now provides an option to strip trailing whitespace
|
|
from a source file.
|
|
|
|
(Contributed by Roger D. Serwy; :issue:`5150`.)
|
|
|
|
Build and C API Changes
|
|
=======================
|
|
|
|
Changes to Python's build process and to the C API include:
|
|
|
|
* Integers are now stored internally either in base 2**15 or in base
|
|
2**30, the base being determined at build time. Previously, they
|
|
were always stored in base 2**15. Using base 2**30 gives
|
|
significant performance improvements on 64-bit machines, but
|
|
benchmark results on 32-bit machines have been mixed. Therefore,
|
|
the default is to use base 2**30 on 64-bit machines and base 2**15
|
|
on 32-bit machines; on Unix, there's a new configure option
|
|
``--enable-big-digits`` that can be used to override this default.
|
|
|
|
Apart from the performance improvements this change should be invisible to
|
|
end users, with one exception: for testing and debugging purposes there's a
|
|
new :attr:`sys.int_info` that provides information about the
|
|
internal format, giving the number of bits per digit and the size in bytes
|
|
of the C type used to store each digit::
|
|
|
|
>>> import sys
|
|
>>> sys.int_info
|
|
sys.int_info(bits_per_digit=30, sizeof_digit=4)
|
|
|
|
(Contributed by Mark Dickinson; :issue:`4258`.)
|
|
|
|
* The :cfunc:`PyLong_AsUnsignedLongLong()` function now handles a negative
|
|
*pylong* by raising :exc:`OverflowError` instead of :exc:`TypeError`.
|
|
|
|
(Contributed by Mark Dickinson and Lisandro Dalcrin; :issue:`5175`.)
|
|
|
|
* Deprecated :cfunc:`PyNumber_Int`. Use :cfunc:`PyNumber_Long` instead.
|
|
|
|
(Contributed by Mark Dickinson; :issue:`4910`.)
|
|
|
|
* Added a new :cfunc:`PyOS_string_to_double` function to replace the
|
|
deprecated functions :cfunc:`PyOS_ascii_strtod` and :cfunc:`PyOS_ascii_atof`.
|
|
|
|
(Contributed by Mark Dickinson; :issue:`5914`.)
|
|
|
|
* Added :ctype:`PyCapsule` as a replacement for the :ctype:`PyCObject` API.
|
|
The principal difference is that the new type has a well defined interface
|
|
for passing typing safety information and a less complicated signature
|
|
for calling a destructor. The old type had a problematic API and is now
|
|
deprecated.
|
|
|
|
(Contributed by Larry Hastings; :issue:`5630`.)
|
|
|
|
Porting to Python 3.1
|
|
=====================
|
|
|
|
This section lists previously described changes and other bugfixes
|
|
that may require changes to your code:
|
|
|
|
* The new floating point string representations can break existing doctests.
|
|
For example::
|
|
|
|
def e():
|
|
'''Compute the base of natural logarithms.
|
|
|
|
>>> e()
|
|
2.7182818284590451
|
|
|
|
'''
|
|
return sum(1/math.factorial(x) for x in reversed(range(30)))
|
|
|
|
doctest.testmod()
|
|
|
|
**********************************************************************
|
|
Failed example:
|
|
e()
|
|
Expected:
|
|
2.7182818284590451
|
|
Got:
|
|
2.718281828459045
|
|
**********************************************************************
|
|
|
|
* The automatic name remapping in the pickle module for protocol 2 or lower can
|
|
make Python 3.1 pickles unreadable in Python 3.0. One solution is to use
|
|
protocol 3. Another solution is to set the *fix_imports* option to **False**.
|
|
See the discussion above for more details.
|