cpython/Doc/whatsnew/3.1.rst

392 lines
14 KiB
ReStructuredText

****************************
What's New In Python 3.1
****************************
:Author: Raymond Hettinger
:Release: |release|
:Date: |today|
.. $Id$
Rules for maintenance:
* Anyone can add text to this document. Do not spend very much time
on the wording of your changes, because your text will probably
get rewritten to some degree.
* The maintainer will go through Misc/NEWS periodically and add
changes; it's therefore more important to add your changes to
Misc/NEWS than to this file.
* This is not a complete list of every single change; completeness
is the purpose of Misc/NEWS. Some changes I consider too small
or esoteric to include. If such a change is added to the text,
I'll just remove it. (This is another reason you shouldn't spend
too much time on writing your addition.)
* If you want to draw your new text to the attention of the
maintainer, add 'XXX' to the beginning of the paragraph or
section.
* It's OK to just add a fragmentary note about a change. For
example: "XXX Describe the transmogrify() function added to the
socket module." The maintainer will research the change and
write the necessary text.
* You can comment out your additions if you like, but it's not
necessary (especially when a final release is some months away).
* Credit the author of a patch or bugfix. Just the name is
sufficient; the e-mail address isn't necessary.
* It's helpful to add the bug/patch number as a comment:
% Patch 12345
XXX Describe the transmogrify() function added to the socket
module.
(Contributed by P.Y. Developer.)
This saves the maintainer the effort of going through the SVN log
when researching a change.
This article explains the new features in Python 3.1, compared to 3.0.
PEP 372: Ordered Dictionaries
=============================
Regular Python dictionaries iterate over key/value pairs in arbitrary order.
Over the years, a number of authors have written alternative implementations
that remember the order that the keys were originally inserted. Based on
the experiences from those implementations, the :mod:`collections` module
now has an :class:`OrderedDict` class.
The OrderedDict API is substantially the same as regular dictionaries
but will iterate over keys and values in a guaranteed order depending on
when a key was first inserted. If a new entry overwrites an existing entry,
the original insertion position is left unchanged. Deleting an entry and
reinserting it will move it to the end.
The standard library now supports use of ordered dictionaries in several
modules. The :mod:`configparser` module uses them by default. This lets
configuration files be read, modified, and then written back in their original
order. The :mod:`collections` module's :meth:`namedtuple._asdict` method now
returns an ordered dictionary with the values appearing in the same order as
the underlying tuple indicies. The :mod:`json` module is being built-out with
an *object_pairs_hook* to allow OrderedDicts to be built by the decoder.
Support was also added for third-party tools like `PyYAML <http://pyyaml.org/>`_.
.. seealso::
:pep:`372` - Ordered Dictionaries
PEP written by Armin Ronacher and Raymond Hettinger. Implementation
written by Raymond Hettinger.
PEP 378: Format Specifier for Thousands Separator
=================================================
The builtin :func:`format` function and the :meth:`str.format` method use
a mini-language that now includes a simple, non-locale aware way to format
a number with a thousands separator. That provides a way to humanize a
program's output, improving its professional appearance and readability::
>>> format(Decimal('1234567.89'), ',f')
'1,234,567.89'
The currently supported types are :class:`int` and :class:`decimal.Decimal`.
Support for :class:`float` is expected before the beta release.
Discussions are underway about how to specify alternative separators
like dots, spaces, apostrophes, or underscores. Locale-aware applications
should use the existing *n* format specifier which already has some support
for thousands separators.
.. seealso::
:pep:`378` - Format Specifier for Thousands Separator
PEP written by Raymond Hettinger; implemented by Eric Smith and
Mark Dickinson.
Other Language Changes
======================
Some smaller changes made to the core Python language are:
* The :func:`int` type gained a ``bit_length`` method that returns the
number of bits necessary to represent its argument in binary::
>>> n = 37
>>> bin(37)
'0b100101'
>>> n.bit_length()
6
>>> n = 2**123-1
>>> n.bit_length()
123
>>> (n+1).bit_length()
124
(Contributed by Fredrik Johansson, Victor Stinner, Raymond Hettinger,
and Mark Dickinson; :issue:`3439`.)
* The fields in :func:`format` strings can now be automatically
numbered::
>>> 'Sir {} of {}'.format('Gallahad', 'Camelot')
'Sir Gallahad of Camelot'
Formerly, the string would have required numbered fields such as:
``'Sir {0} of {1}'``.
(Contributed by Eric Smith; :issue:`5237`.)
* ``round(x, n)`` now returns an integer if *x* is an integer.
Previously it returned a float::
>>> round(1123, -2)
1100
(Contributed by Mark Dickinson; :issue:`4707`.)
New, Improved, and Deprecated Modules
=====================================
* Added a :class:`collections.Counter` class to support convenient
counting of unique items in a sequence or iterable::
>>> Counter(['red', 'blue', 'red', 'green', 'blue', 'blue'])
Counter({'blue': 3, 'red': 2, 'green': 1})
(Contributed by Raymond Hettinger; :issue:`1696199`.)
* Added a new module, :mod:`tkinter.ttk` for access to the Tk themed widget set.
The basic idea of ttk is to separate, to the extent possible, the code
implementing a widget's behavior from the code implementing its appearance.
(Contributed by Guilherme Polo; :issue:`2983`.)
* The :class:`gzip.GzipFile` and :class:`bz2.BZ2File` classes now support
the context manager protocol::
>>> # Automatically close file after writing
>>> with gzip.GzipFile(filename, "wb") as f:
... f.write(b"xxx")
(Contributed by Antoine Pitrou.)
* The :mod:`decimal` module now supports methods for creating a
decimal object from a binary :class:`float`. The conversion is
exact but can sometimes be surprising::
>>> Decimal.from_float(1.1)
Decimal('1.100000000000000088817841970012523233890533447265625')
The long decimal result shows the actual binary fraction being
stored for *1.1*. The fraction has many digits because *1.1* cannot
be exactly represented in binary.
(Contributed by Raymond Hettinger and Mark Dickinson.)
* The :mod:`itertools` module grew two new functions. The
:func:`itertools.combinations_with_replacement` function is one of
four for generating combinatorics including permutations and Cartesian
products. The :func:`itertools.compress` function mimics its namesake
from APL. Also, the existing :func:`itertools.count` function now has
an optional *step* argument and can accept any type of counting
sequence including :class:`fractions.Fraction` and
:class:`decimal.Decimal`::
>>> [p+q for p,q in combinations_with_replacement('LOVE', 2)]
['LL', 'LO', 'LV', 'LE', 'OO', 'OV', 'OE', 'VV', 'VE', 'EE']
>>> list(compress(data=range(10), selectors=[0,0,1,1,0,1,0,1,0,0]))
[2, 3, 5, 7]
>>> c = count(start=Fraction(1,2), step=Fraction(1,6))
>>> next(c), next(c), next(c), next(c)
(Fraction(1, 2), Fraction(2, 3), Fraction(5, 6), Fraction(1, 1))
(Contributed by Raymond Hettinger.)
* :func:`collections.namedtuple` now supports a keyword argument
*rename* which lets invalid fieldnames be automatically converted to
positional names in the form _0, _1, etc. This is useful when
the field names are being created by an external source such as a
CSV header, SQL field list, or user input::
>>> query = input()
SELECT region, dept, count(*) FROM main GROUPBY region, dept
>>> cursor.execute(query)
>>> query_fields = [desc[0] for desc in cursor.description]
>>> UserQuery = namedtuple('UserQuery', query_fields, rename=True)
>>> pprint.pprint([UserQuery(*row) for row in cursor])
[UserQuery(region='South', dept='Shipping', _2=185),
UserQuery(region='North', dept='Accounting', _2=37),
UserQuery(region='West', dept='Sales', _2=419)]
(Contributed by Raymond Hettinger; :issue:`1818`.)
* The :func:`re.sub`, :func:`re.subn` and :func:`re.split` functions now
accept a flags parameter.
(Contributed by Gregory Smith.)
* The :mod:`logging` module now implements a simple :class:`NullHandler`
class for applications that are not using logging but are calling
library code that does. Setting-up a null handler will suppress
spurious warnings like "No handlers could be found for logger X.Y.Z"::
>>> h = logging.NullHandler()
>>> logging.getLogger("foo").addHandler(h)
(Contributed by Vinay Sajip; issue:`4384`).
* The :mod:`runpy` module which supports the ``-m`` command line switch
now supports the execution of packages by looking for and executing
a ``__main__`` submodule when a package name is supplied.
(Contributed by Andi Vajda; :issue:`4195`.)
* The :mod:`pdb` module can now access and display source code loaded via
:mod:`zipimport` (or any other conformant :pep:`302` loader).
(Contributed by Alexander Belopolsky; :issue:`4201`.)
* :class:`functools.partial` objects can now be pickled.
(Suggested by Antoine Pitrou and Jesse Noller. Implemented by
Jack Diedrich; :issue:`5228`.)
* Add :mod:`pydoc` help topics for symbols so that ``help('@')``
works as expected in the interactive environment.
(Contributed by David Laban; :issue:`4739`.)
* The :mod:`unittest` module now supports skipping individual tests or classes
of tests. And it supports marking a test as a expected failure, a test that
is known to be broken, but shouldn't be counted as a failure on a
TestResult::
class TestGizmo(unittest.TestCase):
@unittest.skipUnless(sys.platform.startswith("win"), "requires Windows")
def test_gizmo_on_windows(self):
...
@unittest.expectedFailure
def test_gimzo_without_required_library(self):
...
Also, tests for exceptions have been builtout to work with context managers::
def test_division_by_zero(self):
with self.assertRaises(ZeroDivisionError):
x / 0
In addition, several new assertion methods were added including
:func:`assertSetEqual`, :func:`assertDictEqual`,
:func:`assertDictContainsSubset`, :func:`assertListEqual`,
:func:`assertTupleEqual`, :func:`assertSequenceEqual`,
:func:`assertRaisesRegexp`, :func:`assertIsNone`,
and :func:`assertIsNotNot`.
(Contributed by Benjamin Peterson and Antoine Pitrou.)
* The :mod:`io` module has three new constants for the :meth:`seek`
method :data:`SEEK_SET`, :data:`SEEK_CUR`, and :data:`SEEK_END`.
* The :attr:`sys.version_info` tuple is now a named tuple::
>>> sys.version_info
sys.version_info(major=3, minor=1, micro=0, releaselevel='alpha', serial=2)
(Contributed by Ross Light; :issue:`4285`.)
* A new module, :mod:`importlib` was added. It provides a complete, portable,
pure Python reference implementation of the *import* statement and its
counterpart, the :func:`__import__` function. It represents a substantial
step forward in documenting and defining the actions that take place during
imports.
(Contributed by Brett Cannon.)
Optimizations
=============
Major performance enhancements have been added:
* The new I/O library (as defined in :pep:`3116`) was mostly written in
Python and quickly proved to be a problematic bottleneck in Python 3.0.
In Python 3.1, the I/O library has been entirely rewritten in C and is
2 to 20 times faster depending on the task at hand. The pure Python
version is still available for experimentation purposes through
the ``_pyio`` module.
(Contributed by Amaury Forgeot d'Arc and Antoine Pitrou.)
* Added a heuristic so that tuples and dicts containing only untrackable objects
are not tracked by the garbage collector. This can reduce the size of
collections and therefore the garbage collection overhead on long-running
programs, depending on their particular use of datatypes.
(Contributed by Antoine Pitrou, :issue:`4688`.)
* Enabling a configure option named ``--with-computed-gotos``
on compilers that support it (notably: gcc, SunPro, icc), the bytecode
evaluation loop is compiled with a new dispatch mechanism which gives
speedups of up to 20%, depending on the system, the compiler, and
the benchmark.
(Contributed by Antoine Pitrou along with a number of other participants,
:issue:`4753`).
* The decoding of UTF-8, UTF-16 and LATIN-1 is now two to four times
faster.
(Contributed by Antoine Pitrou and Amaury Forgeot d'Arc, :issue:`4868`.)
* The :mod:`json` module is getting a C extension to substantially improve
its performance. The code is expected to be added in-time for the beta
release.
(Contributed by Bob Ippolito and converted to Py3.1 by Antoine Pitrou;
:issue:`4136`.)
Build and C API Changes
=======================
Changes to Python's build process and to the C API include:
* Integers are now stored internally either in base 2**15 or in base
2**30, the base being determined at build time. Previously, they
were always stored in base 2**15. Using base 2**30 gives
significant performance improvements on 64-bit machines, but
benchmark results on 32-bit machines have been mixed. Therefore,
the default is to use base 2**30 on 64-bit machines and base 2**15
on 32-bit machines; on Unix, there's a new configure option
``--enable-big-digits`` that can be used to override this default.
Apart from the performance improvements this change should be invisible to
end users, with one exception: for testing and debugging purposes there's a
new :attr:`sys.int_info` that provides information about the
internal format, giving the number of bits per digit and the size in bytes
of the C type used to store each digit::
>>> import sys
>>> sys.int_info
sys.int_info(bits_per_digit=30, sizeof_digit=4)
(Contributed by Mark Dickinson; :issue:`4258`.)
* The :cfunc:`PyLong_AsUnsignedLongLong()` function now handles a negative
*pylong* by raising :exc:`OverflowError` instead of :exc:`TypeError`.
(Contributed by Mark Dickinson and Lisandro Dalcrin; :issue:`5175`.)
* Deprecated :cfunc:`PyNumber_Int`. Use :cfunc:`PyNumber_Long` instead.
(Contributed by Mark Dickinson; :issue:`4910`.)