2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
:mod:`collections` --- High-performance container datatypes
|
|
|
|
===========================================================
|
|
|
|
|
|
|
|
.. module:: collections
|
|
|
|
:synopsis: High-performance datatypes
|
|
|
|
.. moduleauthor:: Raymond Hettinger <python@rcn.com>
|
|
|
|
.. sectionauthor:: Raymond Hettinger <python@rcn.com>
|
|
|
|
|
|
|
|
.. versionadded:: 2.4
|
|
|
|
|
2008-03-22 18:06:20 -03:00
|
|
|
.. testsetup:: *
|
|
|
|
|
|
|
|
from collections import *
|
|
|
|
import itertools
|
|
|
|
__name__ = '<doctest>'
|
|
|
|
|
2007-08-15 11:28:01 -03:00
|
|
|
This module implements high-performance container datatypes. Currently,
|
|
|
|
there are two datatypes, :class:`deque` and :class:`defaultdict`, and
|
2008-03-22 18:06:20 -03:00
|
|
|
one datatype factory function, :func:`namedtuple`.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
.. versionchanged:: 2.5
|
|
|
|
Added :class:`defaultdict`.
|
|
|
|
|
|
|
|
.. versionchanged:: 2.6
|
2007-11-14 22:44:53 -04:00
|
|
|
Added :func:`namedtuple`.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-02-11 19:38:00 -04:00
|
|
|
The specialized containers provided in this module provide alternatives
|
2008-03-22 18:06:20 -03:00
|
|
|
to Python's general purpose built-in containers, :class:`dict`,
|
2008-02-11 19:38:00 -04:00
|
|
|
:class:`list`, :class:`set`, and :class:`tuple`.
|
|
|
|
|
|
|
|
Besides the containers provided here, the optional :mod:`bsddb`
|
2008-03-22 18:06:20 -03:00
|
|
|
module offers the ability to create in-memory or file based ordered
|
2008-02-11 19:38:00 -04:00
|
|
|
dictionaries with string keys using the :meth:`bsddb.btopen` method.
|
|
|
|
|
|
|
|
In addition to containers, the collections module provides some ABCs
|
2008-03-22 18:06:20 -03:00
|
|
|
(abstract base classes) that can be used to test whether a class
|
2008-02-11 19:38:00 -04:00
|
|
|
provides a particular interface, for example, is it hashable or
|
2008-03-22 18:06:20 -03:00
|
|
|
a mapping.
|
2008-02-11 19:38:00 -04:00
|
|
|
|
|
|
|
.. versionchanged:: 2.6
|
|
|
|
Added abstract base classes.
|
|
|
|
|
|
|
|
ABCs - abstract base classes
|
|
|
|
----------------------------
|
|
|
|
|
|
|
|
The collections module offers the following ABCs:
|
|
|
|
|
|
|
|
========================= ==================== ====================== ====================================================
|
|
|
|
ABC Inherits Abstract Methods Mixin Methods
|
|
|
|
========================= ==================== ====================== ====================================================
|
|
|
|
:class:`Container` ``__contains__``
|
|
|
|
:class:`Hashable` ``__hash__``
|
|
|
|
:class:`Iterable` ``__iter__``
|
|
|
|
:class:`Iterator` :class:`Iterable` ``__next__`` ``__iter__``
|
|
|
|
:class:`Sized` ``__len__``
|
|
|
|
|
|
|
|
:class:`Mapping` :class:`Sized`, ``__getitem__``, ``__contains__``, ``keys``, ``items``, ``values``,
|
|
|
|
:class:`Iterable`, ``__len__``. and ``get``, ``__eq__``, and ``__ne__``
|
|
|
|
:class:`Container` ``__iter__``
|
|
|
|
|
|
|
|
:class:`MutableMapping` :class:`Mapping` ``__getitem__`` Inherited Mapping methods and
|
|
|
|
``__setitem__``, ``pop``, ``popitem``, ``clear``, ``update``,
|
|
|
|
``__delitem__``, and ``setdefault``
|
|
|
|
``__iter__``, and
|
|
|
|
``__len__``
|
|
|
|
|
|
|
|
:class:`Sequence` :class:`Sized`, ``__getitem__`` ``__contains__``. ``__iter__``, ``__reversed__``.
|
|
|
|
:class:`Iterable`, and ``__len__`` ``index``, and ``count``
|
|
|
|
:class:`Container`
|
|
|
|
|
|
|
|
:class:`MutableSequnce` :class:`Sequence` ``__getitem__`` Inherited Sequence methods and
|
|
|
|
``__delitem__``, ``append``, ``reverse``, ``extend``, ``pop``,
|
|
|
|
``insert``, ``remove``, and ``__iadd__``
|
|
|
|
and ``__len__``
|
|
|
|
|
|
|
|
:class:`Set` :class:`Sized`, ``__len__``, ``__le__``, ``__lt__``, ``__eq__``, ``__ne__``,
|
|
|
|
:class:`Iterable`, ``__iter__``, and ``__gt__``, ``__ge__``, ``__and__``, ``__or__``
|
|
|
|
:class:`Container` ``__contains__`` ``__sub__``, ``__xor__``, and ``isdisjoint``
|
|
|
|
|
|
|
|
:class:`MutableSet` :class:`Set` ``add`` and Inherited Set methods and
|
|
|
|
``discard`` ``clear``, ``pop``, ``remove``, ``__ior__``,
|
|
|
|
``__iand__``, ``__ixor__``, and ``__isub__``
|
|
|
|
========================= ==================== ====================== ====================================================
|
|
|
|
|
|
|
|
These ABCs allow us to ask classes or instances if they provide
|
|
|
|
particular functionality, for example::
|
|
|
|
|
|
|
|
size = None
|
|
|
|
if isinstance(myvar, collections.Sized):
|
|
|
|
size = len(myvar)
|
|
|
|
|
|
|
|
Several of the ABCs are also useful as mixins that make it easier to develop
|
|
|
|
classes supporting container APIs. For example, to write a class supporting
|
|
|
|
the full :class:`Set` API, it only necessary to supply the three underlying
|
|
|
|
abstract methods: :meth:`__contains__`, :meth:`__iter__`, and :meth:`__len__`.
|
|
|
|
The ABC supplies the remaining methods such as :meth:`__and__` and
|
|
|
|
:meth:`isdisjoint` ::
|
|
|
|
|
|
|
|
class ListBasedSet(collections.Set):
|
|
|
|
''' Alternate set implementation favoring space over speed
|
|
|
|
and not requiring the set elements to be hashable. '''
|
|
|
|
def __init__(self, iterable):
|
|
|
|
self.elements = lst = []
|
|
|
|
for value in iterable:
|
|
|
|
if value not in lst:
|
|
|
|
lst.append(value)
|
|
|
|
def __iter__(self):
|
|
|
|
return iter(self.elements)
|
|
|
|
def __contains__(self, value):
|
|
|
|
return value in self.elements
|
|
|
|
def __len__(self):
|
|
|
|
return len(self.elements)
|
|
|
|
|
|
|
|
s1 = ListBasedSet('abcdef')
|
|
|
|
s2 = ListBasedSet('defghi')
|
|
|
|
overlap = s1 & s2 # The __and__() method is supported automatically
|
|
|
|
|
|
|
|
Notes on using :class:`Set` and :class:`MutableSet` as a mixin:
|
|
|
|
|
2008-03-22 18:06:20 -03:00
|
|
|
(1)
|
2008-02-11 19:38:00 -04:00
|
|
|
Since some set operations create new sets, the default mixin methods need
|
2008-03-22 18:06:20 -03:00
|
|
|
a way to create new instances from an iterable. The class constructor is
|
|
|
|
assumed to have a signature in the form ``ClassName(iterable)``.
|
2008-05-23 14:34:34 -03:00
|
|
|
That assumption is factored-out to an internal classmethod called
|
2008-02-11 19:38:00 -04:00
|
|
|
:meth:`_from_iterable` which calls ``cls(iterable)`` to produce a new set.
|
|
|
|
If the :class:`Set` mixin is being used in a class with a different
|
2008-03-22 18:06:20 -03:00
|
|
|
constructor signature, you will need to override :meth:`from_iterable`
|
|
|
|
with a classmethod that can construct new instances from
|
2008-02-11 19:38:00 -04:00
|
|
|
an iterable argument.
|
|
|
|
|
|
|
|
(2)
|
|
|
|
To override the comparisons (presumably for speed, as the
|
|
|
|
semantics are fixed), redefine :meth:`__le__` and
|
|
|
|
then the other operations will automatically follow suit.
|
|
|
|
|
|
|
|
(3)
|
|
|
|
The :class:`Set` mixin provides a :meth:`_hash` method to compute a hash value
|
|
|
|
for the set; however, :meth:`__hash__` is not defined because not all sets
|
|
|
|
are hashable or immutable. To add set hashabilty using mixins,
|
|
|
|
inherit from both :meth:`Set` and :meth:`Hashable`, then define
|
|
|
|
``__hash__ = Set._hash``.
|
|
|
|
|
|
|
|
(For more about ABCs, see the :mod:`abc` module and :pep:`3119`.)
|
|
|
|
|
|
|
|
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
.. _deque-objects:
|
|
|
|
|
|
|
|
:class:`deque` objects
|
|
|
|
----------------------
|
|
|
|
|
|
|
|
|
2007-10-04 23:47:07 -03:00
|
|
|
.. class:: deque([iterable[, maxlen]])
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
Returns a new deque object initialized left-to-right (using :meth:`append`) with
|
|
|
|
data from *iterable*. If *iterable* is not specified, the new deque is empty.
|
|
|
|
|
|
|
|
Deques are a generalization of stacks and queues (the name is pronounced "deck"
|
|
|
|
and is short for "double-ended queue"). Deques support thread-safe, memory
|
|
|
|
efficient appends and pops from either side of the deque with approximately the
|
|
|
|
same O(1) performance in either direction.
|
|
|
|
|
|
|
|
Though :class:`list` objects support similar operations, they are optimized for
|
|
|
|
fast fixed-length operations and incur O(n) memory movement costs for
|
|
|
|
``pop(0)`` and ``insert(0, v)`` operations which change both the size and
|
|
|
|
position of the underlying data representation.
|
|
|
|
|
|
|
|
.. versionadded:: 2.4
|
|
|
|
|
2007-10-09 21:26:46 -03:00
|
|
|
If *maxlen* is not specified or is *None*, deques may grow to an
|
2007-10-04 23:47:07 -03:00
|
|
|
arbitrary length. Otherwise, the deque is bounded to the specified maximum
|
|
|
|
length. Once a bounded length deque is full, when new items are added, a
|
|
|
|
corresponding number of items are discarded from the opposite end. Bounded
|
|
|
|
length deques provide functionality similar to the ``tail`` filter in
|
|
|
|
Unix. They are also useful for tracking transactions and other pools of data
|
|
|
|
where only the most recent activity is of interest.
|
|
|
|
|
|
|
|
.. versionchanged:: 2.6
|
2007-12-29 06:57:00 -04:00
|
|
|
Added *maxlen* parameter.
|
2007-10-04 23:47:07 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Deque objects support the following methods:
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: append(x)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Add *x* to the right side of the deque.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: appendleft(x)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Add *x* to the left side of the deque.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: clear()
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Remove all elements from the deque leaving it with length 0.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: extend(iterable)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Extend the right side of the deque by appending elements from the iterable
|
|
|
|
argument.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: extendleft(iterable)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Extend the left side of the deque by appending elements from *iterable*.
|
|
|
|
Note, the series of left appends results in reversing the order of
|
|
|
|
elements in the iterable argument.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: pop()
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Remove and return an element from the right side of the deque. If no
|
|
|
|
elements are present, raises an :exc:`IndexError`.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: popleft()
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Remove and return an element from the left side of the deque. If no
|
|
|
|
elements are present, raises an :exc:`IndexError`.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: remove(value)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Removed the first occurrence of *value*. If not found, raises a
|
|
|
|
:exc:`ValueError`.
|
|
|
|
|
|
|
|
.. versionadded:: 2.5
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: rotate(n)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Rotate the deque *n* steps to the right. If *n* is negative, rotate to
|
|
|
|
the left. Rotating one step to the right is equivalent to:
|
|
|
|
``d.appendleft(d.pop())``.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
|
|
|
In addition to the above, deques support iteration, pickling, ``len(d)``,
|
|
|
|
``reversed(d)``, ``copy.copy(d)``, ``copy.deepcopy(d)``, membership testing with
|
|
|
|
the :keyword:`in` operator, and subscript references such as ``d[-1]``.
|
|
|
|
|
2008-03-22 18:06:20 -03:00
|
|
|
Example:
|
|
|
|
|
|
|
|
.. doctest::
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
>>> from collections import deque
|
|
|
|
>>> d = deque('ghi') # make a new deque with three items
|
|
|
|
>>> for elem in d: # iterate over the deque's elements
|
2008-03-22 18:06:20 -03:00
|
|
|
... print elem.upper()
|
2007-08-15 11:28:01 -03:00
|
|
|
G
|
|
|
|
H
|
|
|
|
I
|
|
|
|
|
|
|
|
>>> d.append('j') # add a new entry to the right side
|
|
|
|
>>> d.appendleft('f') # add a new entry to the left side
|
|
|
|
>>> d # show the representation of the deque
|
|
|
|
deque(['f', 'g', 'h', 'i', 'j'])
|
|
|
|
|
|
|
|
>>> d.pop() # return and remove the rightmost item
|
|
|
|
'j'
|
|
|
|
>>> d.popleft() # return and remove the leftmost item
|
|
|
|
'f'
|
|
|
|
>>> list(d) # list the contents of the deque
|
|
|
|
['g', 'h', 'i']
|
|
|
|
>>> d[0] # peek at leftmost item
|
|
|
|
'g'
|
|
|
|
>>> d[-1] # peek at rightmost item
|
|
|
|
'i'
|
|
|
|
|
|
|
|
>>> list(reversed(d)) # list the contents of a deque in reverse
|
|
|
|
['i', 'h', 'g']
|
|
|
|
>>> 'h' in d # search the deque
|
|
|
|
True
|
|
|
|
>>> d.extend('jkl') # add multiple elements at once
|
|
|
|
>>> d
|
|
|
|
deque(['g', 'h', 'i', 'j', 'k', 'l'])
|
|
|
|
>>> d.rotate(1) # right rotation
|
|
|
|
>>> d
|
|
|
|
deque(['l', 'g', 'h', 'i', 'j', 'k'])
|
|
|
|
>>> d.rotate(-1) # left rotation
|
|
|
|
>>> d
|
|
|
|
deque(['g', 'h', 'i', 'j', 'k', 'l'])
|
|
|
|
|
|
|
|
>>> deque(reversed(d)) # make a new deque in reverse order
|
|
|
|
deque(['l', 'k', 'j', 'i', 'h', 'g'])
|
|
|
|
>>> d.clear() # empty the deque
|
|
|
|
>>> d.pop() # cannot pop from an empty deque
|
|
|
|
Traceback (most recent call last):
|
|
|
|
File "<pyshell#6>", line 1, in -toplevel-
|
|
|
|
d.pop()
|
|
|
|
IndexError: pop from an empty deque
|
|
|
|
|
|
|
|
>>> d.extendleft('abc') # extendleft() reverses the input order
|
|
|
|
>>> d
|
|
|
|
deque(['c', 'b', 'a'])
|
|
|
|
|
|
|
|
|
|
|
|
.. _deque-recipes:
|
|
|
|
|
2007-10-04 23:47:07 -03:00
|
|
|
:class:`deque` Recipes
|
|
|
|
^^^^^^^^^^^^^^^^^^^^^^
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
This section shows various approaches to working with deques.
|
|
|
|
|
|
|
|
The :meth:`rotate` method provides a way to implement :class:`deque` slicing and
|
|
|
|
deletion. For example, a pure python implementation of ``del d[n]`` relies on
|
|
|
|
the :meth:`rotate` method to position elements to be popped::
|
|
|
|
|
|
|
|
def delete_nth(d, n):
|
|
|
|
d.rotate(-n)
|
|
|
|
d.popleft()
|
|
|
|
d.rotate(n)
|
|
|
|
|
|
|
|
To implement :class:`deque` slicing, use a similar approach applying
|
|
|
|
:meth:`rotate` to bring a target element to the left side of the deque. Remove
|
|
|
|
old entries with :meth:`popleft`, add new entries with :meth:`extend`, and then
|
|
|
|
reverse the rotation.
|
|
|
|
With minor variations on that approach, it is easy to implement Forth style
|
|
|
|
stack manipulations such as ``dup``, ``drop``, ``swap``, ``over``, ``pick``,
|
|
|
|
``rot``, and ``roll``.
|
|
|
|
|
|
|
|
Multi-pass data reduction algorithms can be succinctly expressed and efficiently
|
|
|
|
coded by extracting elements with multiple calls to :meth:`popleft`, applying
|
2007-10-04 23:47:07 -03:00
|
|
|
a reduction function, and calling :meth:`append` to add the result back to the
|
|
|
|
deque.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
For example, building a balanced binary tree of nested lists entails reducing
|
2008-03-22 18:06:20 -03:00
|
|
|
two adjacent nodes into one by grouping them in a list:
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
>>> def maketree(iterable):
|
|
|
|
... d = deque(iterable)
|
|
|
|
... while len(d) > 1:
|
|
|
|
... pair = [d.popleft(), d.popleft()]
|
|
|
|
... d.append(pair)
|
|
|
|
... return list(d)
|
|
|
|
...
|
|
|
|
>>> print maketree('abcdefgh')
|
|
|
|
[[[['a', 'b'], ['c', 'd']], [['e', 'f'], ['g', 'h']]]]
|
|
|
|
|
2007-10-04 23:47:07 -03:00
|
|
|
Bounded length deques provide functionality similar to the ``tail`` filter
|
|
|
|
in Unix::
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2007-10-04 23:47:07 -03:00
|
|
|
def tail(filename, n=10):
|
|
|
|
'Return the last n lines of a file'
|
|
|
|
return deque(open(filename), n)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
.. _defaultdict-objects:
|
|
|
|
|
|
|
|
:class:`defaultdict` objects
|
|
|
|
----------------------------
|
|
|
|
|
|
|
|
|
|
|
|
.. class:: defaultdict([default_factory[, ...]])
|
|
|
|
|
|
|
|
Returns a new dictionary-like object. :class:`defaultdict` is a subclass of the
|
|
|
|
builtin :class:`dict` class. It overrides one method and adds one writable
|
|
|
|
instance variable. The remaining functionality is the same as for the
|
|
|
|
:class:`dict` class and is not documented here.
|
|
|
|
|
|
|
|
The first argument provides the initial value for the :attr:`default_factory`
|
|
|
|
attribute; it defaults to ``None``. All remaining arguments are treated the same
|
|
|
|
as if they were passed to the :class:`dict` constructor, including keyword
|
|
|
|
arguments.
|
|
|
|
|
|
|
|
.. versionadded:: 2.5
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
:class:`defaultdict` objects support the following method in addition to the
|
|
|
|
standard :class:`dict` operations:
|
|
|
|
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: defaultdict.__missing__(key)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
If the :attr:`default_factory` attribute is ``None``, this raises an
|
|
|
|
:exc:`KeyError` exception with the *key* as argument.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
If :attr:`default_factory` is not ``None``, it is called without arguments
|
|
|
|
to provide a default value for the given *key*, this value is inserted in
|
|
|
|
the dictionary for the *key*, and returned.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
If calling :attr:`default_factory` raises an exception this exception is
|
|
|
|
propagated unchanged.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
This method is called by the :meth:`__getitem__` method of the
|
|
|
|
:class:`dict` class when the requested key is not found; whatever it
|
|
|
|
returns or raises is then returned or raised by :meth:`__getitem__`.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
:class:`defaultdict` objects support the following instance variable:
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. attribute:: defaultdict.default_factory
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
This attribute is used by the :meth:`__missing__` method; it is
|
|
|
|
initialized from the first argument to the constructor, if present, or to
|
|
|
|
``None``, if absent.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
|
|
|
.. _defaultdict-examples:
|
|
|
|
|
|
|
|
:class:`defaultdict` Examples
|
|
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
Using :class:`list` as the :attr:`default_factory`, it is easy to group a
|
2008-03-22 18:06:20 -03:00
|
|
|
sequence of key-value pairs into a dictionary of lists:
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
|
|
|
|
>>> d = defaultdict(list)
|
|
|
|
>>> for k, v in s:
|
|
|
|
... d[k].append(v)
|
|
|
|
...
|
|
|
|
>>> d.items()
|
|
|
|
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
|
|
|
|
|
|
|
|
When each key is encountered for the first time, it is not already in the
|
|
|
|
mapping; so an entry is automatically created using the :attr:`default_factory`
|
|
|
|
function which returns an empty :class:`list`. The :meth:`list.append`
|
|
|
|
operation then attaches the value to the new list. When keys are encountered
|
|
|
|
again, the look-up proceeds normally (returning the list for that key) and the
|
|
|
|
:meth:`list.append` operation adds another value to the list. This technique is
|
2008-03-22 18:06:20 -03:00
|
|
|
simpler and faster than an equivalent technique using :meth:`dict.setdefault`:
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
>>> d = {}
|
|
|
|
>>> for k, v in s:
|
|
|
|
... d.setdefault(k, []).append(v)
|
|
|
|
...
|
|
|
|
>>> d.items()
|
|
|
|
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
|
|
|
|
|
|
|
|
Setting the :attr:`default_factory` to :class:`int` makes the
|
|
|
|
:class:`defaultdict` useful for counting (like a bag or multiset in other
|
2008-03-22 18:06:20 -03:00
|
|
|
languages):
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
>>> s = 'mississippi'
|
|
|
|
>>> d = defaultdict(int)
|
|
|
|
>>> for k in s:
|
|
|
|
... d[k] += 1
|
|
|
|
...
|
|
|
|
>>> d.items()
|
|
|
|
[('i', 4), ('p', 2), ('s', 4), ('m', 1)]
|
|
|
|
|
|
|
|
When a letter is first encountered, it is missing from the mapping, so the
|
|
|
|
:attr:`default_factory` function calls :func:`int` to supply a default count of
|
|
|
|
zero. The increment operation then builds up the count for each letter.
|
|
|
|
|
|
|
|
The function :func:`int` which always returns zero is just a special case of
|
|
|
|
constant functions. A faster and more flexible way to create constant functions
|
|
|
|
is to use :func:`itertools.repeat` which can supply any constant value (not just
|
2008-03-22 18:06:20 -03:00
|
|
|
zero):
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
>>> def constant_factory(value):
|
|
|
|
... return itertools.repeat(value).next
|
|
|
|
>>> d = defaultdict(constant_factory('<missing>'))
|
|
|
|
>>> d.update(name='John', action='ran')
|
|
|
|
>>> '%(name)s %(action)s to %(object)s' % d
|
|
|
|
'John ran to <missing>'
|
|
|
|
|
|
|
|
Setting the :attr:`default_factory` to :class:`set` makes the
|
2008-03-22 18:06:20 -03:00
|
|
|
:class:`defaultdict` useful for building a dictionary of sets:
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
>>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]
|
|
|
|
>>> d = defaultdict(set)
|
|
|
|
>>> for k, v in s:
|
|
|
|
... d[k].add(v)
|
|
|
|
...
|
|
|
|
>>> d.items()
|
|
|
|
[('blue', set([2, 4])), ('red', set([1, 3]))]
|
|
|
|
|
|
|
|
|
|
|
|
.. _named-tuple-factory:
|
|
|
|
|
2007-11-14 22:44:53 -04:00
|
|
|
:func:`namedtuple` Factory Function for Tuples with Named Fields
|
2008-01-07 12:43:47 -04:00
|
|
|
----------------------------------------------------------------
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2007-09-18 19:18:02 -03:00
|
|
|
Named tuples assign meaning to each position in a tuple and allow for more readable,
|
|
|
|
self-documenting code. They can be used wherever regular tuples are used, and
|
|
|
|
they add the ability to access fields by name instead of position index.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2007-11-14 22:44:53 -04:00
|
|
|
.. function:: namedtuple(typename, fieldnames, [verbose])
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
Returns a new tuple subclass named *typename*. The new subclass is used to
|
2008-02-22 08:31:45 -04:00
|
|
|
create tuple-like objects that have fields accessible by attribute lookup as
|
2007-08-15 11:28:01 -03:00
|
|
|
well as being indexable and iterable. Instances of the subclass also have a
|
|
|
|
helpful docstring (with typename and fieldnames) and a helpful :meth:`__repr__`
|
|
|
|
method which lists the tuple contents in a ``name=value`` format.
|
|
|
|
|
2007-10-08 18:26:58 -03:00
|
|
|
The *fieldnames* are a single string with each fieldname separated by whitespace
|
2008-01-10 19:00:01 -04:00
|
|
|
and/or commas, for example ``'x y'`` or ``'x, y'``. Alternatively, *fieldnames*
|
|
|
|
can be a sequence of strings such as ``['x', 'y']``.
|
2007-10-16 18:28:32 -03:00
|
|
|
|
|
|
|
Any valid Python identifier may be used for a fieldname except for names
|
2007-12-13 22:49:47 -04:00
|
|
|
starting with an underscore. Valid identifiers consist of letters, digits,
|
|
|
|
and underscores but do not start with a digit or underscore and cannot be
|
2007-10-16 18:28:32 -03:00
|
|
|
a :mod:`keyword` such as *class*, *for*, *return*, *global*, *pass*, *print*,
|
|
|
|
or *raise*.
|
2007-09-18 19:18:02 -03:00
|
|
|
|
2008-01-10 19:00:01 -04:00
|
|
|
If *verbose* is true, the class definition is printed just before being built.
|
2007-09-18 19:18:02 -03:00
|
|
|
|
2007-10-08 18:26:58 -03:00
|
|
|
Named tuple instances do not have per-instance dictionaries, so they are
|
2007-09-20 00:03:43 -03:00
|
|
|
lightweight and require no more memory than regular tuples.
|
2007-09-18 19:18:02 -03:00
|
|
|
|
2007-08-15 11:28:01 -03:00
|
|
|
.. versionadded:: 2.6
|
|
|
|
|
2008-03-22 18:06:20 -03:00
|
|
|
Example:
|
|
|
|
|
|
|
|
.. doctest::
|
|
|
|
:options: +NORMALIZE_WHITESPACE
|
2007-09-18 19:18:02 -03:00
|
|
|
|
2007-11-14 22:44:53 -04:00
|
|
|
>>> Point = namedtuple('Point', 'x y', verbose=True)
|
2007-09-18 19:18:02 -03:00
|
|
|
class Point(tuple):
|
|
|
|
'Point(x, y)'
|
2008-03-22 18:06:20 -03:00
|
|
|
<BLANKLINE>
|
2007-09-18 19:18:02 -03:00
|
|
|
__slots__ = ()
|
2008-03-22 18:06:20 -03:00
|
|
|
<BLANKLINE>
|
2008-01-03 23:22:53 -04:00
|
|
|
_fields = ('x', 'y')
|
2008-03-22 18:06:20 -03:00
|
|
|
<BLANKLINE>
|
2007-09-18 19:18:02 -03:00
|
|
|
def __new__(cls, x, y):
|
|
|
|
return tuple.__new__(cls, (x, y))
|
2008-03-22 18:06:20 -03:00
|
|
|
<BLANKLINE>
|
2008-01-04 21:35:43 -04:00
|
|
|
@classmethod
|
2008-03-22 18:06:20 -03:00
|
|
|
def _make(cls, iterable, new=tuple.__new__, len=len):
|
2008-01-04 21:35:43 -04:00
|
|
|
'Make a new Point object from a sequence or iterable'
|
2008-03-22 18:06:20 -03:00
|
|
|
result = new(cls, iterable)
|
2008-01-04 21:35:43 -04:00
|
|
|
if len(result) != 2:
|
|
|
|
raise TypeError('Expected 2 arguments, got %d' % len(result))
|
|
|
|
return result
|
2008-03-22 18:06:20 -03:00
|
|
|
<BLANKLINE>
|
2007-09-18 19:18:02 -03:00
|
|
|
def __repr__(self):
|
|
|
|
return 'Point(x=%r, y=%r)' % self
|
2008-03-22 18:06:20 -03:00
|
|
|
<BLANKLINE>
|
2007-12-18 18:21:27 -04:00
|
|
|
def _asdict(t):
|
2007-12-14 14:08:20 -04:00
|
|
|
'Return a new dict which maps field names to their values'
|
2007-12-18 18:21:27 -04:00
|
|
|
return {'x': t[0], 'y': t[1]}
|
2008-03-22 18:06:20 -03:00
|
|
|
<BLANKLINE>
|
2007-12-13 22:49:47 -04:00
|
|
|
def _replace(self, **kwds):
|
2007-11-14 22:44:53 -04:00
|
|
|
'Return a new Point object replacing specified fields with new values'
|
2008-01-06 05:02:24 -04:00
|
|
|
result = self._make(map(kwds.pop, ('x', 'y'), self))
|
2008-01-04 22:17:24 -04:00
|
|
|
if kwds:
|
|
|
|
raise ValueError('Got unexpected field names: %r' % kwds.keys())
|
|
|
|
return result
|
2008-03-22 18:06:20 -03:00
|
|
|
<BLANKLINE>
|
2007-09-18 19:18:02 -03:00
|
|
|
x = property(itemgetter(0))
|
|
|
|
y = property(itemgetter(1))
|
|
|
|
|
|
|
|
>>> p = Point(11, y=22) # instantiate with positional or keyword arguments
|
2007-12-17 20:13:45 -04:00
|
|
|
>>> p[0] + p[1] # indexable like the plain tuple (11, 22)
|
2007-09-18 19:18:02 -03:00
|
|
|
33
|
|
|
|
>>> x, y = p # unpack like a regular tuple
|
|
|
|
>>> x, y
|
|
|
|
(11, 22)
|
2008-02-22 08:31:45 -04:00
|
|
|
>>> p.x + p.y # fields also accessible by name
|
2007-09-18 19:18:02 -03:00
|
|
|
33
|
|
|
|
>>> p # readable __repr__ with a name=value style
|
|
|
|
Point(x=11, y=22)
|
|
|
|
|
|
|
|
Named tuples are especially useful for assigning field names to result tuples returned
|
|
|
|
by the :mod:`csv` or :mod:`sqlite3` modules::
|
|
|
|
|
2007-11-14 22:44:53 -04:00
|
|
|
EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade')
|
2007-10-08 18:26:58 -03:00
|
|
|
|
2007-09-18 19:18:02 -03:00
|
|
|
import csv
|
2008-01-04 21:35:43 -04:00
|
|
|
for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))):
|
2007-09-18 19:18:02 -03:00
|
|
|
print emp.name, emp.title
|
|
|
|
|
2007-10-08 18:26:58 -03:00
|
|
|
import sqlite3
|
|
|
|
conn = sqlite3.connect('/companydata')
|
|
|
|
cursor = conn.cursor()
|
|
|
|
cursor.execute('SELECT name, age, title, department, paygrade FROM employees')
|
2008-01-04 21:35:43 -04:00
|
|
|
for emp in map(EmployeeRecord._make, cursor.fetchall()):
|
2007-10-08 18:26:58 -03:00
|
|
|
print emp.name, emp.title
|
|
|
|
|
2007-12-18 19:51:15 -04:00
|
|
|
In addition to the methods inherited from tuples, named tuples support
|
2008-01-07 22:24:15 -04:00
|
|
|
three additional methods and one attribute. To prevent conflicts with
|
|
|
|
field names, the method and attribute names start with an underscore.
|
2007-09-18 19:18:02 -03:00
|
|
|
|
2008-01-07 12:43:47 -04:00
|
|
|
.. method:: somenamedtuple._make(iterable)
|
2007-09-18 00:33:19 -03:00
|
|
|
|
2008-01-04 21:35:43 -04:00
|
|
|
Class method that makes a new instance from an existing sequence or iterable.
|
2007-10-04 23:47:07 -03:00
|
|
|
|
2008-03-22 18:06:20 -03:00
|
|
|
.. doctest::
|
2007-10-04 23:47:07 -03:00
|
|
|
|
2008-01-04 21:35:43 -04:00
|
|
|
>>> t = [11, 22]
|
|
|
|
>>> Point._make(t)
|
|
|
|
Point(x=11, y=22)
|
2007-10-04 23:47:07 -03:00
|
|
|
|
2008-01-07 12:43:47 -04:00
|
|
|
.. method:: somenamedtuple._asdict()
|
2007-10-04 23:47:07 -03:00
|
|
|
|
2008-03-22 18:06:20 -03:00
|
|
|
Return a new dict which maps field names to their corresponding values::
|
2007-09-16 21:55:00 -03:00
|
|
|
|
2007-12-13 22:49:47 -04:00
|
|
|
>>> p._asdict()
|
2007-10-04 23:47:07 -03:00
|
|
|
{'x': 11, 'y': 22}
|
2008-03-22 18:06:20 -03:00
|
|
|
|
2008-01-07 12:43:47 -04:00
|
|
|
.. method:: somenamedtuple._replace(kwargs)
|
2007-09-16 21:55:00 -03:00
|
|
|
|
2008-03-22 18:06:20 -03:00
|
|
|
Return a new instance of the named tuple replacing specified fields with new
|
|
|
|
values:
|
2007-09-20 00:03:43 -03:00
|
|
|
|
|
|
|
::
|
2007-09-16 21:55:00 -03:00
|
|
|
|
2007-09-18 19:18:02 -03:00
|
|
|
>>> p = Point(x=11, y=22)
|
2007-12-13 22:49:47 -04:00
|
|
|
>>> p._replace(x=33)
|
2007-09-16 21:55:00 -03:00
|
|
|
Point(x=33, y=22)
|
|
|
|
|
2007-11-14 23:16:09 -04:00
|
|
|
>>> for partnum, record in inventory.items():
|
2008-01-08 23:02:23 -04:00
|
|
|
... inventory[partnum] = record._replace(price=newprices[partnum], timestamp=time.now())
|
2007-09-16 21:55:00 -03:00
|
|
|
|
2008-01-07 12:43:47 -04:00
|
|
|
.. attribute:: somenamedtuple._fields
|
2007-09-16 21:55:00 -03:00
|
|
|
|
2008-01-07 17:33:51 -04:00
|
|
|
Tuple of strings listing the field names. Useful for introspection
|
2007-10-04 23:47:07 -03:00
|
|
|
and for creating new named tuple types from existing named tuples.
|
2007-09-20 00:03:43 -03:00
|
|
|
|
2008-03-22 18:06:20 -03:00
|
|
|
.. doctest::
|
2007-09-18 19:18:02 -03:00
|
|
|
|
2007-12-13 22:49:47 -04:00
|
|
|
>>> p._fields # view the field names
|
2007-09-18 19:18:02 -03:00
|
|
|
('x', 'y')
|
|
|
|
|
2007-11-14 22:44:53 -04:00
|
|
|
>>> Color = namedtuple('Color', 'red green blue')
|
2007-12-13 22:49:47 -04:00
|
|
|
>>> Pixel = namedtuple('Pixel', Point._fields + Color._fields)
|
2007-09-18 19:18:02 -03:00
|
|
|
>>> Pixel(11, 22, 128, 255, 0)
|
2008-01-08 23:13:20 -04:00
|
|
|
Pixel(x=11, y=22, red=128, green=255, blue=0)
|
2007-09-16 21:55:00 -03:00
|
|
|
|
2007-12-14 17:51:50 -04:00
|
|
|
To retrieve a field whose name is stored in a string, use the :func:`getattr`
|
2008-03-22 18:06:20 -03:00
|
|
|
function:
|
2007-12-14 17:51:50 -04:00
|
|
|
|
|
|
|
>>> getattr(p, 'x')
|
|
|
|
11
|
|
|
|
|
2008-03-22 18:06:20 -03:00
|
|
|
To convert a dictionary to a named tuple, use the double-star-operator [#]_:
|
2007-12-18 19:51:15 -04:00
|
|
|
|
|
|
|
>>> d = {'x': 11, 'y': 22}
|
|
|
|
>>> Point(**d)
|
|
|
|
Point(x=11, y=22)
|
|
|
|
|
2007-11-14 22:44:53 -04:00
|
|
|
Since a named tuple is a regular Python class, it is easy to add or change
|
2008-01-07 00:24:49 -04:00
|
|
|
functionality with a subclass. Here is how to add a calculated field and
|
2008-03-22 18:06:20 -03:00
|
|
|
a fixed-width print format:
|
2007-11-14 22:44:53 -04:00
|
|
|
|
2008-01-07 00:24:49 -04:00
|
|
|
>>> class Point(namedtuple('Point', 'x y')):
|
2008-01-10 15:15:10 -04:00
|
|
|
... __slots__ = ()
|
2008-01-08 23:02:23 -04:00
|
|
|
... @property
|
|
|
|
... def hypot(self):
|
|
|
|
... return (self.x ** 2 + self.y ** 2) ** 0.5
|
|
|
|
... def __str__(self):
|
2008-01-10 19:00:01 -04:00
|
|
|
... return 'Point: x=%6.3f y=%6.3f hypot=%6.3f' % (self.x, self.y, self.hypot)
|
2008-01-07 00:24:49 -04:00
|
|
|
|
2008-01-10 15:15:10 -04:00
|
|
|
>>> for p in Point(3, 4), Point(14, 5/7.):
|
2008-01-08 23:02:23 -04:00
|
|
|
... print p
|
2008-01-10 19:00:01 -04:00
|
|
|
Point: x= 3.000 y= 4.000 hypot= 5.000
|
|
|
|
Point: x=14.000 y= 0.714 hypot=14.018
|
2007-11-14 22:44:53 -04:00
|
|
|
|
2008-01-27 06:47:55 -04:00
|
|
|
The subclass shown above sets ``__slots__`` to an empty tuple. This keeps
|
2008-01-16 19:38:16 -04:00
|
|
|
keep memory requirements low by preventing the creation of instance dictionaries.
|
2008-01-15 16:52:42 -04:00
|
|
|
|
2008-01-07 22:24:15 -04:00
|
|
|
Subclassing is not useful for adding new, stored fields. Instead, simply
|
2008-03-22 18:06:20 -03:00
|
|
|
create a new named tuple type from the :attr:`_fields` attribute:
|
2008-01-07 22:24:15 -04:00
|
|
|
|
2008-01-10 16:37:12 -04:00
|
|
|
>>> Point3D = namedtuple('Point3D', Point._fields + ('z',))
|
2008-01-07 22:24:15 -04:00
|
|
|
|
2008-01-07 16:17:35 -04:00
|
|
|
Default values can be implemented by using :meth:`_replace` to
|
2008-03-22 18:06:20 -03:00
|
|
|
customize a prototype instance:
|
2007-11-15 18:39:34 -04:00
|
|
|
|
|
|
|
>>> Account = namedtuple('Account', 'owner balance transaction_count')
|
2008-01-18 17:14:58 -04:00
|
|
|
>>> default_account = Account('<owner name>', 0.0, 0)
|
|
|
|
>>> johns_account = default_account._replace(owner='John')
|
2007-11-15 18:39:34 -04:00
|
|
|
|
2008-05-08 04:23:30 -03:00
|
|
|
Enumerated constants can be implemented with named tuples, but it is simpler
|
|
|
|
and more efficient to use a simple class declaration:
|
|
|
|
|
|
|
|
>>> Status = namedtuple('Status', 'open pending closed')._make(range(3))
|
|
|
|
>>> Status.open, Status.pending, Status.closed
|
|
|
|
(0, 1, 2)
|
|
|
|
>>> class Status:
|
|
|
|
... open, pending, closed = range(3)
|
|
|
|
|
2007-08-30 12:03:03 -03:00
|
|
|
.. rubric:: Footnotes
|
|
|
|
|
2007-12-18 19:51:15 -04:00
|
|
|
.. [#] For information on the double-star-operator see
|
2007-08-30 12:03:03 -03:00
|
|
|
:ref:`tut-unpacking-arguments` and :ref:`calls`.
|