Make "hashable" a glossary entry and clarify docs on __cmp__, __eq__ and __hash__.

I hope the concept of hashability is better understandable now.
Thanks to Tim Hatch for pointing out the flaws here.
This commit is contained in:
Georg Brandl 2007-11-02 20:06:17 +00:00
parent 03fd077482
commit 7c3e79f67f
9 changed files with 83 additions and 60 deletions

View File

@ -2231,8 +2231,8 @@ Dictionary Objects
.. cfunction:: int PyDict_SetItem(PyObject *p, PyObject *key, PyObject *val)
Insert *value* into the dictionary *p* with a key of *key*. *key* must be
hashable; if it isn't, :exc:`TypeError` will be raised. Return ``0`` on success
or ``-1`` on failure.
:term:`hashable`; if it isn't, :exc:`TypeError` will be raised. Return ``0``
on success or ``-1`` on failure.
.. cfunction:: int PyDict_SetItemString(PyObject *p, const char *key, PyObject *val)

View File

@ -153,6 +153,20 @@ Glossary
in the past to create a "free-threaded" interpreter (one which locks
shared data at a much finer granularity), but performance suffered in the
common single-processor case.
hashable
An object is *hashable* if it has a hash value that never changes during
its lifetime (it needs a :meth:`__hash__` method), and can be compared to
other objects (it needs an :meth:`__eq__` or :meth:`__cmp__` method).
Hashable objects that compare equal must have the same hash value.
Hashability makes an object usable as a dictionary key and a set member,
because these data structures use the hash value internally.
All of Python's immutable built-in objects are hashable, while all mutable
containers (such as lists or dictionaries) are not. Objects that are
instances of user-defined classes are hashable by default; they all
compare unequal, and their hash value is their :func:`id`.
IDLE
An Integrated Development Environment for Python. IDLE is a basic editor

View File

@ -262,7 +262,7 @@ compared to an object of a different type, :exc:`TypeError` is raised unless the
comparison is ``==`` or ``!=``. The latter cases return :const:`False` or
:const:`True`, respectively.
:class:`timedelta` objects are hashable (usable as dictionary keys), support
:class:`timedelta` objects are :term:`hashable` (usable as dictionary keys), support
efficient pickling, and in Boolean contexts, a :class:`timedelta` object is
considered to be true if and only if it isn't equal to ``timedelta(0)``.

View File

@ -20,7 +20,7 @@ diffs. For comparing directories and files, see also, the :mod:`filecmp` module.
.. class:: SequenceMatcher
This is a flexible class for comparing pairs of sequences of any type, so long
as the sequence elements are hashable. The basic algorithm predates, and is a
as the sequence elements are :term:`hashable`. The basic algorithm predates, and is a
little fancier than, an algorithm published in the late 1980's by Ratcliff and
Obershelp under the hyperbolic name "gestalt pattern matching." The idea is to
find the longest contiguous matching subsequence that contains no "junk"
@ -313,7 +313,7 @@ The :class:`SequenceMatcher` class has this constructor:
on blanks or hard tabs.
The optional arguments *a* and *b* are sequences to be compared; both default to
empty strings. The elements of both sequences must be hashable.
empty strings. The elements of both sequences must be :term:`hashable`.
:class:`SequenceMatcher` objects have the following methods:

View File

@ -60,7 +60,7 @@ Bookkeeping functions:
.. function:: seed([x])
Initialize the basic random number generator. Optional argument *x* can be any
hashable object. If *x* is omitted or ``None``, current system time is used;
:term:`hashable` object. If *x* is omitted or ``None``, current system time is used;
current system time is also used to initialize the generator when the module is
first imported. If randomness sources are provided by the operating system,
they are used instead of the system time (see the :func:`os.urandom` function
@ -165,7 +165,7 @@ Functions for sequences:
(the sample) to be partitioned into grand prize and second place winners (the
subslices).
Members of the population need not be hashable or unique. If the population
Members of the population need not be :term:`hashable` or unique. If the population
contains repeats, then each occurrence is a possible selection in the sample.
To choose a sample from a range of integers, use an :func:`xrange` object as an

View File

@ -1419,7 +1419,7 @@ Set Types --- :class:`set`, :class:`frozenset`
.. index:: object: set
A :dfn:`set` object is an unordered collection of distinct hashable objects.
A :dfn:`set` object is an unordered collection of distinct :term:`hashable` objects.
Common uses include membership testing, removing duplicates from a sequence, and
computing mathematical operations such as intersection, union, difference, and
symmetric difference.
@ -1438,7 +1438,7 @@ There are currently two builtin set types, :class:`set` and :class:`frozenset`.
The :class:`set` type is mutable --- the contents can be changed using methods
like :meth:`add` and :meth:`remove`. Since it is mutable, it has no hash value
and cannot be used as either a dictionary key or as an element of another set.
The :class:`frozenset` type is immutable and hashable --- its contents cannot be
The :class:`frozenset` type is immutable and :term:`hashable` --- its contents cannot be
altered after it is created; it can therefore be used as a dictionary key or as
an element of another set.
@ -1538,8 +1538,7 @@ or ``a>b``. Accordingly, sets do not implement the :meth:`__cmp__` method.
Since sets only define partial ordering (subset relationships), the output of
the :meth:`list.sort` method is undefined for lists of sets.
Set elements are like dictionary keys; they need to define both :meth:`__hash__`
and :meth:`__eq__` methods.
Set elements, like dictionary keys, must be :term:`hashable`.
Binary operations that mix :class:`set` instances with :class:`frozenset` return
the type of the first operand. For example: ``frozenset('ab') | set('bc')``
@ -1619,21 +1618,20 @@ Mapping Types --- :class:`dict`
statement: del
builtin: len
A :dfn:`mapping` object maps immutable values to arbitrary objects. Mappings
are mutable objects. There is currently only one standard mapping type, the
:dfn:`dictionary`.
(For other containers see the built in :class:`list`,
:class:`set`, and :class:`tuple` classes, and the :mod:`collections`
module.)
A :dfn:`mapping` object maps :term:`hashable` values to arbitrary objects.
Mappings are mutable objects. There is currently only one standard mapping
type, the :dfn:`dictionary`. (For other containers see the built in
:class:`list`, :class:`set`, and :class:`tuple` classes, and the
:mod:`collections` module.)
A dictionary's keys are *almost* arbitrary values. Only
values containing lists, dictionaries or other mutable types (that are compared
by value rather than by object identity) may not be used as keys. Numeric types
used for keys obey the normal rules for numeric comparison: if two numbers
compare equal (such as ``1`` and ``1.0``) then they can be used interchangeably
to index the same dictionary entry. (Note however, that since computers
store floating-point numbers as approximations it is usually unwise to
use them as dictionary keys.)
A dictionary's keys are *almost* arbitrary values. Values that are not
:term:`hashable`, that is, values containing lists, dictionaries or other
mutable types (that are compared by value rather than by object identity) may
not be used as keys. Numeric types used for keys obey the normal rules for
numeric comparison: if two numbers compare equal (such as ``1`` and ``1.0``)
then they can be used interchangeably to index the same dictionary entry. (Note
however, that since computers store floating-point numbers as approximations it
is usually unwise to use them as dictionary keys.)
Dictionaries can be created by placing a comma-separated list of ``key: value``
pairs within braces, for example: ``{'jack': 4098, 'sjoerd': 4127}`` or ``{4098:

View File

@ -87,7 +87,7 @@ Extension types can easily be made to support weak references; see
but cannot be propagated; they are handled in exactly the same way as exceptions
raised from an object's :meth:`__del__` method.
Weak references are hashable if the *object* is hashable. They will maintain
Weak references are :term:`hashable` if the *object* is hashable. They will maintain
their hash value even after the *object* was deleted. If :func:`hash` is called
the first time only after the *object* was deleted, the call will raise
:exc:`TypeError`.
@ -108,7 +108,7 @@ Extension types can easily be made to support weak references; see
the proxy in most contexts instead of requiring the explicit dereferencing used
with weak reference objects. The returned object will have a type of either
``ProxyType`` or ``CallableProxyType``, depending on whether *object* is
callable. Proxy objects are not hashable regardless of the referent; this
callable. Proxy objects are not :term:`hashable` regardless of the referent; this
avoids a number of problems related to their fundamentally mutable nature, and
prevent their use as dictionary keys. *callback* is the same as the parameter
of the same name to the :func:`ref` function.

View File

@ -409,9 +409,10 @@ Set types
Frozen sets
.. index:: object: frozenset
These represent an immutable set. They are created by the built-in
:func:`frozenset` constructor. As a frozenset is immutable and hashable, it can
be used again as an element of another set, or as a dictionary key.
These represent an immutable set. They are created by the built-in
:func:`frozenset` constructor. As a frozenset is immutable and
:term:`hashable`, it can be used again as an element of another set, or as
a dictionary key.
.. % Set types
@ -1315,6 +1316,9 @@ Basic customization
.. versionadded:: 2.1
.. index::
single: comparisons
These are the so-called "rich comparison" methods, and are called for comparison
operators in preference to :meth:`__cmp__` below. The correspondence between
operator symbols and method names is as follows: ``x<y`` calls ``x.__lt__(y)``,
@ -1329,14 +1333,16 @@ Basic customization
context (e.g., in the condition of an ``if`` statement), Python will call
:func:`bool` on the value to determine if the result is true or false.
There are no implied relationships among the comparison operators. The truth of
``x==y`` does not imply that ``x!=y`` is false. Accordingly, when defining
:meth:`__eq__`, one should also define :meth:`__ne__` so that the operators will
behave as expected.
There are no implied relationships among the comparison operators. The truth
of ``x==y`` does not imply that ``x!=y`` is false. Accordingly, when
defining :meth:`__eq__`, one should also define :meth:`__ne__` so that the
operators will behave as expected. See the paragraph on :meth:`__hash__` for
some important notes on creating :term:`hashable` objects which support
custom comparison operations and are usable as dictionary keys.
There are no reflected (swapped-argument) versions of these methods (to be used
when the left argument does not support the operation but the right argument
does); rather, :meth:`__lt__` and :meth:`__gt__` are each other's reflection,
There are no swapped-argument versions of these methods (to be used when the
left argument does not support the operation but the right argument does);
rather, :meth:`__lt__` and :meth:`__gt__` are each other's reflection,
:meth:`__le__` and :meth:`__ge__` are each other's reflection, and
:meth:`__eq__` and :meth:`__ne__` are their own reflection.
@ -1349,14 +1355,15 @@ Basic customization
builtin: cmp
single: comparisons
Called by comparison operations if rich comparison (see above) is not defined.
Should return a negative integer if ``self < other``, zero if ``self == other``,
a positive integer if ``self > other``. If no :meth:`__cmp__`, :meth:`__eq__`
or :meth:`__ne__` operation is defined, class instances are compared by object
identity ("address"). See also the description of :meth:`__hash__` for some
important notes on creating objects which support custom comparison operations
and are usable as dictionary keys. (Note: the restriction that exceptions are
not propagated by :meth:`__cmp__` has been removed since Python 1.5.)
Called by comparison operations if rich comparison (see above) is not
defined. Should return a negative integer if ``self < other``, zero if
``self == other``, a positive integer if ``self > other``. If no
:meth:`__cmp__`, :meth:`__eq__` or :meth:`__ne__` operation is defined, class
instances are compared by object identity ("address"). See also the
description of :meth:`__hash__` for some important notes on creating
:term:`hashable` objects which support custom comparison operations and are
usable as dictionary keys. (Note: the restriction that exceptions are not
propagated by :meth:`__cmp__` has been removed since Python 1.5.)
.. method:: object.__rcmp__(self, other)
@ -1371,25 +1378,29 @@ Basic customization
object: dictionary
builtin: hash
Called for the key object for dictionary operations, and by the built-in
function :func:`hash`. Should return a 32-bit integer usable as a hash value
Called for the key object for dictionary operations, and by the built-in
function :func:`hash`. Should return an integer usable as a hash value
for dictionary operations. The only required property is that objects which
compare equal have the same hash value; it is advised to somehow mix together
(e.g., using exclusive or) the hash values for the components of the object that
also play a part in comparison of objects. If a class does not define a
:meth:`__cmp__` method it should not define a :meth:`__hash__` operation either;
if it defines :meth:`__cmp__` or :meth:`__eq__` but not :meth:`__hash__`, its
instances will not be usable as dictionary keys. If a class defines mutable
objects and implements a :meth:`__cmp__` or :meth:`__eq__` method, it should not
implement :meth:`__hash__`, since the dictionary implementation requires that a
key's hash value is immutable (if the object's hash value changes, it will be in
the wrong hash bucket).
also play a part in comparison of objects.
If a class does not define a :meth:`__cmp__` or :meth:`__eq__` method it
should not define a :meth:`__hash__` operation either; if it defines
:meth:`__cmp__` or :meth:`__eq__` but not :meth:`__hash__`, its instances
will not be usable as dictionary keys. If a class defines mutable objects
and implements a :meth:`__cmp__` or :meth:`__eq__` method, it should not
implement :meth:`__hash__`, since the dictionary implementation requires that
a key's hash value is immutable (if the object's hash value changes, it will
be in the wrong hash bucket).
User-defined classes have :meth:`__cmp__` and :meth:`__hash__` methods
by default; with them, all objects compare unequal and ``x.__hash__()``
returns ``id(x)``.
.. versionchanged:: 2.5
:meth:`__hash__` may now also return a long integer object; the 32-bit integer
is then derived from the hash of that object.
.. index:: single: __cmp__() (object method)
:meth:`__hash__` may now also return a long integer object; the 32-bit
integer is then derived from the hash of that object.
.. method:: object.__nonzero__(self)

View File

@ -276,7 +276,7 @@ the corresponding datum.
.. index:: pair: immutable; object
Restrictions on the types of the key values are listed earlier in section
:ref:`types`. (To summarize, the key type should be hashable, which excludes
:ref:`types`. (To summarize, the key type should be :term:`hashable`, which excludes
all mutable objects.) Clashes between duplicate keys are not detected; the last
datum (textually rightmost in the display) stored for a given key value
prevails.