he didn't know this), so merged in some changes I made during
review. Nothing material apart from changing a new `mask` local
from int to Py_ssize_t. Mostly this is repairing comments that
were made incorrect, and adding new comments. Also a few
minor code rewrites for clarity or helpful succinctness.
using a custom, nearly-identical macro. This probably changes how some of
these functions are compiled, which may result in fractionally slower (or
faster) execution. Considering the nature of traversal, visiting much of the
address space in unpredictable patterns, I'd argue the code readability and
maintainability is well worth it ;P
- The copy module now "copies" function objects (as atomic objects).
- dict.__getitem__ now looks for a __missing__ hook before raising
KeyError.
- Added a new type, defaultdict, to the collections module.
This uses the new __missing__ hook behavior added to dict (see above).
This gives another 30% speedup for operations such as
map(func, d.iteritems()) or list(d.iteritems()) which can both take
advantage of length information when provided.
* Split into three separate types that share everything except the
code for iternext. Saves run time decision making and allows
each iternext function to be specialized.
* Inlined PyDict_Next(). In addition to saving a function call, this
allows a redundant test to be eliminated and further specialization
of the code for the unique needs of each iterator type.
* Created a reusable result tuple for iteritems(). Saves the malloc
time for tuples when the previous result was not kept by client code
(this is the typical use case for iteritems). If the client code
does keep the reference, then a new tuple is created.
Results in a 20% to 30% speedup depending on the size and sparsity
of the dictionary.
* Factored constant structure references out of the inner loops for
PyDict_Next(), dict_keys(), dict_values(), and dict_items().
Gave measurable speedups to each (the improvement varies depending
on the sparseness of the dictionary being measured).
* Added a freelist scheme styled after that for tuples. Saves around
80% of the calls to malloc and free. About 10% of the time, the
previous dictionary was completely empty; in those cases, the
dictionary initialization with memset() can be skipped.
(Championed by Bob Ippolito.)
The update() method for mappings now accepts all the same argument forms
as the dict() constructor. This includes item lists and/or keyword
arguments.
* Increase dictionary growth rate resulting in more sparse dictionaries,
fewer lookup collisions, increased memory use, and better cache
performance. For dicts with over 50k entries, keep the current
growth rate in case an application is suffering from tight memory
constraints.
* Set the most common case (no resize) to fall-through the test.
the optional proto 2 slot state.
pickle.py, load_build(): CAUTION: Noted that cPickle's
load_build and pickle's load_build really don't do the same
things with the state, and didn't before this patch either.
cPickle never tries to do .update(), and has no backoff if
instance.__dict__ can't be retrieved. There are no tests
that can tell the difference, and part of what cPickle's
load_build() did looked accidental to me, so I don't know
what the true intent is here.
pickletester.py, test_pickle.py: Got rid of the hack for
exempting cPickle from running some of the proto 2 tests.
dictobject.c, PyDict_Next(): documented intended use.
Obtain cleaner coding and a system wide
performance boost by using the fast, pre-parsed
PyArg_Unpack function instead of PyArg_ParseTuple
function which is driven by a format string.
Most of these patches are from Thomas Heller, with long lines folded
by Tim. The change to test_descr.py is from Guido. See the bug report.
Not a bugfix candidate -- METH_CLASS is new in 2.3.
Just van Rossum showed a weird, but clever way for pure python code to
trigger the BadInternalCall. The C code had assumed that calling a class
constructor would return an instance of that class; however, classes that
abuse __new__ can invalidate that assumption.
interning. I modified Oren's patch significantly, but the basic idea
and most of the implementation is unchanged. Interned strings created
with PyString_InternInPlace() are now mortal, and you must keep a
reference to the resulting string around; use the new function
PyString_InternImmortal() to create immortal interned strings.
The staticforward define was needed to support certain broken C
compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the
static keyword when it was used with a forward declaration of a static
initialized structure. Standard C allows the forward declaration with
static, and we've decided to stop catering to broken C compilers. (In
fact, we expect that the compilers are all fixed eight years later.)
I'm leaving staticforward and statichere defined in object.h as
static. This is only for backwards compatibility with C extensions
that might still use it.
XXX I haven't updated the documentation.
di_dict field when the end of the list is reached. Also make the
error ("dictionary changed size during iteration") a sticky state.
Also remove the next() method -- one is supplied automatically by
PyType_Ready() because the tp_iternext slot is set. That's a good
thing, because the implementation given here was buggy (it never
raised StopIteration).
PEP 285. Everything described in the PEP is here, and there is even
some documentation. I had to fix 12 unit tests; all but one of these
were printing Boolean outcomes that changed from 0/1 to False/True.
(The exception is test_unicode.py, which did a type(x) == type(y)
style comparison. I could've fixed that with a single line using
issubtype(x, type(y)), but instead chose to be explicit about those
places where a bool is expected.
Still to do: perhaps more documentation; change standard library
modules to return False/True from predicates.
The fix makes it possible to call PyObject_GC_UnTrack() more than once
on the same object, and then move the PyObject_GC_UnTrack() call to
*before* the trashcan code is invoked.
BUGFIX CANDIDATE!
PyDict_UpdateFromSeq2(): removed it.
PyDict_MergeFromSeq2(): made it public and documented it.
PyDict_Merge() docs: updated to reveal <wink> that the second
argument can be any mapping object.
Rather than tweaking the inheritance of type object slots (which turns
out to be too messy to try), this fix adds a __hash__ to the list and
dict types (the only mutable types I'm aware of) that explicitly
raises an error. This has the advantage that list.__hash__([]) also
raises an error (previously, this would invoke object.__hash__([]),
returning the argument's address); ditto for dict.__hash__.
The disadvantage for this fix is that 3rd party mutable types aren't
automatically fixed. This should be added to the rules for creating
subclassable extension types: if you don't want your object to be
hashable, add a tp_hash function that raises an exception.
Also, it's possible that I've forgotten about other mutable types for
which this should be done.
outer level, the iterator protocol is used for memory-efficiency (the
outer sequence may be very large if fully materialized); at the inner
level, PySequence_Fast() is used for time-efficiency (these should
always be sequences of length 2).
dictobject.c, new functions PyDict_{Merge,Update}FromSeq2. These are
wholly analogous to PyDict_{Merge,Update}, but process a sequence-of-2-
sequences argument instead of a mapping object. For now, I left these
functions file static, so no corresponding doc changes. It's tempting
to change dict.update() to allow a sequence-of-2-seqs argument too.
Also changed the name of dictionary's keyword argument from "mapping"
to "x". Got a better name? "mapping_or_sequence_of_pairs" isn't
attractive, although more so than "mosop" <wink>.
abstract.h, abstract.tex: Added new PySequence_Fast_GET_SIZE function,
much faster than going thru the all-purpose PySequence_Size.
libfuncs.tex:
- Document dictionary().
- Fiddle tuple() and list() to admit that their argument is optional.
- The long-winded repetitions of "a sequence, a container that supports
iteration, or an iterator object" is getting to be a PITA. Many
months ago I suggested factoring this out into "iterable object",
where the definition of that could include being explicit about
generators too (as is, I'm not sure a reader outside of PythonLabs
could guess that "an iterator object" includes a generator call).
- Please check my curly braces -- I'm going blind <0.9 wink>.
abstract.c, PySequence_Tuple(): When PyObject_GetIter() fails, leave
its error msg alone now (the msg it produces has improved since
PySequence_Tuple was generalized to accept iterable objects, and
PySequence_Tuple was also stomping on the msg in cases it shouldn't
have even before PyObject_GetIter grew a better msg).
many types were subclassable but had a xxx_dealloc function that
called PyObject_DEL(self) directly instead of deferring to
self->ob_type->tp_free(self). It is permissible to set tp_free in the
type object directly to _PyObject_Del, for non-GC types, or to
_PyObject_GC_Del, for GC types. Still, PyObject_DEL was a tad faster,
so I'm fearing that our pystone rating is going down again. I'm not
sure if doing something like
void xxx_dealloc(PyObject *self)
{
if (PyXxxCheckExact(self))
PyObject_DEL(self);
else
self->ob_type->tp_free(self);
}
is any faster than always calling the else branch, so I haven't
attempted that -- however those types whose own dealloc is fancier
(int, float, unicode) do use this pattern.
keys are true strings -- no subclasses need apply. This may be debatable.
The problem is that a str subclass may very well want to override __eq__
and/or __hash__ (see the new example of case-insensitive strings in
test_descr), but go-fast shortcuts for strings are ubiquitous in our dicts
(and subclass overrides aren't even looked for then). Another go-fast
reason for the change is that PyCheck_StringExact() is a quicker test
than PyCheck_String(), and we make such a test on virtually every access
to every dict.
OTOH, a str subclass may also be perfectly happy using the base str eq
and hash, and this change slows them a lot. But those cases are still
hypothetical, while Python's own reliance on true-string dicts is not.
mapping object", in the same sense dict.update(x) requires of x (that x
has a keys() method and a getitem).
Questionable: The other type constructors accept a keyword argument, so I
did that here too (e.g., dictionary(mapping={1:2}) works). But type_call
doesn't pass the keyword args to the tp_new slot (it passes NULL), it only
passes them to the tp_init slot, so getting at them required adding a
tp_init slot to dicts. Looks like that makes the normal case (i.e., no
args at all) a little slower (the time it takes to call dict.tp_init and
have it figure out there's nothing to do).
"mapping" object, specifically one that supports PyMapping_Keys() and
PyObject_GetItem(). This allows you to say e.g. {}.update(UserDict())
We keep the special case for concrete dict objects, although that
seems moderately questionable. OTOH, the code exists and works, so
why change that?
.update()'s docstring already claims that D.update(E) implies calling
E.keys() so it's appropriate not to transform AttributeErrors in
PyMapping_Keys() to TypeErrors.
Patch eyeballed by Tim.
Gave Python linear-time repr() implementations for dicts, lists, strings.
This means, e.g., that repr(range(50000)) is no longer 50x slower than
pprint.pprint() in 2.2 <wink>.
I don't consider this a bugfix candidate, as it's a performance boost.
Added _PyString_Join() to the internal string API. If we want that in the
public API, fine, but then it requires runtime error checks instead of
asserts.
frequently used, and in particular this allows to drop the last
remaining obvious time-waster in the crucial lookdict() and
lookdict_string() functions. Other changes consist mostly of changing
"i < ma_size" to "i <= ma_mask" everywhere.
be possible to provoke unbounded recursion now, but leaving that to someone
else to provoke and repair.
Bugfix candidate -- although this is getting harder to backstitch, and the
cases it's protecting against are mondo contrived.
code, less memory. Tests have uncovered no drawbacks. Christian and
Vladimir are the other two people who have burned many brain cells on the
dict code in recent years, and they like the approach too, so I'm checking
it in without further ado.
instead of multiplication to generate the probe sequence. The idea is
recorded in Python-Dev for Dec 2000, but that version is prone to rare
infinite loops.
The value is in getting *all* the bits of the hash code to participate;
and, e.g., this speeds up querying every key in a dict with keys
[i << 16 for i in range(20000)] by a factor of 500. Should be equally
valuable in any bad case where the high-order hash bits were getting
ignored.
Also wrote up some of the motivations behind Python's ever-more-subtle
hash table strategy.