many types were subclassable but had a xxx_dealloc function that
called PyObject_DEL(self) directly instead of deferring to
self->ob_type->tp_free(self). It is permissible to set tp_free in the
type object directly to _PyObject_Del, for non-GC types, or to
_PyObject_GC_Del, for GC types. Still, PyObject_DEL was a tad faster,
so I'm fearing that our pystone rating is going down again. I'm not
sure if doing something like
void xxx_dealloc(PyObject *self)
{
if (PyXxxCheckExact(self))
PyObject_DEL(self);
else
self->ob_type->tp_free(self);
}
is any faster than always calling the else branch, so I haven't
attempted that -- however those types whose own dealloc is fancier
(int, float, unicode) do use this pattern.
tupledealloc(): only feed the free list when the type is really a
tuple, not a subtype. Otherwise, use PyObject_GC_Del().
_PyTuple_Resize(): disallow using this for tuple subtypes.
Gave Python linear-time repr() implementations for dicts, lists, strings.
This means, e.g., that repr(range(50000)) is no longer 50x slower than
pprint.pprint() in 2.2 <wink>.
I don't consider this a bugfix candidate, as it's a performance boost.
Added _PyString_Join() to the internal string API. If we want that in the
public API, fine, but then it requires runtime error checks instead of
asserts.
1. Omit the early-out EQ/NE "lengths different?" test. Was unable to find
any real code where it triggered, but it always costs. The same is not
true of list richcmps, where different-size lists appeared to get
compared about half the time.
2. Because tuples are immutable, there's no need to refetch the lengths of
both tuples from memory again on each loop trip.
BUG ALERT: The tuple (and list) richcmp algorithm is arguably wrong,
because it won't believe there's any difference unless Py_EQ returns false
for some corresponding elements:
>>> class C:
... def __lt__(x, y): return 1
... __eq__ = __lt__
...
>>> C() < C()
1
>>> (C(),) < (C(),)
0
>>>
That doesn't make sense -- provided you believe the defn. of C makes sense.
- tuplecontains(): call RichCompare(Py_EQ).
- Get rid of tuplecompare(), in favor of new tuplerichcompare() (a
clone of list_compare()).
- Aligned the comments for large struct initializers.
This patch modifies the type structures of objects that
participate in GC. The object's tp_basicsize is increased when
GC is enabled. GC information is prefixed to the object to
maintain binary compatibility. GC objects also define the
tp_flag Py_TPFLAGS_GC.
The following patch adds "sq_contains" support to rangeobject, and enables
the already-written support for sq_contains in listobject and tupleobject.
The rangeobject "contains" code should be a bit more efficient than the
current default "in" implementation ;-) It might not get used much, but it's
not that much to add.
listobject.c and tupleobject.c already had code for sq_contains, and the
proper struct member was set, but the PyType structure was not extended to
include tp_flags, so the object-specific code was not getting called (Go
ahead, test it ;-). I also did this for the immutable_list_type in
listobject.c, eventhough it is probably never used. Symmetry and all that.
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
_PyTuple_Resize(). In addition, a change suggested by Jeremy Hylton
to limit the size of the free lists is also merged into this patch.
Charles wrote initially:
"""
Test Case: run the following code:
class Nothing:
def __len__(self):
return 5
def __getitem__(self, i):
if i < 3:
return i
else:
raise IndexError, i
def g(a,*b,**c):
return
for x in xrange(1000000):
g(*Nothing())
and watch Python's memory use go up and up.
Diagnosis:
The analysis begins with the call to PySequence_Tuple at line 1641 in
ceval.c - the argument to g is seen to be a sequence but not a tuple,
so it needs to be converted from an abstract sequence to a concrete
tuple. PySequence_Tuple starts off by creating a new tuple of length
5 (line 1122 in abstract.c). Then at line 1149, since only 3 elements
were assigned, _PyTuple_Resize is called to make the 5-tuple into a
3-tuple. When we're all done the 3-tuple is decrefed, but rather than
being freed it is placed on the free_tuples cache.
The basic problem is that the 3-tuples are being added to the cache
but never picked up again, since _PyTuple_Resize doesn't make use of
the free_tuples cache. If you are resizing a 5-tuple to a 3-tuple and
there is already a 3-tuple in free_tuples[3], instead of using this
tuple, _PyTuple_Resize will realloc the 5-tuple to a 3-tuple. It
would more efficient to use the existing 3-tuple and cache the
5-tuple.
By making _PyTuple_Resize aware of the free_tuples (just as
PyTuple_New), we not only save a few calls to realloc, but also
prevent this misbehavior whereby tuples are being added to the
free_tuples list but never properly "recycled".
"""
And later:
"""
This patch replaces my submission of Sun, 16 Apr and addresses Jeremy
Hylton's suggestions that we also limit the size of the free tuple
list. I chose 2000 as the maximum number of tuples of any particular
size to save.
There was also a problem with the previous version of this patch
causing a core dump if Python was built with Py_TRACE_REFS. This is
fixed in the below version of the patch, which uses tupledealloc
instead of _Py_Dealloc.
"""
Added wrapping macros to dictobject.c, listobject.c, tupleobject.c,
frameobject.c, traceback.c that safely prevends core dumps
on stack overflow. Macros and functions in object.c, object.h.
The method is an "elevator destructor" that turns cascading
deletes into tail recursive behavior when some limit is hit.
* {tuple,list,mapping,array}object.c: call printobject with 0 for flags
* compile.c (parsestr): use quote instead of '\'' at one crucial point
* arraymodule.c (array_getattr): Added __members__ attribute
* PROTO.h, mymalloc.h: added #ifdefs for TURBOC and GNUC.
* allobjects.h: added #include "rangeobject.h"
* Grammar: added lambda_input; relaxed syntax for exec.
* bltinmodule.c: added bagof, map, reduce, lambda, xrange.
* tupleobject.[ch]: added resizetuple().
* rangeobject.[ch]: new object type to speed up range operations (not
convinced this is needed!!!)
shared. The default is to save references to the integers in
the range -1..99. The lower limit can be set by defining
NSMALLNEGINTS (absolute value of smallest integer to be saved)
and NSMALLPOSINTS (1 more than the largest integer to be
saved).
tupleobject.c: Save a reference to the empty tuple to be returned
whenever a tuple of size 0 is requested. Tuples of size 1
upto, but not including, MAXSAVESIZE (default 20) are put in
free lists when deallocated. When MAXSAVESIZE equals 1, only
share references to the empty tuple, when MAXSAVESIZE equals
0, don't include the code at all and revert to the old
behavior.
object.c: Print some more statistics when COUNT_ALLOCS is defined.
image objects, and lots of new methods.
* Added counting of allocations and deallocations of builtin types if
COUNT_ALLOCS is defined. Had to move calls to NEWREF down in some
files.
* Bug fix in sorting lists.
Added $(SYSDEF) to its build rule in Makefile.
* cgensupport.[ch], modsupport.[ch]: removed some old stuff. Also
changed files that still used it... And made several things static
that weren't but should have been... And other minor cleanups...
* listobject.[ch]: add external interfaces {set,get}listslice
* socketmodule.c: fix bugs in new send() argument parsing.
* sunaudiodevmodule.c: added flush() and close().
* Stubs for faster implementation of local variables (not yet finished)
* Added function name to code object. Print it for code and function
objects. THIS MAKES THE .PYC FILE FORMAT INCOMPATIBLE (the version
number has changed accordingly)
* Print address of self for built-in methods
* New internal functions getattro and setattro (getattr/setattr with
string object arg)
* Replaced "dictobject" with more powerful "mappingobject"
* New per-type functio tp_hash to implement arbitrary object hashing,
and hashobject() to interface to it
* Added built-in functions hash(v) and hasattr(v, 'name')
* classobject: made some functions static that accidentally weren't;
added __hash__ special instance method to implement hash()
* Added proper comparison for built-in methods and functions
must (pretend to) support all operations except assignments;
if you don't want to support an operation you have to provide
a dummy function that raises an exception...