the purpose. Increased my claim to two bits, hoping that nobody
will complain about it. I'm taking the highest two bits, whatever
the integer word size may be.
new line.
New pvt API function _Py_PrintReferenceAddresses(): Prints only the
addresses and refcnts of the live objects. This is always safe to call,
because it has no dependence on Python's C API.
Py_Finalize(): If envar PYTHONDUMPREFS is set, call (the new)
_Py_PrintReferenceAddresses() right before dumping final pymalloc stats.
We can't print the reprs of the objects here because too much of the
interpreter has been shut down. You need to correlate the addresses
displayed here with the object reprs printed by the earlier
PYTHONDUMPREFS call to _Py_PrintReferences().
Arranged that all the objects exposed by __builtin__ appear in the list
of all objects. I basically peed away two days tracking down a mystery
leak in sys.gettotalrefcount() in a ZODB app (== tons of code), because
the object leaking the references didn't appear in the sys.getobjects(0)
list. The object happened to be False. Now False is in the list, along
with other popular & previously missing leak candidates (like None).
Alas, we still don't have a choke point covering *all* Python objects,
so the list of all objects may still be incomplete.
_Py_AddToAllObjects() that simply inserts an object at the front of
the doubly-linked list of all objects. Changed PyType_Ready() (the
closest thing we've got to a choke point for type objects) to call
that.
Py_Init crash". refchain cannot be cleared because objects can live across
Py_Finalize() and Py_Initialize() if they are kept alive by circular
references.
a lot of work: it had to save and restore the current exception around
a call to lookup_maybe(), because that could fail in rare cases, and
most objects don't have a __del__ method, so the whole exercise was
usually a waste of time. Changed this to cache the __del__ method in
the type object just like all other special methods, in a new slot
tp_del. So now subtype_dealloc() can test whether tp_del is NULL and
skip the whole exercise if it is. The new slot doesn't need a new
flag bit: subtype_dealloc() is only called if the type was dynamically
allocated by type_new(), so it's guaranteed to have all current slots.
Types defined in C cannot fill in tp_del with a function of their own,
so there's no corresponding "wrapper". (That functionality is already
available through tp_dealloc.)
The staticforward define was needed to support certain broken C
compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the
static keyword when it was used with a forward declaration of a static
initialized structure. Standard C allows the forward declaration with
static, and we've decided to stop catering to broken C compilers. (In
fact, we expect that the compilers are all fixed eight years later.)
I'm leaving staticforward and statichere defined in object.h as
static. This is only for backwards compatibility with C extensions
that might still use it.
XXX I haven't updated the documentation.
helper macros to something saner, and used them appropriately in other
files too, to reduce #ifdef blocks.
classobject.c, instance_dealloc(): One of my worst Python Memories is
trying to fix this routine a few years ago when COUNT_ALLOCS was defined
but Py_TRACE_REFS wasn't. The special-build code here is way too
complicated. Now it's much simpler. Difference: in a Py_TRACE_REFS
build, the instance is no longer in the doubly-linked list of live
objects while its __del__ method is executing, and that may be visible
via sys.getobjects() called from a __del__ method. Tough -- the object
is presumed dead while its __del__ is executing anyway, and not calling
_Py_NewReference() at the start allows enormous code simplification.
typeobject.c, call_finalizer(): The special-build instance_dealloc()
pain apparently spread to here too via cut-'n-paste, and this is much
simpler now too. In addition, I didn't understand why this routine
was calling _PyObject_GC_TRACK() after a resurrection, since there's no
plausible way _PyObject_GC_UNTRACK() could have been called on the
object by this point. I suspect it was left over from pasting the
instance_delloc() code. Instead asserted that the object is still
tracked. Caution: I suspect we don't have a test that actually
exercises the subtype_dealloc() __del__-resurrected-me code.
that their uses can be prettier. I've come to despise the names I picked
for these things, though, and expect to change all of them -- I changed
a bunch of other files to use them (replacing #ifdef blocks), but the
names were so obscure out of context that I backed that all out again.
more trivial lexical helper macros so that uses of these guys expand
to nothing at all when they're not enabled. This should help sub-
standard compilers that can't do a good job of optimizing away the
previous "(void)0" expressions.
Py_DECREF: There's only one definition of this now. Yay! That
was that last one in the family defined multiple times in an #ifdef
maze.
Py_FatalError(): Changed the char* signature to const char*.
_Py_NegativeRefcount(): New helper function for the Py_REF_DEBUG
expansion of Py_DECREF. Calling an external function cuts down on
the volume of generated code. The previous inline expansion of abort()
didn't work as intended on Windows (the program often kept going, and
the error msg scrolled off the screen unseen). _Py_NegativeRefcount
calls Py_FatalError instead, which captures our best knowledge of
how to abort effectively across platforms.
that have taken me "too long" to reverse-engineer over the years.
Vastly reduced the nesting level and redundancy of #ifdef-ery.
Took a light stab at repairing comments that are no longer true.
sys_gettotalrefcount(): Changed to enable under Py_REF_DEBUG.
It was enabled under Py_TRACE_REFS, which was much heavier than
necessary. sys.gettotalrefcount() is now available in a
Py_REF_DEBUG-only build.
mechanism is no longer evil: it no longer plays dangerous games with
the type pointer or refcounts, and objects in extension modules can play
along too without needing to edit the core first.
Rewrote all the comments to explain this, and (I hope) give clear
guidance to extension authors who do want to play along. Documented
all the functions. Added more asserts (it may no longer be evil, but
it's still dangerous <0.9 wink>). Rearranged the generated code to
make it clearer, and to tolerate either the presence or absence of a
semicolon after the macros. Rewrote _PyTrash_destroy_chain() to call
tp_dealloc directly; it was doing a Py_DECREF again, and that has all
sorts of obscure distorting effects in non-release builds (Py_DECREF
was already called on the object!). Removed Christian's little "embedded
change log" comments -- that's what checkin messages are for, and since
it was impossible to correlate the comments with the code that changed,
I found them merely distracting.
PEP 285. Everything described in the PEP is here, and there is even
some documentation. I had to fix 12 unit tests; all but one of these
were printing Boolean outcomes that changed from 0/1 to False/True.
(The exception is test_unicode.py, which did a type(x) == type(y)
style comparison. I could've fixed that with a single line using
issubtype(x, type(y)), but instead chose to be explicit about those
places where a bool is expected.
Still to do: perhaps more documentation; change standard library
modules to return False/True from predicates.
type.__module__ behavior.
This adds the module name and a dot in front of the type name in every
type object initializer, except for built-in types (and those that
already had this). Note that it touches lots of Mac modules -- I have
no way to test these but the changes look right. Apologies if they're
not. This also touches the weakref docs, which contains a sample type
object initializer. It also touches the mmap test output, because the
mmap type's repr is included in that output. It touches object.h to
put the correct description in a comment.
object.h: Added PyType_CheckExact macro.
typeobject.c, type_new():
+ Use the new macro.
+ Assert that the arguments have the right types rather than do incomplete
runtime checks "sometimes".
+ If this isn't the 1-argument flavor() of type, and there aren't 3 args
total, produce a "types() takes 1 or 3 args" msg before
PyArg_ParseTupleAndKeywords produces a "takes exactly 3" msg.
is a list of weak references to types (new-style classes). Make this
accessible to Python as the function __subclasses__ which returns a
list of types -- we don't want Python programmers to be able to
manipulate the raw list.
In order to make this possible, I also had to add weak reference
support to type objects.
This will eventually be used together with a trap on attribute
assignment for dynamic classes for a major speed-up without losing the
dynamic properties of types: when a __foo__ method is added to a
class, the class and all its subclasses will get an appropriate tp_foo
slot function.
instances).
Also added GC support to various auxiliary types: super, property,
descriptors, wrappers, dictproxy. (Only type objects have a tp_clear
field; the other types are.)
One change was necessary to the GC infrastructure. We have statically
allocated type objects that don't have a GC header (and can't easily
be given one) and heap-allocated type objects that do have a GC
header. Giving these different metatypes would be really ugly: I
tried, and I had to modify pickle.py, cPickle.c, copy.py, add a new
invent a new name for the new metatype and make it a built-in, change
affected tests... In short, a mess. So instead, we add a new type
slot tp_is_gc, which is a simple Boolean function that determines
whether a particular instance has GC headers or not. This slot is
only relevant for types that have the (new) GC flag bit set. If the
tp_is_gc slot is NULL (by far the most common case), all instances of
the type are deemed to have GC headers. This slot is called by the
PyObject_IS_GC() macro (which is only used twice, both times in
gcmodule.c).
I also changed the extern declarations for a bunch of GC-related
functions (_PyObject_GC_Del etc.): these always exist but objimpl.h
only declared them when WITH_CYCLE_GC was defined, but I needed to be
able to reference them without #ifdefs. (When WITH_CYCLE_GC is not
defined, they do the same as their non-GC counterparts anyway.)
no backwards compatibility to worry about, so I just pushed the
'closure' struct member to the back -- it's never used in the current
code base (I may eliminate it, but that's more work because the getter
and setter signatures would have to change.)
As examples, I added actual docstrings to the getset attributes of a
few types: file.closed, xxsubtype.spamdict.state.
compatibility, this required all places where an array of "struct
memberlist" structures was declared that is referenced from a type's
tp_members slot to change the type of the structure to PyMemberDef;
"struct memberlist" is now only used by old code that still calls
PyMember_Get/Set. The code in PyObject_GenericGetAttr/SetAttr now
calls the new APIs PyMember_GetOne/SetOne, which take a PyMemberDef
argument.
As examples, I added actual docstrings to the attributes of a few
types: file, complex, instance method, super, and xxsubtype.spamlist.
Also converted the symtable to new style getattr.
hack, and it's even more disgusting than a PyInstance_Check() call.
If the tp_compare slot is the slot used for overrides in Python,
it's always called.
Add some tests that show what should work too.
- Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled
- check for Py_USING_UNICODE in all places that use Unicode functions
- disables unicode literals, and the builtin functions
- add the types.StringTypes list
- remove Unicode literals from most tests.
This introduces:
- A new operator // that means floor division (the kind of division
where 1/2 is 0).
- The "future division" statement ("from __future__ import division)
which changes the meaning of the / operator to implement "true
division" (where 1/2 is 0.5).
- New overloadable operators __truediv__ and __floordiv__.
- New slots in the PyNumberMethods struct for true and floor division,
new abstract APIs for them, new opcodes, and so on.
I emphasize that without the future division statement, the semantics
of / will remain unchanged until Python 3.0.
Not yet implemented are warnings (default off) when / is used with int
or long arguments.
This has been on display since 7/31 as SF patch #443474.
Flames to /dev/null.
- Add an explicit call to PyType_Ready(&PyList_Type) to pythonrun.c
(just for the heck of it, really -- we should either explicitly
ready all types, or none).
sees it (test_iter.py is unchanged).
- Added a tp_iternext slot, which calls the iterator's next() method;
this is much faster for built-in iterators over built-in types
such as lists and dicts, speeding up pybench's ForLoop with about
25% compared to Python 2.1. (Now there's a good argument for
iterators. ;-)
- Renamed the built-in sequence iterator SeqIter, affecting the C API
functions for it. (This frees up the PyIter prefix for generic
iterator operations.)
- Added PyIter_Check(obj), which checks that obj's type has a
tp_iternext slot and that the proper feature flag is set.
- Added PyIter_Next(obj) which calls the tp_iternext slot. It has a
somewhat complex return condition due to the need for speed: when it
returns NULL, it may not have set an exception condition, meaning
the iterator is exhausted; when the exception StopIteration is set
(or a derived exception class), it means the same thing; any other
exception means some other error occurred.
new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER
new C API PyObject_GetIter(), calls tp_iter
new builtin iter(), with two forms: iter(obj), and iter(function, sentinel)
new internal object types iterobject and calliterobject
new exception StopIteration
new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py)
new magic number for .pyc files
new special method for instances: __iter__() returns an iterator
iteration over dictionaries: "for x in dict" iterates over the keys
iteration over files: "for x in file" iterates over lines
TODO:
documentation
test suite
decide whether to use a different way to spell iter(function, sentinal)
decide whether "for key in dict" is a good idea
use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?)
speed tuning (make next() a slot tp_next???)