interning. I modified Oren's patch significantly, but the basic idea
and most of the implementation is unchanged. Interned strings created
with PyString_InternInPlace() are now mortal, and you must keep a
reference to the resulting string around; use the new function
PyString_InternImmortal() to create immortal interned strings.
helper macros to something saner, and used them appropriately in other
files too, to reduce #ifdef blocks.
classobject.c, instance_dealloc(): One of my worst Python Memories is
trying to fix this routine a few years ago when COUNT_ALLOCS was defined
but Py_TRACE_REFS wasn't. The special-build code here is way too
complicated. Now it's much simpler. Difference: in a Py_TRACE_REFS
build, the instance is no longer in the doubly-linked list of live
objects while its __del__ method is executing, and that may be visible
via sys.getobjects() called from a __del__ method. Tough -- the object
is presumed dead while its __del__ is executing anyway, and not calling
_Py_NewReference() at the start allows enormous code simplification.
typeobject.c, call_finalizer(): The special-build instance_dealloc()
pain apparently spread to here too via cut-'n-paste, and this is much
simpler now too. In addition, I didn't understand why this routine
was calling _PyObject_GC_TRACK() after a resurrection, since there's no
plausible way _PyObject_GC_UNTRACK() could have been called on the
object by this point. I suspect it was left over from pasting the
instance_delloc() code. Instead asserted that the object is still
tracked. Caution: I suspect we don't have a test that actually
exercises the subtype_dealloc() __del__-resurrected-me code.
These built-in functions are replaced by their (now callable) type:
slice()
buffer()
and these types can also be called (but have no built-in named
function named after them)
classobj (type name used to be "class")
code
function
instance
instancemethod (type name used to be "instance method")
The module "new" has been replaced with a small backward compatibility
placeholder in Python.
A large portion of the patch simply removes the new module from
various platform-specific build recipes. The following binary Mac
project files still have references to it:
Mac/Build/PythonCore.mcp
Mac/Build/PythonStandSmall.mcp
Mac/Build/PythonStandalone.mcp
[I've tweaked the code layout and the doc strings here and there, and
added a comment to types.py about StringTypes vs. basestring. --Guido]
optional attribute, only clear the exception when the internal getattr
operation raised AttributeError. Many places in this file already had
that policy; but just as many didn't, and there didn't seem to be any
rhyme or reason to it. Be consistently cautious.
Question: should I backport this? On the one hand it's a bugfix. On
the other hand it's a change in behavior. Certain forms of buggy or
just weird code would work in the past but raise an exception under
the new rules; e.g. if you define a __getattr__ method that raises a
non-AttributeError exception.
and functions: we only need to call PyObject_ClearWeakRefs() if the weakref
list is non-NULL. Since these objects are common but weakrefs are still
unusual, saving the call at deallocation time makes a lot of sense.
There really isn't a good reason for instance method objects to have
their own __dict__, __doc__ and __name__ properties that just delegate
the request to the function (callable); the default attribute behavior
already does this.
The test suite had to be fixed because the error changes from
TypeError to AttributeError.
no backwards compatibility to worry about, so I just pushed the
'closure' struct member to the back -- it's never used in the current
code base (I may eliminate it, but that's more work because the getter
and setter signatures would have to change.)
As examples, I added actual docstrings to the getset attributes of a
few types: file.closed, xxsubtype.spamdict.state.
compatibility, this required all places where an array of "struct
memberlist" structures was declared that is referenced from a type's
tp_members slot to change the type of the structure to PyMemberDef;
"struct memberlist" is now only used by old code that still calls
PyMember_Get/Set. The code in PyObject_GenericGetAttr/SetAttr now
calls the new APIs PyMember_GetOne/SetOne, which take a PyMemberDef
argument.
As examples, I added actual docstrings to the attributes of a few
types: file, complex, instance method, super, and xxsubtype.spamlist.
Also converted the symtable to new style getattr.
descriptors for each attribute. The getattr() implementation is
similar to PyObject_GenericGetAttr(), but delegates to im_self instead
of looking in __dict__; I couldn't do this as a wrapper around
PyObject_GenericGetAttr().
XXX A problem here is that this is a case of *delegation*. dir()
doesn't see exactly the same attributes that are actually defined;
e.g. if the delegate is a Python function object, it supports
attributes like func_code etc., but these are not visible to dir(); on
the other hand, dynamic function attributes (stored in the function's
__dict__) *are* visible to dir(). Maybe we need a mechanism to tell
dir() about the delegation mechanism? I vaguely recall seeing a
request in the newsgroup for a more formal definition of attribute
delegation too. Sigh, time for a new PEP.
iterable object. I'm not sure how that got overlooked before!
Got rid of the internal _PySequence_IterContains, introduced a new
internal _PySequence_IterSearch, and rewrote all the iteration-based
"count of", "index of", and "is the object in it or not?" routines to
just call the new function. I suppose it's slower this way, but the
code duplication was getting depressing.
the base classes is not a classic class, and its class (the metaclass)
is callable, call the metaclass to do the deed.
One effect of this is that, when mixing classic and new-style classes
amongst the bases of a class, it doesn't matter whether the first base
class is a classic class or not: you will always get the error
"TypeError: metatype conflict among bases". (Formerly, with a classic
class first, you'd get "TypeError: PyClass_New: base must be a class".)
Another effect is that multiple inheritance from ExtensionClass.Base,
with a classic class as the first class, transfers control to the
ExtensionClass.Base class. This is what we need for SF #443239 (and
also for running Zope under 2.2a4, before ExtensionClass is replaced).
While not even documented, they were clearly part of the C API,
there's no great difficulty to support them, and it has the cool
effect of not requiring any changes to ExtensionClass.c.
an inappropriate first argument. Now that there are more ways for
this to fail, make sure to report the name of the class of the
expected instance and of the actual instance.
This introduces:
- A new operator // that means floor division (the kind of division
where 1/2 is 0).
- The "future division" statement ("from __future__ import division)
which changes the meaning of the / operator to implement "true
division" (where 1/2 is 0.5).
- New overloadable operators __truediv__ and __floordiv__.
- New slots in the PyNumberMethods struct for true and floor division,
new abstract APIs for them, new opcodes, and so on.
I emphasize that without the future division statement, the semantics
of / will remain unchanged until Python 3.0.
Not yet implemented are warnings (default off) when / is used with int
or long arguments.
This has been on display since 7/31 as SF patch #443474.
Flames to /dev/null.
For rich comparisons, use instance_getattr2() when possible to avoid
the expense of setting an AttributeError. Also intern the name_op[]
table and use the interned strings rather than creating a new string
and interning it each time through.
safely together and don't duplicate logic (the common logic was factored
out into new private API function _PySequence_IterContains()).
Visible change:
some_complex_number in some_instance
no longer blows up if some_instance has __getitem__ but neither
__contains__ nor __iter__. test_iter changed to ensure that remains true.
need to be specified in the type structures independently. The flag
exists only for binary compatibility.
This is a "source cleanliness" issue and introduces no behavioral changes.
sees it (test_iter.py is unchanged).
- Added a tp_iternext slot, which calls the iterator's next() method;
this is much faster for built-in iterators over built-in types
such as lists and dicts, speeding up pybench's ForLoop with about
25% compared to Python 2.1. (Now there's a good argument for
iterators. ;-)
- Renamed the built-in sequence iterator SeqIter, affecting the C API
functions for it. (This frees up the PyIter prefix for generic
iterator operations.)
- Added PyIter_Check(obj), which checks that obj's type has a
tp_iternext slot and that the proper feature flag is set.
- Added PyIter_Next(obj) which calls the tp_iternext slot. It has a
somewhat complex return condition due to the need for speed: when it
returns NULL, it may not have set an exception condition, meaning
the iterator is exhausted; when the exception StopIteration is set
(or a derived exception class), it means the same thing; any other
exception means some other error occurred.
new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER
new C API PyObject_GetIter(), calls tp_iter
new builtin iter(), with two forms: iter(obj), and iter(function, sentinel)
new internal object types iterobject and calliterobject
new exception StopIteration
new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py)
new magic number for .pyc files
new special method for instances: __iter__() returns an iterator
iteration over dictionaries: "for x in dict" iterates over the keys
iteration over files: "for x in file" iterates over lines
TODO:
documentation
test suite
decide whether to use a different way to spell iter(function, sentinal)
decide whether "for key in dict" is a good idea
use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?)
speed tuning (make next() a slot tp_next???)
must now initialize the extra field used by the weak-ref machinery to
NULL themselves, to avoid having to require PyObject_INIT() to check
if the type supports weak references and do it there. This causes less
work to be done for all objects (the type object does not need to be
consulted to check for the Py_TPFLAGS_HAVE_WEAKREFS bit).
set a function attribute on a method (either bound or unbound). This
reverts to Python 2.0 behavior that no attributes of the method are
writable, but provides a more informative error message.
earlier coercion changes, not by rich comparisons. When a coercion
function returns 1 (meaning it cannot do it), it should not INCREF the
arguments. When no __coerce__() method was found, instance_coerce()
originally returned 0, pretending it did it. Neil changed the return
value to 1, more accurately reflecting that it didn't do anything, but
forgot to take out the two INCREF calls.
- Got rid of instance_cmp(); refactored instance_compare().
- Added instance_richcompare() which calls __lt__() etc.
Some unrelated stuff mixed in:
- Aligned comments in various large struct initializers.
- Better test to avoid recursion if __coerce__ returns self as the
first argument (this is an unrelated fix by Neil Schemenauer!).
- Style nit: don't use Py_DECREF(Py_NotImplemented); use
Py_DECREF(result) -- it just looks better. :-)
Closes SF patch #103123.
funcobject.h:
PyFunctionObject: add the func_dict slot.
funcobject.c:
PyFunction_New(): Initialize the func_dict slot to NULL.
func_getattr(): Rename to func_getattro() and change the
signature. It's more efficient to use attro methods and dig the C
string out than it is to re-convert a C string to a PyString.
Also, add support for getting the __dict__ (a.k.a. func_dict)
attribute, and for getting an arbitrary function attribute.
func_setattr(): Rename to func_setattro() and change the signature
for the same reason. Also add support for setting __dict__
(a.k.a. func_dict) and any arbitrary function attribute.
func_dealloc(): Be sure to DECREF the func_dict slot.
func_traverse(): Be sure to traverse func_dict too.
PyFunction_Type: make the necessary func_?etattro() changes.
classobject.c:
instancemethod_memberlist: Add __dict__
instancemethod_setattro(): New method to set arbitrary attributes
on methods (really the underlying im_func). Raise TypeError when
the instance is bound or when you're trying to set one of the
reserved im_* attributes.
instancemethod_getattr(): Renamed to instancemethod_getattro()
since that's what it really is. Also, added support fo getting
arbitrary attributes through the im_func.
PyMethod_Type: Do the ?etattr{,o} dance.
that Py_INCREF boosts global _Py_RefTotal when Py_REF_DEBUG is defined
but Py_TRACE_REFS isn't.
There are, IMO, way too many preprocessor gimmicks in use for refcount
debugging (at least 3 distinct true/false symbols, but not all 8 combos
are supported by the code, etc etc), and no coherent documentation of
this stuff -- 'twas too painful to track this one down.
to integer types (i.e. Py_uintptr_t, our spelling of C9X's uintptr_t).
ANSI specifies that pointer compares other than == and != to
non-related structures are undefined. This quiets an Insure
portability warning.
is no __getslice__ available. Also does the same for C extension types.
Includes rudimentary documentation (it could use a cross reference to the
section on slice objects, I couldn't figure out how to do that) and a test
suite for all Python __hooks__ I could think of, including the new
behaviour.
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").
There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)
The common technique for printing out a pointer has been to cast to a long
and use the "%lx" printf modifier. This is incorrect on Win64 where casting
to a long truncates the pointer. The "%p" formatter should be used instead.
The problem as stated by Tim:
> Unfortunately, the C committee refused to define what %p conversion "looks
> like" -- they explicitly allowed it to be implementation-defined. Older
> versions of Microsoft C even stuck a colon in the middle of the address (in
> the days of segment+offset addressing)!
The result is that the hex value of a pointer will maybe/maybe not have a 0x
prepended to it.
Notes on the patch:
There are two main classes of changes:
- in the various repr() functions that print out pointers
- debugging printf's in the various thread_*.h files (these are why the
patch is large)
Closes SourceForge patch #100505.
errors in some of the hash algorithms. For exmaple, in float_hash and
complex_hash a certain part of the value is not included in the hash
calculation. See Tim's, Guido's, and my discussion of this on
python-dev in May under the title "fix float_hash and complex_hash for
64-bit *nix"
(2) The hash algorithms that use pointers (e.g. func_hash, code_hash)
are universally not correct on Win64 (they assume that sizeof(long) ==
sizeof(void*))
As well, this patch significantly cleans up the hash code. It adds the
two function _Py_HashDouble and _PyHash_VoidPtr that the various
hashing routine are changed to use.
These help maintain the hash function invariant: (a==b) =>
(hash(a)==hash(b))) I have added Lib/test/test_hash.py and
Lib/test/output/test_hash to test this for some cases.
Avoid calling the dealloc function, previously triggered with
DECREF(inst). This caused a segfault in PyDict_GetItem, called with a
NULL dict, whenever inst->in_dict fails under low-memory conditions.
This patch modifies the type structures of objects that
participate in GC. The object's tp_basicsize is increased when
GC is enabled. GC information is prefixed to the object to
maintain binary compatibility. GC objects also define the
tp_flag Py_TPFLAGS_GC.
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
The previous checkin (2.84) added a PyErr_Format call that made the
cost of raising an AttributeError much more expensive. In general
this doesn't matter, except that checks for __init__ and
__del__ methods, where exceptions are caught and cleared in C, also
got much more expensive.
The fix is to split instance_getattr1 into two calls:
instance_getattr2 checks the instance and the class for the attribute
and returns it or returns NULL on error. It does not raise an
exception.
instance_getattr1 does rexec checks, then calls instance_getattr2. It
raises an exception if instance_getattr2 returns NULL.
PyInstance_New and instance_dealloc now call instance_getattr2
directly.
In line with a similar checkin to object.c a while ago, this patch
gives a more descriptive error message for an attribute error on a
class instance. The message now looks like:
AttributeError: 'Descriptor' instance has no attribute 'GetReturnType'
not in restricted mode.
__dict__ can be set to any dictionary; the cl_getattr, cl_setattr and
cl_delattr slots are refreshed.
__name__ can be set to any string.
__bases__ can be set to to a tuple of classes, provided they are not
subclasses of the class whose attribute is being assigned.
__getattr__, __setattr__ and __delattr__ can be set to anything, or
deleted; the appropriate slot (cl_getattr, cl_setattr, cl_delattr) is
refreshed.
(Note: __name__ really doesn't need to be a special attribute, but
that would be more work.)
former lets you give an instance a set of new instance vars. The
latter lets you give it a new class. Both are typechecked and
disallowed in restricted mode.
For classes, the check for read-only special attributes is tightened
so that only assignments to __dict__, __bases__, __name__,
__getattr__, __setattr__, and __delattr__ (these could be made to work
as well, but I don't know if that's useful -- let's see first whether
mucking with instances will help).