The common technique for printing out a pointer has been to cast to a long
and use the "%lx" printf modifier. This is incorrect on Win64 where casting
to a long truncates the pointer. The "%p" formatter should be used instead.
The problem as stated by Tim:
> Unfortunately, the C committee refused to define what %p conversion "looks
> like" -- they explicitly allowed it to be implementation-defined. Older
> versions of Microsoft C even stuck a colon in the middle of the address (in
> the days of segment+offset addressing)!
The result is that the hex value of a pointer will maybe/maybe not have a 0x
prepended to it.
Notes on the patch:
There are two main classes of changes:
- in the various repr() functions that print out pointers
- debugging printf's in the various thread_*.h files (these are why the
patch is large)
Closes SourceForge patch #100505.
errors in some of the hash algorithms. For exmaple, in float_hash and
complex_hash a certain part of the value is not included in the hash
calculation. See Tim's, Guido's, and my discussion of this on
python-dev in May under the title "fix float_hash and complex_hash for
64-bit *nix"
(2) The hash algorithms that use pointers (e.g. func_hash, code_hash)
are universally not correct on Win64 (they assume that sizeof(long) ==
sizeof(void*))
As well, this patch significantly cleans up the hash code. It adds the
two function _Py_HashDouble and _PyHash_VoidPtr that the various
hashing routine are changed to use.
These help maintain the hash function invariant: (a==b) =>
(hash(a)==hash(b))) I have added Lib/test/test_hash.py and
Lib/test/output/test_hash to test this for some cases.
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
Improvements:
- does no longer need any extra memory
- has no relationship to tstate
- works in debug mode
- can easily be modified for free threading (hi Greg:)
Side effects:
Trashcan does change the order of object destruction.
Prevending that would be quite an immense effort, as
my attempts have shown. This version works always
the same, with debug mode or not. The slightly
changed destruction order should therefore be no problem.
Algorithm:
While the old idea of delaying the destruction of some
obejcts at a certain recursion level was kept, we now
no longer aloocate an object to hold these objects.
The delayed objects are instead chained together
via their ob_type field. The type is encoded via
ob_refcnt. When it comes to the destruction of the
chain of waiting objects, the topmost object is popped
off the chain and revived with type and refcount 1,
then it gets a normal Py_DECREF.
I am confident that this solution is near optimum
for minimizing side effects and code bloat.
Added wrapping macros to dictobject.c, listobject.c, tupleobject.c,
frameobject.c, traceback.c that safely prevends core dumps
on stack overflow. Macros and functions in object.c, object.h.
The method is an "elevator destructor" that turns cascading
deletes into tail recursive behavior when some limit is hit.
before calling it. This check was there when the objects were of the
same type *before* coercion, but not if they initially differed but
became the same *after* coercion.
PyNumber_Coerce() except that when the coercion can't be done and no
other exceptions happen, it returns 1 instead of raising an
exception.
Use this function in PyObject_Compare() to avoid raising an exception
simply because two objects with numeric behavior can't be coerced to a
common type; instead, proceed with the non-numeric default comparison.
Note that this is a somewhat questionable practice -- comparisons for
numeric objects shouldn't default to random behavior like this, but it
is required for backward compatibility. (Case in point, it broke
comparison of kjDict objects to integers in Aaron Watters' kjbuckets
extension.) A correct fix (for python 2.0) should involve a different
definiton of comparison altogether.
In _Py_PrintReferences(), no longer suppress once-referenced string.
Add Py_Malloc and friends and PyMem_Malloc and friends (malloc
wrappers for third parties).
getcounts() returns a list of counts of allocations and
deallocations for all different object types.
getobjects(n [, type ]) returns a list of recently allocated
and not-yet-freed objects of the given type (all
objects if no type given). Only the n most recent
(all if n==0) objects are returned.
getcounts is only available if compiled with -DCOUNT_ALLOCS,
getobjects is only available if compiled with -DTRACE_REFS. Note that
everything must be compiled with these options!
* object.[ch], bltinmodule.c, fileobject.c: changed str() to call
strobject() which calls an object's __str__ method if it has one.
strobject() is also called by writeobject() when PRINT_RAW is passed.
* ceval.c: rationalize code for PRINT_ITEM (no change in function!)
* funcobject.c, codeobject.c: added compare and hash functionality.
Functions with identical code objects and the same global dictionary are
equal. Code objects are equal when their code, constants list and names
list are identical (i.e. the filename and code name don't count).
(hash doesn't work yet since the constants are in a list and lists can't
be hashed -- suppose this should really be done with a tuple now we have
resizetuple!)
shared. The default is to save references to the integers in
the range -1..99. The lower limit can be set by defining
NSMALLNEGINTS (absolute value of smallest integer to be saved)
and NSMALLPOSINTS (1 more than the largest integer to be
saved).
tupleobject.c: Save a reference to the empty tuple to be returned
whenever a tuple of size 0 is requested. Tuples of size 1
upto, but not including, MAXSAVESIZE (default 20) are put in
free lists when deallocated. When MAXSAVESIZE equals 1, only
share references to the empty tuple, when MAXSAVESIZE equals
0, don't include the code at all and revert to the old
behavior.
object.c: Print some more statistics when COUNT_ALLOCS is defined.
image objects, and lots of new methods.
* Added counting of allocations and deallocations of builtin types if
COUNT_ALLOCS is defined. Had to move calls to NEWREF down in some
files.
* Bug fix in sorting lists.
objects of its derived classes; allow anything that has an attribute
named "__privileged__" access to anything.
* object.[ch]: added hasattr() -- test whether getattr() will succeed.
before it.
* ceval.c, object.c: moved testbool() to object.c (now extern visible)
* stringobject.c: fix bugs in and rationalize string resize in formatstring()
* tokenizer.[ch]: fix non-working code for lines longer than BUFSIZ
* Stubs for faster implementation of local variables (not yet finished)
* Added function name to code object. Print it for code and function
objects. THIS MAKES THE .PYC FILE FORMAT INCOMPATIBLE (the version
number has changed accordingly)
* Print address of self for built-in methods
* New internal functions getattro and setattro (getattr/setattr with
string object arg)
* Replaced "dictobject" with more powerful "mappingobject"
* New per-type functio tp_hash to implement arbitrary object hashing,
and hashobject() to interface to it
* Added built-in functions hash(v) and hasattr(v, 'name')
* classobject: made some functions static that accidentally weren't;
added __hash__ special instance method to implement hash()
* Added proper comparison for built-in methods and functions
* flmodule.c: added some missing functions; changed readonly flags of
some data members based upon FORMS documentation.
* listobject.c: fixed int/long arg lint bug (bites PC compilers).
* several: removed redundant print methods (repr is good enough).
* posixmodule.c: added (still experimental) process group functions.
calls the repr function. When the refcount is bad, don't print
the object at all (chances of crashes).
Changes to checking and printing of references: the consistency
check is somewhat faster; don't print strings referenced once
(most occur in function's name lists).