intern the string "__new__" so we can call PyObject_GetAttr() rather
than PyObject_GetAttrString(). (Though it's a mystery why slot_tp_new
is being called when a class doesn't define __new__. I'll look into
that tomorrow.)
2.2 backport candidate (but I won't do it).
a lot of work: it had to save and restore the current exception around
a call to lookup_maybe(), because that could fail in rare cases, and
most objects don't have a __del__ method, so the whole exercise was
usually a waste of time. Changed this to cache the __del__ method in
the type object just like all other special methods, in a new slot
tp_del. So now subtype_dealloc() can test whether tp_del is NULL and
skip the whole exercise if it is. The new slot doesn't need a new
flag bit: subtype_dealloc() is only called if the type was dynamically
allocated by type_new(), so it's guaranteed to have all current slots.
Types defined in C cannot fill in tp_del with a function of their own,
so there's no corresponding "wrapper". (That functionality is already
available through tp_dealloc.)
subtype_dealloc().
When call_finalizer() failed, it would return without going through
the trashcan end macro, thereby unbalancing the trashcan nesting level
counter, and thereby defeating the test case (slottrash() in
test_descr.py). This in turn meant that the assert in the GC_UNTRACK
macro wasn't triggered by the slottrash() test despite a bug in the
code: _PyTrash_destroy_chain() calls the dealloc routine with an
object that's untracked, and the assert in the GC_UNTRACK macro would
fail on this; but because of an earlier test that resurrects an
object, causing call_finalizer() to fail and the trashcan nesting
level to be unbalanced, so _PyTrash_destroy_chain() was never called.
Calling the slottrash() test in isolation *did* trigger the assert,
however.
So the fix is twofold: (1) call the GC_UnTrack() function instead of
the GC_UNTRACK macro, because the function is safe when the object is
already untracked; (2) when call_finalizer() fails, jump to a label
that exits through the trashcan end macro, keeping the trashcan
nesting balanced.
This is inspired by SF patch 581742 (by Jonathan Hogg, who also
submitted the bug report, and two other suggested patches), but
separates the non-GC case from the GC case to avoid testing for GC
several times.
Had to fix an assert() from call_finalizer() that asserted that the
object wasn't untracked, because it's possible that the object isn't
GC'ed!
For a file f, iter(f) now returns f (unless f is closed), and f.next()
is similar to f.readline() when EOF is not reached; however, f.next()
uses a readahead buffer that messes up the file position, so mixing
f.next() and f.readline() (or other methods) doesn't work right.
Calling f.seek() drops the readahead buffer, but other operations
don't.
The real purpose of this change is to reduce the confusion between
objects and their iterators. By making a file its own iterator, it's
made clearer that using the iterator modifies the file object's state
(in particular the current position).
A nice side effect is that this speeds up "for line in f:" by not
having to use the xreadlines module. The f.xreadlines() method is
still supported for backwards compatibility, though it is the same as
iter(f) now.
(I made some cosmetic changes to Oren's code, and added a test for
"file closed" to file_iternext() and file_iter().)
directly when no comparison function is specified. This saves a layer
of function call on every compare then. Measured speedups:
i 2**i *sort \sort /sort 3sort +sort %sort ~sort =sort !sort
15 32768 12.5% 0.0% 0.0% 100.0% 0.0% 50.0% 100.0% 100.0% -50.0%
16 65536 8.7% 0.0% 0.0% 0.0% 0.0% 0.0% 12.5% 0.0% 0.0%
17 131072 8.0% 25.0% 0.0% 25.0% 0.0% 14.3% 5.9% 0.0% 0.0%
18 262144 6.3% -10.0% 12.5% 11.1% 0.0% 6.3% 5.6% 12.5% 0.0%
19 524288 5.3% 5.9% 0.0% 5.6% 0.0% 5.9% 5.4% 0.0% 2.9%
20 1048576 5.3% 2.9% 2.9% 5.1% 2.8% 1.3% 5.9% 2.9% 4.2%
The best indicators are those that take significant time (larger i), and
where sort doesn't do very few compares (so *sort and ~sort benefit most
reliably). The large numbers are due to roundoff noise combined with
platform variability; e.g., the 14.3% speedup for %sort at i=17 reflects
a printed elapsed time of 0.18 seconds falling to 0.17, but a change in
the last digit isn't really meaningful (indeed, if it really took 0.175
seconds, one electron having a lazy nanosecond could shift it to either
value <wink>). Similarly the 25% at 3sort i=17 was a meaningless change
from 0.05 to 0.04. However, almost all the "meaningless changes" were
in the same direction, which is good. The before-and-after times for
*sort are clearest:
before after
0.18 0.16
0.25 0.23
0.54 0.50
1.18 1.11
2.57 2.44
5.58 5.30
longer to run than normal. A profiler run showed that this was due to
PyFrame_New() taking up an unreasonable amount of time. A little
thinking showed that this was due to the while loop clearing the space
available for the stack. The solution is to only clear the local
variables (and cells and free variables), not the space available for
the stack, since anything beyond the stack top is considered to be
garbage anyway. Also, use memset() instead of a while loop counting
backwards. This should be a time savings for normal code too! (By a
probably unmeasurable amount. :-)
version of PySlice_GetIndicesEx"):
> OK. Michael, if you want to check in indices(), go ahead.
Then I did what was needed, but didn't check it in. Here it is.
listsort. If the former calls itself recursively, they're a waste of
time, since it's called on a random permutation of a random subset of
elements. OTOH, for exactly the same reason, they're an immeasurably
small waste of time (the odds of finding exploitable order in a random
permutation are ~= 0, so the special-case loops looking for order give
up quickly). The point is more for conceptual clarity.
Also changed some "assert comments" into real asserts; when this code
was first written, Python.h didn't supply assert.h.
introduced, list.sort() was rewritten to use only the "< or not <?"
distinction. After rich comparisons were introduced, docompare() was
fiddled to translate a Py_LT Boolean result into the old "-1 for <,
0 for ==, 1 for >" flavor of outcome, and the sorting code was left
alone. This left things more obscure than they should be, and turns
out it also cost measurable cycles.
So: The old CMPERROR novelty is gone. docompare() is renamed to islt(),
and now has the same return conditinos as PyObject_RichCompareBool. The
SETK macro is renamed to ISLT, and is even weirder than before (don't
complain unless you want to maintain the sort code <wink>).
Overall, this yields a 1-2% speedup in the usual (no explicit function
passed to list.sort()) case when sorting arrays of floats (as sortperf.py
does). The boost is higher for arrays of ints.
The staticforward define was needed to support certain broken C
compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the
static keyword when it was used with a forward declaration of a static
initialized structure. Standard C allows the forward declaration with
static, and we've decided to stop catering to broken C compilers. (In
fact, we expect that the compilers are all fixed eight years later.)
I'm leaving staticforward and statichere defined in object.h as
static. This is only for backwards compatibility with C extensions
that might still use it.
XXX I haven't updated the documentation.
PyType_Ready() because the tp_iternext slot is set (fortunately,
because using the tp_iternext implementation for the the next()
implementation is buggy). Also changed the allocation order in
enum_next() so that the underlying iterator is only moved ahead when
we have successfully allocated the result tuple and index.
di_dict field when the end of the list is reached. Also make the
error ("dictionary changed size during iteration") a sticky state.
Also remove the next() method -- one is supplied automatically by
PyType_Ready() because the tp_iternext slot is set. That's a good
thing, because the implementation given here was buggy (it never
raised StopIteration).
object references (it_seq for seqiterobject, it_callable and
it_sentinel for calliterobject) when the end of the list is reached.
Also remove the next() methods -- one is supplied automatically by
PyType_Ready() because the tp_iternext slot is set. That's a good
thing, because the implementation given here was buggy (it never
raised StopIteration).
it_seq field when the end of the list is reached.
Also remove the next() method -- one is supplied automatically by
PyType_Ready() because the tp_iternext slot is set. That's a good
thing, because the implementation given here was buggy (it never
raised StopIteration).
If the object is an ExtensionClass, for example, the slot is not even
defined. So we must check that the type has the slot (implied by
HAVE_CLASS) before calling tp_init().
explicit comparison function case: use PyObject_Call instead of
PyEval_CallObject. Same thing in context, but gives a 2.4% overall
speedup when sorting a list of ints via list.sort(__builtin__.cmp).
MSDN sample programs use it, apparently in error. The correct name
is WIN32_LEAN_AND_MEAN. After switching to the correct name, in two
cases more was needed because the code actually relied on things that
disappear when WIN32_LEAN_AND_MEAN is defined.
arg tuple. This was suggested on c.l.py but afraid I can't find the msg
again for proper attribution. For
list.sort(cmp)
where list is a list of random ints, and cmp is __builtin__.cmp, this
yields an overall 50-60% speedup on my Win2K box. Of course this is a
best case, because the overhead of calling cmp relative to the cost of
actually comparing two ints is at an extreme. Nevertheless it's huge
bang for the buck. An additionak 20-30% can be bought by making the arg
tuple an immortal static (avoiding all but "the first" PyTuple_New), but
that's tricky to make correct since docompare needs to be reentrant. So
this picks the cherry and leaves the pits for Fred <wink>.
Note that this makes no difference to the
list.sort()
case; an arg tuple gets built only if the user specifies an explicit
sort function.
helper macros to something saner, and used them appropriately in other
files too, to reduce #ifdef blocks.
classobject.c, instance_dealloc(): One of my worst Python Memories is
trying to fix this routine a few years ago when COUNT_ALLOCS was defined
but Py_TRACE_REFS wasn't. The special-build code here is way too
complicated. Now it's much simpler. Difference: in a Py_TRACE_REFS
build, the instance is no longer in the doubly-linked list of live
objects while its __del__ method is executing, and that may be visible
via sys.getobjects() called from a __del__ method. Tough -- the object
is presumed dead while its __del__ is executing anyway, and not calling
_Py_NewReference() at the start allows enormous code simplification.
typeobject.c, call_finalizer(): The special-build instance_dealloc()
pain apparently spread to here too via cut-'n-paste, and this is much
simpler now too. In addition, I didn't understand why this routine
was calling _PyObject_GC_TRACK() after a resurrection, since there's no
plausible way _PyObject_GC_UNTRACK() could have been called on the
object by this point. I suspect it was left over from pasting the
instance_delloc() code. Instead asserted that the object is still
tracked. Caution: I suspect we don't have a test that actually
exercises the subtype_dealloc() __del__-resurrected-me code.
more trivial lexical helper macros so that uses of these guys expand
to nothing at all when they're not enabled. This should help sub-
standard compilers that can't do a good job of optimizing away the
previous "(void)0" expressions.
Py_DECREF: There's only one definition of this now. Yay! That
was that last one in the family defined multiple times in an #ifdef
maze.
Py_FatalError(): Changed the char* signature to const char*.
_Py_NegativeRefcount(): New helper function for the Py_REF_DEBUG
expansion of Py_DECREF. Calling an external function cuts down on
the volume of generated code. The previous inline expansion of abort()
didn't work as intended on Windows (the program often kept going, and
the error msg scrolled off the screen unseen). _Py_NegativeRefcount
calls Py_FatalError instead, which captures our best knowledge of
how to abort effectively across platforms.
Repair segfaults and infinite loops in COUNT_ALLOCS builds in the
presence of new-style (heap-allocated) classes/types.
Bugfix candidate. I'll backport this to 2.2. It's irrelevant in 2.1.
that have taken me "too long" to reverse-engineer over the years.
Vastly reduced the nesting level and redundancy of #ifdef-ery.
Took a light stab at repairing comments that are no longer true.
sys_gettotalrefcount(): Changed to enable under Py_REF_DEBUG.
It was enabled under Py_TRACE_REFS, which was much heavier than
necessary. sys.gettotalrefcount() is now available in a
Py_REF_DEBUG-only build.
mechanism is no longer evil: it no longer plays dangerous games with
the type pointer or refcounts, and objects in extension modules can play
along too without needing to edit the core first.
Rewrote all the comments to explain this, and (I hope) give clear
guidance to extension authors who do want to play along. Documented
all the functions. Added more asserts (it may no longer be evil, but
it's still dangerous <0.9 wink>). Rearranged the generated code to
make it clearer, and to tolerate either the presence or absence of a
semicolon after the macros. Rewrote _PyTrash_destroy_chain() to call
tp_dealloc directly; it was doing a Py_DECREF again, and that has all
sorts of obscure distorting effects in non-release builds (Py_DECREF
was already called on the object!). Removed Christian's little "embedded
change log" comments -- that's what checkin messages are for, and since
it was impossible to correlate the comments with the code that changed,
I found them merely distracting.
In a fresh interpreter, type.mro(tuple) would segfault, because
PyType_Ready() isn't called for tuple yet. To fix, call
PyType_Ready(type) if type->tp_dict is NULL.
These built-in functions are replaced by their (now callable) type:
slice()
buffer()
and these types can also be called (but have no built-in named
function named after them)
classobj (type name used to be "class")
code
function
instance
instancemethod (type name used to be "instance method")
The module "new" has been replaced with a small backward compatibility
placeholder in Python.
A large portion of the patch simply removes the new module from
various platform-specific build recipes. The following binary Mac
project files still have references to it:
Mac/Build/PythonCore.mcp
Mac/Build/PythonStandSmall.mcp
Mac/Build/PythonStandalone.mcp
[I've tweaked the code layout and the doc strings here and there, and
added a comment to types.py about StringTypes vs. basestring. --Guido]
gotten from a weak reference to NULL instead of to None. This caused
the following assert() to fail (but only in 2.2 in the debug build --
I have to find a better test case). Will backport.
optional attribute, only clear the exception when the internal getattr
operation raised AttributeError. Many places in this file already had
that policy; but just as many didn't, and there didn't seem to be any
rhyme or reason to it. Be consistently cautious.
Question: should I backport this? On the one hand it's a bugfix. On
the other hand it's a change in behavior. Certain forms of buggy or
just weird code would work in the past but raise an exception under
the new rules; e.g. if you define a __getattr__ method that raises a
non-AttributeError exception.
473985. Through a subtle rearrangement of some members in the etype
struct (!), mapping methods are now preferred over sequence methods,
which is necessary to support str.__getitem__("hello", slice(4)) etc.
[ 400998 ] experimental support for extended slicing on lists
somewhat spruced up and better tested than it was when I wrote it.
Includes docs & tests. The whatsnew section needs expanding, and arrays
should support extended slices -- later.
discovered that subtype_traverse must traverse the type if it is a
heap type, because otherwise some cycles involving a type and its
instance would not be collected. Simplest example:
while 1:
class C(object): pass
C.ref = C()
This program grows without bounds before this fix. (It grows ever
slower since it spends ever more time in the collector.)
Simply adding the right visit() call to subtype_traverse() revealed
other problems. With MvL's help we re-learned that type_clear()
doesn't have to clear *all* references, only the ones that may not be
cleared by other means. Careful analysis (see comments in the code)
revealed that only tp_mro needs to be cleared. (The previous checkin
to this file adds a test for tp_mro==NULL to _PyType_Lookup() that's
essential to prevent crashes due to tp_mro being NULL when
subtype_dealloc() tries to look for a __del__ method.) The same kind
of analysis also revealed that subtype_clear() doesn't need to clear
the instance dict.
With this fix, a useful property of the collector is once again
guaranteed: a single gc.collect() call will clear out all garbage.
(It didn't always before, which put us on the track of this bug.)
Will backport to 2.2.
about the test case, slot_nb_power gets called on behalf of its second
argument, but with a non-None modulus it wouldn't check this, and
believes it is called on behalf of its first argument. Fix this
properly, and get rid of the code in _PyType_Lookup() that tries to
call _PyType_Ready(). But do leave a check for a NULL tp_mro there,
because this can still legitimately occur.
I'll fix this in 2.2.x too.
While I was at it, I added a tp_clear handler and changed the
tp_dealloc handler to use the clear_slots helper for the tp_clear
handler.
Also tightened the rules for slot names: they must now be proper
identifiers (ignoring the dirty little fact that <ctype.h> is locale
sensitive).
Also set mp->flags = READONLY for the __weakref__ pseudo-slot.
Most of this is a 2.2 bugfix candidate; I'll apply it there myself.
Change the module constructor (module_init) to have the signature
__init__(name:str, doc=None); this prevents the call from type_new()
to succeed. While we're at it, prevent repeated calling of
module_init for the same module from leaking the dict, changing the
semantics so that __dict__ is only initialized if NULL.
Also adding a unittest, test_module.py.
This is an incompatibility with 2.2, if anybody was instantiating the
module class before, their argument list was probably empty; so this
can't be backported to 2.2.x.
In the past, an object's tp_compare could return any value. In 2.2
the docs were tightened to require it to return -1, 0 or 1; and -1 for
an error.
We now issue a warning if the value is not in this range. When an
exception is raised, we allow -1 or -2 as return value, since -2 will
the recommended return value for errors in the future. (Eventually
tp_compare will also be allowed to return +2, to indicate
NotImplemented; but that can only be implemented once we know all
extensions return a value in [-2...1]. Or perhaps it will require the
type to set a flag bit.)
I haven't decided yet whether to backport this to 2.2.x. The patch
applies fine. But is it fair to start warning in 2.2.2 about code
that worked flawlessly in 2.2.1?
for 'str' and 'unicode', and can be used instead of
types.StringTypes, e.g. to test whether something is "a string":
isinstance(x, string) is True for Unicode and 8-bit strings. This
is an abstract base class and cannot be instantiated directly.
A MemoryError is now raised when the list cannot be created.
There is a test, but as the comment says, it really only
works for 32 bit systems. I don't know how to improve
the test for other systems (ie, 64 bit or systems
where the data size != addressable size,
e.g. 64 bit data, but 48 bit addressable memory)
returned a proxy for __class__ whose __bases__ was also a proxy. The
merge_class_dict() helper for dir() assumed incorrectly that __bases__
would always be a tuple and used the in-line tuple API on the proxy.
I will backport this to 2.2 as well.
handlers were both set, but were not compatible. This change uses only the
tp_getattro handler with a more "modern" approach.
This fixes SF bug #551285.
don't understand how this function works, also beefed up the docs. The
most common usage error is of this form (often spread out across gotos):
if (_PyString_Resize(&s, n) < 0) {
Py_DECREF(s);
s = NULL;
goto outtahere;
}
The error is that if _PyString_Resize runs out of memory, it automatically
decrefs the input string object s (which also deallocates it, since its
refcount must be 1 upon entry), and sets s to NULL. So if the "if"
branch ever triggers, it's an error to call Py_DECREF(s): s is already
NULL! A correct way to write the above is the simpler (and intended)
if (_PyString_Resize(&s, n) < 0)
goto outtahere;
Bugfix candidate.
This implements ideas from Marc-Andre, Martin, Guido and me on Python-Dev.
"Short" Unicode strings are encoded into a "big enough" stack buffer,
then exactly as much string space as they turn out to need is allocated
at the end. This should have speed benefits akin to Martin's "measure
once, allocate once" strategy, but without needing a distinct measuring
pass.
"Long" Unicode strings allocate as much heap space as they could possibly
need (4 x # Unicode chars), and do a realloc at the end to return the
untouched excess. Since the overallocation is likely to be substantial,
this shouldn't burden the platform realloc with unusably small excess
blocks.
Also simplified uses of the PyString_xyz functions. Also added a release-
build check that 4*size doesn't overflow a C int. Sooner or later, that's
going to happen.
left and right type were of the same type and not classic instances.
This shortcut is dangerous for proxy types, because it means that
coerce(Proxy(1), Proxy(2.1)) leaves Proxy(1) unchanged rather than
turning it into Proxy(1.0).
In an ever-so-slight change of semantics, I now only take the shortcut
when the left and right types are of the same type and don't have the
CHECKTYPES feature. It so happens that classic instances have this
flag, so the shortcut is still skipped in this case (i.e. nothing
changes for classic instances). Proxies also have this flag set
(otherwise implementing numeric operations on proxies would become
nightmarish) and this means that the shortcut is also skipped there,
as desired. It so happens that int, long and float also have this
flag set; that means that e.g. coerce(1, 1) will now invoke
int_coerce(). This is fine: int_coerce() can deal with this, and I'm
not worried about the performance; int_coerce() is only invoked when
the user explicitly calls coerce(), which should be rarer than rare.
This fixes the problem that Barry reported on python-dev:
>>> 23000 .__class__ = bool
crashes in the deallocator. This was because int inherited tp_free
from object, which uses the default allocator.
2.2. Bugfix candidate.
states can be for this function, and ensure that only AttributeErrors
are masked. Any other exception raised via the equivalent of
getattr(cls, '__bases__') should be propagated up.
abstract_issubclass(): If abstract_get_bases() returns NULL, we must
call PyErr_Occurred() to see if an exception is being propagated, and
return -1 or 0 as appropriate. This is the specific fix for a problem
whereby if getattr(derived, '__bases__') raised an exception, an
"undetected error" would occur (under a debug build). This nasty
situation was uncovered when writing a security proxy extension type
for the Zope3 project, where the security proxy raised a Forbidden
exception on getattr of __bases__.
PyObject_IsInstance(), PyObject_IsSubclass(): After both calls to
abstract_get_bases(), where we're setting the TypeError if the return
value is NULL, we must first check to see if an exception occurred,
and /not/ mask an existing exception.
Neil Schemenauer should double check that these changes don't break
his ExtensionClass examples (there aren't any test cases for those
examples and abstract_get_bases() was added by him in response to
problems with ExtensionClass). Neil, please add test cases if
possible!
I belive this is a bug fix candidate for Python 2.2.2.
http://www.python.org/sf/444708
This adds the optional argument for str.strip
to unicode.strip too and makes it possible
to call str.strip with a unicode argument
and unicode.strip with a str argument.
+ Continued looping until n bytes in the buffer have been filled, not
just when n bytes have been read from the file. This repairs the
bug that f.readlines() only sucked up the first 8192 bytes of the file
on Windows when universal newlines was enabled and f was opened in
U mode (see Python-Dev -- this was the ultimate cause of the
test_inspect.py failure).
+ Changed prototye to take a char* buffer (void* doesn't make much sense).
+ Squashed size_t vs int mismatches (in particular, besides the unsigned
vs signed distinction, size_t may be larger than int).
+ Gets out under all error conditions now (it's possible for fread() to
suffer an error even if it returns a number larger than 0 -- any
"short read" is an error or EOF condition).
+ Rearranged and simplified declarations.
pointers is a signed type. Changing "allocated" to a signed int makes
undetected overflow more likely, but there was no overflow detection
before either.
PyFrame_FastToLocals() and PyFrame_LocalsToFast() had a return if
f_nlocals was 0. I think this was a holdover from the pre 2.1 days
when regular locals were the only kind of local variables.
The change makes it possible to use a free variable in eval or exec if
it the variable is also used elsewhere in the same block, which is
what the documentation says.
consistency checks, enabled only in a debug (Py_DEBUG) build. Note that
this never gets called automatically unless PYMALLOC_DEBUG is #define'd
too, and the envar PYTHONMALLOCSTATS exists.
Change type_get_doc (the get function for __doc__) to look in tp_dict
more often, and if it finds a descriptor in tp_dict, to call it (with
a NULL instance). This means you can add a __doc__ descriptor to a
new-style class that returns instance docs when called on an instance,
and class docs when called on a class -- or the same docs in either
case, but lazily computed.
I'll also check this into the 2.2 maintenance branch.
PyNumber_InPlaceMultiply insisted on calling sq_inplace_repeat if it
existed, even if nb_inplace_multiply also existed and the arguments
weren't right for sq_inplace_repeat. Change this to only use
sq_inplace_repeat if nb_inplace_multiply isn't defined.
Bugfix candidate.
Add a method zfill to str, unicode and UserString and change
Lib/string.py accordingly.
This activates the zfill version in unicodeobject.c that was
commented out and implements the same in stringobject.c. It also
adds the test for unicode support in Lib/string.py back in and
uses repr() instead() of str() (as it was before Lib/string.py 1.62)
Complex numbers implement divmod() and //, neither of which makes one
lick of sense. Unfortunately this is documented, so I'm adding a
deprecation warning now, so we can delete this silliness, oh, around
2005 or so.
Bugfix candidate (At least for 2.2.2, I think.)
Highlights: import and friends will understand any of \r, \n and \r\n
as end of line. Python file input will do the same if you use mode 'U'.
Everything can be disabled by configuring with --without-universal-newlines.
See PEP278 for details.
Added code to call this when PYMALLOC_DEBUG is enabled, and envar
PYTHONMALLOCSTATS is set, whenever a new arena is obtained and once
late in the Python shutdown process.
Put a bound on the number of frameobjects that can live in the
frameobject free_list.
Am also backporting to 2.2. I don't intend to backport to 2.1 (too
much work -- lots of cyclic structures leak there, and the GC API).
Add optional arg to string methods strip(), lstrip(), rstrip().
The optional arg specifies characters to delete.
Also for UserString.
Still to do:
- Misc/NEWS
- LaTeX docs (I did the docstrings though)
- Unicode methods, and Unicode support in the string methods.
_PyObject_DebugMalloc: explicitly cast PyObject_Malloc's result to the
target pointer type.
_PyObject_DebugDumpStats: change decl of arena_alignment from unsigned
int to unsigned long.
This is for the 2.3 release only (it's new code).
most of the work. In particular, if the underlying realloc is able to
grow the memory block in place, great (this routine used to do a fresh
malloc + memcpy every time a block grew). BTW, I'm not so keen here on
avoiding possible quadratic-time realloc patterns as I am on making
the debug pymalloc more invisible (the more it uses memory "just like"
the underlying allocator, the better the chance that a suspected memory
corruption bug won't vanish when the debug malloc is turned on).
PyMem_{Del, DEL} doesn't work yet (compilation problems).
pyport.h: _PyMem_EXTRA is gone.
pmem.h: Repaired comments. PyMem_{Malloc, MALLOC} and
PyMem_{Realloc, REALLOC} now make the same x-platform guarantees when
asking for 0 bytes, and when passing a NULL pointer to the latter.
object.c: PyMem_{Malloc, Realloc} just call their macro versions
now, since the latter take care of the x-platform 0 and NULL stuff
by themselves now.
pypcre.c, grow_stack(): So sue me. On two lines, this called
PyMem_RESIZE to grow a "const" area. It's not legit to realloc a
const area, so the compiler warned given the new expansion of
PyMem_RESIZE. It would have gotten the same warning before if it
had used PyMem_Resize() instead; the older macro version, but not the
function version, silently cast away the constness. IMO that was a wrong
thing to do, and the docs say the macro versions of PyMem_xyz are
deprecated anyway. If somebody else is resizing const areas with the
macro spelling, they'll get a warning when they recompile now too.
The bug report pointed out a bogosity in the comment block explaining
thread safety for arena management. Repaired that comment, repaired a
couple others while I was at it, and added an assert.
_PyMalloc_DebugRealloc: If this needed to get more memory, but couldn't,
it erroneously freed the original memory. Repaired that.
This is for 2.3 only (unless we decide to backport the new pymalloc).
open_the_file: Some (not all) flavors of Windows set errno to EINVAL
when passed a syntactically invalid filename. Python turned that into an
incomprehensible complaint about the mode string. Fixed by special-casing
MSVC.
when PyType_Ready() was called, if ob_type was found to be NULL, it
was always set to &PyType_Type; now it is set to base->ob_type,
where base is tp_base, defaulting to &PyObject_Type.
- PyType_Ready() accidentally did not inherit tp_is_gc; now it does.
Bugfix candidate.
runtime multiplications and divisions, via the scheme developed with
Vladimir Marangozov on Python-Dev. The pool_header struct loses its
capacity member, but gains nextoffset and maxnextoffset members; this
still leaves it at 32 bytes on a 32-bit box (it has to be padded to a
multiple of 8 bytes).
speeds up __getitem__ and __setitem__ in subclasses of built-in
sequences.
It's much revised because I took the opportunity to refactor the code
somewhat (moving a large section of duplicated code to a helper
function) and added comments to a series of functions.
what these do given a 0 size argument. This is so that when pymalloc
is enabled, we don't need to wrap pymalloc calls in goofy little
routines special-casing 0. Note that it's virtually impossible to meet
the doc's promise that malloc(0) will never return NULL; this makes a
best effort, but not an insane effort. The code does promise that
realloc(not-NULL, 0) will never return NULL (malloc(0) is much harder).
_PyMalloc_Realloc: Changed to take over all requests for 0 bytes, and
rearranged to be a little quicker in expected cases.
All over the place: when resorting to the platform allocator, call
free/malloc/realloc directly, without indirecting thru macros. This
should avoid needing a nightmarish pile of #ifdef-ery if PYMALLOC_DEBUG
is changed so that pymalloc takes over all Py(Mem, Object} memory
operations (which would add useful debugging info to PyMem_xyz
allocations too).
PEP 285. Everything described in the PEP is here, and there is even
some documentation. I had to fix 12 unit tests; all but one of these
were printing Boolean outcomes that changed from 0/1 to False/True.
(The exception is test_unicode.py, which did a type(x) == type(y)
style comparison. I could've fixed that with a single line using
issubtype(x, type(y)), but instead chose to be explicit about those
places where a bool is expected.
Still to do: perhaps more documentation; change standard library
modules to return False/True from predicates.
possible pool states. I think it's much clearer now.
Added a new long overdue block-management overview comment block.
I believe the comments are in good shape now.
Added two comments about possible small optimizations (one getting rid
of runtime multiplications at the cost of a new pool_header member; the
other getting rid of runtime divisions and the pool_header capacity
member, at the cost of a static const vector of 32 uints).
This displays stats about the # of arenas, pools, blocks and bytes, to
stderr, both used and reserved but unused.
CAUTION: Because PYMALLOC_DEBUG is on, the debug malloc routine adds
16 bytes to each request. This makes each block appear two size classes
higher than it would be if PYMALLOC_DEBUG weren't on.
So far, playing with this confirms the obvious: there's a lot of activity
in the "small dict" size class, but nothing in the core makes any use of
the 8-byte or 16-byte classes.
the code so that the most frequent cases come first. Added comments.
Found a hidden assumption that a pool contains room for at least two
blocks, and added an assert to catch a violation if it ever happens in
a place where that matters. Gave the normal "I allocated this block"
case a longer basic block to work with before it has to do its first
branch (via breaking apart an embedded assignment in an "if", and
hoisting common code out of both branches).
address obtained from system malloc/realloc without holding the GIL.
When the vector of arena base addresses has to grow, the old vector is
deliberately leaked. This makes "stale" x-thread references safe.
arenas and narenas are also declared volatile, and changed in an order
that prevents a thread from picking up a value of narenas too large
for the value of arenas it sees.
Added more asserts.
Fixed an old inaccurate comment.
Added a comment explaining why it's safe to call pymalloc free/realloc
with an address obtained from system malloc/realloc even when arenas is
still NULL (this is obscure, since the ADDRESS_IN_RANGE macro
appears <wink> to index into arenas).
this. But added an overflow check just in case there is.
Got rid of the ushort macro. It wasn't used anymore (it was only used
in the no-longer-exists off_t macro), and there's no plausible use for it.
waste the first pool if malloc happens to return a pool-aligned address.
This means the number of pools per arena can now vary by 1. Unfortunately,
the code counted up from 0 to a presumed constant number of pools. So
changed the increasing "watermark" counter to a decreasing "nfreepools"
counter instead, and fiddled various stuff accordingly. This also allowed
getting rid of two more macros.
Also changed the code to align the first address to a pool boundary
instead of a page boundary. These are two parallel sets of macro #defines
that happen to be identical now, but the page macros are in theory more
restrictive (bigger), and there's simply no reason I can see that it
wasn't aligning to the less restrictive pool size all along (the code
only relies on pool alignment).
Hmm. The "page size" macros aren't used for anything *except* defining
the pool size macros, and the comments claim the latter isn't necessary.
So this has the feel of a layer of indirection that doesn't serve a
purpose; should probably get rid of the page macros now.
are called without the GIL. It's incredibly unlikely to fail, but I can't
make this bulletproof without either adding a lock for exclusion, or
giving up on growing the arena base-address vector (it would be safe if
this were a static array).
+ A new scheme for determining whether an address belongs to a pymalloc
arena. This should be 100% reliable. The poolp->pooladdr and
poolp->magic members are gone. A new poolp->arenaindex member takes
their place. Note that the pool header overhead doesn't actually
shrink, though, since the header is padded to a multiple of 8 bytes.
+ _PyMalloc_Free and _PyMalloc_Realloc should now be safe to call for
any legit address, whether obtained from a _PyMalloc function or from
the system malloc/realloc. It should even be safe to call
_PyMalloc_Free when *not* holding the GIL, provided that the passed-in
address was obtained from system malloc/realloc. Since this is
accomplished without any locks, you better believe the code is subtle.
I hope it's sufficiently commented.
+ The above implies we don't need the new PyMalloc_{New, NewVar, Del}
API anymore, and could switch back to PyObject_XXX without breaking
existing code mixing PyObject_XXX with PyMem_{Del, DEL, Free, FREE}.
Nothing is done here about that yet, and I'd like to see this new
code exercised more first.
+ The small object threshhold is boosted to 256 (the max). We should
play with that some more, but the old 64 was way too small for 2.3.
+ Getting a new arena is now done via new function new_arena().
+ Removed some unused macros, and squashed out some macros that were
used only once to define other macros.
+ Arenas are no longer linked together. A new vector of arena base
addresses had to be created anyway to make address classification
bulletproof.
+ A lot of the patch size is an illusion: given the way address
classification works now, it was more convenient to switch the
sense of the prime "if" tests in the realloc and free functions,
so the "if" and "else" blocks got swapped.
+ Assorted minor code, comment and whitespace cleanup.
Back to the Windows installer <wink>.
The fix makes it possible to call PyObject_GC_UnTrack() more than once
on the same object, and then move the PyObject_GC_UnTrack() call to
*before* the trashcan code is invoked.
BUGFIX CANDIDATE!
descriptor, as used for the tp_methods slot of a type. These new flag
bits are both optional, and mutually exclusive. Most methods will not
use either. These flags are used to create special method types which
exist in the same namespace as normal methods without having to use
tedious construction code to insert the new special method objects in
the type's tp_dict after PyType_Ready() has been called.
If METH_CLASS is specified, the method will represent a class method
like that returned by the classmethod() built-in.
If METH_STATIC is specified, the method will represent a static method
like that returned by the staticmethod() built-in.
These flags may not be used in the PyMethodDef table for modules since
these special method types are not meaningful in that case; a
ValueError will be raised if these flags are found in that context.
Assorted: bump the serial number via a trivial new bumpserialno()
function. The point is to give a single place to set a breakpoint when
waiting for a specific serial number.
of get_line. This makes test_bufio finish in 1.7 seconds instead of 57
seconds on my machine (with Py_DEBUG defined).
Also, rename the local variables n1 and n2 to used_v_size and
total_v_size.
When WITH_PYMALLOC is defined, define PYMALLOC_DEBUG to enable the debug
allocator. This can be done independent of build type (release or debug).
A debug build automatically defines PYMALLOC_DEBUG when pymalloc is
enabled. It's a detected error to define PYMALLOC_DEBUG when pymalloc
isn't enabled.
Two debugging entry points defined only under PYMALLOC_DEBUG:
+ _PyMalloc_DebugCheckAddress(const void *p) can be used (e.g., from gdb)
to sanity-check a memory block obtained from pymalloc. It sprays
info to stderr (see next) and dies via Py_FatalError if the block is
detectably damaged.
+ _PyMalloc_DebugDumpAddress(const void *p) can be used to spray info
about a debug memory block to stderr.
A tiny start at implementing "API family" checks isn't good for
anything yet.
_PyMalloc_DebugRealloc() has been optimized to do little when the new
size is <= old size. However, if the new size is larger, it really
can't call the underlying realloc() routine without either violating its
contract, or knowing something non-trivial about how the underlying
realloc() works. A memcpy is always done in this case.
This was a disaster for (and only) one of the std tests: test_bufio
creates single text file lines up to a million characters long. On
Windows, fileobject.c's get_line() uses the horridly funky
getline_via_fgets(), which keeps growing and growing a string object
hoping to find a newline. It grew the string object 1000 bytes each
time, so for a million-character string it took approximately forever
(I gave up after a few minutes).
So, also:
fileobject.c, getline_via_fgets(): When a single line is outrageously
long, grow the string object at a mildly exponential rate, instead of
just 1000 bytes at a time.
That's enough so that a debug-build test_bufio finishes in about 5 seconds
on my Win98SE box. I'm curious to try this on Win2K, because it has very
different memory behavior than Win9X, and test_bufio always took a factor
of 10 longer to complete on Win2K. It *could* be that the endless
reallocs were simply killing it on Win2K even in the release build.
Also move all _PyMalloc_XXX entry points into obmalloc.c.
The Windows build works fine.
The Unix build is changed here (Makefile.pre.in), but not tested.
No other platform's build process has been fiddled.
Konrad was too kind. Not only did it raise an exception, the specific
exception it raised made no sense. These are old bugs in complex_pow()
and friends:
1. Raising 0 to a negative power isn't a range error, it's a domain
error, so changed c_pow() to set errno to EDOM in that case instead
of ERANGE.
2. Changed complex_pow() to:
A. Used the Py_ADJUST_ERANGE2 macro to try to clear errno of a spurious
ERANGE error due to underflow in the libm pow() called by c_pow().
B. Produced different exceptions depending on the errno value:
i) For errno==EDOM, raise ZeroDivisionError instead of ValueError.
This is for consistency with the non-complex cases 0.0**-2 and
0**-2 and 0L**-2.
ii) For errno==ERANGE, raise OverflowError.
Bugfix candidate.
The proper fix is not quite what was submitted; it's really better to
take the class of the object passed rather than calling PyMethod_New
with NULL pointer args, because that can then cause other core dumps
later.
I also added a testcase for the fix to classmethods() in test_descr.py.
I've already applied this to the 2.2 branch.
As promised in my response to the bug report, I'm not really fixing
it; in fact, one could argule over what the proper fix should do.
Instead, I'm adding a little magic that raises TypeError if you try to
pickle an instance of a class that has __slots__ but doesn't define or
override __getstate__. This is done by adding a bozo __getstate__
that always raises TypeError.
There were several places that assumed the md_dict field was always
set, but it needn't be. Fixed these to be more careful.
I changed PyModule_GetDict() to initialize md_dict to a new dictionary
if it's NULL.
Bugfix candidate.
and (b) stop trying to prevent file growth.
Beef up the file.truncate() docs.
Change test_largefile.py to stop assuming that f.truncate() moves the
file pointer to the truncation point, and to verify instead that it leaves
the file position alone. Remove the test for what happens when a
specified size exceeds the original file size (it's ill-defined, according
to the Single Unix Spec).
dropping MS's inadequate _chsize() function. This was inspired by
SF patch 498109 ("fileobject truncate support for win32"), which I
rejected.
libstdtypes.tex: Someone who knows should update the availability
blurb. For example, if it's available on Linux, it would be good to
say so.
test_largefile: Uncommented the file.truncate() tests, and reworked to
do more. The old comment about "permission errors" in the truncation
tests under Windows was almost certainly due to that the file wasn't open
for *write* access at this point, so of course MS wouldn't let you
truncate it. I'd be appalled if a Unixish system did.
CAUTION: Someone should run this test on Linux (etc) too. The
truncation part was commented out before. Note that test_largefile isn't
run by default.
Adapter from SF patch 528038; fixes SF bug 527816.
The wrapper for __nonzero__ should be wrap_inquiry rather than
wrap_unaryfunc, since the slot returns an int, not a PyObject *.
Another year in the quest to out-guess random C behavior.
Added macros Py_ADJUST_ERANGE1(X) and Py_ADJUST_ERANGE2(X, Y). The latter
is useful for functions with complex results. Two corrections to errno-
after-libm-call are attempted:
1. If the platform set errno to ERANGE due to underflow, clear errno.
Some unknown subset of libm versions and link options do this. It's
allowed by C89, but I never figured anyone would do it.
2. If the platform did not set errno but overflow occurred, force
errno to ERANGE. C89 required setting errno to ERANGE, but C99
doesn't. Some unknown subset of libm versions and link options do
it the C99 way now.
Bugfix candidate, but hold off until some Linux people actually try it,
with and without -lieee. I'll send a help plea to Python-Dev.
PyNumber_Add() tries the nb_add slot first, then falls back to
sq_concat. However, tt didn't check the return value of sq_concat.
If sq_concat returns NotImplemented, raise the standard TypeError.
[ 526072 ] pickling os.stat results round II
structseq's constructors can now take "invisible" fields in a dict.
Gave the constructors better error messages.
their __reduce__ method puts these fields in a dict.
(this is all in aid of getting os.stat_result's to pickle portably)
Also fixes
[ 526039 ] devious code can crash structseqs
Thought needed about how much of this counts as a bugfix. Certainly
#526039 needs to be fixed.
platform realloc(p, 0) returns NULL, so MALLOC_ZERO_RETURNS_NULL can
be correctly undefined yet realloc(p, 0) can return NULL anyway.
Prevent realloc(p, 0) doing free(p) and returning NULL via a different
hack. Would probably be better to get rid of MALLOC_ZERO_RETURNS_NULL
entirely.
Bugfix candidate.
Due to the bizarre definition of _PyLong_Copy(), creating an instance
of a subclass of long with a negative value could cause core dumps
later on. Unfortunately it looks like the behavior of _PyLong_Copy()
is quite intentional, so the fix is more work than feels comfortable.
This fix is almost, but not quite, the code that Naofumi Honda added;
in addition, I added a test case.
Objects/
fileobject.c
stringobject.c
unicodeobject.c
This commit doesn't include the cleanup patches for stringobject.c and
unicodeobject.c which are shown separately in the patch manager. Those
patches will be regenerated and applied in a subsequent commit, so as
to preserve a fallback position (this commit to those files).
Fix for the UTF-8 decoder: it will now accept isolated surrogates
(previously it raised an exception which causes round-trips to
fail).
Added new tests for UTF-8 round-trip safety (we rely on UTF-8 for
marshalling Unicode objects, so we better make sure it works for
all Unicode code points, including isolated surrogates).
Bumped the PYC magic in a non-standard way -- please review. This
was needed because the old PYC format used illegal UTF-8 sequences
for isolated high surrogates which now raise an exception.
NULL, so that you can call PyType_Ready() to initialize a type that
is to be separately compiled with C on Windows.
inherit_special(): Add a long comment explaining that you have to set
tp_new if your base class is PyBaseObject_Type.
Fix for SF bug #492345. (I could've sworn I checked this in, but
apparently I didn't!)
This code:
class Classic:
pass
class New(Classic):
__metaclass__ = type
attempts to create a new-style class with only classic bases -- but it
doesn't work right. Attempts to fix it so it works caused problems
elsewhere, so I'm now raising a TypeError in this case.
delivered bizarre results. Check float_divmod for a Py_NotImplemented
return and pass it along (instead of treating Py_NotImplemented as a
2-tuple).
CONVERT_TO_DOUBLE: Added comments; this macro is obscure.
PyDict_UpdateFromSeq2(): removed it.
PyDict_MergeFromSeq2(): made it public and documented it.
PyDict_Merge() docs: updated to reveal <wink> that the second
argument can be any mapping object.
no get function was defined, the property's doc string was
inaccessible. This was because the test for prop_get was made
*before* the test for a NULL/None object argument.
Also changed the property class defined in Python in a comment to test
for NULL to decide between get and delete; this makes it less Python
but then, assigning None to a property doesn't delete it!
PyString_FromString():
Since the length of the string is already being stored in size,
changed the strcpy() to a memcpy() for a small speed improvement.
out the for loop at the end intended to zero out new items wasn't
doing anything, because sv->ob_size was already equal to newsize. The
fix slightly refactors the function, introducing a variable oldsize
and doing away with sizediff (which was used only once), and using
oldsize and newsize consistently. I also added comments explaining
what the two for loops do. (Looking at the CVS annotation of this
function, it's no miracle a bug crept in -- this has been patched by
many different folks! :-)
This is best reproduced by
while 1:
class U(unicode):
pass
U(u"xxxxxx")
The unicode_dealloc() code wasn't properly freeing the str and defenc
fields of the Unicode object when freeing a subtype instance. Fixed
this by a subtle refactoring that actually reduces the amount of code
slightly.
PyCell_Set() incremenets the reference count, so the earlier XINCREF
causes a leak.
Also make a number of small performance improvements to the code on
the assumption that most of the time variables are not rebound across
a FastToLocals() / LocalsToFast() pair.
Replace uses of PyCell_Set() and PyCell_Get() with PyCell_SET() and
PyCell_GET(), since the frame is guaranteed to contain cells.
Add a missing DECREF in an obscure corner. If the str() or repr() of
an object passed to a string interpolation -- e.g. "%s" % obj --
returns a non-string, the returned object was leaked.
Repair an indentation glitch.
Replace a bunch of PyString_AsString() calls (and their ilk) with
macros.
It was easier than I thought, assuming that no other things contribute
to the instance size besides slots -- a pretty good bet. With a test
suite, no less!
happy if one could delete the __dict__ attribute of an instance. I
love to make Jim happy, so here goes...
- New-style objects now support deleting their __dict__. This is for
all intents and purposes equivalent to assigning a brand new empty
dictionary, but saves space if the object is not used further.
int_mul(): new and vastly simpler overflow checking. Whether it's
faster or slower will likely vary across platforms, favoring boxes
with fast floating point. OTOH, we no longer have to worry about
people shipping broken LONG_BIT definitions <0.9 wink>.
There's now a new structmember code, T_OBJECT_EX, which is used for
all __slot__ variables (except __weakref__, which has special behavior
anyway). This new code raises AttributeError when the variable is
NULL rather than converting NULL to None.
string object (or a Unicode that's trivially converted to ASCII).
PyObject_GetAttr(): add an 'else' to the Unicode test like
PyObject_SetAttr() already has.
Rather than tweaking the inheritance of type object slots (which turns
out to be too messy to try), this fix adds a __hash__ to the list and
dict types (the only mutable types I'm aware of) that explicitly
raises an error. This has the advantage that list.__hash__([]) also
raises an error (previously, this would invoke object.__hash__([]),
returning the argument's address); ditto for dict.__hash__.
The disadvantage for this fix is that 3rd party mutable types aren't
automatically fixed. This should be added to the rules for creating
subclassable extension types: if you don't want your object to be
hashable, add a tp_hash function that raises an exception.
Also, it's possible that I've forgotten about other mutable types for
which this should be done.
SF patch #480716 by Greg Chapman fixes the problem that super's
__get__ method always returns an instance of super, even when the
instance whose __get__ method is called is an instance of a subclass
of super.
Other issues fixed:
- super(C, C()).__class__ would return the __class__ attribute of C()
rather than the __class__ attribute of the super object. This is
confusing. To fix this, I decided to change the semantics of super
so that it only applies to code attributes, not to data attributes.
After all, overriding data attributes is not supported anyway.
- While super(C, x) carefully checked that x is an instance of C,
super(C).__get__(x) made no such check, allowing for a loophole.
This is now fixed.
slot_tp_descr_set(): When deleting an attribute described by a
descriptor implemented in Python, the descriptor's __del__ method is
called by the slot_tp_descr_set dispatch function. This is bogus --
__del__ already has a different meaning. Renaming this use of __del__
is renamed to __delete__.
Bugfix candidate.
int_repr(): we've never had a buffer big enough to hold the largest
possible result on a 64-bit box. Now that we're using snprintf instead
of sprintf, this can lead to nonsense results instead of random stack
corruption.
pass the buffer length. Stop using it. It should be deprecated, but too
late in the release cycle to do that now.
New static format_float() does the same thing but requires passing the
buffer length too. Use it instead.
const char* instead of char*. The change is conceptually correct, and
indirectly fixes a compiler wng introduced when somebody else innocently
passed a const char* to this function.
sprintf() to PyOS_snprintf() for buffer overrun avoidance.
complex_print(), complex_repr(), complex_str(): Call complex_to_buf()
passing in sizeof(buf).
confusing error messages. If a new-style class has no sequence or
mapping behavior, attempting to use the indexing notation with a
non-integer key would complain that the sequence index must be an
integer, rather than complaining that the operation is not supported.
of multiple inheritance from a mix of new- and classic-style classes.
This is his patch, plus a start at some test cases from me. Will check
in more, plus a NEWS blurb, later tonight.
object, so the "Metroworks only" section should not decref it in case
of error (the caller is responsible for decref'ing in case of error --
and does).
helping for types that defined tp_richcmp but not tp_compare, although
that's when it's most valuable, and strings moved into that category
since the fast path was first introduced. Now it helps for same-type
non-Instance objects that define rich or 3-way compares.
For all the edits here, the rest just amounts to moving the fast path from
do_richcmp into PyObject_RichCompare, saving a layer of function call
(measurable on my box!). This loses when NESTING_LIMIT is exceeded, but I
don't care about that (fast-paths are for normal cases, not pathologies).
Also added a tasteful <wink> label to get out of PyObject_RichCompare, as
the if/else nesting in this routine was getting incomprehensible.
Try to ensure that divmod(-0.0, 1.0) -> (-0.0, +0.0) across platforms.
It always did on Windows, and still does. It didn't on Linux. Alas,
there's no platform-independent way to write a test case for this.
Bugfix candidate.
presence of NaNs. So pass the issue on to the platform libm fabs();
after all, fabs() is a std C function because you can't implement it
correctly in portable C89.
should just avoid calling it in the first place to avoid waiting for a repr
of a large object like a dict or list. The result of PyObject_Repr() was
being leaked as well.
Bugfix candidate!
XXX Remaining problems:
- The GC module doesn't know about these; I think it has its reasons
to disallow calling __del__, but for now, __del__ on new-style
objects is called when the GC module discards an object, for better
or for worse.
- The code to call a __del__ handler is really ridiculously
complicated, due to all the different debug #ifdefs. I've copied
this from the similar code in classobject.c, so I'm pretty sure I
did it right, but it's not pretty. :-(
- No tests yet.
object.h: Added PyType_CheckExact macro.
typeobject.c, type_new():
+ Use the new macro.
+ Assert that the arguments have the right types rather than do incomplete
runtime checks "sometimes".
+ If this isn't the 1-argument flavor() of type, and there aren't 3 args
total, produce a "types() takes 1 or 3 args" msg before
PyArg_ParseTupleAndKeywords produces a "takes exactly 3" msg.
the va_list until we are sure we have a format string and need to use it;
this avoid premature initialization and having to finalize it several
different places because of error returns.
and functions: we only need to call PyObject_ClearWeakRefs() if the weakref
list is non-NULL. Since these objects are common but weakrefs are still
unusual, saving the call at deallocation time makes a lot of sense.
PyObject_CallFunctionObArgs() and PyObject_CallMethodObArgs() have the
advantage that no format strings need to be parsed. The CallMethod
variant also avoids creating a new string object in order to retrieve
a method from an object as well.
outer level, the iterator protocol is used for memory-efficiency (the
outer sequence may be very large if fully materialized); at the inner
level, PySequence_Fast() is used for time-efficiency (these should
always be sequences of length 2).
dictobject.c, new functions PyDict_{Merge,Update}FromSeq2. These are
wholly analogous to PyDict_{Merge,Update}, but process a sequence-of-2-
sequences argument instead of a mapping object. For now, I left these
functions file static, so no corresponding doc changes. It's tempting
to change dict.update() to allow a sequence-of-2-seqs argument too.
Also changed the name of dictionary's keyword argument from "mapping"
to "x". Got a better name? "mapping_or_sequence_of_pairs" isn't
attractive, although more so than "mosop" <wink>.
abstract.h, abstract.tex: Added new PySequence_Fast_GET_SIZE function,
much faster than going thru the all-purpose PySequence_Size.
libfuncs.tex:
- Document dictionary().
- Fiddle tuple() and list() to admit that their argument is optional.
- The long-winded repetitions of "a sequence, a container that supports
iteration, or an iterator object" is getting to be a PITA. Many
months ago I suggested factoring this out into "iterable object",
where the definition of that could include being explicit about
generators too (as is, I'm not sure a reader outside of PythonLabs
could guess that "an iterator object" includes a generator call).
- Please check my curly braces -- I'm going blind <0.9 wink>.
abstract.c, PySequence_Tuple(): When PyObject_GetIter() fails, leave
its error msg alone now (the msg it produces has improved since
PySequence_Tuple was generalized to accept iterable objects, and
PySequence_Tuple was also stomping on the msg in cases it shouldn't
have even before PyObject_GetIter grew a better msg).
of the if block where it was before. The name is only used inside
that if block, but the storage is referenced outside it via the 's'
variable.
(This patch was part of SF patch #474590 -- RISC OS support.)
The C-code in fileobject.readinto(buffer) which parses
the arguments assumes that size_t is interchangeable
with int:
size_t ntodo, ndone, nnow;
if (f->f_fp == NULL)
return err_closed();
if (!PyArg_Parse(args, "w#", &ptr, &ntodo))
return NULL;
This causes a problem on Alpha / Tru64 / OSF1 v5.1
where size_t is a long and sizeof(long) != sizeof(int).
The patch I'm proposing declares ntodo as an int. An
alternative might be to redefine w# to expect size_t.
[We can't change w# because there are probably third party modules
relying on it. GvR]
response to a message by Laura Creighton on c.l.py. E.g.
>>> 0+''
TypeError: unsupported operand types for +: 'int' and 'str'
(previously this did not mention the operand types)
>>> ''+0
TypeError: cannot concatenate 'str' and 'int' objects
There really isn't a good reason for instance method objects to have
their own __dict__, __doc__ and __name__ properties that just delegate
the request to the function (callable); the default attribute behavior
already does this.
The test suite had to be fixed because the error changes from
TypeError to AttributeError.
'slotdef' structure typedef and 'struct wrapperbase'. By adding the
wrapper docstrings to the slotdef structure, the slotdefs array can
serve as the data structure that drives add_operators(); the wrapper
descriptor contains a pointer to slotdef structure. This replaces
lots of custom code from add_operators() by a loop over the slotdefs
array, and does away with all the tab_xxx tables.
This patch implements what we have discussed on python-dev late in
September: str(obj) and unicode(obj) should behave similar, while
the old behaviour is retained for unicode(obj, encoding, errors).
The patch also adds a new feature with which objects can provide
unicode(obj) with input data: the __unicode__ method. Currently no
new tp_unicode slot is implemented; this is left as option for the
future.
Note that PyUnicode_FromEncodedObject() no longer accepts Unicode
objects as input. The API name already suggests that Unicode
objects do not belong in the list of acceptable objects and the
functionality was only needed because
PyUnicode_FromEncodedObject() was being used directly by
unicode(). The latter was changed in the discussed way:
* unicode(obj) calls PyObject_Unicode()
* unicode(obj, encoding, errors) calls PyUnicode_FromEncodedObject()
One thing left open to discussion is whether to leave the
PyUnicode_FromObject() API as a thin API extension on top of
PyUnicode_FromEncodedObject() or to turn it into a (macro) alias
for PyObject_Unicode() and deprecate it. Doing so would have some
surprising consequences though, e.g. u"abc" + 123 would turn out
as u"abc123"...
[Marc-Andre didn't have time to check this in before the deadline. I
hope this is OK, Marc-Andre! You can still make changes and commit
them on the trunk after the branch has been made, but then please mail
Barry a context diff if you want the change to be merged into the
2.2b1 release branch. GvR]
the left-hand operand may not be the proxy in all cases. If it isn't,
we end up doing two things: a) unwrapping something that isn't a
PyWeakReference (later resulting in a core dump) and b) passing a
proxy as the right-hand operand anyway, even though that can't be
handled by the actual handler (maybe eventually causing a core dump).
This is fixed by always unwrapping all the proxies involved before
passing anything to the actual handler.
isinstance() now allows any object as the first argument and a class, a
type or something with a __bases__ tuple attribute for the second
argument. This closes SF patch #464992.
object.c, PyObject_Str: Don't try to optimize anything except exact
string objects here; in particular, let str subclasses go thru tp_str,
same as non-str objects. This allows overrides of tp_str to take
effect.
stringobject.c:
+ string_print (str's tp_print): If the argument isn't an exact string
object, get one from PyObject_Str.
+ string_str (str's tp_str): Make a genuine-string copy of the object if
it's of a proper str subclass type. str() applied to a str subclass
that doesn't override __str__ ends up here.
test_descr.py: New str_of_str_subclass() test.
efficient:
- recurse down subclasses only once rather than for each affected
slot;
- short-circuit recursing down subclasses when a subclass has its own
definition of the name that caused the update_slot() calls in the
first place;
- inline collect_ptrs().
using the same algorithm as the slot updates. The slotdefs array is
now sorted by slot offset and has an interned string object corresponding
to the name added to each item. More can be done but I need to commit
this first as a working intermediate stage.
The problem is that if fread() returns a short count, we attempt
another fread() the next time through the loop, and apparently glibc
clears or ignores the eof condition so the second fread() requires
another ^D to make it see the eof condition.
According to the man page (and the C std, I hope) fread() can only
return a short count on error or eof. I'm using that in the band-aid
solution to avoid calling fread() a second time after a short read.
Note that xreadlines() still has this problem: it calls
readlines(sizehint) until it gets a zero-length return. Since
xreadlines() is mostly used for reading real files, I won't worry
about this until we get a bug report.
inherit_slots(): tp_as_buffer was getting inherited as if it were a
method pointer, rather than a pointer to a vector of method pointers. As
a result, inheriting from a type that implemented buffer methods was
ineffective, leaving all the tp_as_buffer slots NULL in the subclass.
corresponding to a dispatch slot (e.g. __getitem__ or __add__) is set,
calculate the proper dispatch slot and propagate the change to all
subclasses. Because of multiple inheritance, there's no easy way to
avoid always recursing down the tree of subclasses. Who cares?
(There's more to do, but this works. There's also a test for this now.)
lseek(fp, 0L, SEEK_CUR) can make a filedescriptor unusable.
This workaround is expected to last only a few weeks (until GUSI
is fixed), but without it test_email fails.
the problem that slots weren't inherited properly. override_slots()
no longer exists; in its place comes fixup_slot_dispatchers() which
does more and different work and is table-based. (Eventually I want
this table also to replace all the little tab_foo tables.)
Also add a wrapper for __delslice__; this required a change in
test_descrtut.py.
without the Py_TPFLAGS_CHECKTYPES flag) in the wrappers. This
required a few changes in test_descr.py to cope with the fact that the
complex type has __int__, __long__ and __float__ methods that always
raise an exception.
is a list of weak references to types (new-style classes). Make this
accessible to Python as the function __subclasses__ which returns a
list of types -- we don't want Python programmers to be able to
manipulate the raw list.
In order to make this possible, I also had to add weak reference
support to type objects.
This will eventually be used together with a trap on attribute
assignment for dynamic classes for a major speed-up without losing the
dynamic properties of types: when a __foo__ method is added to a
class, the class and all its subclasses will get an appropriate tp_foo
slot function.
This simplifies the rounding in _PyObject_VAR_SIZE, allows to restore the
pre-rounding calling sequence, and allows some nice little simplifications
in its callers. I'm still making it return a size_t, though.
As Guido suggested, this makes the new subclassing code substantially
simpler. But the mechanics of doing it w/ C macro semantics are a mess,
and _PyObject_VAR_SIZE has a new calling sequence now.
Question: The PyObject_NEW_VAR macro appears to be part of the public API.
Regardless of what it expands to, the notion that it has to round up the
memory it allocates is new, and extensions containing the old
PyObject_NEW_VAR macro expansion (which was embedded in the
PyObject_NEW_VAR expansion) won't do this rounding. But the rounding
isn't actually *needed* except for new-style instances with dict pointers
after a variable-length blob of embedded data. So my guess is that we do
not need to bump the API version for this (as the rounding isn't needed
for anything an extension can do unless it's recompiled anyway). What's
your guess?
pad memory to properly align the __dict__ pointer in all cases.
gcmodule.c/objimpl.h, _PyObject_GC_Malloc:
+ Added a "padding" argument so that this flavor of malloc can allocate
enough bytes for alignment padding (it can't know this is needed, but
its callers do).
typeobject.c, PyType_GenericAlloc:
+ Allocated enough bytes to align the __dict__ pointer.
+ Sped and simplified the round-up-to-PTRSIZE logic.
+ Added blank lines so I could parse the if/else blocks <0.7 wink>.
+ Use the _PyObject_VAR_SIZE macro to compute object size.
+ Break the computation into lines convenient for debugger inspection.
+ Speed the round-up-to-pointer-size computation.
many types were subclassable but had a xxx_dealloc function that
called PyObject_DEL(self) directly instead of deferring to
self->ob_type->tp_free(self). It is permissible to set tp_free in the
type object directly to _PyObject_Del, for non-GC types, or to
_PyObject_GC_Del, for GC types. Still, PyObject_DEL was a tad faster,
so I'm fearing that our pystone rating is going down again. I'm not
sure if doing something like
void xxx_dealloc(PyObject *self)
{
if (PyXxxCheckExact(self))
PyObject_DEL(self);
else
self->ob_type->tp_free(self);
}
is any faster than always calling the else branch, so I haven't
attempted that -- however those types whose own dealloc is fancier
(int, float, unicode) do use this pattern.
For a dynamically constructed type object, fill in the tp_doc slot with
a copy of the argument dict's "__doc__" value, provided the latter exists
and is a string.
NOTE: I don't know what to do if it's a Unicode string, so in that case
tp_doc is left NULL (which shows up as Py_None if you do Class.__doc__).
Note that tp_doc holds a char*, not a general PyObject*.
test for getattribute==NULL was bogus because it always found
object.__getattribute__. Pick it apart using the trick we learned
from slot_sq_item, and if it's just a wrapper around
PyObject_GenericGetAttr, zap it. Also added a long XXX comment
explaining the consequences.
test dramatically:
class T(tuple): __dynamic__ = 1
t = T(range(1000))
for i in range(1000): tt = tuple(t)
The speedup was about 5x compared to the previous state of CVS (1.7
vs. 8.8, in arbitrary time units). But it's still more than twice as
slow as as the same test with __dynamic__ = 0 (0.8).
I'm not sure that I really want to go through the trouble of this kind
of speedup for every slot. Even doing it just for the most popular
slots will be a major effort (the new slot_sq_item is 40+ lines, while
the old one was one line with a powerful macro -- unfortunately the
speedup comes from expanding the macro and doing things in a way
specific to the slot signature).
An alternative that I'm currently considering is sketched in PLAN.txt:
trap setattr on type objects. But this will require keeping track of
all derived types using weak references.
pointing to a static variable to hold the object form of the string
was never used, causing endless calls to PyString_InternFromString().
One particular test (with lots of __getitem__ calls) became a third
faster with this!
Unknown whether this fixes it.
- stringobject.c, PyString_FromFormatV: don't assume that va_list is of
a type that can be copied via an initializer.
- errors.c, PyErr_Format: add a va_end() to balance the va_start().
instances).
Also added GC support to various auxiliary types: super, property,
descriptors, wrappers, dictproxy. (Only type objects have a tp_clear
field; the other types are.)
One change was necessary to the GC infrastructure. We have statically
allocated type objects that don't have a GC header (and can't easily
be given one) and heap-allocated type objects that do have a GC
header. Giving these different metatypes would be really ugly: I
tried, and I had to modify pickle.py, cPickle.c, copy.py, add a new
invent a new name for the new metatype and make it a built-in, change
affected tests... In short, a mess. So instead, we add a new type
slot tp_is_gc, which is a simple Boolean function that determines
whether a particular instance has GC headers or not. This slot is
only relevant for types that have the (new) GC flag bit set. If the
tp_is_gc slot is NULL (by far the most common case), all instances of
the type are deemed to have GC headers. This slot is called by the
PyObject_IS_GC() macro (which is only used twice, both times in
gcmodule.c).
I also changed the extern declarations for a bunch of GC-related
functions (_PyObject_GC_Del etc.): these always exist but objimpl.h
only declared them when WITH_CYCLE_GC was defined, but I needed to be
able to reference them without #ifdefs. (When WITH_CYCLE_GC is not
defined, they do the same as their non-GC counterparts anyway.)
- SLOT1BINFULL() macro: changed this to check for __rop__ overriding
__op__, like binary_op1() in abstract.c -- the latter only calls the
slot function once if both types use the same slot function, so the
slot function must make both calls -- which it already did for the
__op__, __rop__ order, but not yet for the __rop__, __op__ order
when B.__class__ is a subclass of A.__class__.
- slot_sq_contains(), slot_nb_nonzero(): use lookup_maybe() rather
than lookup_method() which sets an exception which we then clear.
- slot_nb_coerce(): don't give up when left argument's __coerce__
returns NotImplemented, but give the right argument a chance.
Generalize PyLong_AsLongLong to accept int arguments too. The real point
is so that PyArg_ParseTuple's 'L' code does too. That code was
undocumented (AFAICT), so documented it.
__rop__ now takes precendence over __op__. Those circumstances are:
- Both arguments are new-style classes
- Both arguments are new-style numbers
- Their implementation slots for tp_op differ
- Their types differ
- The right argument's type is a subtype of the left argument's type
Also did this for the ternary operator (pow) -- only the binary case
is dealt with properly though, since __rpow__ is not supported anyway.
their 'i' and 'r' variants) were not being generated if the
corresponding nb_ slots were present in the type object. I bet this
is because floor and true division were introduced after I last
looked at that part of the code.
- Made cls.__module__ writable.
- Ensure that obj.__dict__ is returned as {}, not None, even upon first
reference; it simply springs into life when you ask for it.
(*) The pickling support is provisional for the following reasons:
- It doesn't support classes with __slots__.
- It relies on additional support in copy_reg.py: the C method
__reduce__, defined in the object class, really calls calling
copy_reg._reduce(obj). Eventually the Python code in copy_reg.py
needs to be migrated to C, but I'd like to experiment with the
Python implementation first. The _reduce() code also relies on an
additional helper function, _reconstructor(), defined in
copy_reg.py; this should also be reimplemented in C.
than <type 'ClassName'>. Exception: if it's a built-in type or an
extension type, continue to call it <type 'ClassName>. Call me a
wimp, but I don't want to break more user code than necessary.
same. I hope the test for structural equivalence is stringent enough.
It only allows the assignment if the old and new types:
- have the same basic size
- have the same item size
- have the same dict offset
- have the same weaklist offset
- have the same GC flag bit
- have a common base that is the same except for maybe the dict and
weaklist (which may have been added separately at the same offsets
in both types)
- property() now takes 4 keyword arguments: fget, fset, fdel, doc.
Note that the real purpose of the 'f' prefix is to make fdel fit in
('del' is a keyword, so can't used as a keyword argument name).
- These map to visible readonly attributes 'fget', 'fset', 'fdel',
and '__doc__' in the property object.
- fget/fset/fdel weren't discoverable from Python before.
- __doc__ is new, and allows to associate a docstring with a property.
- if __getattribute__ exists, it is called first;
if it doesn't exists, PyObject_GenericGetAttr is called first.
- if the above raises AttributeError, and __getattr__ exists,
it is called.
classes to __getattribute__, to make it crystal-clear that it doesn't
have the same semantics as overriding __getattr__ on classic classes.
This is a halfway checkin -- I'll proceed to add a __getattr__ hook
that works the way it works in classic classes.
no backwards compatibility to worry about, so I just pushed the
'closure' struct member to the back -- it's never used in the current
code base (I may eliminate it, but that's more work because the getter
and setter signatures would have to change.)
As examples, I added actual docstrings to the getset attributes of a
few types: file.closed, xxsubtype.spamdict.state.
compatibility, this required all places where an array of "struct
memberlist" structures was declared that is referenced from a type's
tp_members slot to change the type of the structure to PyMemberDef;
"struct memberlist" is now only used by old code that still calls
PyMember_Get/Set. The code in PyObject_GenericGetAttr/SetAttr now
calls the new APIs PyMember_GetOne/SetOne, which take a PyMemberDef
argument.
As examples, I added actual docstrings to the attributes of a few
types: file, complex, instance method, super, and xxsubtype.spamlist.
Also converted the symtable to new style getattr.
elements which are not Unicode objects or strings. (This matches
the string.join() behaviour.)
Fix a memory leak in the .join() method which occurs in case
the Unicode resize fails.
Restore the test_unicode output.
complex_coerce() would never be called with a complex argument,
because PyNumber_Coerce[Ex] doesn't bother calling the type's coercion
method if the values already have the same type. But now, of course,
it's possible to pass an instance of a complex *subtype*, and those
must be accepted.
hack, and it's even more disgusting than a PyInstance_Check() call.
If the tp_compare slot is the slot used for overrides in Python,
it's always called.
Add some tests that show what should work too.
only safely call a type's tp_compare slot if the second argument is
also an instance of the same type. I hate to think what
e.g. int_compare() would do with a second argument that's a float!
descriptors for each attribute. The getattr() implementation is
similar to PyObject_GenericGetAttr(), but delegates to im_self instead
of looking in __dict__; I couldn't do this as a wrapper around
PyObject_GenericGetAttr().
XXX A problem here is that this is a case of *delegation*. dir()
doesn't see exactly the same attributes that are actually defined;
e.g. if the delegate is a Python function object, it supports
attributes like func_code etc., but these are not visible to dir(); on
the other hand, dynamic function attributes (stored in the function's
__dict__) *are* visible to dir(). Maybe we need a mechanism to tell
dir() about the delegation mechanism? I vaguely recall seeing a
request in the newsgroup for a more formal definition of attribute
delegation too. Sigh, time for a new PEP.
and are lists, and then just the string elements (if any)).
There are good and bad reasons for this. The good reason is to support
dir() "like before" on objects of extension types that haven't migrated
to the class introspection API yet. The bad reason is that Python's own
method objects are such a type, and this is the quickest way to get their
im_self etc attrs to "show up" via dir(). It looks much messier to move
them to the new scheme, as their current getattr implementation presents
a view of their attrs that's a untion of their own attrs plus their
im_func's attrs. In particular, methodobject.__dict__ actually returns
methodobject.im_func.__dict__, and if that's important to preserve it
doesn't seem to fit the class introspection model at all.
Both int and long multiplication are changed to be more careful in
their assumptions about when one of the arguments is a sequence: the
assumption that at least one of the arguments must be an int (or long,
respectively) is still held, but the assumption that these don't smell
like sequences is no longer true: a subtype of int or long may well
have a sequence-repeat thingie!
NotImplemented when the lookup fails, and use this for binary
operators. Also lookup_maybe() which doesn't raise an exception when
the lookup fails (still returning NULL).
- Don't turn a non-tuple argument into a one-tuple. Rather, the
caller must pass a format that causes Py_VaBuildValue() to return a
tuple.
- Speed things up by calling PyObject_Call (which is fairly low-level
and straightforward) rather than PyObject_CallObject (which calls
PyEval_CallObjectWithKeywords which calls PyObject_Call, and nothing
is really done in the mean time except some tests for NULL args and
valid types, which are already guaranteed).
- Cosmetics.
Other places:
- Make sure that the format argument to call_method() is surrounded by
parentheses, so it will cause a tuple to be created.
- Replace a few calls to PyEval_CallObject() with a surefire tuple for
args to calls to PyObject_Call(). (A few calls to
PyEval_CallObject() remain that have NULL for args.)
directly, as the only thing done here (replace NULL args with an empty
tuple) is also done there.
XXX Maybe we should take one step further and equate the two at the
macro level? That's harder though because PyEval_Call* is declared in
a header that's not included standard. But it is silly that
PyObject_CallObject calls PyEval_CallObject which calls back to
PyObject_Call. Maybe PyEval_CallObject should be moved into this file
instead? All I know is that there are too many call APIs! The
differences between PyObject_Call and PyEval_CallObjectWithKeywords is
that the latter allows args to be NULL, and does explicit type checks
for args and kwds.
A surprising number of changes to split tp_new into tp_new and tp_init.
Turned out the older PyFile_FromFile() didn't initialize the memory it
allocated in all (error) cases, which caused new sanity asserts
elsewhere to fail left & right (and could have, e.g., caused file_dealloc
to try decrefing random addresses).
keys are true strings -- no subclasses need apply. This may be debatable.
The problem is that a str subclass may very well want to override __eq__
and/or __hash__ (see the new example of case-insensitive strings in
test_descr), but go-fast shortcuts for strings are ubiquitous in our dicts
(and subclass overrides aren't even looked for then). Another go-fast
reason for the change is that PyCheck_StringExact() is a quicker test
than PyCheck_String(), and we make such a test on virtually every access
to every dict.
OTOH, a str subclass may also be perfectly happy using the base str eq
and hash, and this change slows them a lot. But those cases are still
hypothetical, while Python's own reliance on true-string dicts is not.
just by doing type(f) where f is any file object. This left a hole in
restricted execution mode that rexec.py can't plug by itself (although it
can plug part of it; the rest is plugged in fileobject.c now).
on to the tp_new slot (if non-NULL), as well as to the tp_init slot (if
any). A sane type implementing both tp_new and tp_init should probably
pay attention to the arguments in only one of them.
with the same value instead. This ensures that a string (or string
subclass) object's ob_sinterned pointer is always a str (or NULL), and
that the dict of interned strings only has strs as keys.
+ These were leaving the hash fields at 0, which all string and unicode
routines believe is a legitimate hash code. As a result, hash() applied
to str and unicode subclass instances always returned 0, which in turn
confused dict operations, etc.
+ Changed local names "new"; no point to antagonizing C++ compilers.
subclasses, all "the usual" ones (slicing etc), plus replace, translate,
ljust, rjust, center and strip. I don't know how to be sure they've all
been caught.
Question: Should we complain if someone tries to intern an instance of
a string subclass? I hate to slow any code on those paths.
tuple(i) repaired to return a true tuple when i is an instance of a
tuple subclass.
Added PyTuple_CheckExact macro.
PySequence_Tuple(): if a tuple-like object isn't exactly a tuple, it's
not safe to return the object as-is -- make a new tuple of it instead.
Given an immutable type M, and an instance I of a subclass of M, the
constructor call M(I) was just returning I as-is; but it should return a
new instance of M. This fixes it for M in {int, long}. Strings, floats
and tuples remain to be done.
Added new macros PyInt_CheckExact and PyLong_CheckExact, to more easily
distinguish between "is" and "is a" (i.e., only an int passes
PyInt_CheckExact, while any sublass of int passes PyInt_Check).
Added private API function _PyLong_Copy.
Subtlety on Windows: if we change test_largefile.py to use a file
> 4GB, it still fails. A debug session suggests this is because
fseek(fp, 0, 2) refuses to seek to the end of the file when the file
is > 4GB, because it uses the SetFilePointer() in 32-bit mode.
But it only fails when we seek relative to the end of the file,
because in the other seek modes only calls to fgetpos() and fsetpos()
are made, which use Get/SetFilePointer() in 64-bit mode. Solution:
#ifdef MS_WInDOWS, replace the call to fseek(fp, ...) with a call to
_lseeki64(fileno(fp), ...). Make sure to call fflush(fp) first.
(XXX Could also replace the entire branch with a call to _lseeki64().
Would that be more efficient? Certainly less generated code.)
(XXX This needs more testing. I can't actually test that it works for
files >4GB on my Win98 machine, because the filesystem here won't let
me create files >=4GB at all. Tim should test this on his Win2K
machine.)
- use PyModule_Check() instead of PyObject_TypeCheck(), now we can.
- don't assert that the __dict__ gotten out of a module is always
a dictionary; check its type, and raise an exception if it's not.
iterable object. I'm not sure how that got overlooked before!
Got rid of the internal _PySequence_IterContains, introduced a new
internal _PySequence_IterSearch, and rewrote all the iteration-based
"count of", "index of", and "is the object in it or not?" routines to
just call the new function. I suppose it's slower this way, but the
code duplication was getting depressing.
the base classes is not a classic class, and its class (the metaclass)
is callable, call the metaclass to do the deed.
One effect of this is that, when mixing classic and new-style classes
amongst the bases of a class, it doesn't matter whether the first base
class is a classic class or not: you will always get the error
"TypeError: metatype conflict among bases". (Formerly, with a classic
class first, you'd get "TypeError: PyClass_New: base must be a class".)
Another effect is that multiple inheritance from ExtensionClass.Base,
with a classic class as the first class, transfers control to the
ExtensionClass.Base class. This is what we need for SF #443239 (and
also for running Zope under 2.2a4, before ExtensionClass is replaced).
corresponding "getitem" operation (sq_item or mp_subscript) is
implemented. I realize that "sequence-ness" and "mapping-ness" are
poorly defined (and the tests may still be wrong for user-defined
instances, which always have both slots filled), but I believe that a
sequence that doesn't support its getitem operation should not be
considered a sequence. All other operations are optional though.
For example, the ZODB BTree tests crashed because PySequence_Check()
returned true for a dictionary! (In 2.2, the dictionary type has a
tp_as_sequence pointer, but the only field filled is sq_contains, so
you can write "if key in dict".) With this fix, all standalone ZODB
tests succeed.
a->tp_mro. If a doesn't have class, it's considered a subclass only
of itself or of 'object'.
This one fix is enough to prevent the ExtensionClass test suite from
dumping core, but that doesn't say much (it's a rather small test
suite). Also note that for ExtensionClass-defined types, a different
subclass test may be needed. But I haven't checked whether
PyType_IsSubtype() is actually used in situations where this matters
-- probably it doesn't, since we also don't check for classic classes.
Curious: the MS docs say stati64 etc are supported even on Win95, but
Win95 doesn't support a filesystem that allows partitions > 2 Gb.
test_largefile: This was opening its test file in text mode. I have no
idea how that worked under Win64, but it sure needs binary mode on Win98.
BTW, on Win98 test_largefile runs quickly (under a second).
While not even documented, they were clearly part of the C API,
there's no great difficulty to support them, and it has the cool
effect of not requiring any changes to ExtensionClass.c.
requires that errno ever get set, and it looks like glibc is already
playing that game. New rules:
+ Never use HUGE_VAL. Use the new Py_HUGE_VAL instead.
+ Never believe errno. If overflow is the only thing you're interested in,
use the new Py_OVERFLOWED(x) macro. If you're interested in any libm
errors, use the new Py_SET_ERANGE_IF_OVERFLOW(x) macro, which attempts
to set errno the way C89 said it worked.
Unfortunately, none of these are reliable, but they work on Windows and I
*expect* under glibc too.
I believe this works on Linux (tested both on a system with large file
support and one without it), and it may work on Solaris 2.7.
The changes are twofold:
(1) The configure script now boldly tries to set the two symbols that
are recommended (for Solaris and Linux), and then tries a test
script that does some simple seeking without writing.
(2) The _portable_{fseek,ftell} functions are a little more systematic
in how they try the different large file support options: first
try fseeko/ftello, but only if off_t is large; then try
fseek64/ftell64; then try hacking with fgetpos/fsetpos.
I'm keeping my fingers crossed. The meaning of the
HAVE_LARGEFILE_SUPPORT macro is not at all clear.
I'll see if I can get it to work on Windows as well.
the fiddling is simply due to that no caller of PyLong_AsDouble ever
checked for failure (so that's fixing old bugs). PyLong_AsDouble is much
faster for big inputs now too, but that's more of a happy consequence
than a design goal.
but will be the foundation for Good Things:
+ Speed PyLong_AsDouble.
+ Give PyLong_AsDouble the ability to detect overflow.
+ Make true division of long/long nearly as accurate as possible (no
spurious infinities or NaNs).
+ Return non-insane results from math.log and math.log10 when passing a
long that can't be approximated by a double better than HUGE_VAL.
mapping object", in the same sense dict.update(x) requires of x (that x
has a keys() method and a getitem).
Questionable: The other type constructors accept a keyword argument, so I
did that here too (e.g., dictionary(mapping={1:2}) works). But type_call
doesn't pass the keyword args to the tp_new slot (it passes NULL), it only
passes them to the tp_init slot, so getting at them required adding a
tp_init slot to dicts. Looks like that makes the normal case (i.e., no
args at all) a little slower (the time it takes to call dict.tp_init and
have it figure out there's nothing to do).
PEP 238. Changes:
- add a new flag variable Py_DivisionWarningFlag, declared in
pydebug.h, defined in object.c, set in main.c, and used in
{int,long,float,complex}object.c. When this flag is set, the
classic division operator issues a DeprecationWarning message.
- add a new API PyRun_SimpleStringFlags() to match
PyRun_SimpleString(). The main() function calls this so that
commands run with -c can also benefit from -Dnew.
- While I was at it, I changed the usage message in main() somewhat:
alphabetized the options, split it in *four* parts to fit in under
512 bytes (not that I still believe this is necessary -- doc strings
elsewhere are much longer), and perhaps most visibly, don't display
the full list of options on each command line error. Instead, the
full list is only displayed when -h is used, and otherwise a brief
reminder of -h is displayed. When -h is used, write to stdout so
that you can do `python -h | more'.
Notes:
- I don't want to use the -W option to control whether the classic
division warning is issued or not, because the machinery to decide
whether to display the warning or not is very expensive (it involves
calling into the warnings.py module). You can use -Werror to turn
the warnings into exceptions though.
- The -Dnew option doesn't select future division for all of the
program -- only for the __main__ module. I don't know if I'll ever
change this -- it would require changes to the .pyc file magic
number to do it right, and a more global notion of compiler flags.
- You can usefully combine -Dwarn and -Dnew: this gives the __main__
module new division, and warns about classic division everywhere
else.
__dict__ slot for string subtypes.
subtype_dealloc(): properly use _PyObject_GetDictPtr() to get the
(potentially negative) dict offset. Don't copy things into local
variables that are used only once.
type_new(): properly calculate a negative dict offset when tp_itemsize
is nonzero. The __dict__ attribute, if present, is now a calculated
attribute rather than a structure member.
tupledealloc(): only feed the free list when the type is really a
tuple, not a subtype. Otherwise, use PyObject_GC_Del().
_PyTuple_Resize(): disallow using this for tuple subtypes.
don't use getattr, but only look in the dict of the type and base types.
This prevents picking up all sorts of weird stuff, including things defined
by the metaclass when the object is a class (type).
For this purpose, a helper function lookup_method() was added. One or two
other places also use this.
PyString_FromFormatV(): In the final resize at the end, we can use
PyString_AS_STRING() since we know the object is a string and can
avoid the typechecking.
PyString_FromFormat(): GS sez: "For safety/propriety, you should call
va_end() on the vargs variable."
at least in the first two characters. %p is ill-defined, and people will
forever commit bad tests otherwise ("bad" in the sense that they fall
over (at least on Windows) for lack of a leading '0x'; 5 of the 7 tests
in test_repr.py failed on Windows for that reason this time around).
an inappropriate first argument. Now that there are more ways for
this to fail, make sure to report the name of the class of the
expected instance and of the actual instance.
PyErr_Format() these new C API methods can be used instead of
sprintf()'s into hardcoded char* buffers. This allows us to fix
many situation where long package, module, or class names get
truncated in reprs.
PyString_FromFormat() is the varargs variety.
PyString_FromFormatV() is the va_list variety
Original PyErr_Format() code was modified to allow %p and %ld
expansions.
Many reprs were converted to this, checkins coming soo. Not
changed: complex_repr(), float_repr(), float_print(), float_str(),
int_repr(). There may be other candidates not yet converted.
Closes patch #454743.
super(type) -> unbound super object
super(type, obj) -> bound super object; requires isinstance(obj, type)
Typical use to call a cooperative superclass method:
class C(B):
def meth(self, arg):
super(C, self).meth(arg);
the delete function. (Question: should the attribute name also be
recorded in the getset object? That makes the protocol more work, but
may give us better error messages.)
cases.
powu: Deleted.
This started with a nonsensical error msg:
>>> x = -1.
>>> import sys
>>> x**(-sys.maxint-1L)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: negative number cannot be raised to a fractional power
>>>
The special-casing in float_pow was simply wrong in this case (there's
not even anything peculiar about these inputs), and I don't see any point
to it in *any* case: a decent libm pow should have worst-case error under
1 ULP, so in particular should deliver the exact result whenever the exact
result is representable (else its error is at least 1 ULP). Thus our
special fiddling for integral values "shouldn't" buy anything in accuracy,
and, to the contrary, repeated multiplication is less accurate than a
decent pow when the true result isn't exactly representable. So just
letting pow() do its job here (we may not be able to trust libm x-platform
in exceptional cases, but these are normal cases).
interpretation of negative indices, since neither the sq_*item slots
nor the slot_ wrappers do this. (Slices are a different story, there
the size wrapping is done too early.)
Classes that don't use __slots__ have a __weakref__ member added in
the same way as __dict__ is added (i.e. only if the base didn't
already have one). Classes using __slots__ can enable weak
referenceability by adding '__weakref__' to the __slots__ list.
Renamed the __weaklistoffset__ class member to __weakrefoffset__ --
it's not always a list, it seems. (Is tp_weaklistoffset a historical
misnomer, or do I misunderstand this?)
- Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled
- check for Py_USING_UNICODE in all places that use Unicode functions
- disables unicode literals, and the builtin functions
- add the types.StringTypes list
- remove Unicode literals from most tests.
the class dict). Anything but a nonnegative int in either place is
*ignored* (before, a non-Boolean was an error). The default is still
static -- in a comparative test, Jeremy's Tools/compiler package ran
twice as slow (compiling itself) using dynamic as the default. (The
static version, which requires a few tweaks to avoid modifying class
variables, runs at about the same speed as the classic version.)
slot_tp_descr_get(): this also needed fallback behavior.
slot_tp_getattro(): remove a debug fprintf() call.
the metatype passed in as an argument. This prevents infinite
recursion when a metatype written in Python calls type.__new__() as a
"super" call.
Also tweaked some comments.
calculate it on the fly. This way even modules with long package
names get an accurate repr instead of a truncated one. The extra
malloc/free cost shouldn't be a problem in a repr function.
Closes SF bug #437984
- type_module(), type_name(): if tp_name contains one or more period,
the part before the last period is __module__, the part after that
is __name__. Otherwise, for non-heap types, __module__ is
"__builtin__". For heap types, __module__ is looked up in
tp_defined.
- type_new(): heap types have their __module__ set from
globals().__name__; a pre-existing __module__ in their dict is not
overridden. This is not inherited.
- type_repr(): if __module__ exists and is not "__builtin__", it is
included in the string representation (just as it already is for
classes). For example <type '__main__.C'>.
- descrobject.c:descr_check(): only believe None means the same as
NULL if the type given is None's type.
- typeobject.c:wrap_descr_get(): don't "conventiently" default an
absent type to the type of the object argument. Let the called
function figure it out.
types -- currently Type, List, None and NotImplemented. To be called
from Py_Initialize() instead of accumulating calls there.
Also rename type(None) to NoneType and type(NotImplemented) to
NotImplementedType -- naming the type identical to the object was
confusing.
returns that. (This fix is also by MvL; checkin it in because I want
to make more changes here. I'm still not 100% satisfied -- see
comments attached to the patch.)
operators for which a default implementation exist now work, both in
dynamic classes and in static classes, overridden or not. This
affects __repr__, __str__, __hash__, __contains__, __nonzero__,
__cmp__, and the rich comparisons (__lt__ etc.). For dynamic
classes, this meant copying a lot of code from classobject! (XXX
There are still some holes, because the comparison code in object.c
uses PyInstance_Check(), meaning new-style classes don't get the
same dispensation. This needs more thinking.)
- Add object.__hash__, object.__repr__, object.__str__. The __str__
dispatcher now calls the __repr__ dispatcher, as it should.
- For static classes, the tp_compare, tp_richcompare and tp_hash slots
are now inherited together, or not at all. (XXX I fear there are
still some situations where you can inherit __hash__ when you
shouldn't, but mostly it's OK now, and I think there's no way we can
get that 100% right.)
setting and deleting a function's __dict__ attribute. Deleting
it, or setting it to a non-dictionary result in a TypeError. Note
that getting it the first time magically initializes it to an
empty dict so that func.__dict__ will always appear to be a
dictionary (never None).
Closes SF bug #446645.
The descr changes moved the dispatch for calling objects from
call_object() in ceval.c to PyObject_Call() in abstract.c.
call_object() and the many functions it used in ceval.c were no longer
used, but were not removed.
Rename meth_call() as PyCFunction_Call() so that it can be called by
the CALL_FUNCTION opcode in ceval.c.
Also, fix error message that referred to PyEval_EvalCodeEx() by its
old name eval_code2(). (I'll probably refer to it by its old name,
too.)
XXX There are still some loose ends: repr(), str(), hash() and
comparisons don't inherit a default implementation from object. This
must be resolved similarly to the way it's resolved for classic
instances.
XXX This is not sufficient: if a dynamic class has no __repr__ method
(for instance), but later one is added, that doesn't add a tp_repr
slot, so repr() doesn't call the __repr__ method. To make this work,
I'll have to add default implementations of several slots to 'object'.
XXX Also, dynamic types currently only inherit slots from their
dominant base.
problem). inherit_slots() is split in two parts: inherit_special()
which inherits the flags and a few very special members from the
dominant base; inherit_slots() which inherits only regular slots,
and is now called for each base in the MRO in turn. These are now
both void functions since they don't have error returns.
- Added object.__setitem__() back -- for the same reason as
object.__new__(): a subclass of object should be able to call
object.__new__().
- add_wrappers() was moved around to be closer to where it is used (it
was defined together with add_methods() etc., but has nothing to do
with these).
Removed all instances of Py_UCS2 from the codebase, and so also (I hope)
the last remaining reliance on the platform having an integral type
with exactly 16 bits.
PyUnicode_DecodeUTF16() and PyUnicode_EncodeUTF16() now read and write
one byte at a time.
bit. For one, this class:
class C(object):
def __new__(myclass, ...): ...
would have no way to call the __new__ method of its base class, and
the workaround (to create an intermediate base class whose __new__ you
can call) is ugly.
So, I've come up with a better solution that restores object.__new__,
but still solves the original problem, which is that built-in and
extension types shouldn't inherit object.__new__. The solution is
simple: only "heap types" inherit tp_new. Simpler, less code,
perfect!
Previously, f.read() and f.readlines() checked for
errors on their file object and possibly raised an
IOError, but f.readline() didn't. This patch makes
f.readline() behave like the others.
Note that I've added a call to clearerr() since the other calls to
ferror() include that too.
I have no way to test this code. :-)
division. The basic binary operators now all correctly call the
__rxxx__ variant when they should.
In type_new(), I now make the new type a new-style number unless it
inherits from an old-style number that has numeric methods.
By way of cosmetics, I've changed the signatures of the SLOT<i> macros
to take actual function names and operator names as strings, rather
than rely on C preprocessor symbol manipulations. This makes the
calls slightly more verbose, but greatly helps simple searches through
the file: you can now find out where "__radd__" is used or where the
function slot_nb_power() is defined and where it is used.
This introduces:
- A new operator // that means floor division (the kind of division
where 1/2 is 0).
- The "future division" statement ("from __future__ import division)
which changes the meaning of the / operator to implement "true
division" (where 1/2 is 0.5).
- New overloadable operators __truediv__ and __floordiv__.
- New slots in the PyNumberMethods struct for true and floor division,
new abstract APIs for them, new opcodes, and so on.
I emphasize that without the future division statement, the semantics
of / will remain unchanged until Python 3.0.
Not yet implemented are warnings (default off) when / is used with int
or long arguments.
This has been on display since 7/31 as SF patch #443474.
Flames to /dev/null.
- Add an explicit call to PyType_Ready(&PyList_Type) to pythonrun.c
(just for the heck of it, really -- we should either explicitly
ready all types, or none).
- Add comment blocks explaining add_operators() and override_slots().
(This file could use some more explaining, but this is all I had
breath for today. :)
- Renamed the argument 'base' of add_wrappers() to 'wraps' because
it's not a base class (which is what the 'base' identifier is used
for elsewhere).
Small nits:
- Fix add_tp_new_wrapper() to avoid overwriting an existing __new__
descriptor in tp_defined.
- In add_operators(), check the return value of add_tp_new_wrapper().
Functional change:
- Remove the tp_new functionality from PyBaseObject_Type; this means
you can no longer instantiate the 'object' type. It's only useful
as a base class.
- To make up for the above loss, add tp_new to dynamic types. This
has to be done in a hackish way (after override_slots() has been
called, with an explicit call to add_tp_new_wrapper() at the very
end) because otherwise I ran into recursive calls of slot_tp_new().
Sigh.
And remove all the extern decls in the middle of .c files.
Apparently, it was excluded from the header file because it is
intended for internal use by the interpreter. It's still intended for
internal use and documented as such in the header file.
#caused warnings with the VMS C compiler. (SF bug #442998, in part.)
On a narrow system the current code should never be executed since ch
will always be < 0x10000.
Marc-Andre: you may end up fixing this a different way, since I
believe you have plans to generate \U for surrogate pairs. I'll leave
that to you.
particular, str(long) and repr(long) use base 10, and that gets a factor
of 4 speedup). Another factor of 2 can be gotten by refactoring divrem1 to
support in-place division, but that started getting messy so I'm leaving
that out.
raising an error. This was one of the two issues that the VPython
folks were particularly problematic for their students. (The other
one was integer division...) This implements (my) SF patch #440487.
raising an error. This was one of the two issues that the VPython
folks were particularly problematic for their students. (The other
one was integer division...) This implements (my) SF patch #440487.
Implement sys.maxunicode.
Explicitly wrap around upper/lower computations for wide Py_UNICODE.
When decoding large characters with UTF-8, represent expected test
results using the \U notation.
Add configure option --enable-unicode.
Add config.h macros Py_USING_UNICODE, PY_UNICODE_TYPE, Py_UNICODE_SIZE,
SIZEOF_WCHAR_T.
Define Py_UCS2.
Encode and decode large UTF-8 characters into single Py_UNICODE values
for wide Unicode types; likewise for UTF-16.
Remove test whether sizeof Py_UNICODE is two.
"mapping" object, specifically one that supports PyMapping_Keys() and
PyObject_GetItem(). This allows you to say e.g. {}.update(UserDict())
We keep the special case for concrete dict objects, although that
seems moderately questionable. OTOH, the code exists and works, so
why change that?
.update()'s docstring already claims that D.update(E) implies calling
E.keys() so it's appropriate not to transform AttributeErrors in
PyMapping_Keys() to TypeErrors.
Patch eyeballed by Tim.
unicodeobject.h, which forces sizeof(Py_UNICODE) == sizeof(Py_UCS4).
(this may be good enough for platforms that doesn't have a 16-bit
type. the UTF-16 codecs don't work, though)
the next free valuestack slot, not to the base (in America, stacks push
and pop at the top -- they mutate at the bottom in Australia <winK>).
eval_frame(): assert that f_stacktop isn't NULL upon entry.
frame_delloc(): avoid ordered pointer comparisons involving f_stacktop
when f_stacktop is NULL.
i_divmod: New and simpler algorithm. Old one returned gibberish on most
boxes when the numerator was -sys.maxint-1. Oddly enough, it worked in the
release (not debug) build on Windows, because the compiler optimized away
some tricky sign manipulations that were incorrect in this case.
Makes you wonder <wink> ...
Bugfix candidate.
Gave Python linear-time repr() implementations for dicts, lists, strings.
This means, e.g., that repr(range(50000)) is no longer 50x slower than
pprint.pprint() in 2.2 <wink>.
I don't consider this a bugfix candidate, as it's a performance boost.
Added _PyString_Join() to the internal string API. If we want that in the
public API, fine, but then it requires runtime error checks instead of
asserts.
is allocated than needed (used to allocate 80 bytes of digit space no
matter how small the long input). This also runs faster, at least on 32-
bit boxes.
Replaced PyLong_{As,From}{Unsigned,}LongLong guts with calls
to _PyLong_{As,From}ByteArray.
_testcapimodule.c:
Added strong tests of PyLong_{As,From}{Unsigned,}LongLong.
Fixes SF bug #432552 PyLong_AsLongLong() problems.
Possible bugfix candidate, but the fix relies on code added to longobject
to support the new q/Q structmodule format codes.
This completes the q/Q project.
longobject.c _PyLong_AsByteArray: The original code had a gross bug:
the most-significant Python digit doesn't necessarily have SHIFT
significant bits, and you really need to count how many copies of the sign
bit it has else spurious overflow errors result.
test_struct.py: This now does exhaustive std q/Q testing at, and on both
sides of, all relevant power-of-2 boundaries, both positive and negative.
NEWS: Added brief dict news while I was at it.
_PyLong_FromByteArray
_PyLong_AsByteArray
Untested and probably buggy -- they compile OK, but nothing calls them
yet. Will soon be called by the struct module, to implement x-platform
'q' and 'Q'.
If other people have uses for them, we could move them into the public API.
See longobject.h for usage details.
case of objects with equal types which support tp_compare. Give
type objects a tp_compare function.
Also add c<0 tests before a few PyErr_Occurred tests.
frequently used, and in particular this allows to drop the last
remaining obvious time-waster in the crucial lookdict() and
lookdict_string() functions. Other changes consist mostly of changing
"i < ma_size" to "i <= ma_mask" everywhere.
be possible to provoke unbounded recursion now, but leaving that to someone
else to provoke and repair.
Bugfix candidate -- although this is getting harder to backstitch, and the
cases it's protecting against are mondo contrived.
code, less memory. Tests have uncovered no drawbacks. Christian and
Vladimir are the other two people who have burned many brain cells on the
dict code in recent years, and they like the approach too, so I'm checking
it in without further ado.
instead of multiplication to generate the probe sequence. The idea is
recorded in Python-Dev for Dec 2000, but that version is prone to rare
infinite loops.
The value is in getting *all* the bits of the hash code to participate;
and, e.g., this speeds up querying every key in a dict with keys
[i << 16 for i in range(20000)] by a factor of 500. Should be equally
valuable in any bad case where the high-order hash bits were getting
ignored.
Also wrote up some of the motivations behind Python's ever-more-subtle
hash table strategy.
resizing.
Accurate timings are impossible on my Win98SE box, but this is obviously
faster even on this box for reasonable list.append() cases. I give
credit for this not to the resizing strategy but to getting rid of integer
multiplication and divsion (in favor of shifting) when computing the
rounded-up size.
For unreasonable list.append() cases, Win98SE now displays linear behavior
for one-at-time appends up to a list with about 35 million elements. Then
it dies with a MemoryError, due to fatally fragmented *address space*
(there's plenty of VM available, but by this point Win9X has broken user
space into many distinct heaps none of which has enough contiguous space
left to resize the list, and for whatever reason Win9x isn't coalescing
the dead heaps). Before the patch it got a MemoryError for the same
reason, but once the list reached about 2 million elements.
Haven't yet tried on Win2K but have high hopes extreme list.append()
will be much better behaved now (NT & Win2K didn't fragment address space,
but suffered obvious quadratic-time behavior before as lists got large).
For other systems I'm relying on common sense: replacing integer * and /
by << and >> can't plausibly hurt, the number of function calls hasn't
changed, and the total operation count for reasonably small lists is about
the same (while the operations are cheaper now).
dictresize() was too aggressive about never ever resizing small dicts.
If a small dict is entirely full, it needs to rebuild it despite that
it won't actually resize it, in order to purge old dummy entries thus
creating at least one virgin slot (lookdict assumes at least one such
exists).
Also took the opportunity to add some high-level comments to dictresize.
The idea is Marc-Andre Lemburg's, the implementation is Tim's.
Add a new ma_smalltable member to dictobjects, an embedded vector of
MINSIZE (8) dictentry structs. Short course is that this lets us avoid
additional malloc(s) for dicts with no more than 5 entries.
The changes are widespread but mostly small.
Long course: WRT speed, all scalar operations (getitem, setitem, delitem)
on non-empty dicts benefit from no longer needing NULL-pointer checks
(ma_table is never NULL anymore). Bulk operations (copy, update, resize,
clearing slots during dealloc) benefit in some cases from now looping
on the ma_fill count rather than on ma_size, but that was an unexpected
benefit: the original reason to loop on ma_fill was to let bulk
operations on empty dicts end quickly (since the NULL-pointer checks
went away, empty dicts aren't special-cased any more).
Special considerations:
For dicts that remain empty, this change is a lose on two counts:
the dict object contains 8 new dictentry slots now that weren't
needed before, and dict object creation also spends time memset'ing
these doomed-to-be-unsused slots to NULLs.
For dicts with one or two entries that never get larger than 2, it's
a mix: a malloc()/free() pair is no longer needed, and the 2-entry case
gets to use 8 slots (instead of 4) thus decreasing the chance of
collision. Against that, dict object creation spends time memset'ing
4 slots that aren't strictly needed in this case.
For dicts with 3 through 5 entries that never get larger than 5, it's a
pure win: the dict is created with all the space they need, and they
never need to resize. Before they suffered two malloc()/free() calls,
plus 1 dict resize, to get enough space. In addition, the 8-slot
table they ended with consumed more memory overall, because of the
hidden overhead due to the additional malloc.
For dicts with 6 or more entries, the ma_smalltable member is wasted
space, but then these are large(r) dicts so 8 slots more or less doesn't
make much difference. They still benefit all the time from removing
ubiquitous dynamic null-pointer checks, and get a small benefit (but
relatively smaller the larger the dict) from not having to do two
mallocs, two frees, and a resize on the way *to* getting their sixth
entry.
All in all it appears a small but definite general win, with larger
benefits in specific cases. It's especially nice that it allowed to
get rid of several branches, gotos and labels, and overall made the
code smaller.
This should be faster.
This means:
(1) "for line in file:" won't work if the xreadlines module can't be
imported.
(2) The body of "for line in file:" shouldn't use the file directly;
the effects (e.g. of file.readline(), file.seek() or even
file.tell()) would be undefined because of the buffering that goes
on in the xreadlines module.
UTF-16 codec will now interpret and remove a *leading* BOM mark. Sub-
sequent BOM characters are no longer interpreted and removed.
UTF-16-LE and -BE pass through all BOM mark characters.
These changes should get the UTF-16 codec more in line with what
the Unicode FAQ recommends w/r to BOM marks.
Two exceedingly unlikely errors in dictresize():
1. The loop for finding the new size had an off-by-one error at the
end (could over-index the polys[] vector).
2. The polys[] vector ended with a 0, apparently intended as a sentinel
value but never used as such; i.e., it was never checked, so 0 could
have been used *as* a polynomial.
Neither bug could trigger unless a dict grew to 2**30 slots; since that
would consume at least 12GB of memory just to hold the dict pointers,
I'm betting it's not the cause of the bug Fred's tracking down <wink>.
in the comments for using two passes was bogus, as the only object that
can get decref'ed due to the copy is the dummy key, and decref'ing dummy
can't have side effects (for one thing, dummy is immortal! for another,
it's a string object, not a potentially dangerous user-defined object).
1. Omit the early-out EQ/NE "lengths different?" test. Was unable to find
any real code where it triggered, but it always costs. The same is not
true of list richcmps, where different-size lists appeared to get
compared about half the time.
2. Because tuples are immutable, there's no need to refetch the lengths of
both tuples from memory again on each loop trip.
BUG ALERT: The tuple (and list) richcmp algorithm is arguably wrong,
because it won't believe there's any difference unless Py_EQ returns false
for some corresponding elements:
>>> class C:
... def __lt__(x, y): return 1
... __eq__ = __lt__
...
>>> C() < C()
1
>>> (C(),) < (C(),)
0
>>>
That doesn't make sense -- provided you believe the defn. of C makes sense.
and introduces a new method .decode().
The major change is that strg.encode() will no longer try to convert
Unicode returns from the codec into a string, but instead pass along
the Unicode object as-is. The same is now true for all other codec
return types. The underlying C APIs were changed accordingly.
Note that even though this does have the potential of breaking
existing code, the chances are low since conversion from Unicode
previously took place using the default encoding which is normally
set to ASCII rendering this auto-conversion mechanism useless for
most Unicode encodings.
The good news is that you can now use .encode() and .decode() with
much greater ease and that the door was opened for better accessibility
of the builtin codecs.
As demonstration of the new feature, the patch includes a few new
codecs which allow string to string encoding and decoding (rot13,
hex, zip, uu, base64).
Written by Marc-Andre Lemburg. Copyright assigned to the PSF.
to reason that me_key is much more likely to match the key we're looking
for than to match dummy, and if the key is absent me_key is much more
likely to be NULL than dummy: most dicts don't even have a dummy entry.
Running instrumented dict code over the test suite and some apps confirmed
that matching dummy was 200-300x less frequent than matching key in
practice. So this reorders the tests to try the common case first.
It can lose if a large dict with many collisions is mostly deleted, not
resized, and then frequently searched, but that's hardly a case we
should be favoring.
The comment following used to say:
/* We use ~hash instead of hash, as degenerate hash functions, such
as for ints <sigh>, can have lots of leading zeros. It's not
really a performance risk, but better safe than sorry.
12-Dec-00 tim: so ~hash produces lots of leading ones instead --
what's the gain? */
That is, there was never a good reason for doing it. And to the contrary,
as explained on Python-Dev last December, it tended to make the *sum*
(i + incr) & mask (which is the first table index examined in case of
collison) the same "too often" across distinct hashes.
Changing to the simpler "i = hash & mask" reduced the number of string-dict
collisions (== # number of times we go around the lookup for-loop) from about
6 million to 5 million during a full run of the test suite (these are
approximate because the test suite does some random stuff from run to run).
The number of collisions in non-string dicts also decreased, but not as
dramatically.
Note that this may, for a given dict, change the order (wrt previous
releases) of entries exposed by .keys(), .values() and .items(). A number
of std tests suffered bogus failures as a result. For dicts keyed by
small ints, or (less so) by characters, the order is much more likely to be
in increasing order of key now; e.g.,
>>> d = {}
>>> for i in range(10):
... d[i] = i
...
>>> d
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>>
Unfortunately. people may latch on to that in small examples and draw a
bogus conclusion.
test_support.py
Moved test_extcall's sortdict() into test_support, made it stronger,
and imported sortdict into other std tests that needed it.
test_unicode.py
Excluced cp875 from the "roundtrip over range(128)" test, because
cp875 doesn't have a well-defined inverse for unicode("?", "cp875").
See Python-Dev for excruciating details.
Cookie.py
Chaged various output functions to sort dicts before building
strings from them.
test_extcall
Fiddled the expected-result file. This remains sensitive to native
dict ordering, because, e.g., if there are multiple errors in a
keyword-arg dict (and test_extcall sets up many cases like that), the
specific error Python complains about first depends on native dict
ordering.
Allow module getattr and setattr to exploit string interning, via the
previously null module object tp_getattro and tp_setattro slots. Yields
a very nice speedup for things like random.random and os.path etc.
For rich comparisons, use instance_getattr2() when possible to avoid
the expense of setting an AttributeError. Also intern the name_op[]
table and use the interned strings rather than creating a new string
and interning it each time through.
doesn't know how to do LE, LT, GE, GT. dict_richcompare can't do the
latter any faster than dict_compare can. More importantly, for
cmp(dict1, dict2), Python *first* tries rich compares with EQ, LT, and
GT one at a time, even if the tp_compare slot is defined, and
dict_richcompare called dict_compare for the latter two because
it couldn't do them itself. The result was a lot of wasted calls to
dict_compare. Now dict_richcompare gives up at once the times Python
calls it with LT and GT from try_rich_to_3way_compare(), and dict_compare
is called only once (when Python gets around to trying the tp_compare
slot).
Continued mystery: despite that this cut the number of calls to
dict_compare approximately in half in test_mutants.py, the latter still
runs amazingly slowly. Running under the debugger doesn't show excessive
activity in the dict comparison code anymore, so I'm guessing the culprit
is somewhere else -- but where? Perhaps in the element (key/value)
comparison code? We clearly spend a lot of time figuring out how to
compare things.
Fixed a half dozen ways in which general dict comparison could crash
Python (even cause Win98SE to reboot) in the presence of kay and/or
value comparison routines that mutate the dict during dict comparison.
Bugfix candidate.
interned when created, so the cached versions generally aren't ever
interned. With the patch, the
Py_INCREF(t);
*p = t;
Py_DECREF(s);
return;
indirection block in PyString_InternInPlace() is never executed during a
full run of the test suite, but was executed very many times before. So
I'm trading more work when creating one-character strings for doing less
work later. Note that the "more work" here can happen at most 256 times
per program run, so it's trivial. The same reasoning accounts for the
patch's simplification of string_item (the new version can call
PyString_FromStringAndSize() no more than 256 times per run, so there's
no point to inlining that stuff -- if we were serious about saving time
here, we'd pre-initialize the characters vector so that no runtime testing
at all was needed!).
Store floats and doubles to full precision in marshal.
Test that floats read from .pyc/.pyo closely match those read from .py.
Declare PyFloat_AsString() in floatobject header file.
Add new PyFloat_AsReprString() API function.
Document the functions declared in floatobject.h.
d1 == d2 and d1 != d2 now work even if the keys and values in d1 and d2
don't support comparisons other than ==, and testing dicts for equality
is faster now (especially when inequality obtains).
safely together and don't duplicate logic (the common logic was factored
out into new private API function _PySequence_IterContains()).
Visible change:
some_complex_number in some_instance
no longer blows up if some_instance has __getitem__ but neither
__contains__ nor __iter__. test_iter changed to ensure that remains true.
NEEDS DOC CHANGES
A few more AttributeErrors turned into TypeErrors, but in test_contains
this time.
The full story for instance objects is pretty much unexplainable, because
instance_contains() tries its own flavor of iteration-based containment
testing first, and PySequence_Contains doesn't get a chance at it unless
instance_contains() blows up. A consequence is that
some_complex_number in some_instance
dies with a TypeError unless some_instance.__class__ defines __iter__ but
does not define __getitem__.
to string.join(), so that when the latter figures out in midstream that
it really needs unicode.join() instead, unicode.join() can actually get
all the sequence elements (i.e., there's no guarantee that the sequence
passed to string.join() can be iterated over *again* by unicode.join(),
so string.join() must not pass on the original sequence object anymore).
NEEDS DOC CHANGES.
This one surprised me! While I expected tuple() to be a no-brainer, turns
out it's actually dripping with consequences:
1. It will *allow* the popular PySequence_Fast() to work with any iterable
object (code for that not yet checked in, but should be trivial).
2. It caused two std tests to fail. This because some places used
PyTuple_Sequence() (the C spelling of tuple()) as an indirect way to test
whether something *is* a sequence. But tuple() code only looked for the
existence of sq->item to determine that, and e.g. an instance passed
that test whether or not it supported the other operations tuple()
needed (e.g., __len__). So some things the tests *expected* to fail
with an AttributeError now fail with a TypeError instead. This looks
like an improvement to me; e.g., test_coercion used to produce 559
TypeErrors and 2 AttributeErrors, and now they're all TypeErrors. The
error details are more informative too, because the places calling this
were *looking* for TypeErrors in order to replace the generic tuple()
"not a sequence" msg with their own more specific text, and
AttributeErrors snuck by that.
the code necessary to accomplish this is simpler and faster if confined to
the object implementations, so we only do this there.
This causes no behaviorial changes beyond a (very slight) speedup.
need to be specified in the type structures independently. The flag
exists only for binary compatibility.
This is a "source cleanliness" issue and introduces no behavioral changes.
dictionary size was comparing ma_size, the hash table size, which is
always a power of two, rather than ma_used, wich changes on each
insertion or deletion. Fixed this.
to no longer insist that len(seq) be defined.
NEEDS DOC CHANGES.
This is meant to be a model for how other functions of this ilk (max,
filter, etc) can be generalized similarly. Feel encouraged to grab your
favorite and convert it!
Note some cute consequences:
list(file) == file.readlines() == list(file.xreadlines())
list(dict) == dict.keys()
list(dict.iteritems()) = dict.items()
list(xrange(i, j, k)) == range(i, j, k)
object's type didn't define tp_print, there were still cases where the
full "print uses str() which falls back to repr()" semantics weren't
honored. This resulted in
>>> print None
<None object at 0x80bd674>
>>> print type(u'')
<type object at 0x80c0a80>
Fixed this by always using the appropriate PyObject_Repr() or
PyObject_Str() call, rather than trying to emulate what they would do.
Also simplified PyObject_Str() to always fall back on PyObject_Repr()
when tp_str is not defined (rather than making an extra check for
instances with a __str__ method). And got rid of the special case for
strings.
Patch #419651: Metrowerks on Mac adds 0x itself
C std says %#x and %#X conversion of 0 do not add the 0x/0X base marker.
Metrowerks apparently does. Mark Favas reported the same bug under a
Compaq compiler on Tru64 Unix, but no other libc broken in this respect
is known (known to be OK under MSVC and gcc).
So just try the damn thing at runtime and see what the platform does.
Note that we've always had bugs here, but never knew it before because
a relevant test case didn't exist before 2.1.
Fix a very old flaw in PyObject_Print(). Amazing! When an object
type defines tp_str but not tp_repr, 'print x' to a real file
object would not call the tp_str slot but rather print a default style
representation: <foo object at 0x....>. This even though 'print x' to
a file-like-object would correctly call the tp_str slot.
patch for sharing single character Unicode objects.
Martin's patch had to be reworked in a number of ways to take Unicode
resizing into consideration as well. Here's what the updated patch
implements:
* Single character Unicode strings in the Latin-1 range are shared
(not only ASCII chars as in Martin's original patch).
* The ASCII and Latin-1 codecs make use of this optimization,
providing a noticable speedup for single character strings. Most
Unicode methods can use the optimization as well (by virtue
of using PyUnicode_FromUnicode()).
* Some code cleanup was done (replacing memcpy with Py_UNICODE_COPY)
* The PyUnicode_Resize() can now also handle the case of resizing
unicode_empty which previously resulted in an error.
* Modified the internal API _PyUnicode_Resize() and
the public PyUnicode_Resize() API to handle references to
shared objects correctly. The _PyUnicode_Resize() signature
changed due to this.
* Callers of PyUnicode_FromUnicode() may now only modify the Unicode
object contents of the returned object in case they called the API
with NULL as content template.
Note that even though this patch passes the regression tests, there
may still be subtle bugs in the sharing code.
sees it (test_iter.py is unchanged).
- Added a tp_iternext slot, which calls the iterator's next() method;
this is much faster for built-in iterators over built-in types
such as lists and dicts, speeding up pybench's ForLoop with about
25% compared to Python 2.1. (Now there's a good argument for
iterators. ;-)
- Renamed the built-in sequence iterator SeqIter, affecting the C API
functions for it. (This frees up the PyIter prefix for generic
iterator operations.)
- Added PyIter_Check(obj), which checks that obj's type has a
tp_iternext slot and that the proper feature flag is set.
- Added PyIter_Next(obj) which calls the tp_iternext slot. It has a
somewhat complex return condition due to the need for speed: when it
returns NULL, it may not have set an exception condition, meaning
the iterator is exhausted; when the exception StopIteration is set
(or a derived exception class), it means the same thing; any other
exception means some other error occurred.
new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER
new C API PyObject_GetIter(), calls tp_iter
new builtin iter(), with two forms: iter(obj), and iter(function, sentinel)
new internal object types iterobject and calliterobject
new exception StopIteration
new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py)
new magic number for .pyc files
new special method for instances: __iter__() returns an iterator
iteration over dictionaries: "for x in dict" iterates over the keys
iteration over files: "for x in file" iterates over lines
TODO:
documentation
test suite
decide whether to use a different way to spell iter(function, sentinal)
decide whether "for key in dict" is a good idea
use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?)
speed tuning (make next() a slot tp_next???)
I know some people don't like this -- if it's really controversial,
I'll take it out again. (If it's only Alex Martelli who doesn't like
it, that doesn't count as "real controversial" though. :-)
That's why this is a separate checkin from the iterators stuff I'm
about to check in next.
PyTuple_New() could *conceivably* clear the dict, so move the test for
an empty dict after the tuple allocation. It means that we waste time
allocating and deallocating a 2-tuple when the dict is empty, but who
cares. It also means that when the dict is empty *and* there's no
memory to allocate a 2-tuple, we raise MemoryError, not KeyError --
but that may actually a good idea: if there's no room for a lousy
2-tuple, what are the chances that there's room for a KeyError
instance?
and reported to python-dev: because we were calling dict_resize() in
PyDict_Next(), and because GC's dict_traverse() uses PyDict_Next(),
and because PyTuple_New() can cause GC, and because dict_items() calls
PyTuple_New(), it was possible for dict_items() to have the dict
resized right under its nose.
The solution is convoluted, and touches several places: keys(),
values(), items(), popitem(), PyDict_Next(), and PyDict_SetItem().
There are two parts to it. First, we no longer call dict_resize() in
PyDict_Next(), which seems to solve the immediate problem. But then
PyDict_SetItem() must have a different policy about when *it* calls
dict_resize(), because we want to guarantee (e.g. for an algorithm
that Jeremy uses in the compiler) that you can loop over a dict using
PyDict_Next() and make changes to the dict as long as those changes
are only value replacements for existing keys using PyDict_SetItem().
This is done by resizing *after* the insertion instead of before, and
by remembering the size before we insert the item, and if the size is
still the same, we don't bother to even check if we might need to
resize. An additional detail is that if the dict starts out empty, we
must still resize it before the insertion.
That was the first part. :-)
The second part is to make keys(), values(), items(), and popitem()
safe against side effects on the dict caused by allocations, under the
assumption that if the GC can cause arbitrary Python code to run, it
can cause other threads to run, and it's not inconceivable that our
dict could be resized -- it would be insane to write code that relies
on this, but not all code is sane.
Now, I have this nagging feeling that the loops in lookdict probably
are blissfully assuming that doing a simple key comparison does not
change the dict's size. This is not necessarily true (the keys could
be class instances after all). But that's a battle for another day.
"%#x" % 0
blew up, at heart because C sprintf supplies a base marker if and only if
the value is not 0. I then fixed that, by tolerating C's inconsistency
when it does %#x, and taking away that *Python* produced 0x0 when
formatting 0L (the "long" flavor of 0) under %#x itself. But after talking
with Guido, we agreed it would be better to supply 0x for the short int
case too, despite that it's inconsistent with C, because C is inconsistent
with itself and with Python's hex(0) (plus, while "%#x" % 0 didn't work
before, "%#x" % 0L *did*, and returned "0x0"). Similarly for %#X conversion.
http://sourceforge.net/tracker/index.php?func=detail&aid=415514&group_id=5470&atid=105470
For short ints, Python defers to the platform C library to figure out what
%#x should do. The code asserted that the platform C returned a string
beginning with "0x". However, that's not true when-- and only when --the
*value* being formatted is 0. Changed the code to live with C's inconsistency
here. In the meantime, the problem does not arise if you format a long 0 (0L)
instead. However, that's because the code *we* wrote to do %#x conversions on
longs produces a leading "0x" regardless of value. That's probably wrong too:
we should drop leading "0x", for consistency with C, when (& only when) formatting
0L. So I changed the long formatting code to do that too.
must now initialize the extra field used by the weak-ref machinery to
NULL themselves, to avoid having to require PyObject_INIT() to check
if the type supports weak references and do it there. This causes less
work to be done for all objects (the type object does not need to be
consulted to check for the Py_TPFLAGS_HAVE_WEAKREFS bit).
frees. Note there doesn't seem to be any way to test LocalsToFast(),
because the instructions that trigger it are illegal in nested scopes
with free variables.
Fix allocation strategy for cells that are also formal parameters.
Instead of emitting LOAD_FAST / STORE_DEREF pairs for each parameter,
have the argument handling code in eval_code2() do the right thing.
A side-effect of this change is that cell variables that are also
arguments are listed at the front of co_cellvars in the order they
appear in the argument list.
hashable
This patch changes the behavior of slice objects in the following
manner:
- Slice objects are now comparable with other slice objects as though
they were logically tuples of (start,stop,step). The tuple is not
created in the comparison function, but the comparison behavior is
logically equivalent.
- Slice objects are not hashable. With the above change to being
comparable, slice objects now cannot be used as keys in dictionaries.
[I've edited the patch for style. Note that this fixes the problem
that dict[i:j] seemed to work but was meaningless. --GvR]
with free variables. Thanks to Martin v. Loewis for finding two of
the problems. This fixes SF buf 405583.
There is also a C API change: PyFrame_New() is reverting to its
pre-2.1 signature. The change introduced by nested scopes was a
mistake. XXX Is this okay between beta releases?
cell_clear(), the GC helper, must decref its reference to break
cycles.
frame_dealloc() must dealloc all cell vars and free vars in addition
to locals.
eval_code2() setup code must INCREF cells it copies out of the
closure.
The STORE_DEREF opcode implementation must DECREF the object it passes
to PyCell_Set().
May or may not be related to bug 407680 (obmalloc.c - looks like it's
corrupted). This repairs the illegal vrbl names, but leaves a pile of
illegal macro names (_THIS_xxx, _SYSTEM_xxx, _SET_HOOKS, _FETCH_HOOKS).
- In _portable_ftell(), try fgetpos() before ftello() and ftell64().
I ran into a situation on a 64-bit capable Linux where the C
library's ftello() and ftell64() returned negative numbers despite
fpos_t and off_t both being 64-bit types; fgetpos() did the right
thing.
- Define a new typedef, Py_off_t, which is either fpos_t or off_t,
depending on which one is 64 bits. This removes the need for a lot
of #ifdefs later on. (XXX Should this be moved to pyport.h? That
file currently seems oblivious to large fille support, so for now
I'll leave it here where it's needed.)