* unified the way intobject, longobject and mystrtoul handle
values around -sys.maxint-1.
* in general, trying to entierely avoid overflows in any computation
involving signed ints or longs is extremely involved. Fixed a few
simple cases where a compiler might be too clever (but that's all
guesswork).
* more overflow checks against bad data in marshal.c.
* 2.5 specific: fixed a number of places that were still confusing int
and Py_ssize_t. Some of them could potentially have caused
"real-world" breakage.
* list.pop(x): fixing overflow issues on x was messy. I just reverted
to PyArg_ParseTuple("n"), which does the right thing. (An obscure
test was trying to give a Decimal to list.pop()... doesn't make
sense any more IMHO)
* trying to write a few tests...
i_divmod(): As discussed on Python-Dev, changed the overflow
checking to live happily with recent gcc optimizations that
assume signed integer arithmetic never overflows.
This differs from the corresponding change on the 2.5 and 2.4
branches, using a less obscure approach, but one that /may/
tickle platform idiocies in their definitions of LONG_MIN.
The 2.4 + 2.5 change avoided introducing a dependence on
LONG_MIN, at the cost of substantially goofier code.
OverflowError while x*x succeeds and produces infinity; apparently
these inconsistencies cannot be fixed across ``all'' platforms and
there's a widespread feeling that therefore ``every'' platform
should keep suffering forevermore. Ah well.
inf) but didn't; added a test to test_float to verify that, and ignored the
ERANGE value for errno in the pow operation to make the new test pass (with
help from Marilyn Davis at the Google Python Sprint -- thanks!).
Replace UnicodeDecodeErrors raised during == and !=
compares of Unicode and other objects with a new
UnicodeWarning.
All other comparisons continue to raise exceptions.
Exceptions other than UnicodeDecodeErrors are also left
untouched.
were failing due to inappropriate clipping of numbers larger than 2**31
with new-style classes. (typeobject.c) In reviewing the code for classic
classes, there were 2 problems. Any negative value return could be returned.
Always return -1 if there was an error. Also make the checks similar
with the new-style classes. I believe this is correct for 32 and 64 bit
boxes, including Windows64.
Add a test of classic classes too.
I modified this patch some by fixing style, some error checking, and adding
XXX comments. This patch requires review and some changes are to be expected.
I'm checking in now to get the greatest possible review and establish a
baseline for moving forward. I don't want this to hold up release if possible.
This is the first batch of fixes that should be easy to verify based on context.
This fixes problem numbers: 220 (ast), 323-324 (symtable),
321-322 (structseq), 215 (array), 210 (hotshot), 182 (codecs), 209 (etree).
PyMapping_Size and PySequence_Size.
Because len() tries first sequence, then mapping size, it will always
raise a "non-mapping object has no len" error which is confusing.
be wrong.
The real change is to pass (bufsz - 1) to PyOS_ascii_formatd and 1
to strncat. strncat copies n+1 bytes from src (not dest).
Reported by Klocwork #58.
The problem of checking too eagerly for recursive calls is the
following: if a RuntimeError is caused by recursion, and if code needs
to normalize it immediately (as in the 2nd test), then
PyErr_NormalizeException() needs a call to the RuntimeError class to
instantiate it, and this hits the recursion limit again... causing
PyErr_NormalizeException() to never finish.
Moved this particular recursion check to slot_tp_call(), which is not
involved in instantiating built-in exceptions.
Backport candidate.
arguments in reverse, the interpreter would infinitely recourse trying to get a
coercion that worked. So put in a recursion check after a coercion is made and
the next call to attempt to use the coerced values.
Fixes bug #992017 and closes crashers/coerce.py .
the char buffer was requested. Now it actually returns the char buffer if
available or raises a TypeError if it isn't (as is raised for the other buffer
types if they are not present but requested).
Not a backport candidate since it does change semantics of the buffer object
(although it could be argued this is enough of a bug to bother backporting).
Give a consistent behavior for comparison and hashing of method objects
(both user- and built-in methods). Now compares the 'self' recursively.
The hash was already asking for the hash of 'self'.
to each allocated block. This was using 4 bytes for each such
piece of info regardless of platform. This didn't really matter
before (proof: no bug reports, and the debug-build obmalloc would
have assert-failed if it was ever asked for a chunk of memory
>= 2**32 bytes), since container indices were plain ints. But after
the Py_ssize_t changes, it's at least theoretically possible to
allocate a list or string whose guts exceed 2**32 bytes, and the
PYMALLOC_DEBUG routines would fail then (having only 4 bytes
to record the originally requested size).
Now we use sizeof(size_t) bytes for each of a PYMALLOC_DEBUG
build's extra debugging fields. This won't make any difference
on 32-bit boxes, but will add 16 bytes to each allocation in
a debug build on a 64-bit box.
he didn't know this), so merged in some changes I made during
review. Nothing material apart from changing a new `mask` local
from int to Py_ssize_t. Mostly this is repairing comments that
were made incorrect, and adding new comments. Also a few
minor code rewrites for clarity or helpful succinctness.
a new comment) suggests there are almost certainly large input
integers in all non-binary input bases for which one Python digit
too few is initally allocated to hold the final result. Instead
of assert-failing when that happens, allocate more space. Alas,
I estimate it would take a few days to find a specific such case,
so this isn't backed up by a new test (not to mention that such
a case may take hours to run, since conversion time is quadratic
in the number of digits, and preliminary attempts suggested that
the smallest such inputs contain at least a million digits).
Make some functions that should have been static static.
Fix a bunch of refleaks by fixing the definition of
MiddlingExtendsException.
Remove all the __new__ implementations apart from
BaseException_new. Rewrite most code that needs it to cope with
NULL fields (such code could get excercised anyway, the
__new__-removal just makes it more likely). This involved
editing the code for WindowsError, which I can't test.
This fixes all the refleaks in at least the start of a regrtest
-R :: run.
Fix a number of problems with the need for speed code:
One is doing this sort of thing:
Py_DECREF(self->field);
self->field = newval;
Py_INCREF(self->field);
without being very sure that self->field doesn't start with a
value that has a __del__, because that almost certainly can lead
to segfaults.
As self->args is constrained to be an exact tuple we may as well
exploit this fact consistently. This leads to quite a lot of
simplification (and, hey, probably better performance).
Add some error checking in places lacking it.
Fix some rather strange indentation in the Unicode code.
Delete some trailing whitespace.
More to come, I haven't fixed all the reference leaks yet...
(If compiled without FAST search support, changed the pre-memcmp test
to check the last character as well as the first. This gave a 25%
speedup for my test case.)
Rewrote the split algorithms so they stop when maxsplit gets to 0.
Previously they did a string match first then checked if the maxsplit
was reached. The new way prevents a needless string search.
results list.
Originally it allocated 0 items and used the list growth during append. Now
it preallocates 12 items so the first few appends don't need list reallocs.
("Here are some words ."*2).split(None, 1) is 7% faster
("Here are some words ."*2).split() is is 15% faster
(Your milage may vary, see dealership for details.)
File parsing like this
for line in f:
count += len(line.split())
is also about 15% faster. There is a slowdown of about 3% for large
strings because of the additional overhead of checking if the append is
to a preallocated region of the list or not. This will be the rare case.
It could be improved with special case code but we decided it was not
useful enough.
There is a cost of 12*sizeof(PyObject *) bytes per list. For the normal
case of file parsing this is not a problem because of the lists have
a short lifetime. We have not come up with cases where this is a problem
in real life.
I chose 12 because human text averages about 11 words per line in books,
one of my data sets averages 6.2 words with a final peak at 11 words per
line, and I work with a tab delimited data set with 8 tabs per line (or
9 words per line). 12 encompasses all of these.
Also changed the last rstrip code to append then reverse, rather than
doing insert(0). The strip() and rstrip() times are now comparable.
this is on par with a corresponding find, and nearly twice as fast
as split(sep, 1)
full tests, a unicode version, and documentation will follow to-
morrow.
The SIGCHECK macro defined here has always been bizarre, but
it apparently causes compiler warnings on "Sun Studio 11".
I believe the warnings are bogus, but it doesn't hurt to make
the macro definition saner.
Bugfix candidate (but I'm not going to bother).
made a copy of the string using PyString_FromStringAndSize(s, n) and modify
the copied string in-place. However, 1 (and 0) character strings are shared
from a cache. This cause "A".replace("A", "a") to change the cached version
of "A" -- used by everyone.
Now may the copy with NULL as the string and do the memcpy manually. I've
added regression tests to check if this happens in the future. Perhaps
there should be a PyString_Copy for this case?
both mystrtoul.c and longobject.c. Share the table instead. Also
cut its size by 64 entries (they had been used for an inscrutable
trick originally, but the code no longer tries to use that trick).
results in a 2.5x speedup on the stringbench count tests, and a 20x (!)
speedup on the stringbench search/find/contains test, compared to 2.5a2.
for more on the algorithm, see:
http://effbot.org/zone/stringlib.htm
if you get weird results, you can disable the new algoritm by undefining
USE_FAST in Objects/unicodeobject.c.
enjoy /F
speed up splitlines and strip with charsets; etc. rsplit is now as
fast as split in all our tests (reverse takes no time at all), and
splitlines() is nearly as fast as a plain split("\n") in our tests.
and we're not done yet... ;-)
Applied patch zombie-frames-2.diff from sf patch 876206 with updates for
Python 2.5 and also modified to retain the free_list to avoid the 67%
slow-down in pybench recursion test. 5% speed up in function call pybench.
compiler warnings on Windows (signed vs unsigned mismatch
in comparisons). Cleaned that up by switching more locals
to Py_ssize_t. Simplified overflow checking (it can _be_
simpler because while these things are declared as
Py_ssize_t, then should in fact never be negative).
about "%u", "%lu" and "%zu" formats.
Since PyString_FromFormat and PyErr_Format have exactly the same rules
(both inherited from PyString_FromFormatV), it would be good if someone
with more LaTeX Fu changed one of them to just point to the other.
Their docs were way out of synch before this patch, and I just did a
mass copy+paste to repair that.
Not a backport candidate (this is a new feature).
longobject.c: also fix an ssize_t problem
<a> could have been NULL, so hoist the size calc to not use <a>.
_ssl.c: under fail: self is DECREF'd, but it would have been NULL.
_elementtree.c: delete self if there was an error.
_csv.c: I'm not sure if lineterminator could have been anything other than
a string. However, other string method calls are checked, so check this
one too.
discussion.
There are two places of documentation that still mention __context__:
Doc/lib/libstdtypes.tex -- I wasn't quite sure how to rewrite that without
spending a whole lot of time thinking about it; and whatsnew, which Andrew
usually likes to change himself.
__delitem__, __setslice__ and __delslice__ hooks. This caused test_weakref
and test_userlist to fail in the p3yk branch (where UserList, like all
classes, is new-style) on amd64 systems, with open-ended slices: the
sys.maxint value for empty-endpoint was transformed into -1.
zfill stringmethods, so they can create strings larger than 2Gb on 64bit
systems (even win64.) The unicode versions of these methods already did this
right.
slot_tp_del(), but while the latter had to cater to types
that don't participate in GC, we know that generators do.
That allows strengthing an assert().
using a custom, nearly-identical macro. This probably changes how some of
these functions are compiled, which may result in fractionally slower (or
faster) execution. Considering the nature of traversal, visiting much of the
address space in unpredictable patterns, I'd argue the code readability and
maintainability is well worth it ;P
- In functions where we already hold the same object in differently typed
pointers, use the correctly typed pointer instead of casting the other
pointer a second time.
objects before initializing it. It might be linked
already if there was a Py_Initialize/Py_Finalize
cycle earlier; not unlinking it would break the global
list.
Py_VISIT: cast the `op` argument to PyObject* when calling
`visit()`. Else the caller has to pay too much attention to
this silly detail (e.g., frame_traverse needs to traverse
`struct _frame *` and `PyCodeObject *` pointers too).
problems: first, PyGen_NeedsFinalizing() had an off-by-one bug that
prevented it from ever saying a generator didn't need finalizing, and
second, frame objects cleared themselves in a way that caused their
owning generator to think they were still executable, causing a double
deallocation of objects on the value stack if there was still a loop
on the block stack. This revision also removes some unnecessary
close() operations from test_generators that are now appropriately
handled by the cycle collector.
to avoid confusing situations like:
>>> int("")
ValueError: invalid literal for int():
>>> int("2\n\n2")
ValueError: invalid literal for int(): 2
2
Also report the base used, to avoid:
ValueError: invalid literal for int(): 2
They now report:
>>> int("")
ValueError: invalid literal for int() with base 10: ''
>>> int("2\n\n2")
ValueError: invalid literal for int() with base 10: '2\n\n2'
>>> int("2", 2)
ValueError: invalid literal for int() with base 2: '2'
(Reporting the base could be avoided when base is 10, which is the default,
but hrm.) Another effect of these changes is that the errormessage can be
longer; before, it was cut off at about 250 characters. Now, it can be up to
four times as long, as the unrepr'ed string is cut off at 200 characters,
instead.
No tests were added or changed, since testing for exact errormsgs is (pardon
the pun) somewhat errorprone, and I consider not testing the exact text
preferable. The actually changed code is tested frequent enough in the
test_builtin test as it is (120 runs for each of ints and longs.)
interpolate PY_FORMAT_SIZE_T for refcount display
instead of casting refcounts to long.
I understand that gcc on some boxes delivers
nuisance warnings about this, but if any new ones
appear because of this they'll get fixed by magic
when the others get fixed.