a new comment) suggests there are almost certainly large input
integers in all non-binary input bases for which one Python digit
too few is initally allocated to hold the final result. Instead
of assert-failing when that happens, allocate more space. Alas,
I estimate it would take a few days to find a specific such case,
so this isn't backed up by a new test (not to mention that such
a case may take hours to run, since conversion time is quadratic
in the number of digits, and preliminary attempts suggested that
the smallest such inputs contain at least a million digits).
Make some functions that should have been static static.
Fix a bunch of refleaks by fixing the definition of
MiddlingExtendsException.
Remove all the __new__ implementations apart from
BaseException_new. Rewrite most code that needs it to cope with
NULL fields (such code could get excercised anyway, the
__new__-removal just makes it more likely). This involved
editing the code for WindowsError, which I can't test.
This fixes all the refleaks in at least the start of a regrtest
-R :: run.
Fix a number of problems with the need for speed code:
One is doing this sort of thing:
Py_DECREF(self->field);
self->field = newval;
Py_INCREF(self->field);
without being very sure that self->field doesn't start with a
value that has a __del__, because that almost certainly can lead
to segfaults.
As self->args is constrained to be an exact tuple we may as well
exploit this fact consistently. This leads to quite a lot of
simplification (and, hey, probably better performance).
Add some error checking in places lacking it.
Fix some rather strange indentation in the Unicode code.
Delete some trailing whitespace.
More to come, I haven't fixed all the reference leaks yet...
(If compiled without FAST search support, changed the pre-memcmp test
to check the last character as well as the first. This gave a 25%
speedup for my test case.)
Rewrote the split algorithms so they stop when maxsplit gets to 0.
Previously they did a string match first then checked if the maxsplit
was reached. The new way prevents a needless string search.
results list.
Originally it allocated 0 items and used the list growth during append. Now
it preallocates 12 items so the first few appends don't need list reallocs.
("Here are some words ."*2).split(None, 1) is 7% faster
("Here are some words ."*2).split() is is 15% faster
(Your milage may vary, see dealership for details.)
File parsing like this
for line in f:
count += len(line.split())
is also about 15% faster. There is a slowdown of about 3% for large
strings because of the additional overhead of checking if the append is
to a preallocated region of the list or not. This will be the rare case.
It could be improved with special case code but we decided it was not
useful enough.
There is a cost of 12*sizeof(PyObject *) bytes per list. For the normal
case of file parsing this is not a problem because of the lists have
a short lifetime. We have not come up with cases where this is a problem
in real life.
I chose 12 because human text averages about 11 words per line in books,
one of my data sets averages 6.2 words with a final peak at 11 words per
line, and I work with a tab delimited data set with 8 tabs per line (or
9 words per line). 12 encompasses all of these.
Also changed the last rstrip code to append then reverse, rather than
doing insert(0). The strip() and rstrip() times are now comparable.
this is on par with a corresponding find, and nearly twice as fast
as split(sep, 1)
full tests, a unicode version, and documentation will follow to-
morrow.
The SIGCHECK macro defined here has always been bizarre, but
it apparently causes compiler warnings on "Sun Studio 11".
I believe the warnings are bogus, but it doesn't hurt to make
the macro definition saner.
Bugfix candidate (but I'm not going to bother).
made a copy of the string using PyString_FromStringAndSize(s, n) and modify
the copied string in-place. However, 1 (and 0) character strings are shared
from a cache. This cause "A".replace("A", "a") to change the cached version
of "A" -- used by everyone.
Now may the copy with NULL as the string and do the memcpy manually. I've
added regression tests to check if this happens in the future. Perhaps
there should be a PyString_Copy for this case?
both mystrtoul.c and longobject.c. Share the table instead. Also
cut its size by 64 entries (they had been used for an inscrutable
trick originally, but the code no longer tries to use that trick).
results in a 2.5x speedup on the stringbench count tests, and a 20x (!)
speedup on the stringbench search/find/contains test, compared to 2.5a2.
for more on the algorithm, see:
http://effbot.org/zone/stringlib.htm
if you get weird results, you can disable the new algoritm by undefining
USE_FAST in Objects/unicodeobject.c.
enjoy /F
speed up splitlines and strip with charsets; etc. rsplit is now as
fast as split in all our tests (reverse takes no time at all), and
splitlines() is nearly as fast as a plain split("\n") in our tests.
and we're not done yet... ;-)
Applied patch zombie-frames-2.diff from sf patch 876206 with updates for
Python 2.5 and also modified to retain the free_list to avoid the 67%
slow-down in pybench recursion test. 5% speed up in function call pybench.
compiler warnings on Windows (signed vs unsigned mismatch
in comparisons). Cleaned that up by switching more locals
to Py_ssize_t. Simplified overflow checking (it can _be_
simpler because while these things are declared as
Py_ssize_t, then should in fact never be negative).