Commit Graph

1645 Commits

Author SHA1 Message Date
Tim Peters 738eda742c k_mul: Rearranged computation for better cache use. Ignored overflow
(it's possible, but should be harmless -- this requires more thought,
and allocating enough space in advance to prevent it requires exactly
as much thought, to know exactly how much that is -- the end result
certainly fits in the allocated space -- hmm, but that's really all
the thought it needs!  borrows/carries out of the high digits really
are harmless).
2002-08-12 15:08:20 +00:00
Tim Peters 44121a6bc9 x_mul(): This failed to normalize its result.
k_mul():  This didn't allocate enough result space when one input had
more than twice as many bits as the other.  This was partly hidden by
that x_mul() didn't normalize its result.

The Karatsuba recurrence is pretty much hosed if the inputs aren't
roughly the same size.  If one has at least twice as many bits as the
other, we get a degenerate case where the "high half" of the smaller
input is 0.  Added a special case for that, for speed, but despite that
it helped, this can still be much slower than the "grade school" method.
It seems to take a really wild imbalance to trigger that; e.g., a
2**22-bit input times a 1000-bit input on my box runs about twice as slow
under k_mul than under x_mul.  This still needs to be addressed.

I'm also not sure that allocating a->ob_size + b->ob_size digits is
enough, given that this is computing k = (ah+al)*(bh+bl) instead of
k = (ah-al)*(bl-bh); i.e., it's certainly enough for the final result,
but it's vaguely possible that adding in the "artificially" large k may
overflow that temporarily.  If so, an assert will trigger in the debug
build, but we'll probably compute the right result anyway(!).
2002-08-12 06:17:58 +00:00
Tim Peters 877a212678 Introduced helper functions v_iadd and v_isub, for in-place digit-vector
addition and subtraction.  Reworked the tail end of k_mul() to use them.
This saves oodles of one-shot longobject allocations (this is a triply-
recursive routine, so saving one allocation in the body saves 3**n
allocations at depth n; we actually save 2 allocations in the body).
2002-08-12 05:09:36 +00:00
Tim Peters fc07e56844 k_mul(): Repaired another typo in another comment. 2002-08-12 02:54:10 +00:00
Tim Peters 18c15b9bbd k_mul(): Repaired typo in comment. 2002-08-12 02:43:58 +00:00
Tim Peters 5af4e6c739 Cautious introduction of a patch that started from
SF 560379:  Karatsuba multiplication.
Lots of things were changed from that.  This needs a lot more testing,
for correctness and speed, the latter especially when bit lengths are
unbalanced.  For now, the Karatsuba code gets invoked if and only if
envar KARAT exists.
2002-08-12 02:31:19 +00:00
Tim Peters da1a2212c8 int_lshift(): Simplified/sped overflow-checking. 2002-08-11 17:54:42 +00:00
Guido van Rossum 643d59cbd6 Use a better check for overflow from a<<b. 2002-08-11 14:04:13 +00:00
Marc-André Lemburg cc8764ca9d Add C API PyUnicode_FromOrdinal() which exposes unichr() at C level.
u'%c' will now raise a ValueError in case the argument is an
integer outside the valid range of Unicode code point ordinals.

Closes SF bug #593581.
2002-08-11 12:23:04 +00:00
Guido van Rossum 078151da90 Implement stage B0 of PEP 237: add warnings for operations that
currently return inconsistent results for ints and longs; in
particular: hex/oct/%u/%o/%x/%X of negative short ints, and x<<n that
either loses bits or changes sign.  (No warnings for repr() of a long,
though that will also change to lose the trailing 'L' eventually.)

This introduces some warnings in the test suite; I'll take care of
those later.
2002-08-11 04:24:12 +00:00
Tim Peters 3ddb856ed1 Fixed new typos, added a little info about ~sort versus "hint"s. 2002-08-10 07:04:01 +00:00
Guido van Rossum 40af889081 Disallow class assignment completely unless both old and new are heap
types.  This prevents nonsense like 2.__class__ = bool or
True.__class__ = int.
2002-08-10 05:42:07 +00:00
Tim Peters e05f65a0c6 1. Combined the base and length arrays into a single array of structs.
This is friendlier for caches.

2. Cut MIN_GALLOP to 7, but added a per-sort min_gallop vrbl that adapts
   the "get into galloping mode" threshold higher when galloping isn't
   paying, and lower when it is.  There's no known case where this hurts.
   It's (of course) neutral for /sort, \sort and =sort.  It also happens
   to be neutral for !sort.  It cuts a tiny # of compares in 3sort and +sort.
   For *sort, it reduces the # of compares to better than what this used to
   do when MIN_GALLOP was hardcoded to 10 (it did about 0.1% more *sort
   compares before, but given how close we are to the limit, this is "a
   lot"!).  %sort used to do about 1.5% more compares, and ~sort about
   3.6% more.  Here are exact counts:

 i    *sort    3sort    +sort    %sort    ~sort    !sort
15   449235    33019    33016    51328   188720    65534  before
     448885    33016    33007    50426   182083    65534  after
      0.08%    0.01%    0.03%    1.79%    3.65%    0.00%  %ch from after

16   963714    65824    65809   103409   377634   131070
     962991    65821    65808   101667   364341   131070
      0.08%    0.00%    0.00%    1.71%    3.65%    0.00%

17  2059092   131413   131362   209130   755476   262142
    2057533   131410   131361   206193   728871   262142
      0.08%    0.00%    0.00%    1.42%    3.65%    0.00%

18  4380687   262440   262460   421998  1511174   524286
    4377402   262437   262459   416347  1457945   524286
      0.08%    0.00%    0.00%    1.36%    3.65%    0.00%

19  9285709   524581   524634   848590  3022584  1048574
    9278734   524580   524633   837947  2916107  1048574
      0.08%    0.00%    0.00%    1.27%    3.65%    0.00%

20 19621118  1048960  1048942  1715806  6045418  2097150
   19606028  1048958  1048941  1694896  5832445  2097150
      0.08%    0.00%    0.00%    1.23%    3.65%    0.00%

3. Added some key asserts I overlooked before.

4. Updated the doc file.
2002-08-10 05:21:15 +00:00
Tim Peters b80595f44a The samplesort-vs-mergesort #-of-comparisons comparisons were captured
before %sort was introduced.  Redid them (the numbers change, but the
conclusions don't).  Also did the samplesort counts with the released
2.2.1, as they're slightly different under the last CVS 2.3 samplesort
(some higher, some lower -- CVS had been changed to stop doing the
special-case business on recursive samplesort calls).
2002-08-10 03:04:33 +00:00
Fred Drake f16c3dc81b Add support for the iterator protocol to weakref proxy objects.
Part of fixing SF bug #591704.
2002-08-09 18:34:16 +00:00
Guido van Rossum f36921c4b0 Unicode replace() method with empty pattern argument should fail, like
it does for 8-bit strings.
2002-08-09 15:36:48 +00:00
Neil Schemenauer 3bc3f28dbe Only call sq_repeat if the object does not have a nb_multiply slot. One
example of where this changes behavior is when a new-style instance
defines '__mul__' and '__rmul__' and is multiplied by an int.  Before the
change the '__rmul__' method is never called, even if the int is the
left operand.
2002-08-09 15:20:48 +00:00
Tim Peters 671764beb0 Repaired a braino in the description of bad minrun values. 2002-08-09 05:06:44 +00:00
Guido van Rossum 721f62e200 Major speedup for new-style class creation. Turns out there was some
trampolining going on with the tp_new descriptor, where the inherited
PyType_GenericNew was overwritten with the much slower slot_tp_new
which would end up calling tp_new_wrapper which would eventually call
PyType_GenericNew.  Add a special case for this to update_one_slot().

XXX Hope there isn't a loophole in this.  I'll buy the first person to
point out a bug in the reasoning a beer.

Backport candidate (but I won't do it).
2002-08-09 02:14:34 +00:00
Raymond Hettinger 48923c5533 Moved special case for tuples from iterobject.c to
tupleobject.c. Makes the code in iterobject.c cleaner
and speeds-up the general case by not checking for
tuples everytime.   SF Patch #592065.
2002-08-09 01:30:17 +00:00
Guido van Rossum 7bed213224 Significant speedup in new-style object creation: in slot_tp_new(),
intern the string "__new__" so we can call PyObject_GetAttr() rather
than PyObject_GetAttrString().  (Though it's a mystery why slot_tp_new
is being called when a class doesn't define __new__.  I'll look into
that tomorrow.)

2.2 backport candidate (but I won't do it).
2002-08-08 21:57:53 +00:00
Guido van Rossum febd61dc02 A modest speedup of object deallocation. call_finalizer() did rather
a lot of work: it had to save and restore the current exception around
a call to lookup_maybe(), because that could fail in rare cases, and
most objects don't have a __del__ method, so the whole exercise was
usually a waste of time.  Changed this to cache the __del__ method in
the type object just like all other special methods, in a new slot
tp_del.  So now subtype_dealloc() can test whether tp_del is NULL and
skip the whole exercise if it is.  The new slot doesn't need a new
flag bit: subtype_dealloc() is only called if the type was dynamically
allocated by type_new(), so it's guaranteed to have all current slots.
Types defined in C cannot fill in tp_del with a function of their own,
so there's no corresponding "wrapper".  (That functionality is already
available through tp_dealloc.)
2002-08-08 20:55:20 +00:00
Tim Peters 6c511e6d1c Added info about highwater heap-memory use for the sortperf.py tests; + a
couple of minor edits elsewhere.
2002-08-08 01:55:16 +00:00
Tim Peters 6063e2615f PyList_Reverse(): This was leaking a reference to Py_None on every call.
I believe I introduced this bug when I refactored the reversal code so
that the mergesort could use it too.  It's not a problem on the 2.2 branch.
2002-08-08 01:06:39 +00:00
Guido van Rossum 0906e07442 Fix a subtle bug in the trashcan code I added yesterday to
subtype_dealloc().

When call_finalizer() failed, it would return without going through
the trashcan end macro, thereby unbalancing the trashcan nesting level
counter, and thereby defeating the test case (slottrash() in
test_descr.py).  This in turn meant that the assert in the GC_UNTRACK
macro wasn't triggered by the slottrash() test despite a bug in the
code: _PyTrash_destroy_chain() calls the dealloc routine with an
object that's untracked, and the assert in the GC_UNTRACK macro would
fail on this; but because of an earlier test that resurrects an
object, causing call_finalizer() to fail and the trashcan nesting
level to be unbalanced, so _PyTrash_destroy_chain() was never called.
Calling the slottrash() test in isolation *did* trigger the assert,
however.

So the fix is twofold: (1) call the GC_UnTrack() function instead of
the GC_UNTRACK macro, because the function is safe when the object is
already untracked; (2) when call_finalizer() fails, jump to a label
that exits through the trashcan end macro, keeping the trashcan
nesting balanced.
2002-08-07 20:42:09 +00:00
Martin v. Löwis 3f19b10ca5 Replace abort with Py_FatalError. 2002-08-07 16:21:51 +00:00
Neal Norwitz 657d222700 Make more functions static 2002-08-06 22:12:52 +00:00
Neal Norwitz d8b995f5e8 Make readahead functions static 2002-08-06 21:50:54 +00:00
Guido van Rossum 22b1387c51 Fix SF bug 574207 (chained __slots__ dealloc segfault).
This is inspired by SF patch 581742 (by Jonathan Hogg, who also
submitted the bug report, and two other suggested patches), but
separates the non-GC case from the GC case to avoid testing for GC
several times.

Had to fix an assert() from call_finalizer() that asserted that the
object wasn't untracked, because it's possible that the object isn't
GC'ed!
2002-08-06 21:41:44 +00:00
Barry Warsaw 6a043f3fe8 PyUnicode_Contains(): The memcmp() call didn't take into account the
width of Py_UNICODE.  Good catch, MAL.
2002-08-06 19:03:17 +00:00
Barry Warsaw 817918cc3c Committing patch #591250 which provides "str1 in str2" when str1 is a
string of longer than 1 character.
2002-08-06 16:58:21 +00:00
Guido van Rossum 7a6e95948c SF patch 580331 by Oren Tirosh: make file objects their own iterator.
For a file f, iter(f) now returns f (unless f is closed), and f.next()
is similar to f.readline() when EOF is not reached; however, f.next()
uses a readahead buffer that messes up the file position, so mixing
f.next() and f.readline() (or other methods) doesn't work right.
Calling f.seek() drops the readahead buffer, but other operations
don't.

The real purpose of this change is to reduce the confusion between
objects and their iterators.  By making a file its own iterator, it's
made clearer that using the iterator modifies the file object's state
(in particular the current position).

A nice side effect is that this speeds up "for line in f:" by not
having to use the xreadlines module.  The f.xreadlines() method is
still supported for backwards compatibility, though it is the same as
iter(f) now.

(I made some cosmetic changes to Oren's code, and added a test for
"file closed" to file_iternext() and file_iter().)
2002-08-06 15:55:28 +00:00
Raymond Hettinger bc552ce1b8 SF 582071 clarified the .split() method's docstring to note that sep=None
will trigger splitting on any whitespace.
2002-08-05 06:28:21 +00:00
Tim Peters 66860f6da4 Sped the usual case for sorting by calling PyObject_RichCompareBool
directly when no comparison function is specified.  This saves a layer
of function call on every compare then.  Measured speedups:

 i    2**i  *sort  \sort  /sort  3sort  +sort  %sort  ~sort  =sort  !sort
15   32768  12.5%   0.0%   0.0% 100.0%   0.0%  50.0% 100.0% 100.0% -50.0%
16   65536   8.7%   0.0%   0.0%   0.0%   0.0%   0.0%  12.5%   0.0%   0.0%
17  131072   8.0%  25.0%   0.0%  25.0%   0.0%  14.3%   5.9%   0.0%   0.0%
18  262144   6.3% -10.0%  12.5%  11.1%   0.0%   6.3%   5.6%  12.5%   0.0%
19  524288   5.3%   5.9%   0.0%   5.6%   0.0%   5.9%   5.4%   0.0%   2.9%
20 1048576   5.3%   2.9%   2.9%   5.1%   2.8%   1.3%   5.9%   2.9%   4.2%

The best indicators are those that take significant time (larger i), and
where sort doesn't do very few compares (so *sort and ~sort benefit most
reliably).  The large numbers are due to roundoff noise combined with
platform variability; e.g., the 14.3% speedup for %sort at i=17 reflects
a printed elapsed time of 0.18 seconds falling to 0.17, but a change in
the last digit isn't really meaningful (indeed, if it really took 0.175
seconds, one electron having a lazy nanosecond could shift it to either
value <wink>).  Similarly the 25% at 3sort i=17 was a meaningless change
from 0.05 to 0.04.  However, almost all the "meaningless changes" were
in the same direction, which is good.  The before-and-after times for
*sort are clearest:

before after
  0.18  0.16
  0.25  0.23
  0.54  0.50
  1.18  1.11
  2.57  2.44
  5.58  5.30
2002-08-04 17:47:26 +00:00
Tim Peters 6bdbc9e0b1 SF bug 590366: Small typo in listsort:ParseTuple
The PyArg_ParseTuple() error string still said "msort".  Changed to "sort".
2002-08-03 02:28:24 +00:00
Guido van Rossum f4be427c46 Tim found that once test_longexp has run, test_sort takes very much
longer to run than normal.  A profiler run showed that this was due to
PyFrame_New() taking up an unreasonable amount of time.  A little
thinking showed that this was due to the while loop clearing the space
available for the stack.  The solution is to only clear the local
variables (and cells and free variables), not the space available for
the stack, since anything beyond the stack top is considered to be
garbage anyway.  Also, use memset() instead of a while loop counting
backwards.  This should be a time savings for normal code too!  (By a
probably unmeasurable amount. :-)
2002-08-01 18:50:33 +00:00
Guido van Rossum 0dbab4c560 SF patch 588728 (Nathan Srebro).
The __delete__ method wrapper for descriptors was not supported

(I added a test, too.)

2.2 bugfix candidate.
2002-08-01 14:39:25 +00:00
Tim Peters a64dc245ac Replaced samplesort with a stable, adaptive mergesort. 2002-08-01 02:13:36 +00:00
Tim Peters 92f81f2e63 Checking in the doc file for "timsort". There's way too much here to
stuff into code comments, and lots of it is going to be useful again (but
hard to predict exactly which parts of it ...).
2002-08-01 00:59:42 +00:00
Neal Norwitz cee5ca060b SF patch #587889, fix memory leak of tp_doc 2002-07-30 00:42:06 +00:00
Michael W. Hudson 56796f672f Fix for
[ 587875 ] crash on deleting extended slice

The array code got simpler, always a good thing!
2002-07-29 14:35:04 +00:00
Mark Hammond a290527376 Excise DL_IMPORT/EXPORT from object.h, and related files. This patch
also adds 'extern' to PyAPI_DATA rather than at each declaration, as
discussed with Tim and Guido.
2002-07-29 13:42:14 +00:00
Neal Norwitz 88fe4ff5a9 Fix the problem of not raising a TypeError exception when doing:
'%g' % '1'
    '%d' % '1'

Add a test for these conditions
Fix the test so that if not exception is raise, this is a failure
2002-07-28 16:44:23 +00:00
Martin v. Löwis 673c0a2247 Patch #574867: Correct list.extend docstring. 2002-07-28 16:35:57 +00:00
Neal Norwitz 7beeed5dfd SF patch #577031, remove PyArg_Parse() since it's deprecated 2002-07-28 15:19:47 +00:00
Martin v. Löwis 75d2d94e0f Patch #554716: Use __va_copy where available. 2002-07-28 10:23:27 +00:00
Skip Montanaro 35b37a5c11 tighten up the unicode object's docstring a tad 2002-07-26 16:22:46 +00:00
Jeremy Hylton 73a088e3fa Don't be so hasty. If PyInt_AsLong() raises an error, don't set ValueError. 2002-07-25 16:43:29 +00:00
Jeremy Hylton f20fcf9fed Complain if __len__() returns < 0, just like classic classes.
Fixes SF bug #575773.

Bug fix candidate.
2002-07-25 16:06:15 +00:00
Michael W. Hudson 206d8f818f Silly typo. Not sure how that got in. 2002-07-19 15:52:38 +00:00