Commit Graph

1690 Commits

Author SHA1 Message Date
Guido van Rossum bf935fde15 string_contains(): speed up by avoiding function calls where
possible.  This always called PyUnicode_Check() and PyString_Check(),
at least one of which would call PyType_IsSubtype().  Also, this would
call PyString_Size() on known string objects.
2002-08-24 06:57:49 +00:00
Guido van Rossum 6248f441ea Speedup for PyObject_IsTrue(): check for True and False first.
Because all built-in tests return bools now, this is the most common
path!
2002-08-24 06:31:34 +00:00
Guido van Rossum 81912d4764 Speedup for PyObject_RichCompareBool(): PyObject_RichCompare() almost
always returns a bool, so avoid calling PyObject_IsTrue() in that
case.
2002-08-24 05:33:28 +00:00
Guido van Rossum 2023c9b84a Fix SF bug 599128, submitted by Inyeol Lee: .replace() would do the
wrong thing for a unicode subclass when there were zero string
replacements.  The example given in the SF bug report was only one way
to trigger this; replacing a string of length >= 2 that's not found is
another.  The code would actually write outside allocated memory if
replacement string was longer than the search string.

(I wonder how many more of these are lurking?  The unicode code base
is full of wonders.)

Bugfix candidate; this same bug is present in 2.2.1.
2002-08-23 18:50:21 +00:00
Guido van Rossum 8b1a6d694f Code by Inyeol Lee, submitted to SF bug 595350, to implement
the string/unicode method .replace() with a zero-lengt first argument.
Inyeol contributed tests for this too.
2002-08-23 18:21:28 +00:00
Tim Peters 0d2d87d202 long_format(), long_lshift(): Someone on c.l.py is trying to boost
SHIFT and MASK, and widen digit.  One problem is that code of the form

    digit << small_integer

implicitly assumes that the result fits in an int or unsigned int
(platform-dependent, but "int sized" in any case), since digit is
promoted "just" to int or unsigned via the usual integer promotions.
But if digit is typedef'ed as unsigned int, this loses information.
The cure for this is just to cast digit to twodigits first.
2002-08-20 19:00:22 +00:00
Guido van Rossum 76afbd9aa4 Fix some endcase bugs in unicode rfind()/rindex() and endswith().
These were reported and fixed by Inyeol Lee in SF bug 595350.  The
endswith() bug was already fixed in 2.3, but this adds some more test
cases.
2002-08-20 17:29:29 +00:00
Tim Peters 75585d4ec1 getinstclassname(): Squash new compiler wng in assert (comparison of
signed vs unsigned).
2002-08-20 14:31:35 +00:00
Guido van Rossum 45ec02aed1 SF patch 576101, by Oren Tirosh: alternative implementation of
interning.  I modified Oren's patch significantly, but the basic idea
and most of the implementation is unchanged.  Interned strings created
with PyString_InternInPlace() are now mortal, and you must keep a
reference to the resulting string around; use the new function
PyString_InternImmortal() to create immortal interned strings.
2002-08-19 21:43:18 +00:00
Guido van Rossum e3a8e7ed1d Call me anal, but there was a particular phrase that was speading to
comments everywhere that bugged me: /* Foo is inlined */ instead of
/* Inline Foo */.  Somehow the "is inlined" phrase always confused me
for half a second (thinking, "No it isn't" until I added the missing
"here").  The new phrase is hopefully unambiguous.
2002-08-19 19:26:42 +00:00
Guido van Rossum 056fbf422d Another modest speedup in PyObject_GenericGetAttr(): inline the call
to _PyType_Lookup().
2002-08-19 19:22:50 +00:00
Guido van Rossum 492b46f29e Make PyDescr_IsData() a macro. It's too simple to be a function.
Should save 4% on slot lookups.
2002-08-19 18:45:37 +00:00
Michael W. Hudson 69734a5272 Check in my ultra-shortlived patch #597220.
Move some debugging checks inside Py_DEBUG.

They were causing cache misses according to cachegrind.
2002-08-19 16:54:08 +00:00
Guido van Rossum c66ff4441e Inline call to _PyObject_GetDictPtr() in PyObject_GenericGetAttr().
This causes a modest speedup.
2002-08-19 16:50:48 +00:00
Guido van Rossum c588e9041a Simple but important optimization for descr_check(): instead of the
expensive and overly general PyObject_IsInstance(), call
PyObject_TypeCheck() which is a macro that often avoids a call, and if
it does make a call, calls the much more efficient PyType_IsSubtype().
This saved 6% on a benchmark for slot lookups.
2002-08-19 16:02:33 +00:00
Neal Norwitz b898d9fc9a Get this to compile again if Py_USING_UNICODE is not defined.
com_error() is static in Python/compile.c.
2002-08-16 23:20:39 +00:00
Guido van Rossum 84b2bed435 Squash a few calls to the hideously expensive PyObject_CallObject(o,a)
-- replace then with slightly faster PyObject_Call(o,a,NULL).  (The
difference is that the latter requires a to be a tuple; the former
allows other values and wraps them in a tuple if necessary; it
involves two more levels of C function calls to accomplish all that.)
2002-08-16 17:01:09 +00:00
Guido van Rossum 8e829200b1 Fix SF bug 595838 -- buffer in type_new() should not be static. Moved
to inner scope, too.
2002-08-16 03:47:49 +00:00
Tim Peters e417de0e56 Illustrating by example one good reason not to trust a proof <wink>. 2002-08-15 20:10:45 +00:00
Tim Peters ab86c2be24 k_mul() comments: In honor of Dijkstra, made the proof that "t3 fits"
rigorous instead of hoping for testing not to turn up counterexamples.
Call me heretical, but despite that I'm wholly confident in the proof,
and have done it two different ways now, I still put more faith in
testing ...
2002-08-15 20:06:00 +00:00
Tim Peters 9973d74b2d long_mul(): Simplified exit code. In particular, k_mul() returns a
normalized result, so no point to normalizing it again.  The number
of test+branches was also excessive.
2002-08-15 19:41:06 +00:00
Michael W. Hudson dd32a91cc0 This is my patch
[ 587993 ] SET_LINENO killer

Remove SET_LINENO.  Tracing is now supported by inspecting co_lnotab.

Many sundry changes to document and adapt to this change.
2002-08-15 14:59:02 +00:00
Jeremy Hylton 8b73542cf5 Reflow long lines. 2002-08-14 21:01:41 +00:00
Guido van Rossum 54df53a352 More changes of DeprecationWarning to FutureWarning. 2002-08-14 18:38:27 +00:00
Guido van Rossum 323a9cfc83 PyType_Ready(): initialize the base class a bit earlier, so that if we
copy the metatype from the base, the base actually has one!
2002-08-14 17:26:30 +00:00
Tim Peters 48d52c0fcc k_mul() comments: Simplified the simplified explanation of why ah*bh and
al*bl "always fit":  it's actually trivial given what came before.
2002-08-14 17:07:32 +00:00
Tim Peters 8e966ee49a k_mul() comments: Explained why there's always enough room to subtract
ah*bh and al*bl.  This is much easier than explaining why that's true
for (ah+al)*(bh+bl), and follows directly from the simple part of the
(ah+al)*(bh+bl) explanation.
2002-08-14 16:36:23 +00:00
Martin v. Löwis eb3f00aeeb Check for trailing backslash. Fixes #593656. 2002-08-14 08:22:50 +00:00
Martin v. Löwis 8a8da798a5 Patch #505705: Remove eval in pickle and cPickle. 2002-08-14 07:46:28 +00:00
Neal Norwitz 5dc2a37f0f Allow more docstrings to be removed during compilation 2002-08-13 22:19:13 +00:00
Tim Peters cba6e96929 Fixed error in new comment. 2002-08-13 20:42:00 +00:00
Tim Peters d6974a54ab k_mul(): The fix for (ah+al)*(bh+bl) spilling 1 bit beyond the allocated
space is no longer needed, so removed the code.  It was only possible when
a degenerate (ah->ob_size == 0) split happened, but after that fix went
in I added k_lopsided_mul(), which saves the body of k_mul() from seeing
a degenerate split.  So this removes code, and adds a honking long comment
block explaining why spilling out of bounds isn't possible anymore.  Note:
ff we end up spilling out of bounds anyway <wink>, an assert in v_iadd()
is certain to trigger.
2002-08-13 20:37:51 +00:00
Neal Norwitz d47714a727 Allow docstrings to be removed during compilation for *SLOT macro and friends 2002-08-13 19:01:38 +00:00
Neal Norwitz 858e34f649 Allow docstrings to be removed during compilation 2002-08-13 17:18:45 +00:00
Guido van Rossum 4571e9d42a Add an improvement wrinkle to Neil Schemenauer's change to int_mul
(rev. 2.86).  The other type is only disqualified from sq_repeat when
it has the CHECKTYPES flag.  This means that for extension types that
only support "old-style" numeric ops, such as Zope 2's ExtensionClass,
sq_repeat still trumps nb_multiply.
2002-08-13 10:05:56 +00:00
Guido van Rossum d8c8048f5e Fix comment for PyLong_AsUnsignedLong() to say that the return value
is an *unsigned* long.
2002-08-13 00:24:58 +00:00
Tim Peters 1203403743 k_lopsided_mul(): This allocated more space for bslice than necessary. 2002-08-12 22:10:00 +00:00
Tim Peters 6000464d08 Added new function k_lopsided_mul(), which is much more efficient than
k_mul() when inputs have vastly different sizes, and a little more
efficient when they're close to a factor of 2 out of whack.

I consider this done now, although I'll set up some more correctness
tests to run overnight.
2002-08-12 22:01:34 +00:00
Tim Peters 547607c4bf k_mul(): Moved an assert down. In a debug build, interrupting a
multiply via Ctrl+C could cause a NULL-pointer dereference due to
the assert.
2002-08-12 19:43:49 +00:00
Tim Peters 70b041bbe7 k_mul(): Heh -- I checked in two fixes for the last problem. Only keep
the good one <wink>.  Also checked in a test-aid by mistake.
2002-08-12 19:38:01 +00:00
Tim Peters d8b2173ef9 k_mul(): White-box testing turned up that (ah+al)*(bh+bl) can, in rare
cases, overflow the allocated result object by 1 bit.  In such cases,
it would have been brought back into range if we subtracted al*bl and
ah*bh from it first, but I don't want to do that because it hurts cache
behavior.  Instead we just ignore the excess bit when it appears -- in
effect, this is forcing unsigned mod BASE**(asize + bsize) arithmetic
in a case where that doesn't happen all by itself.
2002-08-12 19:30:26 +00:00
Guido van Rossum 3747a0f04c Fix MSVC warnings. 2002-08-12 19:25:08 +00:00
Guido van Rossum ad47da072a Refactor how __dict__ and __weakref__ interact with __slots__.
1. You can now have __dict__ and/or __weakref__ in your __slots__
   (before only __weakref__ was supported).  This is treated
   differently than before: it merely sets a flag that the object
   should support the corresponding magic.

2. Dynamic types now always have descriptors __dict__ and __weakref__
   thrust upon them.  If the type in fact does not support one or the
   other, that descriptor's __get__ method will raise AttributeError.

3. (This is the reason for all this; it fixes SF bug 575229, reported
   by Cesar Douady.)  Given this code:
      class A(object): __slots__ = []
      class B(object): pass
      class C(A, B): __slots__ = []
   the class object for C was broken; its size was less than that of
   B, and some descriptors on B could cause a segfault.  C now
   correctly inherits __weakrefs__ and __dict__ from B, even though A
   is the "primary" base (C.__base__ is A).

4. Some code cleanup, and a few comments added.
2002-08-12 19:05:44 +00:00
Tim Peters 115c888b97 x_mul(): Made life easier for C optimizers in the "grade school"
algorithm.  MSVC 6 wasn't impressed <wink>.

Something odd:  the x_mul algorithm appears to get substantially worse
than quadratic time as the inputs grow larger:

bits in each input   x_mul time   k_mul time
------------------   ----------   ----------
             15360         0.01         0.00
             30720         0.04         0.01
             61440         0.16         0.04
            122880         0.64         0.14
            245760         2.56         0.40
            491520        10.76         1.23
            983040        71.28         3.69
           1966080       459.31        11.07

That is, x_mul is perfectly quadratic-time until a little burp at
2.56->10.76, and after that goes to hell in a hurry.  Under Karatsuba,
doubling the input size "should take" 3 times longer instead of 4, and
that remains the case throughout this range.  I conclude that my "be nice
to the cache" reworkings of k_mul() are paying.
2002-08-12 18:25:43 +00:00
Tim Peters d64c1def7c k_mul() and long_mul(): I'm confident that the Karatsuba algorithm is
correct now, so added some final comments, did some cleanup, and enabled
it for all long-int multiplies.  The KARAT envar no longer matters,
although I left some #if 0'ed code in there for my own use (temporary).
k_mul() is still much slower than x_mul() if the inputs have very
differenent sizes, and that still needs to be addressed.
2002-08-12 17:36:03 +00:00
Tim Peters 738eda742c k_mul: Rearranged computation for better cache use. Ignored overflow
(it's possible, but should be harmless -- this requires more thought,
and allocating enough space in advance to prevent it requires exactly
as much thought, to know exactly how much that is -- the end result
certainly fits in the allocated space -- hmm, but that's really all
the thought it needs!  borrows/carries out of the high digits really
are harmless).
2002-08-12 15:08:20 +00:00
Tim Peters 44121a6bc9 x_mul(): This failed to normalize its result.
k_mul():  This didn't allocate enough result space when one input had
more than twice as many bits as the other.  This was partly hidden by
that x_mul() didn't normalize its result.

The Karatsuba recurrence is pretty much hosed if the inputs aren't
roughly the same size.  If one has at least twice as many bits as the
other, we get a degenerate case where the "high half" of the smaller
input is 0.  Added a special case for that, for speed, but despite that
it helped, this can still be much slower than the "grade school" method.
It seems to take a really wild imbalance to trigger that; e.g., a
2**22-bit input times a 1000-bit input on my box runs about twice as slow
under k_mul than under x_mul.  This still needs to be addressed.

I'm also not sure that allocating a->ob_size + b->ob_size digits is
enough, given that this is computing k = (ah+al)*(bh+bl) instead of
k = (ah-al)*(bl-bh); i.e., it's certainly enough for the final result,
but it's vaguely possible that adding in the "artificially" large k may
overflow that temporarily.  If so, an assert will trigger in the debug
build, but we'll probably compute the right result anyway(!).
2002-08-12 06:17:58 +00:00
Tim Peters 877a212678 Introduced helper functions v_iadd and v_isub, for in-place digit-vector
addition and subtraction.  Reworked the tail end of k_mul() to use them.
This saves oodles of one-shot longobject allocations (this is a triply-
recursive routine, so saving one allocation in the body saves 3**n
allocations at depth n; we actually save 2 allocations in the body).
2002-08-12 05:09:36 +00:00
Tim Peters fc07e56844 k_mul(): Repaired another typo in another comment. 2002-08-12 02:54:10 +00:00
Tim Peters 18c15b9bbd k_mul(): Repaired typo in comment. 2002-08-12 02:43:58 +00:00