Commit Graph

2748 Commits

Author SHA1 Message Date
Georg Brandl 5d59c09834 Patch #1567691: super() and new.instancemethod() now don't accept
keyword arguments any more (previously they accepted them, but didn't
use them).
2006-09-30 08:43:30 +00:00
Brett Cannon f6aa86e33b Allow exceptions to be directly sliced again
(e.g., ``BaseException(1,2,3)[0:2]``).

Discovered in Python 2.5.0 by Thomas Heller and reported to python-dev.  This
should be backported to 2.5 .
2006-09-20 18:43:13 +00:00
Brett Cannon ca2ca79d23 Remove the __unicode__ method from exceptions. Allows unicode() to be called
on exception classes.  Would require introducing a tp_unicode slot to make it
work otherwise.

Fixes bug #1551432 and will be backported.
2006-09-09 07:11:46 +00:00
Raymond Hettinger c563a1c32b Fix refcounts and add error checks. 2006-09-07 02:42:48 +00:00
Georg Brandl 38f6237dfe Bug #1542051: Exceptions now correctly call PyObject_GC_UnTrack.
Also make sure that every exception class has __module__ set to
'exceptions'.
2006-09-06 06:50:05 +00:00
Neal Norwitz a22975fb35 Fix SF bug #1546288, crash in dict_equal. 2006-09-05 02:24:03 +00:00
Tim Peters c10c9d0d6b "Conceptual" merge of rev 51711 from the 2.5 branch.
i_divmod():  As discussed on Python-Dev, changed the overflow
checking to live happily with recent gcc optimizations that
assume signed integer arithmetic never overflows.

This differs from the corresponding change on the 2.5 and 2.4
branches, using a less obscure approach, but one that /may/
tickle platform idiocies in their definitions of LONG_MIN.
The 2.4 + 2.5 change avoided introducing a dependence on
LONG_MIN, at the cost of substantially goofier code.
2006-09-05 02:18:09 +00:00
Raymond Hettinger a0c95fa4d8 Fix endcase for str.rpartition() 2006-09-04 15:32:48 +00:00
Brett Cannon 2b3666f737 Make sure memory is properly cleaned up in file_init.
Backport candidate.
2006-08-31 18:54:26 +00:00
Alex Martelli 348dc88097 Reverting the patch that tried to fix the issue whereby x**2 raises
OverflowError while x*x succeeds and produces infinity; apparently
these inconsistencies cannot be fixed across ``all'' platforms and
there's a widespread feeling that therefore ``every'' platform
should keep suffering forevermore.  Ah well.
2006-08-23 22:17:59 +00:00
Alex Martelli 20362a820b x**2 should about equal x*x (including for a float x such that the result is
inf) but didn't; added a test to test_float to verify that, and ignored the
ERANGE value for errno in the pow operation to make the new test pass (with
help from Marilyn Davis at the Google Python Sprint -- thanks!).
2006-08-23 20:42:02 +00:00
Neal Norwitz 17753ecbfa Patch #1541585: fix buffer overrun when performing repr() on
a unicode string in a build with wide unicode (UCS-4) support.

This code could be improved, so add an XXX comment.
2006-08-21 22:21:19 +00:00
Neal Norwitz 076d1e0c0b Fix a couple of ssize-t issues reported by Alexander Belopolsky on python-dev 2006-08-21 18:20:10 +00:00
Neal Norwitz 7fd9607bad Move initialization to after the asserts for non-NULL values.
Klocwork 286-287.

(I'm not backporting this, but if someone wants to, feel free.)
2006-08-19 04:28:55 +00:00
Neal Norwitz 6cbb726539 Move initialization of interned strings to before allocating the
object so we don't leak op.  (Fixes an earlier patch to this code)

Klockwork #350
2006-08-19 04:22:33 +00:00
Neal Norwitz 271a8689e9 Subclasses of int/long are allowed to define an __index__. 2006-08-15 06:29:03 +00:00
Georg Brandl 26a07b5198 Fix refleak introduced in rev. 51248. 2006-08-14 20:25:39 +00:00
Marc-André Lemburg 3a457790c7 Correct an accidentally removed previous patch. 2006-08-14 12:57:27 +00:00
Marc-André Lemburg 040f76b79c Slightly revised version of patch #1538956:
Replace UnicodeDecodeErrors raised during == and !=
compares of Unicode and other objects with a new
UnicodeWarning.

All other comparisons continue to raise exceptions.
Exceptions other than UnicodeDecodeErrors are also left
untouched.
2006-08-14 10:55:19 +00:00
Neal Norwitz af33f2d571 Can't return NULL from a void function. If there is a memory error,
about the best we can do is call PyErr_WriteUnraisable and go on.
We won't be able to do the call below either, so verify delstr is valid.
2006-08-14 00:59:03 +00:00
Neal Norwitz 56423e5762 Fix segfault when doing string formatting on subclasses of long if
__oct__, __hex__ don't return a string.

Klocwork 308
2006-08-13 18:11:08 +00:00
Neal Norwitz b09f4f578f Handle a whole lot of failures from PyString_FromInternedString().
Should fix most of Klocwork 234-272.
2006-08-13 18:10:28 +00:00
Neal Norwitz 1872b1c01f Fix a couple of bugs exposed by the new __index__ code. The 64-bit buildbots
were failing due to inappropriate clipping of numbers larger than 2**31
with new-style classes. (typeobject.c)  In reviewing the code for classic
classes, there were 2 problems.  Any negative value return could be returned.
Always return -1 if there was an error.  Also make the checks similar
with the new-style classes.  I believe this is correct for 32 and 64 bit
boxes, including Windows64.

Add a test of classic classes too.
2006-08-12 18:44:06 +00:00
Neal Norwitz 8a87f5d37e Patch #1538606, Patch to fix __index__() clipping.
I modified this patch some by fixing style, some error checking, and adding
XXX comments.  This patch requires review and some changes are to be expected.
I'm checking in now to get the greatest possible review and establish a
baseline for moving forward.  I don't want this to hold up release if possible.
2006-08-12 17:03:09 +00:00
Neal Norwitz b88cfad318 Check return of PyMem_MALLOC (garbage) is non-NULL.
Check seq in both portions of if/else.

Klocwork #289-290.
2006-08-12 03:16:54 +00:00
Neal Norwitz 421c1319ad Whoops, how did that get in there. :-) Revert all the parts of 51227 that were not supposed to go it. Only Modules/_ctypes/cfields.c was supposed to be changed 2006-08-12 02:12:30 +00:00
Neal Norwitz 6b4953fd3d Check returned pointer is valid.
Klocwork #233
2006-08-12 02:06:34 +00:00
Neal Norwitz 6f5ff3f3eb Klocwork made another run and found a bunch more problems.
This is the first batch of fixes that should be easy to verify based on context.

This fixes problem numbers: 220 (ast), 323-324 (symtable),
321-322 (structseq), 215 (array), 210 (hotshot), 182 (codecs), 209 (etree).
2006-08-12 01:43:40 +00:00
Armin Rigo 51fc8c456e Fix and test for an infinite C recursion. 2006-08-09 14:55:26 +00:00
Martin v. Löwis ab2f8f7bd5 __hash__ may now return long int; the final hash
value is obtained by invoking hash on the long int.
Fixes #1536021.
2006-08-09 07:57:39 +00:00
Thomas Heller ab1049c046 memcmp() can return values other than -1, 0, and +1 but tp_compare
must not.
2006-08-08 17:37:00 +00:00
Georg Brandl b0061c8e93 Remove "non-mapping" and "non-sequence" from TypeErrors raised by
PyMapping_Size and PySequence_Size.

Because len() tries first sequence, then mapping size, it will always
raise a "non-mapping object has no len" error which is confusing.
2006-08-08 11:56:21 +00:00
Andrew M. Kuchling 0067b5fa2b Typo fixes 2006-08-04 20:37:43 +00:00
Georg Brandl e9462c72bd Change fix for segfaulting property(), add a NEWS entry and a test. 2006-08-04 18:03:37 +00:00
Georg Brandl 45381938e9 Fix bug caused by first decrefing, then increfing. 2006-08-04 06:03:53 +00:00
Fred Drake 7a36f5f344 SF patch #1534048 (bug #1531003): fix typo in error message 2006-08-04 05:17:21 +00:00
Neal Norwitz c5e060dee6 _PyWeakref_GetWeakrefCount() now returns a Py_ssize_t instead of long. 2006-08-02 06:14:22 +00:00
Andrew M. Kuchling 5a51bf50b8 typo fix 2006-08-01 16:24:30 +00:00
Neal Norwitz a7edb11122 Whitespace normalization 2006-07-30 06:59:13 +00:00
Neal Norwitz f71ec5a0ac Bug #1515471: string.replace() accepts character buffers again.
Pass the char* and size around rather than PyObject's.
2006-07-30 06:57:04 +00:00
Andrew M. Kuchling 52740be425 [Bug #1414697] Change docstring of set/frozenset types to specify that the contents are unique. Raymond, please feel free to edit or revert. 2006-07-29 15:10:32 +00:00
Neal Norwitz 101bac205d Closure can't be NULL at this point since we know it's a tuple.
Reported by Klocwork # 74.
2006-07-27 03:55:39 +00:00
Neal Norwitz c09efa8444 Move the initialization of size_a down below the check for a being NULL.
Reported by Klocwork #106
2006-07-23 07:53:14 +00:00
Neal Norwitz e1fdb32ff2 Handle allocation failures gracefully. Found with failmalloc.
Many (all?) of these could be backported.
2006-07-21 05:32:28 +00:00
Neal Norwitz 1adbb50701 Move the initialization of some pointers earlier. The problem is
that if we call Py_DECREF(frame) like we do if allocating locals fails,
frame_dealloc() will try to use these bogus values and crash.
2006-07-21 05:31:02 +00:00
Neal Norwitz 48808a1d6c Add some asserts that we got good params passed 2006-07-21 05:29:58 +00:00
Neal Norwitz 04e39ec815 otherset is known to be non-NULL based on checks before and DECREF after.
DECREF otherset rather than XDECREF in error conditions too.

Reported by Klockwork #154.
2006-07-17 00:57:15 +00:00
Neal Norwitz b337bb541b Stop INCREFing name, then checking if it's NULL. name (f_name) should never
be NULL so assert it.  Fix one place where we could have passed NULL.

Reported by Klocwork #66.
2006-07-17 00:55:45 +00:00
Neal Norwitz ee4cc698ca PyFunction_SetDefaults() is documented as taking None or a tuple.
A NULL would crash the PyTuple_Check().  Now make NULL return a SystemError.

Reported by Klocwork #73.
2006-07-16 02:35:47 +00:00
Neal Norwitz fc28e0de58 Handle a NULL name properly.
Reported by Klocwork #67
2006-07-16 02:32:03 +00:00
Neal Norwitz 4b0a315c31 Use sizeof(buffer) instead of duplicating the constants to ensure they won't
be wrong.

The real change is to pass (bufsz - 1) to PyOS_ascii_formatd and 1
to strncat.  strncat copies n+1 bytes from src (not dest).

Reported by Klocwork #58.
2006-07-16 02:22:30 +00:00
Neal Norwitz ef02b9e144 a & b were dereffed above, so they are known to be valid pointers.
z is known to be NULL, nothing to DECREF.

Reported by Klockwork, #107.
2006-07-16 02:00:32 +00:00
Neal Norwitz 7e49c6eee8 Fix uninitialized memory read reported by Valgrind when running doctest.
This could happen if size == 0.
2006-07-12 05:27:46 +00:00
Kristján Valur Jónsson 74c3ea0a0f Fix build problems with the platform SDK on windows. It is not sufficient to test for the C compiler version when determining if we have the secure CRT from microsoft. Must test with an undocumented macro, __STDC_SECURE_LIB__ too. 2006-07-03 14:59:05 +00:00
Martin v. Löwis d5cfa5491a Put method-wrappers into trashcan. Fixes #927248. 2006-07-03 13:47:40 +00:00
Neal Norwitz 0f415dc57f Another problem reported by Coverity. Backport candidate. 2006-06-30 07:32:46 +00:00
Neal Norwitz b114984225 Fix refleak 2006-06-23 03:32:44 +00:00
Armin Rigo 53c1692f6a Fix for an obscure bug introduced by revs 46806 and 46808, with a test.
The problem of checking too eagerly for recursive calls is the
following: if a RuntimeError is caused by recursion, and if code needs
to normalize it immediately (as in the 2nd test), then
PyErr_NormalizeException() needs a call to the RuntimeError class to
instantiate it, and this hits the recursion limit again...  causing
PyErr_NormalizeException() to never finish.

Moved this particular recursion check to slot_tp_call(), which is not
involved in instantiating built-in exceptions.

Backport candidate.
2006-06-21 21:58:50 +00:00
Neal Norwitz 0f2783cb4c Use Py_ssize_t 2006-06-19 05:40:44 +00:00
Georg Brandl ccff785258 Patch #1507676: improve exception messages in abstract.c, object.c and typeobject.c. 2006-06-18 22:17:29 +00:00
Martin v. Löwis d825143be1 Patch #1455898: Incremental mode for "mbcs" codec. 2006-06-14 05:21:04 +00:00
Brett Cannon ea3912b0da If a classic class defined a __coerce__() method that just returned its two
arguments in reverse, the interpreter would infinitely recourse trying to get a
coercion that worked.  So put in a recursion check after a coercion is made and
the next call to attempt to use the coerced values.

Fixes bug #992017 and closes crashers/coerce.py .
2006-06-13 21:46:41 +00:00
Neal Norwitz de4c78a1d7 Initialize the type object so pychecker can't crash the interpreter. 2006-06-13 08:28:19 +00:00
Kristján Valur Jónsson f608317061 Fix the CRT argument error handling for VisualStudio .NET 2005. Install a CRT error handler and disable the assertion for debug builds. This causes CRT to set errno to EINVAL.
This update fixes crash cases in the test suite where the default CRT error handler would cause process exit.
2006-06-12 15:45:12 +00:00
Neal Norwitz b9845e72f9 Get rid of f_restricted too. Doc the other 4 ints that were already removed
at the NeedForSpeed sprint.
2006-06-12 02:11:18 +00:00
Neal Norwitz a00c0b97bf Don't leak the list object if there's an error allocating the item storage. Backport candidate 2006-06-12 02:08:41 +00:00
Neal Norwitz 7d5b6e8991 f_code can't be NULL based on Frame_New and other code that derefs it.
So there doesn't seem to be much point to checking here.
2006-06-11 05:48:14 +00:00
Neal Norwitz 8e6675a7dc Update doc to make it agree with code.
Bottom factor out some common code.
2006-06-11 05:47:14 +00:00
Skip Montanaro 9a8ae8f46b Suppress warning on MacOSX about possible use before set of proc. 2006-06-10 22:38:13 +00:00
Martin v. Löwis 0e8bd7e1cc Patch #1495999: Part two of Windows CE changes.
- update header checks, using autoconf
- provide dummies for getenv, environ, and GetVersion
- adjust MSC_VER check in socketmodule.c
2006-06-10 12:23:46 +00:00
Armin Rigo acd0d6d416 SF bug #1503294.
PyThreadState_GET() complains if the tstate is NULL, but only in debug mode.
2006-06-10 10:57:40 +00:00
Georg Brandl 90e27d38f5 Apply perky's fix for #1503157: "/".join([u"", u""]) raising OverflowError.
Also improve error message on overflow.
2006-06-10 06:40:50 +00:00
Brett Cannon 6946ea0be0 Fix bug introduced in rev. 46806 by not having variable declaration at the top of a block. 2006-06-09 22:45:54 +00:00
Brett Cannon 22565aac3b An object with __call__ as an attribute, when called, will have that attribute checked for __call__ itself, and will continue to look until it finds an object without the attribute. This can lead to an infinite recursion.
Closes bug #532646, again.  Will be backported.
2006-06-09 22:31:23 +00:00
Georg Brandl 242508160e RFE #1491485: str/unicode.endswith()/startswith() now accept a tuple as first argument. 2006-06-09 18:45:48 +00:00
Brett Cannon c48b0e6657 Fix inconsistency in naming within an enum. 2006-06-09 17:05:48 +00:00
Brett Cannon de3b052216 Buffer objects would return the read or write buffer for a wrapped object when
the char buffer was requested.  Now it actually returns the char buffer if
available or raises a TypeError if it isn't (as is raised for the other buffer
types if they are not present but requested).

Not a backport candidate since it does change semantics of the buffer object
(although it could be argued this is enough of a bug to bother backporting).
2006-06-08 17:00:45 +00:00
Georg Brandl 98b40ad590 Bug #1502805: don't alias file.__exit__ to file.close since the
latter can return something that's true.
2006-06-08 14:50:21 +00:00
Armin Rigo fd01d7933b (arre, arigo) SF bug #1350060
Give a consistent behavior for comparison and hashing of method objects
(both user- and built-in methods).  Now compares the 'self' recursively.
The hash was already asking for the hash of 'self'.
2006-06-08 10:56:24 +00:00
Brett Cannon ea229bd1ed Fix coding style guide bug. 2006-06-06 18:08:16 +00:00
Georg Brandl 9f16760666 Repair refleaks in unicodeobject. 2006-06-04 21:46:16 +00:00
Martin v. Löwis 3f767795f6 Patch #1359618: Speed-up charmap encoder. 2006-06-04 19:36:28 +00:00
Neal Norwitz 7a071939d9 SF #1499797, Fix for memory leak in WindowsError_str 2006-06-04 06:19:31 +00:00
Tim Peters 3eeb17346c _PyObject_DebugMalloc(): The return value should add
2*sizeof(size_t) now, not 8.  This probably accounts for
current disasters on the 64-bit buildbot slaves.
2006-06-04 03:38:04 +00:00
Tim Peters 9ea89d2a19 In a PYMALLOC_DEBUG build obmalloc adds extra debugging info
to each allocated block.  This was using 4 bytes for each such
piece of info regardless of platform.  This didn't really matter
before (proof: no bug reports, and the debug-build obmalloc would
have assert-failed if it was ever asked for a chunk of memory
>= 2**32 bytes), since container indices were plain ints.  But after
the Py_ssize_t changes, it's at least theoretically possible to
allocate a list or string whose guts exceed 2**32 bytes, and the
PYMALLOC_DEBUG routines would fail then (having only 4 bytes
to record the originally requested size).

Now we use sizeof(size_t) bytes for each of a PYMALLOC_DEBUG
build's extra debugging fields.  This won't make any difference
on 32-bit boxes, but will add 16 bytes to each allocation in
a debug build on a 64-bit box.
2006-06-04 03:26:02 +00:00
Neal Norwitz 38d4d4a35b Fix memory leak found by valgrind. 2006-06-02 04:50:49 +00:00
Tim Peters d770ebd286 Armin committed his patch while I was reviewing it (I'm sure
he didn't know this), so merged in some changes I made during
review.  Nothing material apart from changing a new `mask` local
from int to Py_ssize_t.  Mostly this is repairing comments that
were made incorrect, and adding new comments.  Also a few
minor code rewrites for clarity or helpful succinctness.
2006-06-01 15:50:44 +00:00
Armin Rigo 35f6d36951 [ 1497053 ] Let dicts propagate the exceptions in user __eq__().
[ 1456209 ] dictresize() vulnerability ( <- backport candidate ).
2006-06-01 13:19:12 +00:00
Georg Brandl 6b50c63a23 Correctly allocate complex types with tp_alloc. (bug #1498638) 2006-06-01 08:27:32 +00:00
Georg Brandl 85ac850834 Correctly unpickle 2.4 exceptions via __setstate__ (patch #1498571) 2006-06-01 06:39:19 +00:00
Neal Norwitz b16e4e7860 Remove ; at end of macro. There was a compiler recently that warned
about extra semi-colons.  It may have been the HP C compiler.
This file will trigger a bunch of those warnings now.
2006-06-01 05:32:49 +00:00
Fredrik Lundh 9e9ef9fa5a changed count to return 0 for slices outside the source string 2006-05-30 17:39:58 +00:00
Fredrik Lundh 93eff6fecd changed find/rfind to return -1 for matches outside the source string 2006-05-30 17:11:48 +00:00
Tim Peters 9faa3eda6b PyLong_FromString(): Continued fraction analysis (explained in
a new comment) suggests there are almost certainly large input
integers in all non-binary input bases for which one Python digit
too few is initally allocated to hold the final result.  Instead
of assert-failing when that happens, allocate more space.  Alas,
I estimate it would take a few days to find a specific such case,
so this isn't backed up by a new test (not to mention that such
a case may take hours to run, since conversion time is quadratic
in the number of digits, and preliminary attempts suggested that
the smallest such inputs contain at least a million digits).
2006-05-30 15:53:34 +00:00
Georg Brandl b0432bc032 Do the check for no keyword arguments in __init__ so that
subclasses of Exception can be supplied keyword args
2006-05-30 08:17:00 +00:00
Georg Brandl 861089fc49 Disallow keyword args for exceptions. 2006-05-30 07:34:45 +00:00
Georg Brandl 05f97bffac Add a test case for exception pickling. args is never NULL. 2006-05-30 07:13:29 +00:00
Georg Brandl ddba473e26 Restore exception pickle support. #1497319. 2006-05-30 07:04:55 +00:00
Tim Peters 33f4a6a31a dict_print(): So that Neal & I don't spend the rest of
our lives taking turns rewriting code that works ;-),
get rid of casting illusions by declaring a new variable
with the obvious type.
2006-05-30 05:23:59 +00:00
Tim Peters 638144305c dict_print(): Explicitly narrow the return value
from a (possibly) wider variable.
2006-05-30 05:04:59 +00:00
Neal Norwitz 5e1b45dc21 No DOWNCAST is required since sizeof(Py_ssize_t) >= sizeof(int) and Py_ReprEntr returns an int 2006-05-30 04:43:23 +00:00
Neal Norwitz d3881b026c Use Py_SAFE_DOWNCAST for safety. Fix format strings. Remove 2 more stray | in comment 2006-05-30 04:25:05 +00:00
Neal Norwitz 80af59cbd4 Remove stray | in comment 2006-05-30 04:19:21 +00:00
Tim Peters 9b10f7e0cb Convert relevant dict internals to Py_ssize_t.
I don't have a box with nearly enough RAM, or an OS,
that could get close to tickling this, though (requires
a dict w/ at least 2**31 entries).
2006-05-30 04:16:25 +00:00
Fredrik Lundh b51b470eb8 fixed "abc".count("", 100) == -96 error (hopefully, nobody's relying on
the current behaviour ;-)
2006-05-29 22:42:07 +00:00
Georg Brandl 96a8c3954c Make use of METH_O and METH_NOARGS where possible.
Use Py_UnpackTuple instead of PyArg_ParseTuple where possible.
2006-05-29 21:04:52 +00:00
Georg Brandl 2cfaa34dfa Correct some value converting strangenesses. 2006-05-29 19:39:45 +00:00
Georg Brandl c7c51147c7 Fix refleak in socketmodule. Replace bogus Py_BuildValue calls.
Fix refleak in exceptions.
2006-05-29 09:46:51 +00:00
Thomas Wouters c1282eef0c Make last patch valid C89 so Windows compilers can deal with it. 2006-05-28 21:32:12 +00:00
Michael W. Hudson 27596279a2 use the UnicodeError traversal and clearing functions in UnicodeError
subclasses.
2006-05-28 21:19:03 +00:00
Georg Brandl 43ab100cdc Fix refleaks in UnicodeError get and set methods. 2006-05-28 20:57:09 +00:00
Michael W. Hudson 96495ee6dd Quality control, meet exceptions.c, round two.
Make some functions that should have been static static.

Fix a bunch of refleaks by fixing the definition of
MiddlingExtendsException.

Remove all the __new__ implementations apart from
BaseException_new.  Rewrite most code that needs it to cope with
NULL fields (such code could get excercised anyway, the
__new__-removal just makes it more likely).  This involved
editing the code for WindowsError, which I can't test.

This fixes all the refleaks in at least the start of a regrtest
-R :: run.
2006-05-28 17:40:29 +00:00
Michael W. Hudson 22a80e7cb0 Quality control, meet exceptions.c.
Fix a number of problems with the need for speed code:

One is doing this sort of thing:

    Py_DECREF(self->field);
    self->field = newval;
    Py_INCREF(self->field);

without being very sure that self->field doesn't start with a
value that has a __del__, because that almost certainly can lead
to segfaults.

As self->args is constrained to be an exact tuple we may as well
exploit this fact consistently.  This leads to quite a lot of
simplification (and, hey, probably better performance).

Add some error checking in places lacking it.

Fix some rather strange indentation in the Unicode code.

Delete some trailing whitespace.

More to come, I haven't fixed all the reference leaks yet...
2006-05-28 15:51:40 +00:00
Fredrik Lundh 80f8e80c15 needforspeed: added Py_MEMCPY macro (currently tuned for Visual C only),
and use it for string copy operations.  this gives a 20% speedup on some
string benchmarks.
2006-05-28 12:06:46 +00:00
Richard Jones 2d555b356a move semicolons 2006-05-27 16:15:11 +00:00
Richard Jones c5b2a2e7b9 doc string additions and tweaks 2006-05-27 16:07:28 +00:00
Fredrik Lundh 0b7ef46950 needforspeed: stringlib refactoring: use find_slice for stringobject 2006-05-27 15:26:19 +00:00
Fredrik Lundh 60d8b18831 needforspeed: stringlib refactoring: changed find_obj to find_slice,
to enable use from stringobject
2006-05-27 15:20:22 +00:00
Fredrik Lundh c2d29c5a6d needforspeed: replace improvements, changed to Py_LOCAL_INLINE
where appropriate
2006-05-27 14:58:20 +00:00
Georg Brandl 94b8c122fd Remove spurious semicolons after macro invocations. 2006-05-27 14:41:55 +00:00
Andrew Dalke d49d5c49ba cleanup - removed trailing whitespace 2006-05-27 14:16:40 +00:00
Richard Jones 7b9558d37d Conversion of exceptions over from faked-up classes to new-style C types. 2006-05-27 12:29:24 +00:00
Martin v. Löwis 2e3f6b77d5 Revert bogus change committed in 46432 to this file. 2006-05-27 11:07:49 +00:00
Andrew Dalke e0df762719 fixed typo 2006-05-27 11:04:36 +00:00
Fredrik Lundh 2d23d5bf2e needforspeed: more stringlib refactoring 2006-05-27 10:05:10 +00:00
Martin v. Löwis d004fc810a Patch 1494554: Update numeric properties to Unicode 4.1. 2006-05-27 08:36:52 +00:00
Neal Norwitz d1b6cd7bfb Fix Coverity warnings.
- Check the correct variable (str_obj, not str) for NULL
 - sep_len was already verified it wasn't 0
2006-05-27 05:21:30 +00:00
Andrew Dalke 7e0a62ea90 Added description of why splitlines doesn't use the prealloc strategy 2006-05-26 22:49:03 +00:00
Andrew Dalke 5132407868 Added limits to the replace code so it does not count all of the matching
patterns in a string, only the number needed by the max limit.
2006-05-26 20:25:22 +00:00
Georg Brandl e4e023c4d3 Simplify calling. 2006-05-26 20:22:50 +00:00
Andrew M. Kuchling 07bbfc6a51 Comment typo 2006-05-26 19:51:10 +00:00
Fredrik Lundh e6e43c867d needforspeed: stringlib refactoring: use stringlib/find for string find 2006-05-26 19:48:07 +00:00
Fredrik Lundh c816281304 needforspeed: use a macro to fix slice indexes 2006-05-26 19:33:03 +00:00
Fredrik Lundh ce4eccb0c4 needforspeed: stringlib refactoring: use stringlib/find for unicode
find
2006-05-26 19:29:05 +00:00
Fredrik Lundh 58b5e84d52 needforspeed: stringlib refactoring, continued. added count and
find helpers; updated unicodeobject to use stringlib_count
2006-05-26 19:24:53 +00:00
Andrew Dalke c5da53ba78 substring split now uses /F's fast string matching algorithm.
(If compiled without FAST search support, changed the pre-memcmp test
   to check the last character as well as the first.  This gave a 25%
   speedup for my test case.)

Rewrote the split algorithms so they stop when maxsplit gets to 0.
Previously they did a string match first then checked if the maxsplit
was reached.  The new way prevents a needless string search.
2006-05-26 19:02:09 +00:00
Fredrik Lundh 9c0e9c089c needspeed: rpartition documentation, tests, and a bug fixes.
feel free to add more tests and improve the documentation.
2006-05-26 18:24:15 +00:00
Fredrik Lundh b3167cbcd7 needforspeed: added rpartition implementation 2006-05-26 18:15:38 +00:00
Fredrik Lundh be9f219e40 removed unnecessary include 2006-05-26 18:05:34 +00:00
Fredrik Lundh 3a65d87e8c needforspeed: remove remaining USE_FAST macros; if fastsearch was
broken, someone would have noticed by now ;-)
2006-05-26 17:31:41 +00:00
Fredrik Lundh c2032fb86a needforspeed: cleanup 2006-05-26 17:26:39 +00:00
Fredrik Lundh b947948c61 needforspeed: stringlib refactoring (in progress) 2006-05-26 17:22:38 +00:00
Fredrik Lundh a50d201bd9 needforspeed: stringlib refactoring (in progress) 2006-05-26 17:04:58 +00:00
Fredrik Lundh 7c940d1d68 needforspeed: use Py_LOCAL on a few more locals in stringobject.c 2006-05-26 16:32:42 +00:00
Andrew Dalke 02758d66ce Eeked out another 3% or so performance in split whitespace by cleaning up the algorithm. 2006-05-26 15:21:01 +00:00
Andrew Dalke 525eab3712 Changes to string.split/rsplit on whitespace to preallocate space in the
results list.

Originally it allocated 0 items and used the list growth during append.  Now
it preallocates 12 items so the first few appends don't need list reallocs.

("Here are some words ."*2).split(None, 1) is 7% faster
("Here are some words ."*2).split() is is 15% faster

  (Your milage may vary, see dealership for details.)

File parsing like this

    for line in f:
        count += len(line.split())

is also about 15% faster.  There is a slowdown of about 3% for large
strings because of the additional overhead of checking if the append is
to a preallocated region of the list or not.  This will be the rare case.
It could be improved with special case code but we decided it was not
useful enough.

There is a cost of 12*sizeof(PyObject *) bytes per list.  For the normal
case of file parsing this is not a problem because of the lists have
a short lifetime.  We have not come up with cases where this is a problem
in real life.

I chose 12 because human text averages about 11 words per line in books,
one of my data sets averages 6.2 words with a final peak at 11 words per
line, and I work with a tab delimited data set with 8 tabs per line (or
9 words per line).  12 encompasses all of these.

Also changed the last rstrip code to append then reverse, rather than
doing insert(0).  The strip() and rstrip() times are now comparable.
2006-05-26 14:00:45 +00:00
Fredrik Lundh 95e2a91615 use Py_LOCAL also for string and unicode objects 2006-05-26 11:38:15 +00:00
Fredrik Lundh f2c0dfdb13 needforspeed: use Py_ssize_t for the fastsearch counter and skip
length (thanks, neal!).  and yes, I've verified that this doesn't
slow things down ;-)
2006-05-26 10:27:17 +00:00
Fredrik Lundh 450277fef5 needforspeed: use METH_O for argument handling, which made partition some
~15% faster for the current tests (which is noticable faster than a corre-
sponding find call).  thanks to neal-who-never-sleeps for the tip.
2006-05-26 09:46:59 +00:00
Fredrik Lundh 06a69dd8ff needforspeed: partition implementation, part two.
feel free to improve the documentation and the docstrings.
2006-05-26 08:54:28 +00:00