Commit Graph

2536 Commits

Author SHA1 Message Date
Andrew Dalke 8c9091074b Fixed problem identified by Georg. The special-case in-place code for replace
made a copy of the string using PyString_FromStringAndSize(s, n) and modify
the copied string in-place.  However, 1 (and 0) character strings are shared
from a cache.  This cause "A".replace("A", "a") to change the cached version
of "A" -- used by everyone.

Now may the copy with NULL as the string and do the memcpy manually.  I've
added regression tests to check if this happens in the future.  Perhaps
there should be a PyString_Copy for this case?
2006-05-25 17:53:00 +00:00
Tim Peters da53afa1b0 A new table to help string->integer conversion was added yesterday to
both mystrtoul.c and longobject.c.  Share the table instead.  Also
cut its size by 64 entries (they had been used for an inscrutable
trick originally, but the code no longer tries to use that trick).
2006-05-25 17:34:03 +00:00
Fredrik Lundh e68955cf32 needforspeed: new replace implementation by Andrew Dalke. replace is
now about 3x faster on my machine, for the replace tests from string-
bench.
2006-05-25 17:08:14 +00:00
Fredrik Lundh 0c71f88fc9 needforspeed: check for overflow in replace (from Andrew Dalke) 2006-05-25 16:46:54 +00:00
Fredrik Lundh dfe503d3f0 needforspeed: _toupper/_tolower is a SUSv2 thing; fall back on ISO C
versions if they're not defined.
2006-05-25 16:10:12 +00:00
Kristján Valur Jónsson f94323fbb4 Added a new macro, Py_IS_FINITE(X). On windows there is an intrinsic for this and it is more efficient than to use !Py_IS_INFINITE(X) && !Py_IS_NAN(X). No change on other platforms 2006-05-25 15:53:30 +00:00
Fredrik Lundh 4b4e33ef14 needforspeed: make new upper/lower work properly for single-character
strings too... (thanks to georg brandl for spotting the exact problem
faster than anyone else)
2006-05-25 15:49:45 +00:00
Fredrik Lundh 39ccef607e needforspeed: speed up upper and lower for 8-bit string objects.
(the unicode versions of these are still 2x faster on windows,
though...)

based on work by Andrew Dalke, with tweaks by yours truly.
2006-05-25 15:22:03 +00:00
Tim Peters 696cf43b58 Heavily fiddled variant of patch #1442927: PyLong_FromString optimization.
``long(str, base)`` is now up to 6x faster for non-power-of-2 bases.  The
largest speedup is for inputs with about 1000 decimal digits.  Conversion
from non-power-of-2 bases remains quadratic-time in the number of input
digits (it was and remains linear-time for bases 2, 4, 8, 16 and 32).

Speedups at various lengths for decimal inputs, comparing 2.4.3 with
current trunk.  Note that it's actually a bit slower for 1-digit strings:

  len  speedup
 ----  -------
   1     -4.5%
   2      4.6%
   3      8.3%
   4     12.7%
   5     16.9%
   6     28.6%
   7     35.5%
   8     44.3%
   9     46.6%
  10     55.3%
  11     65.7%
  12     77.7%
  13     73.4%
  14     75.3%
  15     85.2%
  16    103.0%
  17     95.1%
  18    112.8%
  19    117.9%
  20    128.3%
  30    174.5%
  40    209.3%
  50    236.3%
  60    254.3%
  70    262.9%
  80    295.8%
  90    297.3%
 100    324.5%
 200    374.6%
 300    403.1%
 400    391.1%
 500    388.7%
 600    440.6%
 700    468.7%
 800    498.0%
 900    507.2%
1000    501.2%
2000    450.2%
3000    463.2%
4000    452.5%
5000    440.6%
6000    439.6%
7000    424.8%
8000    418.1%
9000    417.7%
2006-05-24 21:10:40 +00:00
Fredrik Lundh 347ee277aa needforspeed: refactored the replace code slightly; special-case
constant-length changes; use fastsearch to locate the first match.
2006-05-24 16:35:18 +00:00
Fredrik Lundh d5e0dc51cf needforspeedindeed: use fastsearch also for __contains__ 2006-05-24 15:11:01 +00:00
Fredrik Lundh 6471ee4f18 needforspeed: use "fastsearch" for count and findstring helpers. this
results in a 2.5x speedup on the stringbench count tests, and a 20x (!)
speedup on the stringbench search/find/contains test, compared to 2.5a2.

for more on the algorithm, see:

    http://effbot.org/zone/stringlib.htm

if you get weird results, you can disable the new algoritm by undefining
USE_FAST in Objects/unicodeobject.c.

enjoy /F
2006-05-24 14:28:11 +00:00
Fredrik Lundh 240bf2a8e4 use Py_ssize_t for string indexes (thanks, neal!) 2006-05-24 10:20:36 +00:00
Fredrik Lundh 7763351808 return 0 on misses, not -1. 2006-05-23 19:47:35 +00:00
Fredrik Lundh b63588c188 needforspeed: use append+reverse for rsplit, use "bloom filters" to
speed up splitlines and strip with charsets; etc.  rsplit is now as
fast as split in all our tests (reverse takes no time at all), and
splitlines() is nearly as fast as a plain split("\n") in our tests.
and we're not done yet... ;-)
2006-05-23 18:44:25 +00:00
Richard Jones a372711fcc fix broken merge 2006-05-23 18:32:11 +00:00
Richard Jones cebbefc98d Applied patch 1337051 by Neal Norwitz, saving 4 ints on frame objects. 2006-05-23 18:28:17 +00:00
Richard Jones 7c88dcc5ab Merge from rjones-funccall branch.
Applied patch zombie-frames-2.diff from sf patch 876206 with updates for
Python 2.5 and also modified to retain the free_list to avoid the 67%
slow-down in pybench recursion test. 5% speed up in function call pybench.
2006-05-23 10:37:38 +00:00
Fredrik Lundh 833bf9422e needforspeed: fixed unicode "in" operator to use same implementation
approach as find/index
2006-05-23 10:12:21 +00:00
Tim Peters 1bacc641a0 unicode_repeat(): Change type of local to Py_ssize_t,
since that's what it should be.
2006-05-23 05:47:16 +00:00
Tim Peters 286085c781 PyUnicode_Join(): Recent code changes introduced new
compiler warnings on Windows (signed vs unsigned mismatch
in comparisons).  Cleaned that up by switching more locals
to Py_ssize_t.  Simplified overflow checking (it can _be_
simpler because while these things are declared as
Py_ssize_t, then should in fact never be negative).
2006-05-22 19:17:04 +00:00
Fredrik Lundh 8a8e05a2b9 needforspeed: use memcpy for "long" strings; use a better algorithm
for long repeats.
2006-05-22 17:12:58 +00:00
Fredrik Lundh f1d60a5384 needforspeed: speed up unicode repeat, unicode string copy 2006-05-22 16:29:30 +00:00
Fredrik Lundh 763b50f9d9 docstring tweaks: count counts non-overlapping substrings, not
total number of occurences
2006-05-22 15:35:12 +00:00
Georg Brandl 7b90e168f3 Bug #1462152: file() now checks more thoroughly for invalid mode
strings and removes a possible "U" before passing the mode to the
C library function.
2006-05-18 07:01:27 +00:00
Neal Norwitz 1004a5339a Patch #1488312, Fix memory alignment problem on SPARC in unicode. Will backport 2006-05-15 07:17:23 +00:00
Tim Peters 8931ff1f67 Teach PyString_FromFormat, PyErr_Format, and PyString_FromFormatV
about "%u", "%lu" and "%zu" formats.

Since PyString_FromFormat and PyErr_Format have exactly the same rules
(both inherited from PyString_FromFormatV), it would be good if someone
with more LaTeX Fu changed one of them to just point to the other.
Their docs were way out of synch before this patch, and I just did a
mass copy+paste to repair that.

Not a backport candidate (this is a new feature).
2006-05-13 23:28:20 +00:00
Martin v. Löwis 822f34a848 Revert 43315: Printing of %zd must be signed. 2006-05-13 13:34:04 +00:00
Neal Norwitz c6a989ac3a Fix problems found by Coverity.
longobject.c: also fix an ssize_t problem
  <a> could have been NULL, so hoist the size calc to not use <a>.

_ssl.c: under fail: self is DECREF'd, but it would have been NULL.

_elementtree.c: delete self if there was an error.

_csv.c: I'm not sure if lineterminator could have been anything other than
a string.  However, other string method calls are checked, so check this
one too.
2006-05-10 06:57:58 +00:00
Guido van Rossum da5b701aee Get rid of __context__, per the latest changes to PEP 343 and python-dev
discussion.
There are two places of documentation that still mention __context__:
Doc/lib/libstdtypes.tex -- I wasn't quite sure how to rewrite that without
spending a whole lot of time thinking about it; and whatsnew, which Andrew
usually likes to change himself.
2006-05-02 19:47:52 +00:00
Neal Norwitz c4edb0ec81 SF #1479181: split open() and file() from being aliases for each other. 2006-05-02 04:43:14 +00:00
Martin v. Löwis 6685128b97 Fix more ssize_t issues. 2006-04-22 11:40:03 +00:00
Thomas Wouters 568f1d0eed Py_ssize_t issue; repr()'ing a very large string would result in a teensy
string, because of a cast to int.
2006-04-21 13:54:43 +00:00
Thomas Wouters 4e908107b0 Fix variable/format-char discrepancy in new-style class __getitem__,
__delitem__, __setslice__ and __delslice__ hooks. This caused test_weakref
and test_userlist to fail in the p3yk branch (where UserList, like all
classes, is new-style) on amd64 systems, with open-ended slices: the
sys.maxint value for empty-endpoint was transformed into -1.
2006-04-21 11:26:56 +00:00
Thomas Wouters dc5f808cbc Make s.replace() work with explicit counts exceeding 2Gb. 2006-04-19 15:38:01 +00:00
Thomas Wouters 4abb3660ca Use Py_ssize_t to hold the 'width' argument to the ljust, rjust, center and
zfill stringmethods, so they can create strings larger than 2Gb on 64bit
systems (even win64.) The unicode versions of these methods already did this
right.
2006-04-19 14:50:15 +00:00
Jeremy Hylton a4ebc135ac Refactor: Move code that uses co_lnotab from ceval to codeobject 2006-04-18 14:47:00 +00:00
Andrew M. Kuchling 2060d1bd27 Comment typo fix 2006-04-18 11:49:53 +00:00
Martin v. Löwis 45294a9562 Remove types from type_list if they have no objects
and unlist_types_without_objects is set.
Give dump_counts a FILE* argument.
2006-04-18 06:24:08 +00:00
Skip Montanaro 429433b30b C++ compiler cleanup: bunch-o-casts, plus use of unsigned loop index var in a couple places 2006-04-18 00:35:43 +00:00
Skip Montanaro 54e964d253 C++ compilation cleanup: Migrate declaration of
_PyObject_Call(Function|Method)_SizeT into Include/abstract.h.  This gets
them under the umbrella of the extern "C" { ... } block in that file.
2006-04-18 00:27:46 +00:00
Neal Norwitz 0e2cbabb8d No need to cast a Py_ssize_t, use %z in PyErr_Format 2006-04-17 05:56:32 +00:00
Thomas Wouters 715a4cdea2 Use %zd instead of %i as format character (in call to PyErr_Format) for
Py_ssize_t argument.
2006-04-16 22:04:49 +00:00
Tim Peters 81b092d0e6 gen_del(): Looks like much this was copy/pasted from
slot_tp_del(), but while the latter had to cater to types
that don't participate in GC, we know that generators do.
That allows strengthing an assert().
2006-04-15 22:59:10 +00:00
Tim Peters ffe2395777 Remove now-unused variables from tp_traverse and tp_clear methods. 2006-04-15 22:51:26 +00:00
Thomas Wouters c6e55068ca Use Py_VISIT in all tp_traverse methods, instead of traversing manually or
using a custom, nearly-identical macro. This probably changes how some of
these functions are compiled, which may result in fractionally slower (or
faster) execution. Considering the nature of traversal, visiting much of the
address space in unpredictable patterns, I'd argue the code readability and
maintainability is well worth it ;P
2006-04-15 21:47:09 +00:00
Thomas Wouters 447d095976 - Whitespace normalization
- In functions where we already hold the same object in differently typed
   pointers, use the correctly typed pointer instead of casting the other
   pointer a second time.
2006-04-15 21:41:56 +00:00
Thomas Wouters edf17d8798 Use Py_CLEAR instead of in-place DECREF/XDECREF or custom macros, for
tp_clear methods.
2006-04-15 17:28:34 +00:00
Martin v. Löwis ed8f783126 Clear dummy and emptyfrozenset, so that we don't have
dangling references in case of a Py_Initialize/Py_Finalize
cycle.
2006-04-15 12:47:23 +00:00
Martin v. Löwis c597d1b446 Unlink the structseq type from the global list of
objects before initializing it. It might be linked
already if there was a Py_Initialize/Py_Finalize
cycle earlier; not unlinking it would break the global
list.
2006-04-15 12:45:05 +00:00
Tim Peters adcd25e7fa frame_clear(): Explain why it's important to make the frame
look dead right at the start.  Use Py_CLEAR for four more
frame members.
2006-04-15 03:30:08 +00:00
Tim Peters de2acf6512 frame_traverse(): Use the standard Py_VISIT macro.
Py_VISIT:  cast the `op` argument to PyObject* when calling
`visit()`.  Else the caller has to pay too much attention to
this silly detail (e.g., frame_traverse needs to traverse
`struct _frame *` and `PyCodeObject *` pointers too).
2006-04-15 03:22:46 +00:00
Tim Peters a13131cf7f Trimmed trailing whitespace. 2006-04-15 03:15:24 +00:00
Phillip J. Eby 8ebb28df3a Fix SF#1470508: crash in generator cycle finalization. There were two
problems: first, PyGen_NeedsFinalizing() had an off-by-one bug that
prevented it from ever saying a generator didn't need finalizing, and
second, frame objects cleared themselves in a way that caused their
owning generator to think they were still executable, causing a double
deallocation of objects on the value stack if there was still a loop
on the block stack.  This revision also removes some unnecessary
close() operations from test_generators that are now appropriately
handled by the cycle collector.
2006-04-15 01:02:17 +00:00
Martin v. Löwis 5cb6936672 Make Py_BuildValue, PyObject_CallFunction and
PyObject_CallMethod aware of PY_SSIZE_T_CLEAN.
2006-04-14 09:08:42 +00:00
Martin v. Löwis 83687c98dc Change more occurrences of maxsplit to Py_ssize_t. 2006-04-13 08:52:56 +00:00
Martin v. Löwis 9c83076b7b Change maxsplit types to Py_ssize_t. 2006-04-13 08:37:17 +00:00
Martin v. Löwis b1ed7fac12 Replace INT_MAX with PY_SSIZE_T_MAX. 2006-04-13 07:52:27 +00:00
Martin v. Löwis 2a19074a9c Replace INT_MAX with PY_SSIZE_T_MAX where string length
are concerned.
2006-04-13 07:37:25 +00:00
Martin v. Löwis f15da6995b Remove another INT_MAX limitation 2006-04-13 07:24:50 +00:00
Martin v. Löwis 8ce358f5fe Replace most INT_MAX with PY_SSIZE_T_MAX. 2006-04-13 07:22:51 +00:00
Martin v. Löwis 412fb67368 Change more ints to Py_ssize_t. 2006-04-13 06:34:32 +00:00
Martin v. Löwis 80d2e591d5 Revert 34153: Py_UNICODE should not be signed. 2006-04-13 06:06:08 +00:00
Anthony Baxter ac6bd46d5c spread the extern "C" { } magic pixie dust around. Python itself builds now
using a C++ compiler. Still lots and lots of errors in the modules built by
setup.py, and a bunch of warnings from g++ in the core.
2006-04-13 02:06:09 +00:00
Anthony Baxter 3109d62da6 Add a cast to make code compile with a C++ compiler. 2006-04-13 01:07:27 +00:00
Phillip J. Eby 8920bf24f8 Don't set gi_frame to Py_None, use NULL instead, eliminating some insane
pointer dereferences.
2006-04-12 19:07:15 +00:00
Armin Rigo e170937af6 Ignore the references to the dummy objects used as deleted keys
in dicts and sets when computing the total number of references.
2006-04-12 17:06:05 +00:00
Neal Norwitz 017749c33d wrap docstrings so they are less than 80 columns. add spaces after commas. 2006-04-12 06:56:56 +00:00
Tim Peters a5a80cb4a4 gen_throw(): The caller doesn't own PyArg_ParseTuple()
"O" arguments, so must not decref them.  This accounts
for why running test_contextlib.test_main() in a loop
eventually tried to deallocate Py_None.
2006-04-12 06:44:36 +00:00
Thomas Wouters 9cb28bea04 Fix int() and long() to repr() their argument when formatting the exception,
to avoid confusing situations like:

>>> int("")
ValueError: invalid literal for int():
>>> int("2\n\n2")
ValueError: invalid literal for int(): 2

2

Also report the base used, to avoid:

ValueError: invalid literal for int(): 2

They now report:

>>> int("")
ValueError: invalid literal for int() with base 10: ''
>>> int("2\n\n2")
ValueError: invalid literal for int() with base 10: '2\n\n2'
>>> int("2", 2)
ValueError: invalid literal for int() with base 2: '2'

(Reporting the base could be avoided when base is 10, which is the default,
but hrm.) Another effect of these changes is that the errormessage can be
longer; before, it was cut off at about 250 characters. Now, it can be up to
four times as long, as the unrepr'ed string is cut off at 200 characters,
instead.

No tests were added or changed, since testing for exact errormsgs is (pardon
the pun) somewhat errorprone, and I consider not testing the exact text
preferable. The actually changed code is tested frequent enough in the
test_builtin test as it is (120 runs for each of ints and longs.)
2006-04-11 23:50:33 +00:00
Tim Peters cbd6f1896d _Py_PrintReferenceAddresses,_Py_PrintReferences:
interpolate PY_FORMAT_SIZE_T for refcount display
instead of casting refcounts to long.

I understand that gcc on some boxes delivers
nuisance warnings about this, but if any new ones
appear because of this they'll get fixed by magic
when the others get fixed.
2006-04-11 19:12:33 +00:00
Martin v. Löwis ee36d650bb Correct casts to char*. 2006-04-11 09:08:02 +00:00
Martin v. Löwis 72d206776d Remove "static forward" declaration. Move constructors
after the type objects.
2006-04-11 09:04:12 +00:00
Neal Norwitz 9b26122ec0 Get compiling again 2006-04-11 07:58:54 +00:00
Anthony Baxter a62862120d More low-hanging fruit. Still need to re-arrange some code (or find a better
solution) in the same way as listobject.c got changed. Hoping for a better
solution.
2006-04-11 07:42:36 +00:00
Anthony Baxter 377be11ee1 More C++-compliance. Note especially listobject.c - to get C++ to accept the
PyTypeObject structures, I had to make prototypes for the functions, and
move the structure definition ahead of the functions. I'd dearly like a better
way to do this - to change this would make for a massive set of changes to
the codebase.

There's still some warnings - this is purely to get rid of errors first.
2006-04-11 06:54:30 +00:00
Martin v. Löwis 0bc2ab9a20 Patch #837242: id() for large ptr should return a long. 2006-04-10 20:28:17 +00:00
Phillip J. Eby 2ba96610bf SF Patch #1463867: Improved generator finalization to allow generators
that are suspended outside of any try/except/finally blocks to be
garbage collected even if they are part of a cycle.  Generators that
suspend inside of an active try/except or try/finally block (including
those created by a ``with`` statement) are still not GC-able if they
are part of a cycle, however.
2006-04-10 17:51:05 +00:00
Neal Norwitz 7e957d38b7 Remove dead code (reported by HP compiler).
Can probably be backported if anyone cares.
2006-04-06 08:17:41 +00:00
Martin v. Löwis c48c8db110 Add PY_SSIZE_T_MIN, as suggested by Ralf W. Grosse-Kunstleve. 2006-04-05 18:21:17 +00:00
Thomas Wouters f4d8f39053 Make xrange more Py_ssize_t aware, by assuming a Py_ssize_t is always at
least as big as a long. I believe this to be a safe assumption that is being
made in many parts of CPython, but a check could be added.

len(xrange(sys.maxint)) works now, so fix the testsuite's odd exception for
64-bit platforms too. It also fixes 'zip(xrange(sys.maxint), it)' as a
portable-ish (if expensive) alternative to enumerate(it); since zip() now
calls len(), this was breaking on (real) 64-bit platforms. No additional
test was added for that behaviour.
2006-04-04 17:28:12 +00:00
Martin v. Löwis 54b42f185e Allow long integers in PySlice_GetIndices. 2006-04-03 11:38:08 +00:00
Neal Norwitz d08eaf4d1b Use Py_ssize_t in slices 2006-04-03 04:46:04 +00:00
Georg Brandl ed02eb6aa9 Bug #1177964: make file iterator raise MemoryError on too big files 2006-03-31 20:31:02 +00:00
Barry Warsaw 176014ffad SF patch #1458476 with modifications based on discussions in python-dev. This
adds the following API calls: PySet_Clear(), _PySet_Next(), and
_PySet_Update().  The latter two are considered non-public.  Tests and
documentation (for the public API) are included.
2006-03-30 22:45:35 +00:00
Armin Rigo 314861c568 Minor bugs in the __index__ code (PEP 357), with tests. 2006-03-30 14:04:02 +00:00
Georg Brandl ecdc0a9f46 That one was a mistake. 2006-03-30 12:19:07 +00:00
Georg Brandl 347b30042b Remove unnecessary casts in type object initializers. 2006-03-30 11:57:00 +00:00
Anthony Baxter 262c00a21e Fixed bug #1459029 - unicode reprs were double-escaped.
Backed out an old patch from 2000.
2006-03-30 10:53:17 +00:00
Armin Rigo 12bec1b985 fix a comment. 2006-03-28 19:27:56 +00:00
Raymond Hettinger 334b5b20f2 Tighten an overbroad and misleading assertion.
(Reported by Jim Jewett.)
2006-03-26 03:11:29 +00:00
Neal Norwitz 7fbd6916b6 Get rid of warnings on some platforms by using %u for a size_t. 2006-03-25 23:55:39 +00:00
Phillip J. Eby bee0712214 Support throw() of string exceptions. 2006-03-25 00:05:50 +00:00
Neal Norwitz badc086543 Stop duplicating code and handle slice indices consistently and correctly
wrt to ssize_t.
2006-03-23 06:03:08 +00:00
Tim Peters 8af92d1f6c Heh -- used the right format for a refcount, but forgot
to stop truncating it.
2006-03-23 05:41:24 +00:00
Tim Peters 4d073bb9a1 _Py_NegativeRefcount(): print the full value of ob_refcnt. 2006-03-23 05:38:33 +00:00
Neal Norwitz 29892cc386 Update function name to reflect params and stop casting to long to avoid losing data 2006-03-20 01:55:26 +00:00
Neal Norwitz 2aa9a5dfdd Use macro versions instead of function versions when we already know the type.
This will hopefully get rid of some Coverity warnings, be a hint to
developers, and be marginally faster.

Some asserts were added when the type is currently known, but depends
on values from another function.
2006-03-20 01:53:23 +00:00
Georg Brandl abd1ff8f1f Previously, Python code had no easy way to access the contents of a
cell object. Now, a ``cell_contents`` attribute has been added
(closes patch #1170323).
2006-03-18 07:59:59 +00:00
Georg Brandl 5c170fd4a9 Fix some missing checks after PyTuple_New, PyList_New, PyDict_New 2006-03-17 19:03:25 +00:00