cpython

Commit Graph

Author	SHA1	Message	Date
Fredrik Lundh	f2c0dfdb13	needforspeed: use Py_ssize_t for the fastsearch counter and skip length (thanks, neal!). and yes, I've verified that this doesn't slow things down ;-)	2006-05-26 10:27:17 +00:00
Fredrik Lundh	450277fef5	needforspeed: use METH_O for argument handling, which made partition some ~15% faster for the current tests (which is noticable faster than a corre- sponding find call). thanks to neal-who-never-sleeps for the tip.	2006-05-26 09:46:59 +00:00
Fredrik Lundh	06a69dd8ff	needforspeed: partition implementation, part two. feel free to improve the documentation and the docstrings.	2006-05-26 08:54:28 +00:00
Fredrik Lundh	fe5bb7e6d9	needforspeed: partition for 8-bit strings. for some simple tests, this is on par with a corresponding find, and nearly twice as fast as split(sep, 1) full tests, a unicode version, and documentation will follow to- morrow.	2006-05-25 23:27:53 +00:00
Bob Ippolito	955b64c031	squelch gcc4 darwin/x86 compiler warnings	2006-05-25 20:52:38 +00:00
Fredrik Lundh	554da412a8	needforspeed: use insert+reverse instead of append	2006-05-25 19:19:05 +00:00
Jack Diederich	60cbb3fe49	* eliminate warning by reverting tmp_s type to 'const char*'	2006-05-25 18:47:15 +00:00
Fredrik Lundh	c3434b3834	needforspeed: use fastsearch also for find/index and contains. the related tests are now about 10x faster.	2006-05-25 18:44:29 +00:00
Andrew Dalke	598710c727	Added overflow test for adding two (very) large strings where the new string is over max Py_ssize_t. I have no way to test it on my box or any box I have access to. At least it doesn't break anything.	2006-05-25 18:18:39 +00:00
Andrew M. Kuchling	f344c94c85	Comment typo	2006-05-25 18:11:16 +00:00
Fredrik Lundh	af72237abc	needforspeed: use "fastsearch" for count. this results in a 3x speedup for the related stringbench tests.	2006-05-25 17:55:31 +00:00
Andrew Dalke	8c9091074b	Fixed problem identified by Georg. The special-case in-place code for replace made a copy of the string using PyString_FromStringAndSize(s, n) and modify the copied string in-place. However, 1 (and 0) character strings are shared from a cache. This cause "A".replace("A", "a") to change the cached version of "A" -- used by everyone. Now may the copy with NULL as the string and do the memcpy manually. I've added regression tests to check if this happens in the future. Perhaps there should be a PyString_Copy for this case?	2006-05-25 17:53:00 +00:00
Fredrik Lundh	e68955cf32	needforspeed: new replace implementation by Andrew Dalke. replace is now about 3x faster on my machine, for the replace tests from string- bench.	2006-05-25 17:08:14 +00:00
Fredrik Lundh	0c71f88fc9	needforspeed: check for overflow in replace (from Andrew Dalke)	2006-05-25 16:46:54 +00:00
Fredrik Lundh	dfe503d3f0	needforspeed: _toupper/_tolower is a SUSv2 thing; fall back on ISO C versions if they're not defined.	2006-05-25 16:10:12 +00:00
Fredrik Lundh	4b4e33ef14	needforspeed: make new upper/lower work properly for single-character strings too... (thanks to georg brandl for spotting the exact problem faster than anyone else)	2006-05-25 15:49:45 +00:00
Fredrik Lundh	39ccef607e	needforspeed: speed up upper and lower for 8-bit string objects. (the unicode versions of these are still 2x faster on windows, though...) based on work by Andrew Dalke, with tweaks by yours truly.	2006-05-25 15:22:03 +00:00
Fredrik Lundh	763b50f9d9	docstring tweaks: count counts non-overlapping substrings, not total number of occurences	2006-05-22 15:35:12 +00:00
Tim Peters	8931ff1f67	Teach PyString_FromFormat, PyErr_Format, and PyString_FromFormatV about "%u", "%lu" and "%zu" formats. Since PyString_FromFormat and PyErr_Format have exactly the same rules (both inherited from PyString_FromFormatV), it would be good if someone with more LaTeX Fu changed one of them to just point to the other. Their docs were way out of synch before this patch, and I just did a mass copy+paste to repair that. Not a backport candidate (this is a new feature).	2006-05-13 23:28:20 +00:00
Martin v. Löwis	822f34a848	Revert 43315: Printing of %zd must be signed.	2006-05-13 13:34:04 +00:00
Thomas Wouters	568f1d0eed	Py_ssize_t issue; repr()'ing a very large string would result in a teensy string, because of a cast to int.	2006-04-21 13:54:43 +00:00
Thomas Wouters	dc5f808cbc	Make s.replace() work with explicit counts exceeding 2Gb.	2006-04-19 15:38:01 +00:00
Thomas Wouters	4abb3660ca	Use Py_ssize_t to hold the 'width' argument to the ljust, rjust, center and zfill stringmethods, so they can create strings larger than 2Gb on 64bit systems (even win64.) The unicode versions of these methods already did this right.	2006-04-19 14:50:15 +00:00
Skip Montanaro	429433b30b	C++ compiler cleanup: bunch-o-casts, plus use of unsigned loop index var in a couple places	2006-04-18 00:35:43 +00:00
Neal Norwitz	0e2cbabb8d	No need to cast a Py_ssize_t, use %z in PyErr_Format	2006-04-17 05:56:32 +00:00
Martin v. Löwis	5cb6936672	Make Py_BuildValue, PyObject_CallFunction and PyObject_CallMethod aware of PY_SSIZE_T_CLEAN.	2006-04-14 09:08:42 +00:00
Martin v. Löwis	83687c98dc	Change more occurrences of maxsplit to Py_ssize_t.	2006-04-13 08:52:56 +00:00
Martin v. Löwis	9c83076b7b	Change maxsplit types to Py_ssize_t.	2006-04-13 08:37:17 +00:00
Martin v. Löwis	8ce358f5fe	Replace most INT_MAX with PY_SSIZE_T_MAX.	2006-04-13 07:22:51 +00:00
Anthony Baxter	a62862120d	More low-hanging fruit. Still need to re-arrange some code (or find a better solution) in the same way as listobject.c got changed. Hoping for a better solution.	2006-04-11 07:42:36 +00:00
Neal Norwitz	7e957d38b7	Remove dead code (reported by HP compiler). Can probably be backported if anyone cares.	2006-04-06 08:17:41 +00:00
Georg Brandl	347b30042b	Remove unnecessary casts in type object initializers.	2006-03-30 11:57:00 +00:00
Neal Norwitz	7fbd6916b6	Get rid of warnings on some platforms by using %u for a size_t.	2006-03-25 23:55:39 +00:00
Neal Norwitz	2aa9a5dfdd	Use macro versions instead of function versions when we already know the type. This will hopefully get rid of some Coverity warnings, be a hint to developers, and be marginally faster. Some asserts were added when the type is currently known, but depends on values from another function.	2006-03-20 01:53:23 +00:00
Tim Peters	ae1d0c978d	Introduced symbol PY_FORMAT_SIZE_T. See the new comments in pyport.h. Changed PyString_FromFormatV() to use it instead of inlining its own maze of #if'ery.	2006-03-17 03:29:34 +00:00
Guido van Rossum	38fff8c4e4	Checking in the code for PEP 357. This was mostly written by Travis Oliphant. I've inspected it all; Neal Norwitz and MvL have also looked at it (in an earlier incarnation).	2006-03-07 18:50:55 +00:00
Hye-Shik Chang	4af5c8cee4	SF #1444030 : Fix several potential defects found by Coverity. (reviewed by Neal Norwitz)	2006-03-07 15:39:21 +00:00
Martin v. Löwis	725507b52e	Change int to Py_ssize_t in several places. Add (int) casts to silence compiler warnings. Raise Python exceptions for overflows.	2006-03-07 12:08:51 +00:00
Martin v. Löwis	15e62742fa	Revert backwards-incompatible const changes.	2006-02-27 16:46:16 +00:00
Thomas Wouters	977485d888	Use Py_ssize_t in helper function between Py_ssize_t-using functions.	2006-02-16 15:59:12 +00:00
Martin v. Löwis	eb079f1c25	Use Py_ssize_t for counts and sizes. Convert Py_ssize_t using PyInt_FromSsize_t	2006-02-16 14:32:27 +00:00
Martin v. Löwis	2c95cc6d72	Support %zd in PyErr_Format and PyString_FromFormat.	2006-02-16 06:54:25 +00:00
Martin v. Löwis	18e165558b	Merge ssize_t branch.	2006-02-15 17:27:45 +00:00
Jeremy Hylton	af68c874a6	Add const to several API functions that take char . In C++, it's an error to pass a string literal to a char function without a const_cast(). Rather than require every C++ extension module to put a cast around string literals, fix the API to state the const-ness. I focused on parts of the API where people usually pass literals: PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type slots, etc. Predictably, there were a large set of functions that needed to be fixed as a result of these changes. The most pervasive change was to make the keyword args list passed to PyArg_ParseTupleAndKewords() to be a const char kwlist[]. One cast was required as a result of the changes: A type object mallocs the memory for its tp_doc slot and later frees it. PyTypeObject says that tp_doc is const char ; but if the type was created by type_new(), we know it is safe to cast to char *.	2005-12-10 18:50:16 +00:00
Michael W. Hudson	b2308bb9be	Fix bug: [ 1327110 ] wrong TypeError traceback in generator expressions by removing the code that can stomp on the users' TypeError raised by the iterable argument to ''.join() -- PySequence_Fast (now?) gives a perfectly reasonable message itself. Also, a couple of tests.	2005-10-21 11:45:01 +00:00
Neal Norwitz	95c1e5065c	SF bug #1331563 ] string_subscript doesn't check for failed PyMem_Malloc. Will backport	2005-10-20 04:15:52 +00:00
Georg Brandl	d45014b236	Fix PyString_Format so that the "%s" format works again when Unicode is not enabled.	2005-10-01 17:06:00 +00:00
Neil Schemenauer	ab61923637	Fix bug in last checkin (2.231). To match previous behavior, unicode subclasses should be substituted as-is and not have tp_str called on them.	2005-08-31 23:02:05 +00:00
Neil Schemenauer	cf52c07843	Change the %s format specifier for str objects so that it returns a unicode instance if the argument is not an instance of basestring and calling __str__ on the argument returns a unicode instance.	2005-08-12 17:34:58 +00:00
Raymond Hettinger	3296e696db	SF bug #1224347 : int/long unification and hex() Hex longs now print with lowercase letters like their int counterparts.	2005-06-29 23:29:56 +00:00
Raymond Hettinger	57e7447c44	* Beef-up tests for str.count(). * Speed-up str.count() by using memchr() to fly between first char matches.	2005-02-20 09:54:53 +00:00
Raymond Hettinger	7cbf1bcb3e	* Beef-up testing of str.__contains__() and str.find(). * Speed-up "x in y" where x has more than one character. The existing code made excessive calls to the expensive memcmp() function. The new code uses memchr() to rapidly find a start point for memcmp(). In addition to knowing that the first character is a match, the new code also checks that the last character is a match. This significantly reduces the incidence of false starts (saving memcmp() calls and making quadratic behavior less likely). Improves the timings on: python -m timeit -r7 -s"x='a'1000" "'ab' in x" python -m timeit -r7 -s"x='a'1000" "'bc' in x" Once this code has proven itself, then string_find_internal() should refer to it rather than running its own version. Also, something similar may apply to unicode objects.	2005-02-20 04:07:08 +00:00
Michael W. Hudson	faa7648ffe	More bug #1077106 stuff, sorry -- modem induced impatiece! This should go on whatever bugfix branches the other fetches up on.	2005-01-31 17:09:25 +00:00
Raymond Hettinger	561fbf138d	SF bug #1054139 : serious string hashing error in 2.4b1 _PyString_Resize() readied strings for mutation but did not invalidate the cached hash value.	2004-10-26 01:52:37 +00:00
Raymond Hettinger	674f241e9c	SF Patch #1007087 : Return new string for single subclass joins (Bug #1001011 ) (Patch contributed by Nick Coghlan.) Now joining string subtypes will always return a string. Formerly, if there were only one item, it was returned unchanged.	2004-08-23 23:23:54 +00:00
Armin Rigo	618fbf5469	This was quite a dark bug in my recent in-place string concatenation hack: it would resize interned strings in-place! This occurred because their reference counts do not have their expected value -- stringobject.c hacks them. Mea culpa.	2004-08-07 20:58:32 +00:00
Armin Rigo	79f7ad228b	Fixed some compiler warnings.	2004-08-07 19:27:39 +00:00
Jeremy Hylton	4c989ddc9c	Subclasses of string can no longer be interned. The semantics of interning were not clear here -- a subclass could be mutable, for example -- and had bugs. Explicitly interning a subclass of string via intern() will raise a TypeError. Internal operations that attempt to intern a string subclass will have no effect. Added a few tests to test_builtin that includes the old buggy code and verifies that calls like PyObject_SetAttr() don't fail. Perhaps these tests should have gone in test_string.	2004-08-07 19:20:05 +00:00
Marc-André Lemburg	1dffb120b7	.encode()/.decode() patch part 2.	2004-07-08 19:13:55 +00:00
Marc-André Lemburg	d2d4598ec2	Allow string and unicode return types from .encode()/.decode() methods on string and unicode objects. Added unicode.decode() which was missing for no apparent reason.	2004-07-08 17:57:32 +00:00
Tim Peters	e7c053233f	sizeof(char) is 1, by definition, so get rid of that expression in places it's just noise.	2004-06-27 17:24:49 +00:00
Martin v. Löwis	737ea82a5a	Patch #774665 : Make Python LC_NUMERIC agnostic.	2004-06-08 18:52:54 +00:00
Hye-Shik Chang	75c00efcc7	[SF #866875 ] Add a specialized routine for one character separaters on str.split() and str.rsplit().	2004-01-05 00:29:51 +00:00
Skip Montanaro	ac4ea13a3a	There are places in Python which assume bytes have 8-bits. Formalize that a bit by checking the value of UCHAR_MAX in Include/Python.h. There was a check in Objects/stringobject.c. Remove that. (Note that we don't define UCHAR_MAX if it's not defined as the old test did.)	2003-12-22 16:31:41 +00:00
Hye-Shik Chang	3ae811b57d	Add rsplit method for str and unicode builtin types. SF feature request #801847. Original patch is written by Sean Reifschneider.	2003-12-15 18:49:53 +00:00
Guido van Rossum	6c9e130524	- Removed FutureWarnings related to hex/oct literals and conversions and left shifts. (Thanks to Kalle Svensson for SF patch 849227.) This addresses most of the remaining semantic changes promised by PEP 237, except for repr() of a long, which still shows the trailing 'L'. The PEP appears to promise warnings for operations that changed semantics compared to Python 2.3, but this is not implemented; we've suffered through enough warnings related to hex/oct literals and I think it's best to be silent now.	2003-11-29 23:52:13 +00:00
Raymond Hettinger	4f8f976576	Add optional fillchar argument to ljust(), rjust(), and center() string methods.	2003-11-26 08:21:35 +00:00
Fred Drake	d22bb6584d	Avoid confusing name for the 3rd argument to str.replace(). This closes SF bug #827260.	2003-10-22 02:56:40 +00:00
Martin v. Löwis	6828e18a6a	Patch #825679 : Clarify semantics of .isfoo on empty strings. Backported to 2.3.	2003-10-18 09:55:08 +00:00
Raymond Hettinger	9bfe533c69	SF bug #795506 : Wrong handling of string format code for float values. Adding missing support for '%F'. Will backport to 2.3.1.	2003-08-27 04:55:52 +00:00
Walter Dörwald	9ff3f03c3e	Fix whitespace.	2003-06-18 14:17:01 +00:00
Neal Norwitz	ffe33b7f24	Attempt to make all the various string strip methods the same. Doc - add doc for when functions were added * UserString * string object methods * string module functions 'chars' is used for the last parameter everywhere. These changes will be backported, since part of the changes have already been made, but they were inconsistent.	2003-04-10 22:35:32 +00:00
Guido van Rossum	a7132189d2	Reformat a few docstrings that caused line wraps in help() output.	2003-04-09 19:32:45 +00:00
Walter Dörwald	43440a621e	Fix PyString_Format() so that '%c' % u'a' returns u'a' instead of raising a TypeError. (From SF patch #710127) Add tests to verify this is fixed. Add various tests for '%c' % int.	2003-03-31 18:07:50 +00:00
Guido van Rossum	5d9113d8be	Implement appropriate __getnewargs__ for all immutable subclassable builtin types. The special handling for these can now be removed from save_newobj(). Add some testing for this. Also add support for setting the 'fast' flag on the Python Pickler class, which suppresses use of the memo.	2003-01-29 17:58:45 +00:00
Raymond Hettinger	5d5e7c0e34	SF patch #664192 bug #661913 : inconsistent error messages between string and unicode Patch by Christopher Blunck.	2003-01-15 05:32:57 +00:00
Raymond Hettinger	0a2f849b79	GvR's idea to use memset() for the most common special case of repeating a single character. Shaves another 10% off the running time by avoiding the lg2(N) loops and cache effects for the other cases.	2003-01-06 22:42:41 +00:00
Raymond Hettinger	698258a199	Optimize string_repeat. Christian Tismer pointed out the high cost of the loop overhead and function call overhead for 'c' * n where n is large. Accordingly, the new code only makes lg2(n) loops. Interestingly, 'c' * 1000 * 1000 ran a bit faster with old code. At some point, the loop and function call overhead became cheaper than invalidating the cache with lengthy memcpys. But for more typical sizes of n, the new code runs much faster and for larger values of n it runs only a bit slower.	2003-01-06 10:33:56 +00:00
Marc-André Lemburg	79f57833f3	Patch for bug #659709 : bogus computation of float length Python 2.2.x backport candidate. (This bug has been around since Python 1.6.)	2002-12-29 19:44:06 +00:00
Raymond Hettinger	ea3fdf44a2	SF patch #659536 : Use PyArg_UnpackTuple where possible. Obtain cleaner coding and a system wide performance boost by using the fast, pre-parsed PyArg_Unpack function instead of PyArg_ParseTuple function which is driven by a format string.	2002-12-29 16:33:45 +00:00
Martin v. Löwis	00b6127097	Patch #650653 : Raise always value error if the table is not 256 bytes long.	2002-12-12 20:03:19 +00:00
Martin v. Löwis	79acb9edfa	Patch #614055 : Support OpenVMS.	2002-12-06 12:48:53 +00:00
Neil Schemenauer	a6cd4e65d7	Add nb_remainder (i.e. __mod__) slot to str type. Fixes SF bug #615506 .	2002-11-18 16:09:38 +00:00
Neal Norwitz	80a1bf4b5d	Fix SF # 635969, No error "not all arguments converted" When mwh added extended slicing, strings and unicode became mappings. Thus, dict was set which prevented an error when doing: newstr = 'format without a percent' % string_value This fix raises an exception again when there are no formats and % with a string value.	2002-11-12 23:01:12 +00:00
Martin v. Löwis	a5f0907d79	Back out #479898 .	2002-10-11 05:37:59 +00:00
Guido van Rossum	049cd6b563	Fix a nasty endcase reported by Armin Rigo in SF bug 618623: '%2147483647d' % -123 segfaults. This was because an integer overflow in a comparison caused the string resize to be skipped. After fixing the overflow, this could call _PyString_Resize() with a negative size, so I (1) test for that and raise MemoryError instead; (2) also added a test for negative newsize to _PyString_Resize(), raising SystemError as for all bad arguments. An identical bug existed in unicodeobject.c, of course. Will backport to 2.2.2.	2002-10-11 00:43:48 +00:00
Guido van Rossum	8052f8921e	Undo this part of the previous checkin: Also fixed an error message -- %s argument has non-string str() doesn't make sense for %r, so the error message now differentiates between %s and %r. because PyObject_Repr() and PyObject_Str() ensure that this can never happen. Added a helpful comment instead.	2002-10-09 19:14:30 +00:00
Guido van Rossum	b00c07f038	The string formatting code has a test to switch to Unicode when %s sees a Unicode argument. Unfortunately this test was also executed for %r, because %s and %r share almost all of their code. This meant that, if u is a unicode object while repr(u) is an 8-bit string containing ASCII characters, '%r' % u is a unicode string containing only ASCII characters! Fixed by executing the test only for %s. Also fixed an error message -- %s argument has non-string str() doesn't make sense for %r, so the error message now differentiates between %s and %r.	2002-10-09 19:07:53 +00:00
Martin v. Löwis	bab9559d12	Include wctype.h.	2002-10-07 18:26:16 +00:00
Martin v. Löwis	fed2405cb5	Patch #479898 : Use multibyte C library for printing strings if available.	2002-10-07 13:55:50 +00:00
Guido van Rossum	efc1188239	Fix warnings on 64-bit platforms about casts from pointers to ints. Two of these were real bugs.	2002-09-12 14:43:41 +00:00
Martin v. Löwis	2412853f8e	Fix escaping of non-ASCII characters.	2002-09-09 06:17:05 +00:00
Walter Dörwald	8709a420c4	Check whether a string resize is necessary at the end of PyString_DecodeEscape(). This prevents a call to _PyString_Resize() for the empty string, which would result in a PyErr_BadInternalCall(), because the empty string has more than one reference. This closes SF bug http://www.python.org/sf/603937	2002-09-03 13:53:40 +00:00
Walter Dörwald	3aeb632c31	PEP 293 implemention (from SF patch http://www.python.org/sf/432401 )	2002-09-02 13:14:32 +00:00
Guido van Rossum	bf935fde15	string_contains(): speed up by avoiding function calls where possible. This always called PyUnicode_Check() and PyString_Check(), at least one of which would call PyType_IsSubtype(). Also, this would call PyString_Size() on known string objects.	2002-08-24 06:57:49 +00:00
Guido van Rossum	8b1a6d694f	Code by Inyeol Lee, submitted to SF bug 595350, to implement the string/unicode method .replace() with a zero-lengt first argument. Inyeol contributed tests for this too.	2002-08-23 18:21:28 +00:00
Guido van Rossum	76afbd9aa4	Fix some endcase bugs in unicode rfind()/rindex() and endswith(). These were reported and fixed by Inyeol Lee in SF bug 595350. The endswith() bug was already fixed in 2.3, but this adds some more test cases.	2002-08-20 17:29:29 +00:00
Guido van Rossum	45ec02aed1	SF patch 576101, by Oren Tirosh: alternative implementation of interning. I modified Oren's patch significantly, but the basic idea and most of the implementation is unchanged. Interned strings created with PyString_InternInPlace() are now mortal, and you must keep a reference to the resulting string around; use the new function PyString_InternImmortal() to create immortal interned strings.	2002-08-19 21:43:18 +00:00
Guido van Rossum	e3a8e7ed1d	Call me anal, but there was a particular phrase that was speading to comments everywhere that bugged me: /* Foo is inlined / instead of / Inline Foo */. Somehow the "is inlined" phrase always confused me for half a second (thinking, "No it isn't" until I added the missing "here"). The new phrase is hopefully unambiguous.	2002-08-19 19:26:42 +00:00
Neal Norwitz	b898d9fc9a	Get this to compile again if Py_USING_UNICODE is not defined. com_error() is static in Python/compile.c.	2002-08-16 23:20:39 +00:00

1 2 3 4 5 ...

334 Commits