cpython

Commit Graph

Author	SHA1	Message	Date
Andrew Dalke	5132407868	Added limits to the replace code so it does not count all of the matching patterns in a string, only the number needed by the max limit.	2006-05-26 20:25:22 +00:00
Fredrik Lundh	e6e43c867d	needforspeed: stringlib refactoring: use stringlib/find for string find	2006-05-26 19:48:07 +00:00
Fredrik Lundh	58b5e84d52	needforspeed: stringlib refactoring, continued. added count and find helpers; updated unicodeobject to use stringlib_count	2006-05-26 19:24:53 +00:00
Andrew Dalke	c5da53ba78	substring split now uses /F's fast string matching algorithm. (If compiled without FAST search support, changed the pre-memcmp test to check the last character as well as the first. This gave a 25% speedup for my test case.) Rewrote the split algorithms so they stop when maxsplit gets to 0. Previously they did a string match first then checked if the maxsplit was reached. The new way prevents a needless string search.	2006-05-26 19:02:09 +00:00
Fredrik Lundh	b3167cbcd7	needforspeed: added rpartition implementation	2006-05-26 18:15:38 +00:00
Fredrik Lundh	3a65d87e8c	needforspeed: remove remaining USE_FAST macros; if fastsearch was broken, someone would have noticed by now ;-)	2006-05-26 17:31:41 +00:00
Fredrik Lundh	c2032fb86a	needforspeed: cleanup	2006-05-26 17:26:39 +00:00
Fredrik Lundh	b947948c61	needforspeed: stringlib refactoring (in progress)	2006-05-26 17:22:38 +00:00
Fredrik Lundh	a50d201bd9	needforspeed: stringlib refactoring (in progress)	2006-05-26 17:04:58 +00:00
Fredrik Lundh	7c940d1d68	needforspeed: use Py_LOCAL on a few more locals in stringobject.c	2006-05-26 16:32:42 +00:00
Andrew Dalke	02758d66ce	Eeked out another 3% or so performance in split whitespace by cleaning up the algorithm.	2006-05-26 15:21:01 +00:00
Andrew Dalke	525eab3712	Changes to string.split/rsplit on whitespace to preallocate space in the results list. Originally it allocated 0 items and used the list growth during append. Now it preallocates 12 items so the first few appends don't need list reallocs. ("Here are some words ."2).split(None, 1) is 7% faster ("Here are some words ."2).split() is is 15% faster (Your milage may vary, see dealership for details.) File parsing like this for line in f: count += len(line.split()) is also about 15% faster. There is a slowdown of about 3% for large strings because of the additional overhead of checking if the append is to a preallocated region of the list or not. This will be the rare case. It could be improved with special case code but we decided it was not useful enough. There is a cost of 12sizeof(PyObject ) bytes per list. For the normal case of file parsing this is not a problem because of the lists have a short lifetime. We have not come up with cases where this is a problem in real life. I chose 12 because human text averages about 11 words per line in books, one of my data sets averages 6.2 words with a final peak at 11 words per line, and I work with a tab delimited data set with 8 tabs per line (or 9 words per line). 12 encompasses all of these. Also changed the last rstrip code to append then reverse, rather than doing insert(0). The strip() and rstrip() times are now comparable.	2006-05-26 14:00:45 +00:00
Fredrik Lundh	95e2a91615	use Py_LOCAL also for string and unicode objects	2006-05-26 11:38:15 +00:00
Fredrik Lundh	f2c0dfdb13	needforspeed: use Py_ssize_t for the fastsearch counter and skip length (thanks, neal!). and yes, I've verified that this doesn't slow things down ;-)	2006-05-26 10:27:17 +00:00
Fredrik Lundh	450277fef5	needforspeed: use METH_O for argument handling, which made partition some ~15% faster for the current tests (which is noticable faster than a corre- sponding find call). thanks to neal-who-never-sleeps for the tip.	2006-05-26 09:46:59 +00:00
Fredrik Lundh	06a69dd8ff	needforspeed: partition implementation, part two. feel free to improve the documentation and the docstrings.	2006-05-26 08:54:28 +00:00
Fredrik Lundh	fe5bb7e6d9	needforspeed: partition for 8-bit strings. for some simple tests, this is on par with a corresponding find, and nearly twice as fast as split(sep, 1) full tests, a unicode version, and documentation will follow to- morrow.	2006-05-25 23:27:53 +00:00
Bob Ippolito	955b64c031	squelch gcc4 darwin/x86 compiler warnings	2006-05-25 20:52:38 +00:00
Fredrik Lundh	554da412a8	needforspeed: use insert+reverse instead of append	2006-05-25 19:19:05 +00:00
Jack Diederich	60cbb3fe49	* eliminate warning by reverting tmp_s type to 'const char*'	2006-05-25 18:47:15 +00:00
Fredrik Lundh	c3434b3834	needforspeed: use fastsearch also for find/index and contains. the related tests are now about 10x faster.	2006-05-25 18:44:29 +00:00
Andrew Dalke	598710c727	Added overflow test for adding two (very) large strings where the new string is over max Py_ssize_t. I have no way to test it on my box or any box I have access to. At least it doesn't break anything.	2006-05-25 18:18:39 +00:00
Andrew M. Kuchling	f344c94c85	Comment typo	2006-05-25 18:11:16 +00:00
Fredrik Lundh	af72237abc	needforspeed: use "fastsearch" for count. this results in a 3x speedup for the related stringbench tests.	2006-05-25 17:55:31 +00:00
Andrew Dalke	8c9091074b	Fixed problem identified by Georg. The special-case in-place code for replace made a copy of the string using PyString_FromStringAndSize(s, n) and modify the copied string in-place. However, 1 (and 0) character strings are shared from a cache. This cause "A".replace("A", "a") to change the cached version of "A" -- used by everyone. Now may the copy with NULL as the string and do the memcpy manually. I've added regression tests to check if this happens in the future. Perhaps there should be a PyString_Copy for this case?	2006-05-25 17:53:00 +00:00
Fredrik Lundh	e68955cf32	needforspeed: new replace implementation by Andrew Dalke. replace is now about 3x faster on my machine, for the replace tests from string- bench.	2006-05-25 17:08:14 +00:00
Fredrik Lundh	0c71f88fc9	needforspeed: check for overflow in replace (from Andrew Dalke)	2006-05-25 16:46:54 +00:00
Fredrik Lundh	dfe503d3f0	needforspeed: _toupper/_tolower is a SUSv2 thing; fall back on ISO C versions if they're not defined.	2006-05-25 16:10:12 +00:00
Fredrik Lundh	4b4e33ef14	needforspeed: make new upper/lower work properly for single-character strings too... (thanks to georg brandl for spotting the exact problem faster than anyone else)	2006-05-25 15:49:45 +00:00
Fredrik Lundh	39ccef607e	needforspeed: speed up upper and lower for 8-bit string objects. (the unicode versions of these are still 2x faster on windows, though...) based on work by Andrew Dalke, with tweaks by yours truly.	2006-05-25 15:22:03 +00:00
Fredrik Lundh	763b50f9d9	docstring tweaks: count counts non-overlapping substrings, not total number of occurences	2006-05-22 15:35:12 +00:00
Tim Peters	8931ff1f67	Teach PyString_FromFormat, PyErr_Format, and PyString_FromFormatV about "%u", "%lu" and "%zu" formats. Since PyString_FromFormat and PyErr_Format have exactly the same rules (both inherited from PyString_FromFormatV), it would be good if someone with more LaTeX Fu changed one of them to just point to the other. Their docs were way out of synch before this patch, and I just did a mass copy+paste to repair that. Not a backport candidate (this is a new feature).	2006-05-13 23:28:20 +00:00
Martin v. Löwis	822f34a848	Revert 43315: Printing of %zd must be signed.	2006-05-13 13:34:04 +00:00
Thomas Wouters	568f1d0eed	Py_ssize_t issue; repr()'ing a very large string would result in a teensy string, because of a cast to int.	2006-04-21 13:54:43 +00:00
Thomas Wouters	dc5f808cbc	Make s.replace() work with explicit counts exceeding 2Gb.	2006-04-19 15:38:01 +00:00
Thomas Wouters	4abb3660ca	Use Py_ssize_t to hold the 'width' argument to the ljust, rjust, center and zfill stringmethods, so they can create strings larger than 2Gb on 64bit systems (even win64.) The unicode versions of these methods already did this right.	2006-04-19 14:50:15 +00:00
Skip Montanaro	429433b30b	C++ compiler cleanup: bunch-o-casts, plus use of unsigned loop index var in a couple places	2006-04-18 00:35:43 +00:00
Neal Norwitz	0e2cbabb8d	No need to cast a Py_ssize_t, use %z in PyErr_Format	2006-04-17 05:56:32 +00:00
Martin v. Löwis	5cb6936672	Make Py_BuildValue, PyObject_CallFunction and PyObject_CallMethod aware of PY_SSIZE_T_CLEAN.	2006-04-14 09:08:42 +00:00
Martin v. Löwis	83687c98dc	Change more occurrences of maxsplit to Py_ssize_t.	2006-04-13 08:52:56 +00:00
Martin v. Löwis	9c83076b7b	Change maxsplit types to Py_ssize_t.	2006-04-13 08:37:17 +00:00
Martin v. Löwis	8ce358f5fe	Replace most INT_MAX with PY_SSIZE_T_MAX.	2006-04-13 07:22:51 +00:00
Anthony Baxter	a62862120d	More low-hanging fruit. Still need to re-arrange some code (or find a better solution) in the same way as listobject.c got changed. Hoping for a better solution.	2006-04-11 07:42:36 +00:00
Neal Norwitz	7e957d38b7	Remove dead code (reported by HP compiler). Can probably be backported if anyone cares.	2006-04-06 08:17:41 +00:00
Georg Brandl	347b30042b	Remove unnecessary casts in type object initializers.	2006-03-30 11:57:00 +00:00
Neal Norwitz	7fbd6916b6	Get rid of warnings on some platforms by using %u for a size_t.	2006-03-25 23:55:39 +00:00
Neal Norwitz	2aa9a5dfdd	Use macro versions instead of function versions when we already know the type. This will hopefully get rid of some Coverity warnings, be a hint to developers, and be marginally faster. Some asserts were added when the type is currently known, but depends on values from another function.	2006-03-20 01:53:23 +00:00
Tim Peters	ae1d0c978d	Introduced symbol PY_FORMAT_SIZE_T. See the new comments in pyport.h. Changed PyString_FromFormatV() to use it instead of inlining its own maze of #if'ery.	2006-03-17 03:29:34 +00:00
Guido van Rossum	38fff8c4e4	Checking in the code for PEP 357. This was mostly written by Travis Oliphant. I've inspected it all; Neal Norwitz and MvL have also looked at it (in an earlier incarnation).	2006-03-07 18:50:55 +00:00
Hye-Shik Chang	4af5c8cee4	SF #1444030 : Fix several potential defects found by Coverity. (reviewed by Neal Norwitz)	2006-03-07 15:39:21 +00:00
Martin v. Löwis	725507b52e	Change int to Py_ssize_t in several places. Add (int) casts to silence compiler warnings. Raise Python exceptions for overflows.	2006-03-07 12:08:51 +00:00
Martin v. Löwis	15e62742fa	Revert backwards-incompatible const changes.	2006-02-27 16:46:16 +00:00
Thomas Wouters	977485d888	Use Py_ssize_t in helper function between Py_ssize_t-using functions.	2006-02-16 15:59:12 +00:00
Martin v. Löwis	eb079f1c25	Use Py_ssize_t for counts and sizes. Convert Py_ssize_t using PyInt_FromSsize_t	2006-02-16 14:32:27 +00:00
Martin v. Löwis	2c95cc6d72	Support %zd in PyErr_Format and PyString_FromFormat.	2006-02-16 06:54:25 +00:00
Martin v. Löwis	18e165558b	Merge ssize_t branch.	2006-02-15 17:27:45 +00:00
Jeremy Hylton	af68c874a6	Add const to several API functions that take char . In C++, it's an error to pass a string literal to a char function without a const_cast(). Rather than require every C++ extension module to put a cast around string literals, fix the API to state the const-ness. I focused on parts of the API where people usually pass literals: PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type slots, etc. Predictably, there were a large set of functions that needed to be fixed as a result of these changes. The most pervasive change was to make the keyword args list passed to PyArg_ParseTupleAndKewords() to be a const char kwlist[]. One cast was required as a result of the changes: A type object mallocs the memory for its tp_doc slot and later frees it. PyTypeObject says that tp_doc is const char ; but if the type was created by type_new(), we know it is safe to cast to char *.	2005-12-10 18:50:16 +00:00
Michael W. Hudson	b2308bb9be	Fix bug: [ 1327110 ] wrong TypeError traceback in generator expressions by removing the code that can stomp on the users' TypeError raised by the iterable argument to ''.join() -- PySequence_Fast (now?) gives a perfectly reasonable message itself. Also, a couple of tests.	2005-10-21 11:45:01 +00:00
Neal Norwitz	95c1e5065c	SF bug #1331563 ] string_subscript doesn't check for failed PyMem_Malloc. Will backport	2005-10-20 04:15:52 +00:00
Georg Brandl	d45014b236	Fix PyString_Format so that the "%s" format works again when Unicode is not enabled.	2005-10-01 17:06:00 +00:00
Neil Schemenauer	ab61923637	Fix bug in last checkin (2.231). To match previous behavior, unicode subclasses should be substituted as-is and not have tp_str called on them.	2005-08-31 23:02:05 +00:00
Neil Schemenauer	cf52c07843	Change the %s format specifier for str objects so that it returns a unicode instance if the argument is not an instance of basestring and calling __str__ on the argument returns a unicode instance.	2005-08-12 17:34:58 +00:00
Raymond Hettinger	3296e696db	SF bug #1224347 : int/long unification and hex() Hex longs now print with lowercase letters like their int counterparts.	2005-06-29 23:29:56 +00:00
Raymond Hettinger	57e7447c44	* Beef-up tests for str.count(). * Speed-up str.count() by using memchr() to fly between first char matches.	2005-02-20 09:54:53 +00:00
Raymond Hettinger	7cbf1bcb3e	* Beef-up testing of str.__contains__() and str.find(). * Speed-up "x in y" where x has more than one character. The existing code made excessive calls to the expensive memcmp() function. The new code uses memchr() to rapidly find a start point for memcmp(). In addition to knowing that the first character is a match, the new code also checks that the last character is a match. This significantly reduces the incidence of false starts (saving memcmp() calls and making quadratic behavior less likely). Improves the timings on: python -m timeit -r7 -s"x='a'1000" "'ab' in x" python -m timeit -r7 -s"x='a'1000" "'bc' in x" Once this code has proven itself, then string_find_internal() should refer to it rather than running its own version. Also, something similar may apply to unicode objects.	2005-02-20 04:07:08 +00:00
Michael W. Hudson	faa7648ffe	More bug #1077106 stuff, sorry -- modem induced impatiece! This should go on whatever bugfix branches the other fetches up on.	2005-01-31 17:09:25 +00:00
Raymond Hettinger	561fbf138d	SF bug #1054139 : serious string hashing error in 2.4b1 _PyString_Resize() readied strings for mutation but did not invalidate the cached hash value.	2004-10-26 01:52:37 +00:00
Raymond Hettinger	674f241e9c	SF Patch #1007087 : Return new string for single subclass joins (Bug #1001011 ) (Patch contributed by Nick Coghlan.) Now joining string subtypes will always return a string. Formerly, if there were only one item, it was returned unchanged.	2004-08-23 23:23:54 +00:00
Armin Rigo	618fbf5469	This was quite a dark bug in my recent in-place string concatenation hack: it would resize interned strings in-place! This occurred because their reference counts do not have their expected value -- stringobject.c hacks them. Mea culpa.	2004-08-07 20:58:32 +00:00
Armin Rigo	79f7ad228b	Fixed some compiler warnings.	2004-08-07 19:27:39 +00:00
Jeremy Hylton	4c989ddc9c	Subclasses of string can no longer be interned. The semantics of interning were not clear here -- a subclass could be mutable, for example -- and had bugs. Explicitly interning a subclass of string via intern() will raise a TypeError. Internal operations that attempt to intern a string subclass will have no effect. Added a few tests to test_builtin that includes the old buggy code and verifies that calls like PyObject_SetAttr() don't fail. Perhaps these tests should have gone in test_string.	2004-08-07 19:20:05 +00:00
Marc-André Lemburg	1dffb120b7	.encode()/.decode() patch part 2.	2004-07-08 19:13:55 +00:00
Marc-André Lemburg	d2d4598ec2	Allow string and unicode return types from .encode()/.decode() methods on string and unicode objects. Added unicode.decode() which was missing for no apparent reason.	2004-07-08 17:57:32 +00:00
Tim Peters	e7c053233f	sizeof(char) is 1, by definition, so get rid of that expression in places it's just noise.	2004-06-27 17:24:49 +00:00
Martin v. Löwis	737ea82a5a	Patch #774665 : Make Python LC_NUMERIC agnostic.	2004-06-08 18:52:54 +00:00
Hye-Shik Chang	75c00efcc7	[SF #866875 ] Add a specialized routine for one character separaters on str.split() and str.rsplit().	2004-01-05 00:29:51 +00:00
Skip Montanaro	ac4ea13a3a	There are places in Python which assume bytes have 8-bits. Formalize that a bit by checking the value of UCHAR_MAX in Include/Python.h. There was a check in Objects/stringobject.c. Remove that. (Note that we don't define UCHAR_MAX if it's not defined as the old test did.)	2003-12-22 16:31:41 +00:00
Hye-Shik Chang	3ae811b57d	Add rsplit method for str and unicode builtin types. SF feature request #801847. Original patch is written by Sean Reifschneider.	2003-12-15 18:49:53 +00:00
Guido van Rossum	6c9e130524	- Removed FutureWarnings related to hex/oct literals and conversions and left shifts. (Thanks to Kalle Svensson for SF patch 849227.) This addresses most of the remaining semantic changes promised by PEP 237, except for repr() of a long, which still shows the trailing 'L'. The PEP appears to promise warnings for operations that changed semantics compared to Python 2.3, but this is not implemented; we've suffered through enough warnings related to hex/oct literals and I think it's best to be silent now.	2003-11-29 23:52:13 +00:00
Raymond Hettinger	4f8f976576	Add optional fillchar argument to ljust(), rjust(), and center() string methods.	2003-11-26 08:21:35 +00:00
Fred Drake	d22bb6584d	Avoid confusing name for the 3rd argument to str.replace(). This closes SF bug #827260.	2003-10-22 02:56:40 +00:00
Martin v. Löwis	6828e18a6a	Patch #825679 : Clarify semantics of .isfoo on empty strings. Backported to 2.3.	2003-10-18 09:55:08 +00:00
Raymond Hettinger	9bfe533c69	SF bug #795506 : Wrong handling of string format code for float values. Adding missing support for '%F'. Will backport to 2.3.1.	2003-08-27 04:55:52 +00:00
Walter Dörwald	9ff3f03c3e	Fix whitespace.	2003-06-18 14:17:01 +00:00
Neal Norwitz	ffe33b7f24	Attempt to make all the various string strip methods the same. Doc - add doc for when functions were added * UserString * string object methods * string module functions 'chars' is used for the last parameter everywhere. These changes will be backported, since part of the changes have already been made, but they were inconsistent.	2003-04-10 22:35:32 +00:00
Guido van Rossum	a7132189d2	Reformat a few docstrings that caused line wraps in help() output.	2003-04-09 19:32:45 +00:00
Walter Dörwald	43440a621e	Fix PyString_Format() so that '%c' % u'a' returns u'a' instead of raising a TypeError. (From SF patch #710127) Add tests to verify this is fixed. Add various tests for '%c' % int.	2003-03-31 18:07:50 +00:00
Guido van Rossum	5d9113d8be	Implement appropriate __getnewargs__ for all immutable subclassable builtin types. The special handling for these can now be removed from save_newobj(). Add some testing for this. Also add support for setting the 'fast' flag on the Python Pickler class, which suppresses use of the memo.	2003-01-29 17:58:45 +00:00
Raymond Hettinger	5d5e7c0e34	SF patch #664192 bug #661913 : inconsistent error messages between string and unicode Patch by Christopher Blunck.	2003-01-15 05:32:57 +00:00
Raymond Hettinger	0a2f849b79	GvR's idea to use memset() for the most common special case of repeating a single character. Shaves another 10% off the running time by avoiding the lg2(N) loops and cache effects for the other cases.	2003-01-06 22:42:41 +00:00
Raymond Hettinger	698258a199	Optimize string_repeat. Christian Tismer pointed out the high cost of the loop overhead and function call overhead for 'c' * n where n is large. Accordingly, the new code only makes lg2(n) loops. Interestingly, 'c' * 1000 * 1000 ran a bit faster with old code. At some point, the loop and function call overhead became cheaper than invalidating the cache with lengthy memcpys. But for more typical sizes of n, the new code runs much faster and for larger values of n it runs only a bit slower.	2003-01-06 10:33:56 +00:00
Marc-André Lemburg	79f57833f3	Patch for bug #659709 : bogus computation of float length Python 2.2.x backport candidate. (This bug has been around since Python 1.6.)	2002-12-29 19:44:06 +00:00
Raymond Hettinger	ea3fdf44a2	SF patch #659536 : Use PyArg_UnpackTuple where possible. Obtain cleaner coding and a system wide performance boost by using the fast, pre-parsed PyArg_Unpack function instead of PyArg_ParseTuple function which is driven by a format string.	2002-12-29 16:33:45 +00:00
Martin v. Löwis	00b6127097	Patch #650653 : Raise always value error if the table is not 256 bytes long.	2002-12-12 20:03:19 +00:00
Martin v. Löwis	79acb9edfa	Patch #614055 : Support OpenVMS.	2002-12-06 12:48:53 +00:00
Neil Schemenauer	a6cd4e65d7	Add nb_remainder (i.e. __mod__) slot to str type. Fixes SF bug #615506 .	2002-11-18 16:09:38 +00:00
Neal Norwitz	80a1bf4b5d	Fix SF # 635969, No error "not all arguments converted" When mwh added extended slicing, strings and unicode became mappings. Thus, dict was set which prevented an error when doing: newstr = 'format without a percent' % string_value This fix raises an exception again when there are no formats and % with a string value.	2002-11-12 23:01:12 +00:00
Martin v. Löwis	a5f0907d79	Back out #479898 .	2002-10-11 05:37:59 +00:00
Guido van Rossum	049cd6b563	Fix a nasty endcase reported by Armin Rigo in SF bug 618623: '%2147483647d' % -123 segfaults. This was because an integer overflow in a comparison caused the string resize to be skipped. After fixing the overflow, this could call _PyString_Resize() with a negative size, so I (1) test for that and raise MemoryError instead; (2) also added a test for negative newsize to _PyString_Resize(), raising SystemError as for all bad arguments. An identical bug existed in unicodeobject.c, of course. Will backport to 2.2.2.	2002-10-11 00:43:48 +00:00
Guido van Rossum	8052f8921e	Undo this part of the previous checkin: Also fixed an error message -- %s argument has non-string str() doesn't make sense for %r, so the error message now differentiates between %s and %r. because PyObject_Repr() and PyObject_Str() ensure that this can never happen. Added a helpful comment instead.	2002-10-09 19:14:30 +00:00
Guido van Rossum	b00c07f038	The string formatting code has a test to switch to Unicode when %s sees a Unicode argument. Unfortunately this test was also executed for %r, because %s and %r share almost all of their code. This meant that, if u is a unicode object while repr(u) is an 8-bit string containing ASCII characters, '%r' % u is a unicode string containing only ASCII characters! Fixed by executing the test only for %s. Also fixed an error message -- %s argument has non-string str() doesn't make sense for %r, so the error message now differentiates between %s and %r.	2002-10-09 19:07:53 +00:00
Martin v. Löwis	bab9559d12	Include wctype.h.	2002-10-07 18:26:16 +00:00
Martin v. Löwis	fed2405cb5	Patch #479898 : Use multibyte C library for printing strings if available.	2002-10-07 13:55:50 +00:00
Guido van Rossum	efc1188239	Fix warnings on 64-bit platforms about casts from pointers to ints. Two of these were real bugs.	2002-09-12 14:43:41 +00:00
Martin v. Löwis	2412853f8e	Fix escaping of non-ASCII characters.	2002-09-09 06:17:05 +00:00
Walter Dörwald	8709a420c4	Check whether a string resize is necessary at the end of PyString_DecodeEscape(). This prevents a call to _PyString_Resize() for the empty string, which would result in a PyErr_BadInternalCall(), because the empty string has more than one reference. This closes SF bug http://www.python.org/sf/603937	2002-09-03 13:53:40 +00:00
Walter Dörwald	3aeb632c31	PEP 293 implemention (from SF patch http://www.python.org/sf/432401 )	2002-09-02 13:14:32 +00:00
Guido van Rossum	bf935fde15	string_contains(): speed up by avoiding function calls where possible. This always called PyUnicode_Check() and PyString_Check(), at least one of which would call PyType_IsSubtype(). Also, this would call PyString_Size() on known string objects.	2002-08-24 06:57:49 +00:00
Guido van Rossum	8b1a6d694f	Code by Inyeol Lee, submitted to SF bug 595350, to implement the string/unicode method .replace() with a zero-lengt first argument. Inyeol contributed tests for this too.	2002-08-23 18:21:28 +00:00
Guido van Rossum	76afbd9aa4	Fix some endcase bugs in unicode rfind()/rindex() and endswith(). These were reported and fixed by Inyeol Lee in SF bug 595350. The endswith() bug was already fixed in 2.3, but this adds some more test cases.	2002-08-20 17:29:29 +00:00
Guido van Rossum	45ec02aed1	SF patch 576101, by Oren Tirosh: alternative implementation of interning. I modified Oren's patch significantly, but the basic idea and most of the implementation is unchanged. Interned strings created with PyString_InternInPlace() are now mortal, and you must keep a reference to the resulting string around; use the new function PyString_InternImmortal() to create immortal interned strings.	2002-08-19 21:43:18 +00:00
Guido van Rossum	e3a8e7ed1d	Call me anal, but there was a particular phrase that was speading to comments everywhere that bugged me: /* Foo is inlined / instead of / Inline Foo */. Somehow the "is inlined" phrase always confused me for half a second (thinking, "No it isn't" until I added the missing "here"). The new phrase is hopefully unambiguous.	2002-08-19 19:26:42 +00:00
Neal Norwitz	b898d9fc9a	Get this to compile again if Py_USING_UNICODE is not defined. com_error() is static in Python/compile.c.	2002-08-16 23:20:39 +00:00
Guido van Rossum	54df53a352	More changes of DeprecationWarning to FutureWarning.	2002-08-14 18:38:27 +00:00
Martin v. Löwis	eb3f00aeeb	Check for trailing backslash. Fixes #593656 .	2002-08-14 08:22:50 +00:00
Martin v. Löwis	8a8da798a5	Patch #505705 : Remove eval in pickle and cPickle.	2002-08-14 07:46:28 +00:00
Guido van Rossum	078151da90	Implement stage B0 of PEP 237: add warnings for operations that currently return inconsistent results for ints and longs; in particular: hex/oct/%u/%o/%x/%X of negative short ints, and x<<n that either loses bits or changes sign. (No warnings for repr() of a long, though that will also change to lose the trailing 'L' eventually.) This introduces some warnings in the test suite; I'll take care of those later.	2002-08-11 04:24:12 +00:00
Barry Warsaw	817918cc3c	Committing patch #591250 which provides "str1 in str2" when str1 is a string of longer than 1 character.	2002-08-06 16:58:21 +00:00
Raymond Hettinger	bc552ce1b8	SF 582071 clarified the .split() method's docstring to note that sep=None will trigger splitting on any whitespace.	2002-08-05 06:28:21 +00:00
Neal Norwitz	88fe4ff5a9	Fix the problem of not raising a TypeError exception when doing: '%g' % '1' '%d' % '1' Add a test for these conditions Fix the test so that if not exception is raise, this is a failure	2002-07-28 16:44:23 +00:00
Neal Norwitz	7beeed5dfd	SF patch #577031 , remove PyArg_Parse() since it's deprecated	2002-07-28 15:19:47 +00:00
Martin v. Löwis	75d2d94e0f	Patch #554716 : Use __va_copy where available.	2002-07-28 10:23:27 +00:00
Jeremy Hylton	938ace69a0	staticforward bites the dust. The staticforward define was needed to support certain broken C compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the static keyword when it was used with a forward declaration of a static initialized structure. Standard C allows the forward declaration with static, and we've decided to stop catering to broken C compilers. (In fact, we expect that the compilers are all fixed eight years later.) I'm leaving staticforward and statichere defined in object.h as static. This is only for backwards compatibility with C extensions that might still use it. XXX I haven't updated the documentation.	2002-07-17 16:30:39 +00:00
Tim Peters	3459251d5a	object.h special-build macro minefield: renamed all the new lexical helper macros to something saner, and used them appropriately in other files too, to reduce #ifdef blocks. classobject.c, instance_dealloc(): One of my worst Python Memories is trying to fix this routine a few years ago when COUNT_ALLOCS was defined but Py_TRACE_REFS wasn't. The special-build code here is way too complicated. Now it's much simpler. Difference: in a Py_TRACE_REFS build, the instance is no longer in the doubly-linked list of live objects while its __del__ method is executing, and that may be visible via sys.getobjects() called from a __del__ method. Tough -- the object is presumed dead while its __del__ is executing anyway, and not calling _Py_NewReference() at the start allows enormous code simplification. typeobject.c, call_finalizer(): The special-build instance_dealloc() pain apparently spread to here too via cut-'n-paste, and this is much simpler now too. In addition, I didn't understand why this routine was calling _PyObject_GC_TRACK() after a resurrection, since there's no plausible way _PyObject_GC_UNTRACK() could have been called on the object by this point. I suspect it was left over from pasting the instance_delloc() code. Instead asserted that the object is still tracked. Caution: I suspect we don't have a test that actually exercises the subtype_dealloc() __del__-resurrected-me code.	2002-07-11 06:23:50 +00:00
Neal Norwitz	1f68fc7fa5	SF bug # 493951 string.{starts,ends}with vs slices Handle negative indices similar to slices.	2002-06-14 00:50:42 +00:00
Martin v. Löwis	14f8b4cfcb	Patch #568124 : Add doc string macros.	2002-06-13 20:33:02 +00:00
Michael W. Hudson	5efaf7eac8	This is my nearly two year old patch [ 400998 ] experimental support for extended slicing on lists somewhat spruced up and better tested than it was when I wrote it. Includes docs & tests. The whatsnew section needs expanding, and arrays should support extended slices -- later.	2002-06-11 10:55:12 +00:00
Neal Norwitz	32a7e7f6b6	Change name from string to basestring	2002-05-31 19:58:02 +00:00
Guido van Rossum	cacfc07d08	- A new type object, 'string', is added. This is a common base type for 'str' and 'unicode', and can be used instead of types.StringTypes, e.g. to test whether something is "a string": isinstance(x, string) is True for Unicode and 8-bit strings. This is an abstract base class and cannot be instantiated directly.	2002-05-24 19:01:59 +00:00
Raymond Hettinger	0ebac97058	Patch 549187. Improve string formatting error message.	2002-05-21 15:14:57 +00:00
Walter Dörwald	775c11f07a	Add #ifdef PY_USING_UNICODE sections, so that stringobject.c compiles again with --disable-unicode. Fixes SF bug http://www.python.org/sf/554912	2002-05-13 09:00:41 +00:00
Tim Peters	5de9842b34	Repair widespread misuse of _PyString_Resize. Since it's clear people don't understand how this function works, also beefed up the docs. The most common usage error is of this form (often spread out across gotos): if (_PyString_Resize(&s, n) < 0) { Py_DECREF(s); s = NULL; goto outtahere; } The error is that if _PyString_Resize runs out of memory, it automatically decrefs the input string object s (which also deallocates it, since its refcount must be 1 upon entry), and sets s to NULL. So if the "if" branch ever triggers, it's an error to call Py_DECREF(s): s is already NULL! A correct way to write the above is the simpler (and intended) if (_PyString_Resize(&s, n) < 0) goto outtahere; Bugfix candidate.	2002-04-27 18:44:32 +00:00
Walter Dörwald	de02bcb265	Apply patch diff.txt from SF feature request http://www.python.org/sf/444708 This adds the optional argument for str.strip to unicode.strip too and makes it possible to call str.strip with a unicode argument and unicode.strip with a str argument.	2002-04-22 17:42:37 +00:00
Walter Dörwald	0fe940c862	Return the orginal string only if it's a real str or unicode instance, otherwise make a copy.	2002-04-15 18:42:15 +00:00
Guido van Rossum	3aa3fc46c8	Remove 'const' from local variable declaration in string_zfill() -- it isn't constant, so why bother. Folded long lines. Whitespace normalization.	2002-04-15 13:48:52 +00:00
Walter Dörwald	068325ef92	Apply the second version of SF patch http://www.python.org/sf/536241 Add a method zfill to str, unicode and UserString and change Lib/string.py accordingly. This activates the zfill version in unicodeobject.c that was commented out and implements the same in stringobject.c. It also adds the test for unicode support in Lib/string.py back in and uses repr() instead() of str() (as it was before Lib/string.py 1.62)	2002-04-15 13:36:47 +00:00
Guido van Rossum	018b0eb0f5	Partially implement SF feature request 444708. Add optional arg to string methods strip(), lstrip(), rstrip(). The optional arg specifies characters to delete. Also for UserString. Still to do: - Misc/NEWS - LaTeX docs (I did the docstrings though) - Unicode methods, and Unicode support in the string methods.	2002-04-13 00:56:08 +00:00
Neil Schemenauer	510492e985	Remove PyMalloc_New, _PyMalloc_MALLOC, and PyMalloc_Del.	2002-04-12 03:05:19 +00:00
Guido van Rossum	77f6a65eb0	Add the 'bool' type and its values 'False' and 'True', as described in PEP 285. Everything described in the PEP is here, and there is even some documentation. I had to fix 12 unit tests; all but one of these were printing Boolean outcomes that changed from 0/1 to False/True. (The exception is test_unicode.py, which did a type(x) == type(y) style comparison. I could've fixed that with a single line using issubtype(x, type(y)), but instead chose to be explicit about those places where a bool is expected. Still to do: perhaps more documentation; change standard library modules to return False/True from predicates.	2002-04-03 22:41:51 +00:00
Tim Peters	8deda70b16	Eliminate DONT_SHARE_SHORT_STRINGS.	2002-03-30 10:06:07 +00:00
Tim Peters	1f7df3595a	Remove the CACHE_HASH and INTERN_STRINGS preprocessor symbols.	2002-03-29 03:29:08 +00:00
Neil Schemenauer	dcc819a5c9	Use pymalloc if it's enabled.	2002-03-22 15:33:15 +00:00
Andrew MacIntyre	5e9c80d906	%#x/%#X format conversion cleanup (see patch #450267 ): Objects/ stringobject.c unicodeobject.c	2002-02-28 11:38:24 +00:00
Andrew MacIntyre	c487439aa7	OS/2 EMX port changes (Objects part of patch #450267 ): Objects/ fileobject.c stringobject.c unicodeobject.c This commit doesn't include the cleanup patches for stringobject.c and unicodeobject.c which are shown separately in the patch manager. Those patches will be regenerated and applied in a subsequent commit, so as to preserve a fallback position (this commit to those files).	2002-02-26 11:36:35 +00:00
Martin v. Löwis	1f803f782c	Updated patch #487906 : Revise inline docs.	2002-01-16 10:53:24 +00:00
Guido van Rossum	169192e818	SF patch #491049 (David Jacobs): Small PyString_FromString optimization PyString_FromString(): Since the length of the string is already being stored in size, changed the strcpy() to a memcpy() for a small speed improvement.	2001-12-10 15:45:54 +00:00
Tim Peters	62de65b25e	PyString_FromString: this requires its argument be non-NULL, but doesn't check it. Added an assert() to that effect.	2001-12-06 20:29:32 +00:00
Jeremy Hylton	7802a53e38	Little stuff. Add a missing DECREF in an obscure corner. If the str() or repr() of an object passed to a string interpolation -- e.g. "%s" % obj -- returns a non-string, the returned object was leaked. Repair an indentation glitch. Replace a bunch of PyString_AsString() calls (and their ilk) with macros.	2001-12-06 15:18:48 +00:00
Martin v. Löwis	8f1ea71eab	Add more inline documentation, as contributed in #487906 .	2001-12-03 08:24:52 +00:00
Tim Peters	9161c8b0a1	PyString_FromFormatV, string_repr: document why these use sprintf instead of PyOS_snprintf; add some relevant comments and asserts.	2001-12-03 01:55:38 +00:00

1 2 3 4 5 ...

397 Commits