cpython

Commit Graph

Author	SHA1	Message	Date
Raymond Hettinger	9bfe533c69	SF bug #795506 : Wrong handling of string format code for float values. Adding missing support for '%F'. Will backport to 2.3.1.	2003-08-27 04:55:52 +00:00
Walter Dörwald	9ff3f03c3e	Fix whitespace.	2003-06-18 14:17:01 +00:00
Neal Norwitz	ffe33b7f24	Attempt to make all the various string strip methods the same. Doc - add doc for when functions were added * UserString * string object methods * string module functions 'chars' is used for the last parameter everywhere. These changes will be backported, since part of the changes have already been made, but they were inconsistent.	2003-04-10 22:35:32 +00:00
Guido van Rossum	a7132189d2	Reformat a few docstrings that caused line wraps in help() output.	2003-04-09 19:32:45 +00:00
Walter Dörwald	43440a621e	Fix PyString_Format() so that '%c' % u'a' returns u'a' instead of raising a TypeError. (From SF patch #710127) Add tests to verify this is fixed. Add various tests for '%c' % int.	2003-03-31 18:07:50 +00:00
Guido van Rossum	5d9113d8be	Implement appropriate __getnewargs__ for all immutable subclassable builtin types. The special handling for these can now be removed from save_newobj(). Add some testing for this. Also add support for setting the 'fast' flag on the Python Pickler class, which suppresses use of the memo.	2003-01-29 17:58:45 +00:00
Raymond Hettinger	5d5e7c0e34	SF patch #664192 bug #661913 : inconsistent error messages between string and unicode Patch by Christopher Blunck.	2003-01-15 05:32:57 +00:00
Raymond Hettinger	0a2f849b79	GvR's idea to use memset() for the most common special case of repeating a single character. Shaves another 10% off the running time by avoiding the lg2(N) loops and cache effects for the other cases.	2003-01-06 22:42:41 +00:00
Raymond Hettinger	698258a199	Optimize string_repeat. Christian Tismer pointed out the high cost of the loop overhead and function call overhead for 'c' * n where n is large. Accordingly, the new code only makes lg2(n) loops. Interestingly, 'c' * 1000 * 1000 ran a bit faster with old code. At some point, the loop and function call overhead became cheaper than invalidating the cache with lengthy memcpys. But for more typical sizes of n, the new code runs much faster and for larger values of n it runs only a bit slower.	2003-01-06 10:33:56 +00:00
Marc-André Lemburg	79f57833f3	Patch for bug #659709 : bogus computation of float length Python 2.2.x backport candidate. (This bug has been around since Python 1.6.)	2002-12-29 19:44:06 +00:00
Raymond Hettinger	ea3fdf44a2	SF patch #659536 : Use PyArg_UnpackTuple where possible. Obtain cleaner coding and a system wide performance boost by using the fast, pre-parsed PyArg_Unpack function instead of PyArg_ParseTuple function which is driven by a format string.	2002-12-29 16:33:45 +00:00
Martin v. Löwis	00b6127097	Patch #650653 : Raise always value error if the table is not 256 bytes long.	2002-12-12 20:03:19 +00:00
Martin v. Löwis	79acb9edfa	Patch #614055 : Support OpenVMS.	2002-12-06 12:48:53 +00:00
Neil Schemenauer	a6cd4e65d7	Add nb_remainder (i.e. __mod__) slot to str type. Fixes SF bug #615506 .	2002-11-18 16:09:38 +00:00
Neal Norwitz	80a1bf4b5d	Fix SF # 635969, No error "not all arguments converted" When mwh added extended slicing, strings and unicode became mappings. Thus, dict was set which prevented an error when doing: newstr = 'format without a percent' % string_value This fix raises an exception again when there are no formats and % with a string value.	2002-11-12 23:01:12 +00:00
Martin v. Löwis	a5f0907d79	Back out #479898 .	2002-10-11 05:37:59 +00:00
Guido van Rossum	049cd6b563	Fix a nasty endcase reported by Armin Rigo in SF bug 618623: '%2147483647d' % -123 segfaults. This was because an integer overflow in a comparison caused the string resize to be skipped. After fixing the overflow, this could call _PyString_Resize() with a negative size, so I (1) test for that and raise MemoryError instead; (2) also added a test for negative newsize to _PyString_Resize(), raising SystemError as for all bad arguments. An identical bug existed in unicodeobject.c, of course. Will backport to 2.2.2.	2002-10-11 00:43:48 +00:00
Guido van Rossum	8052f8921e	Undo this part of the previous checkin: Also fixed an error message -- %s argument has non-string str() doesn't make sense for %r, so the error message now differentiates between %s and %r. because PyObject_Repr() and PyObject_Str() ensure that this can never happen. Added a helpful comment instead.	2002-10-09 19:14:30 +00:00
Guido van Rossum	b00c07f038	The string formatting code has a test to switch to Unicode when %s sees a Unicode argument. Unfortunately this test was also executed for %r, because %s and %r share almost all of their code. This meant that, if u is a unicode object while repr(u) is an 8-bit string containing ASCII characters, '%r' % u is a unicode string containing only ASCII characters! Fixed by executing the test only for %s. Also fixed an error message -- %s argument has non-string str() doesn't make sense for %r, so the error message now differentiates between %s and %r.	2002-10-09 19:07:53 +00:00
Martin v. Löwis	bab9559d12	Include wctype.h.	2002-10-07 18:26:16 +00:00
Martin v. Löwis	fed2405cb5	Patch #479898 : Use multibyte C library for printing strings if available.	2002-10-07 13:55:50 +00:00
Guido van Rossum	efc1188239	Fix warnings on 64-bit platforms about casts from pointers to ints. Two of these were real bugs.	2002-09-12 14:43:41 +00:00
Martin v. Löwis	2412853f8e	Fix escaping of non-ASCII characters.	2002-09-09 06:17:05 +00:00
Walter Dörwald	8709a420c4	Check whether a string resize is necessary at the end of PyString_DecodeEscape(). This prevents a call to _PyString_Resize() for the empty string, which would result in a PyErr_BadInternalCall(), because the empty string has more than one reference. This closes SF bug http://www.python.org/sf/603937	2002-09-03 13:53:40 +00:00
Walter Dörwald	3aeb632c31	PEP 293 implemention (from SF patch http://www.python.org/sf/432401 )	2002-09-02 13:14:32 +00:00
Guido van Rossum	bf935fde15	string_contains(): speed up by avoiding function calls where possible. This always called PyUnicode_Check() and PyString_Check(), at least one of which would call PyType_IsSubtype(). Also, this would call PyString_Size() on known string objects.	2002-08-24 06:57:49 +00:00
Guido van Rossum	8b1a6d694f	Code by Inyeol Lee, submitted to SF bug 595350, to implement the string/unicode method .replace() with a zero-lengt first argument. Inyeol contributed tests for this too.	2002-08-23 18:21:28 +00:00
Guido van Rossum	76afbd9aa4	Fix some endcase bugs in unicode rfind()/rindex() and endswith(). These were reported and fixed by Inyeol Lee in SF bug 595350. The endswith() bug was already fixed in 2.3, but this adds some more test cases.	2002-08-20 17:29:29 +00:00
Guido van Rossum	45ec02aed1	SF patch 576101, by Oren Tirosh: alternative implementation of interning. I modified Oren's patch significantly, but the basic idea and most of the implementation is unchanged. Interned strings created with PyString_InternInPlace() are now mortal, and you must keep a reference to the resulting string around; use the new function PyString_InternImmortal() to create immortal interned strings.	2002-08-19 21:43:18 +00:00
Guido van Rossum	e3a8e7ed1d	Call me anal, but there was a particular phrase that was speading to comments everywhere that bugged me: /* Foo is inlined / instead of / Inline Foo */. Somehow the "is inlined" phrase always confused me for half a second (thinking, "No it isn't" until I added the missing "here"). The new phrase is hopefully unambiguous.	2002-08-19 19:26:42 +00:00
Neal Norwitz	b898d9fc9a	Get this to compile again if Py_USING_UNICODE is not defined. com_error() is static in Python/compile.c.	2002-08-16 23:20:39 +00:00
Guido van Rossum	54df53a352	More changes of DeprecationWarning to FutureWarning.	2002-08-14 18:38:27 +00:00
Martin v. Löwis	eb3f00aeeb	Check for trailing backslash. Fixes #593656 .	2002-08-14 08:22:50 +00:00
Martin v. Löwis	8a8da798a5	Patch #505705 : Remove eval in pickle and cPickle.	2002-08-14 07:46:28 +00:00
Guido van Rossum	078151da90	Implement stage B0 of PEP 237: add warnings for operations that currently return inconsistent results for ints and longs; in particular: hex/oct/%u/%o/%x/%X of negative short ints, and x<<n that either loses bits or changes sign. (No warnings for repr() of a long, though that will also change to lose the trailing 'L' eventually.) This introduces some warnings in the test suite; I'll take care of those later.	2002-08-11 04:24:12 +00:00
Barry Warsaw	817918cc3c	Committing patch #591250 which provides "str1 in str2" when str1 is a string of longer than 1 character.	2002-08-06 16:58:21 +00:00
Raymond Hettinger	bc552ce1b8	SF 582071 clarified the .split() method's docstring to note that sep=None will trigger splitting on any whitespace.	2002-08-05 06:28:21 +00:00
Neal Norwitz	88fe4ff5a9	Fix the problem of not raising a TypeError exception when doing: '%g' % '1' '%d' % '1' Add a test for these conditions Fix the test so that if not exception is raise, this is a failure	2002-07-28 16:44:23 +00:00
Neal Norwitz	7beeed5dfd	SF patch #577031 , remove PyArg_Parse() since it's deprecated	2002-07-28 15:19:47 +00:00
Martin v. Löwis	75d2d94e0f	Patch #554716 : Use __va_copy where available.	2002-07-28 10:23:27 +00:00
Jeremy Hylton	938ace69a0	staticforward bites the dust. The staticforward define was needed to support certain broken C compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the static keyword when it was used with a forward declaration of a static initialized structure. Standard C allows the forward declaration with static, and we've decided to stop catering to broken C compilers. (In fact, we expect that the compilers are all fixed eight years later.) I'm leaving staticforward and statichere defined in object.h as static. This is only for backwards compatibility with C extensions that might still use it. XXX I haven't updated the documentation.	2002-07-17 16:30:39 +00:00
Tim Peters	3459251d5a	object.h special-build macro minefield: renamed all the new lexical helper macros to something saner, and used them appropriately in other files too, to reduce #ifdef blocks. classobject.c, instance_dealloc(): One of my worst Python Memories is trying to fix this routine a few years ago when COUNT_ALLOCS was defined but Py_TRACE_REFS wasn't. The special-build code here is way too complicated. Now it's much simpler. Difference: in a Py_TRACE_REFS build, the instance is no longer in the doubly-linked list of live objects while its __del__ method is executing, and that may be visible via sys.getobjects() called from a __del__ method. Tough -- the object is presumed dead while its __del__ is executing anyway, and not calling _Py_NewReference() at the start allows enormous code simplification. typeobject.c, call_finalizer(): The special-build instance_dealloc() pain apparently spread to here too via cut-'n-paste, and this is much simpler now too. In addition, I didn't understand why this routine was calling _PyObject_GC_TRACK() after a resurrection, since there's no plausible way _PyObject_GC_UNTRACK() could have been called on the object by this point. I suspect it was left over from pasting the instance_delloc() code. Instead asserted that the object is still tracked. Caution: I suspect we don't have a test that actually exercises the subtype_dealloc() __del__-resurrected-me code.	2002-07-11 06:23:50 +00:00
Neal Norwitz	1f68fc7fa5	SF bug # 493951 string.{starts,ends}with vs slices Handle negative indices similar to slices.	2002-06-14 00:50:42 +00:00
Martin v. Löwis	14f8b4cfcb	Patch #568124 : Add doc string macros.	2002-06-13 20:33:02 +00:00
Michael W. Hudson	5efaf7eac8	This is my nearly two year old patch [ 400998 ] experimental support for extended slicing on lists somewhat spruced up and better tested than it was when I wrote it. Includes docs & tests. The whatsnew section needs expanding, and arrays should support extended slices -- later.	2002-06-11 10:55:12 +00:00
Neal Norwitz	32a7e7f6b6	Change name from string to basestring	2002-05-31 19:58:02 +00:00
Guido van Rossum	cacfc07d08	- A new type object, 'string', is added. This is a common base type for 'str' and 'unicode', and can be used instead of types.StringTypes, e.g. to test whether something is "a string": isinstance(x, string) is True for Unicode and 8-bit strings. This is an abstract base class and cannot be instantiated directly.	2002-05-24 19:01:59 +00:00
Raymond Hettinger	0ebac97058	Patch 549187. Improve string formatting error message.	2002-05-21 15:14:57 +00:00
Walter Dörwald	775c11f07a	Add #ifdef PY_USING_UNICODE sections, so that stringobject.c compiles again with --disable-unicode. Fixes SF bug http://www.python.org/sf/554912	2002-05-13 09:00:41 +00:00
Tim Peters	5de9842b34	Repair widespread misuse of _PyString_Resize. Since it's clear people don't understand how this function works, also beefed up the docs. The most common usage error is of this form (often spread out across gotos): if (_PyString_Resize(&s, n) < 0) { Py_DECREF(s); s = NULL; goto outtahere; } The error is that if _PyString_Resize runs out of memory, it automatically decrefs the input string object s (which also deallocates it, since its refcount must be 1 upon entry), and sets s to NULL. So if the "if" branch ever triggers, it's an error to call Py_DECREF(s): s is already NULL! A correct way to write the above is the simpler (and intended) if (_PyString_Resize(&s, n) < 0) goto outtahere; Bugfix candidate.	2002-04-27 18:44:32 +00:00
Walter Dörwald	de02bcb265	Apply patch diff.txt from SF feature request http://www.python.org/sf/444708 This adds the optional argument for str.strip to unicode.strip too and makes it possible to call str.strip with a unicode argument and unicode.strip with a str argument.	2002-04-22 17:42:37 +00:00
Walter Dörwald	0fe940c862	Return the orginal string only if it's a real str or unicode instance, otherwise make a copy.	2002-04-15 18:42:15 +00:00
Guido van Rossum	3aa3fc46c8	Remove 'const' from local variable declaration in string_zfill() -- it isn't constant, so why bother. Folded long lines. Whitespace normalization.	2002-04-15 13:48:52 +00:00
Walter Dörwald	068325ef92	Apply the second version of SF patch http://www.python.org/sf/536241 Add a method zfill to str, unicode and UserString and change Lib/string.py accordingly. This activates the zfill version in unicodeobject.c that was commented out and implements the same in stringobject.c. It also adds the test for unicode support in Lib/string.py back in and uses repr() instead() of str() (as it was before Lib/string.py 1.62)	2002-04-15 13:36:47 +00:00
Guido van Rossum	018b0eb0f5	Partially implement SF feature request 444708. Add optional arg to string methods strip(), lstrip(), rstrip(). The optional arg specifies characters to delete. Also for UserString. Still to do: - Misc/NEWS - LaTeX docs (I did the docstrings though) - Unicode methods, and Unicode support in the string methods.	2002-04-13 00:56:08 +00:00
Neil Schemenauer	510492e985	Remove PyMalloc_New, _PyMalloc_MALLOC, and PyMalloc_Del.	2002-04-12 03:05:19 +00:00
Guido van Rossum	77f6a65eb0	Add the 'bool' type and its values 'False' and 'True', as described in PEP 285. Everything described in the PEP is here, and there is even some documentation. I had to fix 12 unit tests; all but one of these were printing Boolean outcomes that changed from 0/1 to False/True. (The exception is test_unicode.py, which did a type(x) == type(y) style comparison. I could've fixed that with a single line using issubtype(x, type(y)), but instead chose to be explicit about those places where a bool is expected. Still to do: perhaps more documentation; change standard library modules to return False/True from predicates.	2002-04-03 22:41:51 +00:00
Tim Peters	8deda70b16	Eliminate DONT_SHARE_SHORT_STRINGS.	2002-03-30 10:06:07 +00:00
Tim Peters	1f7df3595a	Remove the CACHE_HASH and INTERN_STRINGS preprocessor symbols.	2002-03-29 03:29:08 +00:00
Neil Schemenauer	dcc819a5c9	Use pymalloc if it's enabled.	2002-03-22 15:33:15 +00:00
Andrew MacIntyre	5e9c80d906	%#x/%#X format conversion cleanup (see patch #450267 ): Objects/ stringobject.c unicodeobject.c	2002-02-28 11:38:24 +00:00
Andrew MacIntyre	c487439aa7	OS/2 EMX port changes (Objects part of patch #450267 ): Objects/ fileobject.c stringobject.c unicodeobject.c This commit doesn't include the cleanup patches for stringobject.c and unicodeobject.c which are shown separately in the patch manager. Those patches will be regenerated and applied in a subsequent commit, so as to preserve a fallback position (this commit to those files).	2002-02-26 11:36:35 +00:00
Martin v. Löwis	1f803f782c	Updated patch #487906 : Revise inline docs.	2002-01-16 10:53:24 +00:00
Guido van Rossum	169192e818	SF patch #491049 (David Jacobs): Small PyString_FromString optimization PyString_FromString(): Since the length of the string is already being stored in size, changed the strcpy() to a memcpy() for a small speed improvement.	2001-12-10 15:45:54 +00:00
Tim Peters	62de65b25e	PyString_FromString: this requires its argument be non-NULL, but doesn't check it. Added an assert() to that effect.	2001-12-06 20:29:32 +00:00
Jeremy Hylton	7802a53e38	Little stuff. Add a missing DECREF in an obscure corner. If the str() or repr() of an object passed to a string interpolation -- e.g. "%s" % obj -- returns a non-string, the returned object was leaked. Repair an indentation glitch. Replace a bunch of PyString_AsString() calls (and their ilk) with macros.	2001-12-06 15:18:48 +00:00
Martin v. Löwis	8f1ea71eab	Add more inline documentation, as contributed in #487906 .	2001-12-03 08:24:52 +00:00
Tim Peters	9161c8b0a1	PyString_FromFormatV, string_repr: document why these use sprintf instead of PyOS_snprintf; add some relevant comments and asserts.	2001-12-03 01:55:38 +00:00
Martin v. Löwis	d132750206	Patch 487906: update inline docs.	2001-12-02 18:09:41 +00:00
Tim Peters	885d457709	sprintf -> PyOS_snprintf in some "obviously safe" cases. Also changed <>-style #includes to ""-style in some places where the former didn't make sense.	2001-11-28 20:27:42 +00:00
Guido van Rossum	5c66a26dee	Make the error message for unsupported operand types cleaner, in response to a message by Laura Creighton on c.l.py. E.g. >>> 0+'' TypeError: unsupported operand types for +: 'int' and 'str' (previously this did not mention the operand types) >>> ''+0 TypeError: cannot concatenate 'str' and 'int' objects	2001-10-22 04:12:44 +00:00
Tim Peters	c993315b18	SF bug [#468061 ] __str__ ignored in str subclass. object.c, PyObject_Str: Don't try to optimize anything except exact string objects here; in particular, let str subclasses go thru tp_str, same as non-str objects. This allows overrides of tp_str to take effect. stringobject.c: + string_print (str's tp_print): If the argument isn't an exact string object, get one from PyObject_Str. + string_str (str's tp_str): Make a genuine-string copy of the object if it's of a proper str subclass type. str() applied to a str subclass that doesn't override __str__ ends up here. test_descr.py: New str_of_str_subclass() test.	2001-10-16 20:18:24 +00:00
Fred Drake	2bae4face2	Remove extra "]" in splitlines() docstring. Reported by Neal Norwitz.	2001-10-13 15:57:55 +00:00
Guido van Rossum	9475a2310d	Enable GC for new-style instances. This touches lots of files, since many types were subclassable but had a xxx_dealloc function that called PyObject_DEL(self) directly instead of deferring to self->ob_type->tp_free(self). It is permissible to set tp_free in the type object directly to _PyObject_Del, for non-GC types, or to _PyObject_GC_Del, for GC types. Still, PyObject_DEL was a tad faster, so I'm fearing that our pystone rating is going down again. I'm not sure if doing something like void xxx_dealloc(PyObject *self) { if (PyXxxCheckExact(self)) PyObject_DEL(self); else self->ob_type->tp_free(self); } is any faster than always calling the else branch, so I haven't attempted that -- however those types whose own dealloc is fancier (int, float, unicode) do use this pattern.	2001-10-05 20:51:39 +00:00
Tim Peters	c15c4f1f39	SF bug [#467265 ] Compile errors on SuSe Linux on IBM/s390. Unknown whether this fixes it. - stringobject.c, PyString_FromFormatV: don't assume that va_list is of a type that can be copied via an initializer. - errors.c, PyErr_Format: add a va_end() to balance the va_start().	2001-10-02 21:32:07 +00:00
Guido van Rossum	2ed6bf87c9	Merge branch changes (coercion, rich comparisons) into trunk.	2001-09-27 20:30:07 +00:00
Guido van Rossum	bb77e6801e	Change string comparison so that it applies even when one (or both) arguments are subclasses of str, as long as they don't override rich comparison.	2001-09-24 16:51:54 +00:00
Tim Peters	111f60964e	If interning an instance of a string subclass, intern a real string object with the same value instead. This ensures that a string (or string subclass) object's ob_sinterned pointer is always a str (or NULL), and that the dict of interned strings only has strs as keys.	2001-09-12 07:54:51 +00:00
Tim Peters	af90b3e610	str_subtype_new, unicode_subtype_new: + These were leaving the hash fields at 0, which all string and unicode routines believe is a legitimate hash code. As a result, hash() applied to str and unicode subclass instances always returned 0, which in turn confused dict operations, etc. + Changed local names "new"; no point to antagonizing C++ compilers.	2001-09-12 05:18:58 +00:00
Tim Peters	8fa5dd0601	More bug 460020: lots of string optimizations inhibited for string subclasses, all "the usual" ones (slicing etc), plus replace, translate, ljust, rjust, center and strip. I don't know how to be sure they've all been caught. Question: Should we complain if someone tries to intern an instance of a string subclass? I hate to slow any code on those paths.	2001-09-12 02:18:30 +00:00
Tim Peters	5a49ade70e	More on SF bug [#460020 ] bug or feature: unicode() and subclasses. Repaired str(i) to return a genuine string when i is an instance of a str subclass. New PyString_CheckExact() macro.	2001-09-11 01:41:59 +00:00
Guido van Rossum	29d55a38ce	Fix a memory leak in str_subtype_new(). (All the other xxx_subtype_new() functions are OK, but I goofed up in this one. :-( )	2001-08-31 16:11:15 +00:00
Guido van Rossum	ae960afb5e	Make str and tuple types subclassable.	2001-08-30 03:11:59 +00:00
Barry Warsaw	7c47beb860	Two improvements suggested by Greg Stein: PyString_FromFormatV(): In the final resize at the end, we can use PyString_AS_STRING() since we know the object is a string and can avoid the typechecking. PyString_FromFormat(): GS sez: "For safety/propriety, you should call va_end() on the vargs variable."	2001-08-27 03:11:09 +00:00
Tim Peters	6af5bbb565	PyString_FromFormatV: Massage platform %p output to match what gcc does, at least in the first two characters. %p is ill-defined, and people will forever commit bad tests otherwise ("bad" in the sense that they fall over (at least on Windows) for lack of a leading '0x'; 5 of the 7 tests in test_repr.py failed on Windows for that reason this time around).	2001-08-25 03:02:28 +00:00
Barry Warsaw	dadace004b	PyString_FromFormat() and PyString_FromFormatV(): Largely ripped from PyErr_Format() these new C API methods can be used instead of sprintf()'s into hardcoded char* buffers. This allows us to fix many situation where long package, module, or class names get truncated in reprs. PyString_FromFormat() is the varargs variety. PyString_FromFormatV() is the va_list variety Original PyErr_Format() code was modified to allow %p and %ld expansions. Many reprs were converted to this, checkins coming soo. Not changed: complex_repr(), float_repr(), float_print(), float_str(), int_repr(). There may be other candidates not yet converted. Closes patch #454743.	2001-08-24 18:32:06 +00:00
Martin v. Löwis	339d0f720e	Patch #445762 : Support --disable-unicode - Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled - check for Py_USING_UNICODE in all places that use Unicode functions - disables unicode literals, and the builtin functions - add the types.StringTypes list - remove Unicode literals from most tests.	2001-08-17 18:39:25 +00:00
Martin v. Löwis	e3eb1f2b23	Patch #427190 : Implement and use METH_NOARGS and METH_O.	2001-08-16 13:15:00 +00:00
Tim Peters	6d6c1a35e0	Merge of descr-branch back into trunk.	2001-08-02 04:15:00 +00:00
Jeremy Hylton	3ce45389bd	Add _PyUnicode_AsDefaultEncodedString to unicodeobject.h. And remove all the extern decls in the middle of .c files. Apparently, it was excluded from the header file because it is intended for internal use by the interpreter. It's still intended for internal use and documented as such in the header file.	2001-07-30 22:34:24 +00:00
Tim Peters	52e155e31b	Reformat decl of new _PyString_Join. Add NEWS blurb about repr() speedup.	2001-06-16 05:42:57 +00:00
Tim Peters	a7259597f1	SF bug 433228: repr(list) woes when len(list) big. Gave Python linear-time repr() implementations for dicts, lists, strings. This means, e.g., that repr(range(50000)) is no longer 50x slower than pprint.pprint() in 2.2 <wink>. I don't consider this a bugfix candidate, as it's a performance boost. Added _PyString_Join() to the internal string API. If we want that in the public API, fine, but then it requires runtime error checks instead of asserts.	2001-06-16 05:11:17 +00:00
Marc-André Lemburg	8c2133da7b	Fix for bug #432384 : Recursion in PyString_AsEncodedString?	2001-06-12 13:14:10 +00:00
Martin v. Löwis	cd35306a25	Patch #424335 : Implement string_richcompare, remove string_compare. Use new _PyString_Eq in lookdict_string.	2001-05-24 16:56:35 +00:00
Marc-André Lemburg	2d9204199f	This patch changes the way the string .encode() method works slightly and introduces a new method .decode(). The major change is that strg.encode() will no longer try to convert Unicode returns from the codec into a string, but instead pass along the Unicode object as-is. The same is now true for all other codec return types. The underlying C APIs were changed accordingly. Note that even though this does have the potential of breaking existing code, the chances are low since conversion from Unicode previously took place using the default encoding which is normally set to ASCII rendering this auto-conversion mechanism useless for most Unicode encodings. The good news is that you can now use .encode() and .decode() with much greater ease and that the door was opened for better accessibility of the builtin codecs. As demonstration of the new feature, the patch includes a few new codecs which allow string to string encoding and decoding (rot13, hex, zip, uu, base64). Written by Marc-Andre Lemburg. Copyright assigned to the PSF.	2001-05-15 12:00:02 +00:00
Tim Peters	9c012af3c3	Heh. I need a break. After this: stropmodule & stringobject were more out of synch than I realized, and I managed to break replace's "count" argument when it was 0. All is well again. Maybe. Bugfix candidate.	2001-05-10 00:32:57 +00:00
Tim Peters	4cd44ef4bf	Fudge. stropmodule and stringobject both had copies of the buggy mymemXXX stuff, and they were already out of synch. Fix the remaining bugs in both and get them back in synch. Bugfix release candidate.	2001-05-10 00:05:33 +00:00
Tim Peters	1a97d5f098	SF patch #416247 2.1c1 stringobject: unused vrbl cleanup. Thanks to Mark Favas.	2001-05-09 20:06:00 +00:00
Tim Peters	4862ab7bf4	Sheesh -- repair the dodge around "cast isn't an lvalue" complaints to restore correct semantics.	2001-05-09 08:43:21 +00:00
Tim Peters	9e897f41db	Mark Favas reported that gcc caught me using casts as lvalues. Dodge it.	2001-05-09 07:37:07 +00:00
Tim Peters	b4bbcd76ea	Ack! Restore the COUNT_ALLOCS one_strings code.	2001-05-09 00:31:40 +00:00
Tim Peters	cf5ad5d6f6	My change to string_item() left an extra reference to each 1-character interned string created by "string"[i]. Since they're immortal anyway, this was hard to notice, but it was still wrong <wink>.	2001-05-09 00:24:55 +00:00
Tim Peters	5b4d477568	Intern 1-character strings as soon as they're created. As-is, they aren't interned when created, so the cached versions generally aren't ever interned. With the patch, the Py_INCREF(t); *p = t; Py_DECREF(s); return; indirection block in PyString_InternInPlace() is never executed during a full run of the test suite, but was executed very many times before. So I'm trading more work when creating one-character strings for doing less work later. Note that the "more work" here can happen at most 256 times per program run, so it's trivial. The same reasoning accounts for the patch's simplification of string_item (the new version can call PyString_FromStringAndSize() no more than 256 times per run, so there's no point to inlining that stuff -- if we were serious about saving time here, we'd pre-initialize the characters vector so that no runtime testing at all was needed!).	2001-05-08 22:33:50 +00:00
Tim Peters	2cfe368283	Make unicode.join() work nice with iterators. This also required a change to string.join(), so that when the latter figures out in midstream that it really needs unicode.join() instead, unicode.join() can actually get all the sequence elements (i.e., there's no guarantee that the sequence passed to string.join() can be iterated over again by unicode.join(), so string.join() must not pass on the original sequence object anymore).	2001-05-05 05:36:48 +00:00
Marc-André Lemburg	542fe56cb9	Fix for bug #417030 : "print '%*s' fails for unicode string"	2001-05-02 14:21:53 +00:00
Guido van Rossum	189f1df301	Add a proper implementation for the tp_str slot (returning self, of course), so I can get rid of the special case for strings in PyObject_Str().	2001-05-01 16:51:53 +00:00
Tim Peters	b3d8d1f76c	A different approach to the problem reported in Patch #419651: Metrowerks on Mac adds 0x itself C std says %#x and %#X conversion of 0 do not add the 0x/0X base marker. Metrowerks apparently does. Mark Favas reported the same bug under a Compaq compiler on Tru64 Unix, but no other libc broken in this respect is known (known to be OK under MSVC and gcc). So just try the damn thing at runtime and see what the platform does. Note that we've always had bugs here, but never knew it before because a relevant test case didn't exist before 2.1.	2001-04-28 05:38:26 +00:00
Guido van Rossum	59d1d2b434	Iterators phase 1. This comprises: new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER new C API PyObject_GetIter(), calls tp_iter new builtin iter(), with two forms: iter(obj), and iter(function, sentinel) new internal object types iterobject and calliterobject new exception StopIteration new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py) new magic number for .pyc files new special method for instances: __iter__() returns an iterator iteration over dictionaries: "for x in dict" iterates over the keys iteration over files: "for x in file" iterates over lines TODO: documentation test suite decide whether to use a different way to spell iter(function, sentinal) decide whether "for key in dict" is a good idea use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?) speed tuning (make next() a slot tp_next???)	2001-04-20 19:13:02 +00:00
Tim Peters	fff5325078	Bug 415514 reported that e.g. "%#x" % 0 blew up, at heart because C sprintf supplies a base marker if and only if the value is not 0. I then fixed that, by tolerating C's inconsistency when it does %#x, and taking away that Python produced 0x0 when formatting 0L (the "long" flavor of 0) under %#x itself. But after talking with Guido, we agreed it would be better to supply 0x for the short int case too, despite that it's inconsistent with C, because C is inconsistent with itself and with Python's hex(0) (plus, while "%#x" % 0 didn't work before, "%#x" % 0L did, and returned "0x0"). Similarly for %#X conversion.	2001-04-12 18:38:48 +00:00
Tim Peters	711088d9b8	Fix for SF bug #415514 : "%#x" % 0 caused assertion failure/abort. http://sourceforge.net/tracker/index.php?func=detail&aid=415514&group_id=5470&atid=105470 For short ints, Python defers to the platform C library to figure out what %#x should do. The code asserted that the platform C returned a string beginning with "0x". However, that's not true when-- and only when --the value being formatted is 0. Changed the code to live with C's inconsistency here. In the meantime, the problem does not arise if you format a long 0 (0L) instead. However, that's because the code we wrote to do %#x conversions on longs produces a leading "0x" regardless of value. That's probably wrong too: we should drop leading "0x", for consistency with C, when (& only when) formatting 0L. So I changed the long formatting code to do that too.	2001-04-12 00:35:51 +00:00
Barry Warsaw	a903ad9855	_Py_ReleaseInternedStrings(): Private API function to decref and release the interned string dictionary. This is useful for memory use debugging because it eliminates a huge source of noise from the reports. Only defined when INTERN_STRINGS is defined.	2001-02-23 16:40:48 +00:00
Ka-Ping Yee	fa004ad36c	Show '\011', '\012', and '\015' as '\t', '\n', '\r' in strings. Switch from octal escapes to hex escapes for other nonprintable characters.	2001-01-24 17:19:08 +00:00
Tim Peters	19fe14e76a	Derivative of patch #102549 , "simpler, faster(!) implementation of string.join". Also fixes two long-standing bugs (present in 2.0): 1. .join() didn't check that the result size fit in an int. 2. string.join(s) when len(s)==1 returned s[0] regardless of s[0]'s type; e.g., "".join([3]) returned 3 (overly optimistic optimization). I resisted a keen temptation to make .join() apply str() automagically.	2001-01-19 03:03:47 +00:00
Marc-André Lemburg	3a645e4dd4	Added checks to prevent PyUnicode_Count() from dumping core in case the parameters are out of bounds and fixes error handling for .count(), .startswith() and .endswith() for the case of mixed string/Unicode objects. This patch adds Python style index semantics to PyUnicode_Count() indices (including the special handling of negative indices). The patch is an extended version of patch #103249 submitted by Michael Hudson (mwh) on SF. It also includes new test cases.	2001-01-16 11:54:12 +00:00
Andrew M. Kuchling	6ca8917758	[ Patch #102852 ] Make % error a bit more informative by indicates the index at which an unknown %-escape was found	2000-12-15 13:07:46 +00:00
Fred Drake	49312a52ec	Jeffrey D. Collins <tokeneater@users.sourceforge.net>: Fix type of the self parameter to some string object methods. This closes patch #102670.	2000-12-06 14:27:49 +00:00
Tim Peters	a3a3a030af	Fox for SF bug #123859 : %[duxXo] long formats inconsistent.	2000-11-30 05:22:44 +00:00
Guido van Rossum	2ccda8a7c4	SF patch #102548 , fix for bug #121013 , by mwh@users.sourceforge.net. Fixes a typo that caused "".join(u"this is a test") to dump core.	2000-11-27 18:46:26 +00:00
Fred Drake	661ea26b3d	Ka-Ping Yee <ping@lfw.org>: Changes to error messages to increase consistency & clarity. This (mostly) closes SourceForge patch #101839.	2000-10-24 19:57:45 +00:00
Marc-André Lemburg	53f3d4ac74	[ Bug #116174 ] using %% in cstrings sometimes fails with unicode paramsFix for the bug reported in Bug #116174 : "%% %s" % u"abc" failed due to the way string formatting delegated work to the Unicode formatting function.	2000-10-07 08:54:09 +00:00
Fred Drake	d5fadf75e4	Rationalize use of limits.h, moving the inclusion to Python.h. Add definitions of INT_MAX and LONG_MAX to pyport.h. Remove includes of limits.h and conditional definitions of INT_MAX and LONG_MAX elsewhere. This closes SourceForge patch #101659 and bug #115323.	2000-09-26 05:46:01 +00:00
Tim Peters	38fd5b6413	Derived from Martin's SF patch 110609: support unbounded ints in %d,i,u,x,X,o formats. Note a curious extension to the std C rules: x, X and o formatting can never produce a sign character in C, so the '+' and ' ' flags are meaningless for them. But unbounded ints can produce a sign character under these conversions (no fixed- width bitstring is wide enough to hold all negative values in 2's-comp form). So these flags become meaningful in Python when formatting a Python long which is too big to fit in a C long. This required shuffling around existing code, which hacked x and X conversions to death when both the '#' and '0' flags were specified: the hacks weren't strong enough to deal with the simultaneous possibility of the ' ' or '+' flags too, since signs were always meaningless before for x and X conversions. Isomorphic shuffling was required in unicodeobject.c. Also added dozens of non-trivial new unbounded-int test cases to test_format.py.	2000-09-21 05:43:11 +00:00
Marc-André Lemburg	d1ba443206	This patch adds a new Python C API called PyString_AsStringAndSize() which implements the automatic conversion from Unicode to a string object using the default encoding. The new API is then put to use to have eval() and exec accept Unicode objects as code parameter. This closes bugs #110924 and #113890. As side-effect, the traditional C APIs PyString_Size() and PyString_AsString() will also accept Unicode objects as parameters.	2000-09-19 21:04:18 +00:00
Tim Peters	8f422461b4	Fix for bug 113934. stringn and unicoden did no overflow checking at all, either to see whether the # of chars fit in an int, or that the amount of memory needed fit in a size_t. Checking these is expensive, but the alternative is silently wrong answers (as in the bug report) or core dumps (which were easy to provoke using Unicode strings).	2000-09-09 06:13:41 +00:00
Guido van Rossum	8586991099	REMOVED all CWI, CNRI and BeOpen copyright markings. This should match the situation in the 1.6b1 tree.	2000-09-01 23:29:29 +00:00
Barry Warsaw	4df762ff98	Insure properly identifies the `interned' dictionary as leaking at shutdown time, but CVS log entry for revision 2.45 explains why this is so. Simply include a comment so we don't have to re-figure it out again 5 years from now.	2000-08-16 23:41:01 +00:00
Peter Schneider-Kamp	7e01890986	merge Include/my.h into Include/pyport.h marked my.h as obsolete	2000-07-31 15:28:04 +00:00
Thomas Wouters	7e47402264	Spelling fixes supplied by Rob W. W. Hooft. All these are fixes in either comments, docstrings or error messages. I fixed two minor things in test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't"). There is a minor style issue involved: Guido seems to have preferred English grammar (behaviour, honour) in a couple places. This patch changes that to American, which is the more prominent style in the source. I prefer English myself, so if English is preferred, I'd be happy to supply a patch myself ;)	2000-07-16 12:04:32 +00:00
Jeremy Hylton	03657cfdb0	replace PyXXX_Length calls with PyXXX_Size calls	2000-07-12 13:05:33 +00:00
Andrew M. Kuchling	bd9848d02f	Fix typo in error message	2000-07-12 02:58:28 +00:00
Jeremy Hylton	88887aa38e	small updates to string_join: use PyString_AS_STRING macro on local string object when resizing string, make sure resized string will always be big enough split string containing error message across two lines add test to string_tests that causes resizing	2000-07-11 20:55:38 +00:00
Barry Warsaw	771d0675b6	string_join(): Some cleaning up of reference counting. In the seqlen==1 clause, before returning item, we need to DECREF seq. In the res=PyString... failure clause, we need to goto finally to also decref seq (and the DECREF of res in finally is changed to a XDECREF). Also, we need to DECREF seq just before the PyUnicode_Join() return.	2000-07-11 04:58:12 +00:00
Jeremy Hylton	4904829dbf	fix two refcount bugs in new string_join implementation: 1. PySequence_Fast_GET_ITEM is a macro and borrows a reference 2. The seq returned from PySequence_Fast must be decref'd	2000-07-11 03:28:17 +00:00
Jeremy Hylton	194e43e953	two changes to string_join: implementation -- use PySequence_Fast interface to iterate over elements interface -- if instance object reports wrong length, ignore it; previous version raised an IndexError if reported length was too high	2000-07-10 21:30:28 +00:00
Tim Peters	c2e7da9859	Somebody started playing with const, so of course the outcome was cascades of warnings about mismatching const decls. Overall, I think const creates lots of headaches and solves almost nothing. Added enough consts to shut up the warnings, but this did require casting away const in one spot too (another usual outcome of starting down this path): the function mymemreplace can't return const char, but sometimes wants to return its first argument as-is, which latter must be declared const char in order to avoid const warnings at mymemreplace's call sites. So, in the case the function wants to return the first arg, that arg's declared constness must be subverted.	2000-07-09 08:02:21 +00:00
Fred Drake	ba09633e1e	ANSI-fication of the sources.	2000-07-09 07:04:36 +00:00
Marc-André Lemburg	63f3d17418	Added new codec APIs and a new interface method .encode() which works just like the Unicode one. The C APIs match the ones in the Unicode implementation, but were extended to be able to reuse the existing Unicode codecs for string purposes too. Conversions from string to Unicode and back are done using the default encoding.	2000-07-06 11:29:01 +00:00
Marc-André Lemburg	4027f8f4b3	Added new .isalpha() and .isalnum() methods to match the same ones on the Unicode objects. Note that the string versions use the (locale aware) C lib APIs isalpha() and isalnum().	2000-07-05 09:47:46 +00:00
Guido van Rossum	ffcc3813d8	Change copyright notice - 2nd try.	2000-06-30 23:58:06 +00:00
Guido van Rossum	fd71b9e9d4	Change copyright notice.	2000-06-30 23:50:40 +00:00
Marc-André Lemburg	f28dd83b86	Marc-Andre Lemburg <mal@lemburg.com>: New buffer overflow checks for formatting strings. By Trent Mick.	2000-06-30 10:29:57 +00:00
Fred Drake	396f6e0d6a	Fredrik Lundh <effbot@telia.com>: Simplify find code; this is a performance improvement on at least some platforms.	2000-06-20 15:47:54 +00:00
Marc-André Lemburg	60bc809d9a	Marc-Andre Lemburg <mal@lemburg.com>: Added code so that .isXXX() testing returns 0 for emtpy strings.	2000-06-14 09:18:32 +00:00
Andrew M. Kuchling	cb95a1470a	Patch from Michael Hudson: improve unclear error message	2000-06-09 14:04:53 +00:00
Fred Drake	b6a9ada757	Michael Hudson <mwh21@cam.ac.uk>: Removed PyErr_BadArgument() calls and replaced them with more useful error messages.	2000-06-01 03:12:13 +00:00
Guido van Rossum	c682140de7	Trent Mick: Fix the string methods that implement slice-like semantics with optional args (count, find, endswith, etc.) to properly handle indeces outside [INT_MIN, INT_MAX]. Previously the "i" formatter for PyArg_ParseTuple was used to get the indices. These could overflow. This patch changes the string methods to use the "O&" formatter with the slice_index() function from ceval.c which is used to do the same job for Python code slices (e.g. 'abcabcabc'[0:1000000000L]). slice_index() is renamed _PyEval_SliceIndex() and is now exported. As well, the return values for success/fail were changed to make slice_index directly usable as required by the "O&" formatter. [GvR: shouldn't a similar patch be applied to unicodeobject.c?]	2000-05-08 14:08:05 +00:00
Guido van Rossum	b8f820c5a9	The methods islower(), isupper(), isspace(), isdigit() and istitle() gave bogus results for chars in the range 128-255, because their implementation was using signed characters. Fixed this by using unsigned character pointers (as opposed to using Py_CHARMASK()).	2000-05-05 20:44:24 +00:00
Guido van Rossum	b18618dab7	Vladimir Marangozov's long-awaited malloc restructuring. For more comments, read the patches@python.org archives. For documentation read the comments in mymalloc.h and objimpl.h. (This is not exactly what Vladimir posted to the patches list; I've made a few changes, and Vladimir sent me a fix in private email for a problem that only occurs in debug mode. I'm also holding back on his change to main.c, which seems unnecessary to me.)	2000-05-03 23:44:39 +00:00
Guido van Rossum	f0b7b04ae8	Marc-Andre Lemburg: The maxsplit functionality in .splitlines() was replaced by the keepends functionality which allows keeping the line end markers together with the string. Added support for '%r' % obj: this inserts repr(obj) rather than str(obj).	2000-04-11 15:39:26 +00:00
Guido van Rossum	90daa87569	Marc-Andre Lemburg: * string_contains now calls PyUnicode_Contains() only when the other operand is a Unicode string (not whenever it's not a string). * New format style '%r' inserts repr(arg) instead of str(arg). * '...%s...' % u"abc" now coerces to Unicode just like string methods. Care is taken not to reevaluate already formatted arguments -- only the first Unicode object appearing in the argument mapping is looked up twice. Added test cases for this to test_unicode.py.	2000-04-10 13:47:21 +00:00

1 2 3 4 5 ...

315 Commits