cpython

Commit Graph

Author	SHA1	Message	Date
Neil Schemenauer	ce30bc9f49	Add nb_remainder (i.e. __mod__) slot to unicode type. Fixes SF bug #615506 .	2002-11-18 16:10:18 +00:00
Neal Norwitz	80a1bf4b5d	Fix SF # 635969, No error "not all arguments converted" When mwh added extended slicing, strings and unicode became mappings. Thus, dict was set which prevented an error when doing: newstr = 'format without a percent' % string_value This fix raises an exception again when there are no formats and % with a string value.	2002-11-12 23:01:12 +00:00
Marc-André Lemburg	9cd87aaa54	Fix for bug #626172 : crash using unicode latin1 single char Python 2.2.3 candidate.	2002-10-23 09:02:46 +00:00
Guido van Rossum	049cd6b563	Fix a nasty endcase reported by Armin Rigo in SF bug 618623: '%2147483647d' % -123 segfaults. This was because an integer overflow in a comparison caused the string resize to be skipped. After fixing the overflow, this could call _PyString_Resize() with a negative size, so I (1) test for that and raise MemoryError instead; (2) also added a test for negative newsize to _PyString_Resize(), raising SystemError as for all bad arguments. An identical bug existed in unicodeobject.c, of course. Will backport to 2.2.2.	2002-10-11 00:43:48 +00:00
Marc-André Lemburg	24e53b6d91	Add cast to avoid compiler warning.	2002-09-24 09:32:14 +00:00
Neal Norwitz	a0378e1eda	Fix part of SF bug # 544248 gcc warning in unicodeobject.c When --enable-unicode=ucs4, need to cast Py_UNICODE to a char	2002-09-13 13:47:06 +00:00
Guido van Rossum	efc1188239	Fix warnings on 64-bit platforms about casts from pointers to ints. Two of these were real bugs.	2002-09-12 14:43:41 +00:00
Walter Dörwald	5c1ee17742	Change the unicode.translate docstring to document that Unicode strings (with arbitrary length) are allowed as entries in the unicode.translate mapping. Add a test case for multicharacter replacements. (Multicharacter replacements were enabled by the PEP 293 patch)	2002-09-04 20:31:32 +00:00
Walter Dörwald	3aeb632c31	PEP 293 implemention (from SF patch http://www.python.org/sf/432401 )	2002-09-02 13:14:32 +00:00
Guido van Rossum	2023c9b84a	Fix SF bug 599128, submitted by Inyeol Lee: .replace() would do the wrong thing for a unicode subclass when there were zero string replacements. The example given in the SF bug report was only one way to trigger this; replacing a string of length >= 2 that's not found is another. The code would actually write outside allocated memory if replacement string was longer than the search string. (I wonder how many more of these are lurking? The unicode code base is full of wonders.) Bugfix candidate; this same bug is present in 2.2.1.	2002-08-23 18:50:21 +00:00
Guido van Rossum	8b1a6d694f	Code by Inyeol Lee, submitted to SF bug 595350, to implement the string/unicode method .replace() with a zero-lengt first argument. Inyeol contributed tests for this too.	2002-08-23 18:21:28 +00:00
Guido van Rossum	76afbd9aa4	Fix some endcase bugs in unicode rfind()/rindex() and endswith(). These were reported and fixed by Inyeol Lee in SF bug 595350. The endswith() bug was already fixed in 2.3, but this adds some more test cases.	2002-08-20 17:29:29 +00:00
Guido van Rossum	54df53a352	More changes of DeprecationWarning to FutureWarning.	2002-08-14 18:38:27 +00:00
Marc-André Lemburg	cc8764ca9d	Add C API PyUnicode_FromOrdinal() which exposes unichr() at C level. u'%c' will now raise a ValueError in case the argument is an integer outside the valid range of Unicode code point ordinals. Closes SF bug #593581.	2002-08-11 12:23:04 +00:00
Guido van Rossum	078151da90	Implement stage B0 of PEP 237: add warnings for operations that currently return inconsistent results for ints and longs; in particular: hex/oct/%u/%o/%x/%X of negative short ints, and x<<n that either loses bits or changes sign. (No warnings for repr() of a long, though that will also change to lose the trailing 'L' eventually.) This introduces some warnings in the test suite; I'll take care of those later.	2002-08-11 04:24:12 +00:00
Guido van Rossum	f36921c4b0	Unicode replace() method with empty pattern argument should fail, like it does for 8-bit strings.	2002-08-09 15:36:48 +00:00
Barry Warsaw	6a043f3fe8	PyUnicode_Contains(): The memcmp() call didn't take into account the width of Py_UNICODE. Good catch, MAL.	2002-08-06 19:03:17 +00:00
Barry Warsaw	817918cc3c	Committing patch #591250 which provides "str1 in str2" when str1 is a string of longer than 1 character.	2002-08-06 16:58:21 +00:00
Skip Montanaro	35b37a5c11	tighten up the unicode object's docstring a tad	2002-07-26 16:22:46 +00:00
Jeremy Hylton	938ace69a0	staticforward bites the dust. The staticforward define was needed to support certain broken C compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the static keyword when it was used with a forward declaration of a static initialized structure. Standard C allows the forward declaration with static, and we've decided to stop catering to broken C compilers. (In fact, we expect that the compilers are all fixed eight years later.) I'm leaving staticforward and statichere defined in object.h as static. This is only for backwards compatibility with C extensions that might still use it. XXX I haven't updated the documentation.	2002-07-17 16:30:39 +00:00
Martin v. Löwis	6238d2b024	Patch #569753 : Remove support for WIN16. Rename all occurrences of MS_WIN32 to MS_WINDOWS.	2002-06-30 15:26:10 +00:00
Neal Norwitz	20e72130c4	Fix typo in exception message	2002-06-13 21:25:17 +00:00
Martin v. Löwis	14f8b4cfcb	Patch #568124 : Add doc string macros.	2002-06-13 20:33:02 +00:00
Michael W. Hudson	5efaf7eac8	This is my nearly two year old patch [ 400998 ] experimental support for extended slicing on lists somewhat spruced up and better tested than it was when I wrote it. Includes docs & tests. The whatsnew section needs expanding, and arrays should support extended slices -- later.	2002-06-11 10:55:12 +00:00
Marc-André Lemburg	4164439240	Fix a possible segfault. Found be Neal Norvitz.	2002-05-29 13:46:29 +00:00
Marc-André Lemburg	4da6fd63bc	Fix for bug [ 561796 ] string.find causes lazy error	2002-05-29 11:33:13 +00:00
Guido van Rossum	cacfc07d08	- A new type object, 'string', is added. This is a common base type for 'str' and 'unicode', and can be used instead of types.StringTypes, e.g. to test whether something is "a string": isinstance(x, string) is True for Unicode and 8-bit strings. This is an abstract base class and cannot be instantiated directly.	2002-05-24 19:01:59 +00:00
Raymond Hettinger	0ebac97058	Patch 549187. Improve string formatting error message.	2002-05-21 15:14:57 +00:00
Tim Peters	5de9842b34	Repair widespread misuse of _PyString_Resize. Since it's clear people don't understand how this function works, also beefed up the docs. The most common usage error is of this form (often spread out across gotos): if (_PyString_Resize(&s, n) < 0) { Py_DECREF(s); s = NULL; goto outtahere; } The error is that if _PyString_Resize runs out of memory, it automatically decrefs the input string object s (which also deallocates it, since its refcount must be 1 upon entry), and sets s to NULL. So if the "if" branch ever triggers, it's an error to call Py_DECREF(s): s is already NULL! A correct way to write the above is the simpler (and intended) if (_PyString_Resize(&s, n) < 0) goto outtahere; Bugfix candidate.	2002-04-27 18:44:32 +00:00
Tim Peters	602f740bc2	SF patch 549375: Compromise PyUnicode_EncodeUTF8 This implements ideas from Marc-Andre, Martin, Guido and me on Python-Dev. "Short" Unicode strings are encoded into a "big enough" stack buffer, then exactly as much string space as they turn out to need is allocated at the end. This should have speed benefits akin to Martin's "measure once, allocate once" strategy, but without needing a distinct measuring pass. "Long" Unicode strings allocate as much heap space as they could possibly need (4 x # Unicode chars), and do a realloc at the end to return the untouched excess. Since the overallocation is likely to be substantial, this shouldn't burden the platform realloc with unusably small excess blocks. Also simplified uses of the PyString_xyz functions. Also added a release- build check that 4*size doesn't overflow a C int. Sooner or later, that's going to happen.	2002-04-27 18:03:26 +00:00
Tim Peters	030a5cebf4	unicode_memchr(): Squashed gratuitous int-vs-size_t mismatch (which gives a compiler wng under MSVC because of the resulting signed-vs- unsigned comparison).	2002-04-22 19:00:10 +00:00
Walter Dörwald	de02bcb265	Apply patch diff.txt from SF feature request http://www.python.org/sf/444708 This adds the optional argument for str.strip to unicode.strip too and makes it possible to call str.strip with a unicode argument and unicode.strip with a str argument.	2002-04-22 17:42:37 +00:00
Tim Peters	0eca65c4c5	PyUnicode_EncodeUTF8(): tightened the memory asserts a bit, and at least tried to catch some possible arithmetic overflows in the debug build.	2002-04-21 17:28:06 +00:00
Martin v. Löwis	2a7ff35a07	Back out 2.140.	2002-04-21 09:59:45 +00:00
Tim Peters	7e3d961fc1	PyUnicode_EncodeUTF8: squash compiler wng. The difference of two pointers is a signed type. Changing "allocated" to a signed int makes undetected overflow more likely, but there was no overflow detection before either.	2002-04-21 03:26:37 +00:00
Martin v. Löwis	a4eb14b7a4	Patch #495401 : Count number of required bytes for encoding UTF-8 before allocating the target buffer.	2002-04-20 13:44:01 +00:00
Walter Dörwald	0fe940c862	Return the orginal string only if it's a real str or unicode instance, otherwise make a copy.	2002-04-15 18:42:15 +00:00
Walter Dörwald	068325ef92	Apply the second version of SF patch http://www.python.org/sf/536241 Add a method zfill to str, unicode and UserString and change Lib/string.py accordingly. This activates the zfill version in unicodeobject.c that was commented out and implements the same in stringobject.c. It also adds the test for unicode support in Lib/string.py back in and uses repr() instead() of str() (as it was before Lib/string.py 1.62)	2002-04-15 13:36:47 +00:00
Neil Schemenauer	58aa861fa2	Remove PyMalloc_*.	2002-04-12 03:07:20 +00:00
Marc-André Lemburg	68e69338ae	Bug fix for UTF-8 encoding bug (buffer overrun) #541828 .	2002-04-10 20:36:13 +00:00
Marc-André Lemburg	ce0b664af2	Added test case for UTF-8 encoding bug #541828 .	2002-04-10 17:18:02 +00:00
Guido van Rossum	77f6a65eb0	Add the 'bool' type and its values 'False' and 'True', as described in PEP 285. Everything described in the PEP is here, and there is even some documentation. I had to fix 12 unit tests; all but one of these were printing Boolean outcomes that changed from 0/1 to False/True. (The exception is test_unicode.py, which did a type(x) == type(y) style comparison. I could've fixed that with a single line using issubtype(x, type(y)), but instead chose to be explicit about those places where a bool is expected. Still to do: perhaps more documentation; change standard library modules to return False/True from predicates.	2002-04-03 22:41:51 +00:00
Walter Dörwald	8c077227f2	Fix whitespace.	2002-03-25 11:16:18 +00:00
Neil Schemenauer	dcc819a5c9	Use pymalloc if it's enabled.	2002-03-22 15:33:15 +00:00
Martin v. Löwis	047c05ebc4	Do not insert characters for unicode-escape decoders if the error mode is "ignore". Fixes #529104.	2002-03-21 08:55:28 +00:00
Andrew MacIntyre	5e9c80d906	%#x/%#X format conversion cleanup (see patch #450267 ): Objects/ stringobject.c unicodeobject.c	2002-02-28 11:38:24 +00:00
Andrew MacIntyre	c487439aa7	OS/2 EMX port changes (Objects part of patch #450267 ): Objects/ fileobject.c stringobject.c unicodeobject.c This commit doesn't include the cleanup patches for stringobject.c and unicodeobject.c which are shown separately in the patch manager. Those patches will be regenerated and applied in a subsequent commit, so as to preserve a fallback position (this commit to those files).	2002-02-26 11:36:35 +00:00
Marc-André Lemburg	bd3be8f0ca	Fix to the UTF-8 encoder: it failed on 0-length input strings. Fix for the UTF-8 decoder: it will now accept isolated surrogates (previously it raised an exception which causes round-trips to fail). Added new tests for UTF-8 round-trip safety (we rely on UTF-8 for marshalling Unicode objects, so we better make sure it works for all Unicode code points, including isolated surrogates). Bumped the PYC magic in a non-standard way -- please review. This was needed because the old PYC format used illegal UTF-8 sequences for isolated high surrogates which now raise an exception.	2002-02-07 11:33:49 +00:00
Marc-André Lemburg	dc724d6e35	Cosmetics.	2002-02-06 18:20:19 +00:00
Marc-André Lemburg	e7c6ee4b8a	Whitespace fixes.	2002-02-06 18:18:03 +00:00

1 2 3 4

175 Commits