Commit Graph

74 Commits

Author SHA1 Message Date
Thomas Wouters 3ccec68a05 Improve extended slicing support in builtin types and classes. Specifically:
- Specialcase extended slices that amount to a shallow copy the same way as
   is done for simple slices, in the tuple, string and unicode case.

 - Specialcase step-1 extended slices to optimize the common case for all
   involved types.

 - For lists, allow extended slice assignment of differing lengths as long
   as the step is 1. (Previously, 'l[:2:1] = []' failed even though
   'l[:2] = []' and 'l[:2:None] = []' do not.)

 - Implement extended slicing for buffer, array, structseq, mmap and
   UserString.UserString.

 - Implement slice-object support (but not non-step-1 slice assignment) for
   UserString.MutableString.

 - Add tests for all new functionality.
2007-08-28 15:28:19 +00:00
Neal Norwitz 5c9a81a3d8 Fix a bug when there was a newline in the string expandtabs was called on.
This also catches another condition that can overflow.

Will backport.
2007-06-11 02:16:10 +00:00
Kristján Valur Jónsson 170eee9d6a Fix those parts in the testsuite that assumed that sys.maxint would cause overflow on x64. Now the testsuite is well behaved on that platform. 2007-05-03 20:09:56 +00:00
Neal Norwitz 0d4c06e06e Whitespace normalization. Ugh, we really need to do this more often.
You might want to review this change as it's my first time.  Be gentle. :-)
2007-04-25 06:30:05 +00:00
Raymond Hettinger 4db5fe970c SF 1193128: Let str.translate(None) be an identity transformation 2007-04-12 04:10:00 +00:00
Raymond Hettinger a0c95fa4d8 Fix endcase for str.rpartition() 2006-09-04 15:32:48 +00:00
Neal Norwitz f71ec5a0ac Bug #1515471: string.replace() accepts character buffers again.
Pass the char* and size around rather than PyObject's.
2006-07-30 06:57:04 +00:00
Georg Brandl 90e27d38f5 Apply perky's fix for #1503157: "/".join([u"", u""]) raising OverflowError.
Also improve error message on overflow.
2006-06-10 06:40:50 +00:00
Georg Brandl 242508160e RFE #1491485: str/unicode.endswith()/startswith() now accept a tuple as first argument. 2006-06-09 18:45:48 +00:00
Tim Peters 80a18f0f9c Re-enable a new empty-string test added during the NFS sprint,
but disabled then because str and unicode strings gave different
results.  The implementations were repaired later during the
sprint, but the new test remained disabled.
2006-06-01 13:56:26 +00:00
Fredrik Lundh 9e9ef9fa5a changed count to return 0 for slices outside the source string 2006-05-30 17:39:58 +00:00
Fredrik Lundh 93eff6fecd changed find/rfind to return -1 for matches outside the source string 2006-05-30 17:11:48 +00:00
Fredrik Lundh b51b470eb8 fixed "abc".count("", 100) == -96 error (hopefully, nobody's relying on
the current behaviour ;-)
2006-05-29 22:42:07 +00:00
Fredrik Lundh 9c0e9c089c needspeed: rpartition documentation, tests, and a bug fixes.
feel free to add more tests and improve the documentation.
2006-05-26 18:24:15 +00:00
Andrew Dalke 725fe4089d Test for more edge strip cases; leading and trailing separator gets removed
even with strip(..., 0)
2006-05-26 16:22:52 +00:00
Andrew Dalke 669fa188b1 Added more rstrip tests, including for prealloc'ed arrays 2006-05-26 13:05:55 +00:00
Andrew Dalke 5cc6009f0d Test cases for off-by-one errors in string split with multicharacter pattern. 2006-05-26 12:31:00 +00:00
Andrew Dalke 005aee2c39 I like tests.
The new split functions use a preallocated list.  Added tests which exceed
the preallocation size, to exercise list appends/resizes.

Also added more edge case tests.
2006-05-26 12:28:15 +00:00
Andrew Dalke 03fb444990 Added split whitespace checks for characters other than space. 2006-05-26 11:15:22 +00:00
Andrew Dalke 984b971341 Added a few more test cases for whitespace split. These strings have leading whitespace. 2006-05-26 11:11:38 +00:00
Fredrik Lundh 06a69dd8ff needforspeed: partition implementation, part two.
feel free to improve the documentation and the docstrings.
2006-05-26 08:54:28 +00:00
Tim Peters d95d593f47 Whitespace normalization. 2006-05-25 21:52:19 +00:00
Fredrik Lundh 0c71f88fc9 needforspeed: check for overflow in replace (from Andrew Dalke) 2006-05-25 16:46:54 +00:00
Andrew Dalke 2bddcbf10e Added tests for implementation error we came up with in the need for speed sprint. 2006-05-25 16:30:52 +00:00
Tim Peters f4049089c5 Disable the damn empty-string replace test -- it can't
be make to pass now for unicode if it passes for str, or
vice versa.
2006-05-24 21:00:45 +00:00
Tim Peters beaec0c3a1 We can't leave the checked-in tests broken. 2006-05-24 20:27:18 +00:00
Andrew Dalke e5488ec01e Added a slew of test for string replace, based various corner cases from
the Need For Speed sprint coding.  Includes commented out overflow tests
which will be uncommented once the code is fixed.

This test will break the 8-bit string tests because
    "".replace("", "A") == "" when it should == "A"

We have a fix for it, which should be added tomorrow.
2006-05-24 18:55:37 +00:00
Martin v. Löwis 18e165558b Merge ssize_t branch. 2006-02-15 17:27:45 +00:00
Michael W. Hudson b2308bb9be Fix bug:
[ 1327110 ] wrong TypeError traceback in generator expressions

by removing the code that can stomp on the users' TypeError raised by the
iterable argument to ''.join() -- PySequence_Fast (now?) gives a perfectly
reasonable message itself.  Also, a couple of tests.
2005-10-21 11:45:01 +00:00
Walter Dörwald 6eea789fd2 Disable encoding/decoding test, if unicode is disabled. 2005-07-28 16:49:15 +00:00
Raymond Hettinger 57e7447c44 * Beef-up tests for str.count().
* Speed-up str.count() by using memchr() to fly between first char matches.
2005-02-20 09:54:53 +00:00
Raymond Hettinger 7cbf1bcb3e * Beef-up testing of str.__contains__() and str.find().
* Speed-up "x in y" where x has more than one character.

The existing code made excessive calls to the expensive memcmp() function.
The new code uses memchr() to rapidly find a start point for memcmp().
In addition to knowing that the first character is a match, the new code
also checks that the last character is a match.  This significantly reduces
the incidence of false starts (saving memcmp() calls and making quadratic
behavior less likely).

Improves the timings on:
    python -m timeit -r7 -s"x='a'*1000" "'ab' in x"
    python -m timeit -r7 -s"x='a'*1000" "'bc' in x"

Once this code has proven itself, then string_find_internal() should refer
to it rather than running its own version.  Also, something similar may
apply to unicode objects.
2005-02-20 04:07:08 +00:00
Raymond Hettinger 561fbf138d SF bug #1054139: serious string hashing error in 2.4b1
_PyString_Resize() readied strings for mutation but did not invalidate
the cached hash value.
2004-10-26 01:52:37 +00:00
Tim Peters 108f137519 test_bug1001011(): Verify that
s.join([t]) is t

for (s, t) in (str, str), (unicode, unicode), and (str, unicode).
For (unicode, str), verify that it's *not* t (the result is promoted
to unicode instead).  Also verify that when t is a subclass of str or
unicode that "the right thing" happens.
2004-08-27 05:36:07 +00:00
Walter Dörwald 57d88e5abd Move test_bug1001011() to string_tests.MixinStrUnicodeTest so that
it can be used for str and unicode. Drop the test for
   "".join([s]) is s
because this is an implementation detail (and doesn't work for unicode)
2004-08-26 16:53:04 +00:00
Hye-Shik Chang e9ddfbb412 SF #989185: Drop unicode.iswide() and unicode.width() and add
unicodedata.east_asian_width().  You can still implement your own
simple width() function using it like this:
    def width(u):
        w = 0
        for c in unicodedata.normalize('NFC', u):
            cwidth = unicodedata.east_asian_width(c)
            if cwidth in ('W', 'F'): w += 2
            else: w += 1
        return w
2004-08-04 07:38:35 +00:00
Hye-Shik Chang 5f5125997b Add iswide() and width() method for UserString according as the
addition to unicode objects.
2004-06-04 03:18:12 +00:00
Hye-Shik Chang 75c00efcc7 [SF #866875] Add a specialized routine for one character
separaters on str.split() and str.rsplit().
2004-01-05 00:29:51 +00:00
Hye-Shik Chang 7fc4cf57b8 Fix unicode.rsplit()'s bug that ignores separater on the end of string when
using specialized splitter for 1 char sep.
2003-12-23 09:10:16 +00:00
Hye-Shik Chang 3ae811b57d Add rsplit method for str and unicode builtin types.
SF feature request #801847.
Original patch is written by Sean Reifschneider.
2003-12-15 18:49:53 +00:00
Raymond Hettinger 4f8f976576 Add optional fillchar argument to ljust(), rjust(), and center() string methods. 2003-11-26 08:21:35 +00:00
Raymond Hettinger 9bfe533c69 SF bug #795506: Wrong handling of string format code for float values.
Adding missing support for '%F'.

Will backport to 2.3.1.
2003-08-27 04:55:52 +00:00
Tim Peters 0eadaac7dc Whitespace normalization. 2003-04-24 16:02:54 +00:00
Neal Norwitz ffe33b7f24 Attempt to make all the various string *strip methods the same.
* Doc - add doc for when functions were added
 * UserString
 * string object methods
 * string module functions
'chars' is used for the last parameter everywhere.

These changes will be backported, since part of the changes
have already been made, but they were inconsistent.
2003-04-10 22:35:32 +00:00
Walter Dörwald 43440a621e Fix PyString_Format() so that '%c' % u'a' returns u'a'
instead of raising a TypeError. (From SF patch #710127)

Add tests to verify this is fixed.

Add various tests for '%c' % int.
2003-03-31 18:07:50 +00:00
Walter Dörwald 97951de77c Add two tests for simple error cases. 2003-03-26 14:31:25 +00:00
Neal Norwitz 15ff0e9342 Get test to work on alpha 2003-02-23 23:15:26 +00:00
Walter Dörwald 0fd583ce4d Port all string tests to PyUnit and share as much tests
between str, unicode, UserString and the string module
as possible. This increases code coverage in stringobject.c
from 83% to 86% and should help keep the string classes
in sync in the future. From SF patch #662807
2003-02-21 12:53:50 +00:00
Martin v. Löwis 00b6127097 Patch #650653: Raise always value error if the table is not 256 bytes long. 2002-12-12 20:03:19 +00:00
Neil Schemenauer b981df9943 check for str.__mod__ 2002-11-18 16:12:11 +00:00