Commit Graph

45 Commits

Author SHA1 Message Date
Walter Dörwald 6eea789fd2 Disable encoding/decoding test, if unicode is disabled. 2005-07-28 16:49:15 +00:00
Raymond Hettinger 57e7447c44 * Beef-up tests for str.count().
* Speed-up str.count() by using memchr() to fly between first char matches.
2005-02-20 09:54:53 +00:00
Raymond Hettinger 7cbf1bcb3e * Beef-up testing of str.__contains__() and str.find().
* Speed-up "x in y" where x has more than one character.

The existing code made excessive calls to the expensive memcmp() function.
The new code uses memchr() to rapidly find a start point for memcmp().
In addition to knowing that the first character is a match, the new code
also checks that the last character is a match.  This significantly reduces
the incidence of false starts (saving memcmp() calls and making quadratic
behavior less likely).

Improves the timings on:
    python -m timeit -r7 -s"x='a'*1000" "'ab' in x"
    python -m timeit -r7 -s"x='a'*1000" "'bc' in x"

Once this code has proven itself, then string_find_internal() should refer
to it rather than running its own version.  Also, something similar may
apply to unicode objects.
2005-02-20 04:07:08 +00:00
Raymond Hettinger 561fbf138d SF bug #1054139: serious string hashing error in 2.4b1
_PyString_Resize() readied strings for mutation but did not invalidate
the cached hash value.
2004-10-26 01:52:37 +00:00
Tim Peters 108f137519 test_bug1001011(): Verify that
s.join([t]) is t

for (s, t) in (str, str), (unicode, unicode), and (str, unicode).
For (unicode, str), verify that it's *not* t (the result is promoted
to unicode instead).  Also verify that when t is a subclass of str or
unicode that "the right thing" happens.
2004-08-27 05:36:07 +00:00
Walter Dörwald 57d88e5abd Move test_bug1001011() to string_tests.MixinStrUnicodeTest so that
it can be used for str and unicode. Drop the test for
   "".join([s]) is s
because this is an implementation detail (and doesn't work for unicode)
2004-08-26 16:53:04 +00:00
Hye-Shik Chang e9ddfbb412 SF #989185: Drop unicode.iswide() and unicode.width() and add
unicodedata.east_asian_width().  You can still implement your own
simple width() function using it like this:
    def width(u):
        w = 0
        for c in unicodedata.normalize('NFC', u):
            cwidth = unicodedata.east_asian_width(c)
            if cwidth in ('W', 'F'): w += 2
            else: w += 1
        return w
2004-08-04 07:38:35 +00:00
Hye-Shik Chang 5f5125997b Add iswide() and width() method for UserString according as the
addition to unicode objects.
2004-06-04 03:18:12 +00:00
Hye-Shik Chang 75c00efcc7 [SF #866875] Add a specialized routine for one character
separaters on str.split() and str.rsplit().
2004-01-05 00:29:51 +00:00
Hye-Shik Chang 7fc4cf57b8 Fix unicode.rsplit()'s bug that ignores separater on the end of string when
using specialized splitter for 1 char sep.
2003-12-23 09:10:16 +00:00
Hye-Shik Chang 3ae811b57d Add rsplit method for str and unicode builtin types.
SF feature request #801847.
Original patch is written by Sean Reifschneider.
2003-12-15 18:49:53 +00:00
Raymond Hettinger 4f8f976576 Add optional fillchar argument to ljust(), rjust(), and center() string methods. 2003-11-26 08:21:35 +00:00
Raymond Hettinger 9bfe533c69 SF bug #795506: Wrong handling of string format code for float values.
Adding missing support for '%F'.

Will backport to 2.3.1.
2003-08-27 04:55:52 +00:00
Tim Peters 0eadaac7dc Whitespace normalization. 2003-04-24 16:02:54 +00:00
Neal Norwitz ffe33b7f24 Attempt to make all the various string *strip methods the same.
* Doc - add doc for when functions were added
 * UserString
 * string object methods
 * string module functions
'chars' is used for the last parameter everywhere.

These changes will be backported, since part of the changes
have already been made, but they were inconsistent.
2003-04-10 22:35:32 +00:00
Walter Dörwald 43440a621e Fix PyString_Format() so that '%c' % u'a' returns u'a'
instead of raising a TypeError. (From SF patch #710127)

Add tests to verify this is fixed.

Add various tests for '%c' % int.
2003-03-31 18:07:50 +00:00
Walter Dörwald 97951de77c Add two tests for simple error cases. 2003-03-26 14:31:25 +00:00
Neal Norwitz 15ff0e9342 Get test to work on alpha 2003-02-23 23:15:26 +00:00
Walter Dörwald 0fd583ce4d Port all string tests to PyUnit and share as much tests
between str, unicode, UserString and the string module
as possible. This increases code coverage in stringobject.c
from 83% to 86% and should help keep the string classes
in sync in the future. From SF patch #662807
2003-02-21 12:53:50 +00:00
Martin v. Löwis 00b6127097 Patch #650653: Raise always value error if the table is not 256 bytes long. 2002-12-12 20:03:19 +00:00
Neil Schemenauer b981df9943 check for str.__mod__ 2002-11-18 16:12:11 +00:00
Guido van Rossum 8b1a6d694f Code by Inyeol Lee, submitted to SF bug 595350, to implement
the string/unicode method .replace() with a zero-lengt first argument.
Inyeol contributed tests for this too.
2002-08-23 18:21:28 +00:00
Raymond Hettinger c35491ee3a Moved inplace add and multiply methods from UserString to MutableString.
Closes SF Bug #592573 where inplace add mutated a UserString.
Added unittests to verify the bug is cleared.
2002-08-09 01:37:06 +00:00
Raymond Hettinger 8da9da0ccc Revised the test suite for 'contains' to use the test() function argument
rather than vereq().  While it was effectively testing regular strings, it
ignored the test() function argument when called by test_userstring.py.
2002-08-09 00:43:38 +00:00
Tim Peters 469cdad822 Whitespace normalization. 2002-08-08 20:19:19 +00:00
Barry Warsaw 817918cc3c Committing patch #591250 which provides "str1 in str2" when str1 is a
string of longer than 1 character.
2002-08-06 16:58:21 +00:00
Barry Warsaw 408b6d34de Complete the absolute import patch for the test suite. All relative
imports of test modules now import from the test package.  Other
related oddities are also fixed (like DeprecationWarning filters that
weren't specifying the full import part, etc.).  Also did a general
code cleanup to remove all "from test.test_support import *"'s.  Other
from...import *'s weren't changed.
2002-07-30 23:27:12 +00:00
Neal Norwitz 1f68fc7fa5 SF bug # 493951 string.{starts,ends}with vs slices
Handle negative indices similar to slices.
2002-06-14 00:50:42 +00:00
Tim Peters 8ac1495a6a Whitespace normalization. 2002-05-23 15:15:30 +00:00
Michael W. Hudson f207277167 More --disable-unicode stuff.
I'm getting better at vi!
2002-05-20 14:48:16 +00:00
Walter Dörwald de02bcb265 Apply patch diff.txt from SF feature request
http://www.python.org/sf/444708

This adds the optional argument for str.strip
to unicode.strip too and makes it possible
to call str.strip with a unicode argument
and unicode.strip with a str argument.
2002-04-22 17:42:37 +00:00
Tim Peters 863ac44b74 Whitespace normalization. 2002-04-16 01:38:40 +00:00
Walter Dörwald 068325ef92 Apply the second version of SF patch http://www.python.org/sf/536241
Add a method zfill to str, unicode and UserString and change
Lib/string.py accordingly.

This activates the zfill version in unicodeobject.c that was
commented out and implements the same in stringobject.c. It also
adds the test for unicode support in Lib/string.py back in and
uses repr() instead() of str() (as it was before Lib/string.py 1.62)
2002-04-15 13:36:47 +00:00
Guido van Rossum 018b0eb0f5 Partially implement SF feature request 444708.
Add optional arg to string methods strip(), lstrip(), rstrip().
The optional arg specifies characters to delete.

Also for UserString.

Still to do:

- Misc/NEWS
- LaTeX docs (I did the docstrings though)
- Unicode methods, and Unicode support in the string methods.
2002-04-13 00:56:08 +00:00
Andrew M. Kuchling c6c9c4a10f Add two tests for string.zfill 2002-03-29 16:00:13 +00:00
Martin v. Löwis 339d0f720e Patch #445762: Support --disable-unicode
- Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled
- check for Py_USING_UNICODE in all places that use Unicode functions
- disables unicode literals, and the builtin functions
- add the types.StringTypes list
- remove Unicode literals from most tests.
2001-08-17 18:39:25 +00:00
Marc-André Lemburg 2d9204199f This patch changes the way the string .encode() method works slightly
and introduces a new method .decode().

The major change is that strg.encode() will no longer try to convert
Unicode returns from the codec into a string, but instead pass along
the Unicode object as-is. The same is now true for all other codec
return types. The underlying C APIs were changed accordingly.

Note that even though this does have the potential of breaking
existing code, the chances are low since conversion from Unicode
previously took place using the default encoding which is normally
set to ASCII rendering this auto-conversion mechanism useless for
most Unicode encodings.

The good news is that you can now use .encode() and .decode() with
much greater ease and that the door was opened for better accessibility
of the builtin codecs.

As demonstration of the new feature, the patch includes a few new
codecs which allow string to string encoding and decoding (rot13,
hex, zip, uu, base64).

Written by Marc-Andre Lemburg. Copyright assigned to the PSF.
2001-05-15 12:00:02 +00:00
Tim Peters 1a7b3eee94 SF bug #422088: [OSF1 alpha] string.replace().
Platform blew up on "123".replace("123", "").  Michael Hudson pinned the
blame on platform malloc(0) returning NULL.
This is a candidate for all bugfix releases.
2001-05-09 23:00:26 +00:00
Eric S. Raymond fc170b1fd5 String method conversion. 2001-02-09 11:51:27 +00:00
Marc-André Lemburg fde66e1bcc Fixed .capitalize() method of Unicode objects to work like the
corresponding string method. Added tests for this too.

Patch written by Marc-Andre Lemburg. Copyright assigned to Guido van Rossum.
2001-01-29 11:14:16 +00:00
Tim Peters d2bf3b7ca6 Whitespace normalization. Leaving tokenize_tests.py alone for now. 2001-01-18 02:22:22 +00:00
Marc-André Lemburg 3a645e4dd4 Added checks to prevent PyUnicode_Count() from dumping core
in case the parameters are out of bounds and fixes error handling
for .count(), .startswith() and .endswith() for the case of
mixed string/Unicode objects.

This patch adds Python style index semantics to PyUnicode_Count()
indices (including the special handling of negative indices).

The patch is an extended version of patch #103249 submitted
by Michael Hudson (mwh) on SF. It also includes new test cases.
2001-01-16 11:54:12 +00:00
Jeremy Hylton 88887aa38e small updates to string_join:
use PyString_AS_STRING macro on local string object
    when resizing string, make sure resized string will always be big enough
    split string containing error message across two lines
add test to string_tests that causes resizing
2000-07-11 20:55:38 +00:00
Jeremy Hylton 20f41b6456 add more tests of string.join variants to run_method_tests 2000-07-11 03:31:55 +00:00
Jeremy Hylton f82b04ecbb factor out test definitions to string_tests module
test_string and test_userstring run same tests for string methods
2000-07-10 17:08:42 +00:00