Commit Graph

231 Commits

Author SHA1 Message Date
Antoine Pitrou 2c3b2302ad Issue #13134: optimize finding single-character strings using memchr 2011-10-11 20:29:21 +02:00
Antoine Pitrou 798b4df812 test_unicode was forgetting to run the common string tests for str.find() 2011-10-08 22:42:00 +02:00
Antoine Pitrou c0bbe7d38a test_unicode was forgetting to run the common string tests for str.find() 2011-10-08 22:41:35 +02:00
Victor Stinner 1d972ad12a Mark 'abc'.expandtab() optimization as specific to CPython
Improve also str.replace(a, a) test
2011-10-07 13:31:46 +02:00
Victor Stinner 59de0ee9e0 str.replace(a, a) is now returning str unchanged if a is a 2011-10-07 10:01:28 +02:00
Ezio Melotti a9860aeb08 #13054: fix usage of sys.maxunicode after PEP-393. 2011-10-04 19:06:00 +03:00
Antoine Pitrou e19aa388e8 When expandtabs() would be a no-op, don't create a duplicate string 2011-10-04 16:04:01 +02:00
Victor Stinner 07ac3ebd7b Optimize unicode_subtype_new(): don't encode to wchar_t and decode from wchar_t
Rewrite unicode_subtype_new(): allocate directly the right type.
2011-10-01 16:16:43 +02:00
Benjamin Peterson 811c2f1369 remove "fast-path" for (i)adding strings
These were just an artifact of the old unicode concatenation hack and likely
just penalized other kinds of adding. Also, this fixes __(i)add__ on string
subclasses.
2011-09-30 21:31:21 -04:00
Martin v. Löwis 287eca658d Fix struct sizes. Drop -1, since the resulting string was actually the largest one
that could be allocated.
2011-09-28 10:03:28 +02:00
Martin v. Löwis d63a3b8beb Implement PEP 393. 2011-09-28 07:41:54 +02:00
Ezio Melotti a3fbde3504 Merge indentation fix and skip decorator with 3.2. 2011-08-23 00:40:09 +03:00
Ezio Melotti a5c92b4714 Fix indentation and add a skip decorator. 2011-08-23 00:37:08 +03:00
Ezio Melotti 6f2a683a0c #9200: merge with 3.2. 2011-08-22 20:31:11 +03:00
Ezio Melotti 93e7afc5d9 #9200: The str.is* methods now work with strings that contain non-BMP characters even in narrow Unicode builds. 2011-08-22 14:08:38 +03:00
Benjamin Peterson f8e7543df9 merge 3.2 (#12732) 2011-08-12 22:18:19 -05:00
Benjamin Peterson f413b80806 in narrow builds, make sure to test codepoints as identifier characters (closes #12732)
This fixes the use of Unicode identifiers outside the BMP in narrow builds.
2011-08-12 22:17:18 -05:00
Eric V. Smith c12469df22 Merge from 3.2. 2011-07-18 14:08:55 -04:00
Eric V. Smith 12ebefc9d3 Closes #12579. Positional fields with str.format_map() now raise a ValueError instead of SystemError. 2011-07-18 14:03:41 -04:00
Senthil Kumaran bc9d8f838b merge from 3.2 2011-07-03 21:05:25 -07:00
Senthil Kumaran 9ebe08d2f6 Fix closes issue12471 - wrong TypeError message when '%i' format spec was used. 2011-07-03 21:03:16 -07:00
Ezio Melotti bf1253b25a #6780: merge with 3.2. 2011-04-26 06:45:24 +03:00
Ezio Melotti f2b3f780a1 #6780: merge with 3.1. 2011-04-26 06:40:59 +03:00
Ezio Melotti ba42fd5801 #6780: fix starts/endswith error message to mention that tuples are accepted too. 2011-04-26 06:09:45 +03:00
Eric V. Smith b9cd3531c4 Issue 9856: Change object.__format__ with a non-empty format string from a PendingDeprecationWarning to a DeprecationWarning. 2011-03-12 10:08:48 -05:00
Victor Stinner 6d970f4713 Issue #10831: PyUnicode_FromFormat() supports %li, %lli and %zi formats 2011-03-02 00:04:25 +00:00
Victor Stinner 968654515f Issue #10829: Refactor PyUnicode_FromFormat()
* Use the same function to parse the format string in the 3 steps
 * Fix crashs on invalid format strings
2011-03-01 23:44:09 +00:00
Victor Stinner 2b574a2332 Merged revisions 88697 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88697 | victor.stinner | 2011-03-01 23:46:52 +0100 (mar., 01 mars 2011) | 4 lines

  Issue #11246: Fix PyUnicode_FromFormat("%V")

  Decode the byte string from UTF-8 (with replace error handler) instead of
  ISO-8859-1 (in strict mode). Patch written by Ray Allen.
........
2011-03-01 22:48:49 +00:00
Victor Stinner 2512a8b62e Issue #11246: Fix PyUnicode_FromFormat("%V")
Decode the byte string from UTF-8 (with replace error handler) instead of
ISO-8859-1 (in strict mode). Patch written by Ray Allen.
2011-03-01 22:46:52 +00:00
Marc-André Lemburg 8f36af7a4c Normalize the encoding names for Latin-1 and UTF-8 to
'latin-1' and 'utf-8'.

These are optimized in the Python Unicode implementation
to result in more direct processing, bypassing the codec
registry.

Also see issue11303.
2011-02-25 15:42:01 +00:00
Victor Stinner 659eb84457 Merged revisions 88481 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88481 | victor.stinner | 2011-02-21 22:13:44 +0100 (lun., 21 févr. 2011) | 4 lines

  Fix PyUnicode_FromFormatV("%c") for non-BMP char

  Issue #10830: Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
  narrow build.
........
2011-02-23 12:14:22 +00:00
Victor Stinner 5ed8b2c737 Fix PyUnicode_FromFormatV("%c") for non-BMP char
Issue #10830: Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
narrow build.
2011-02-21 21:13:44 +00:00
Eric Smith a1eac7218b Issue #11302: missing type check on _string.formatter_field_name_split and _string.formatter_parser caused crash.
Originial patch by haypo, reviewed by me, okayed by Georg.
2011-01-29 11:15:35 +00:00
Victor Stinner ca1e7ec344 test_unicode: use ctypes to test PyUnicode_FromFormat()
Instead of _testcapi.format_unicode() because it has a limited API: it requires
exactly one argument of type unicode.
2011-01-05 00:19:28 +00:00
Alexander Belopolsky 942af5a9a4 Issue #10557: Fixed error messages from float() and other numeric
types.  Added a new API function, PyUnicode_TransformDecimalToASCII(),
which transforms non-ASCII decimal digits in a Unicode string to their
ASCII equivalents.
2010-12-04 03:38:46 +00:00
Ezio Melotti ed3a7d2d60 #10273: Rename assertRegexpMatches and assertRaisesRegexp to assertRegex and assertRaisesRegex. 2010-12-01 02:32:32 +00:00
Antoine Pitrou 0662bc297a Fix tests when ctypes isn't available 2010-11-22 16:19:04 +00:00
Ezio Melotti 19f2aeba67 Merged revisions 86596 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r86596 | ezio.melotti | 2010-11-20 21:04:17 +0200 (Sat, 20 Nov 2010) | 1 line

  #9424: Replace deprecated assert* methods in the Python test suite.
........
2010-11-21 01:30:29 +00:00
Ezio Melotti b3aedd4862 #9424: Replace deprecated assert* methods in the Python test suite. 2010-11-20 19:04:17 +00:00
Eric Smith 72f6620859 Removed unused test classes from test_format_map(). 2010-11-06 14:43:26 +00:00
Eric Smith 27bbca6f79 Issue #6081: Add str.format_map. str.format_map(mapping) is similar to str.format(**mapping), except mapping does not get converted to a dict. 2010-11-04 17:06:58 +00:00
Antoine Pitrou 43ffd5c013 Merged revisions 85861 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r85861 | antoine.pitrou | 2010-10-27 20:52:48 +0200 (mer., 27 oct. 2010) | 3 lines

  Recode modules from latin-1 to utf-8
........
2010-10-27 18:54:06 +00:00
Antoine Pitrou d72402effc Recode modules from latin-1 to utf-8 2010-10-27 18:52:48 +00:00
Victor Stinner 9a90900da5 PyUnicode_FromFormatV(): Fix %A format
It was not completly implemented. Add a test.
2010-10-18 20:59:24 +00:00
Martin v. Löwis baecd7243a Upgrade to Unicode 6.0.0.
makeunicodedata.py: download all data files from unicode.org,
  switch to extracting Unihan data from zip file.
  Read linebreakprops and derivednormalizationprops even for
  old versions, even though they are not used in delta records.
test:unicode.py: U+11000 is now assigned, use U+14000 instead.
2010-10-11 22:42:28 +00:00
Victor Stinner 46c7b3b283 Issue #8670: Rename testcapi unicode test methods
* test_aswidechar() => unicode_aswidechar()
 * test_aswidecharstring() => unicode_aswidecharstring()
2010-10-02 11:49:31 +00:00
Victor Stinner ea3f305a25 Oops, revert unwanted _testcapi changes of r85174 2010-10-02 11:46:20 +00:00
Victor Stinner 749261e241 Issue #8670: ctypes.c_wchar supports non-BMP characters with 32 bits wchar_t 2010-10-02 11:25:35 +00:00
Victor Stinner 5593d8aeb4 Issue #8670: PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() replace
UTF-16 surrogate pairs by single non-BMP characters for 16 bits Py_UNICODE
and 32 bits wchar_t (eg. Linux in narrow build).
2010-10-02 11:11:27 +00:00
Victor Stinner 1c24bd0252 Issue #8870: PyUnicode_AsWideCharString() doesn't count the trailing nul character
And write unit tests for PyUnicode_AsWideChar() and PyUnicode_AsWideCharString().
2010-10-02 11:03:13 +00:00