Serhiy Storchaka
f15ffe0ee5
Add tests for issue #18183 .
2013-06-12 09:28:20 +03:00
Serhiy Storchaka
31b1c8bbde
Add tests for issue #18183 .
2013-06-12 09:20:44 +03:00
Benjamin Peterson
3164f5d565
merge 3.3 ( #18183 )
2013-06-10 09:24:01 -07:00
Benjamin Peterson
7e30373126
remove MAX_MAXCHAR because it's unsafe for computing maximum codepoitn value (see #18183 )
2013-06-10 09:19:46 -07:00
Benjamin Peterson
d2b58a9880
only recursively expand in the format spec ( closes #17644 )
2013-05-17 17:34:30 -05:00
Benjamin Peterson
4d94474ba3
rewrite the parsing of field names to be more consistent wrt recursive expansion
2013-05-17 18:22:31 -05:00
Benjamin Peterson
48953632df
merge 3.3
2013-05-17 17:35:28 -05:00
Victor Stinner
8cecc8c262
Issue #7330 : Implement width and precision (ex: "%5.3s") for the format string
...
of PyUnicode_FromFormat() function, original patch written by Ysj Ray.
2013-05-06 23:11:54 +02:00
Victor Stinner
9fc5981ea2
Issue #17615 : Add tests comparing Unicode strings of different kinds
...
Kinds: ascii, latin, bmp, astral.
2013-04-08 22:34:43 +02:00
Ezio Melotti
09d9d0f385
Merge DeprecationWarnings silencing in test_unicode from 3.3.
2013-02-21 00:01:44 +02:00
Ezio Melotti
51e243f22e
Silence DeprecationWarnings in test_unicode.
2013-02-20 23:56:01 +02:00
Victor Stinner
cfd2c1b4cc
(Merge 3.3) Issue #17137 : When an Unicode string is resized, the internal wide
...
character string (wstr) format is now cleared.
2013-02-07 23:17:34 +01:00
Victor Stinner
bbbac2ec34
Issue #17137 : When an Unicode string is resized, the internal wide character
...
string (wstr) format is now cleared.
2013-02-07 23:12:46 +01:00
Ezio Melotti
5b1acc0dff
#16910 : merge with 3.3.
2013-01-10 07:46:29 +02:00
Ezio Melotti
0dceb560b6
#16910 : test_bytes, test_unicode, and test_userstring now work with unittest test discovery. Patch by Zachary Ware.
2013-01-10 07:43:26 +02:00
Andrew Svetlov
2cd8ce4690
Issue #9856 : Replace deprecation warinigs to raising TypeError in object.__format__
...
Patch by Florent Xicluna.
2012-12-23 14:27:17 +02:00
Chris Jerdonek
d675a2c48a
Merge from 3.3: Improve str() and object.__str__() docs (issue #13538 ).
2012-11-20 17:53:17 -08:00
Chris Jerdonek
5fae0e5854
Improve str() and object.__str__() documentation (issue #13538 ).
2012-11-20 17:45:51 -08:00
Ezio Melotti
cfa9636404
#8271 : merge with 3.3.
2012-11-04 23:23:09 +02:00
Ezio Melotti
f7ed5d111b
#8271 : the utf-8 decoder now outputs the correct number of U+FFFD characters when used with the "replace" error handler on invalid utf-8 sequences. Patch by Serhiy Storchaka, tests by Ezio Melotti.
2012-11-04 23:21:38 +02:00
Mark Dickinson
61254b9391
Issue #14700 : merge tests from 3.3.
2012-10-28 10:23:08 +00:00
Mark Dickinson
2a83f16e5e
Issue #14700 : merge tests from 3.2.
2012-10-28 10:22:22 +00:00
Mark Dickinson
fb90c0934c
Issue #14700 : Fix buggy overflow checks for large precision and width in new-style and old-style formatting.
2012-10-28 10:18:03 +00:00
Victor Stinner
15a1136547
Issue #16147 : PyUnicode_FromFormatV() doesn't need anymore to allocate a buffer
...
on the heap to format numbers.
2012-10-06 23:48:20 +02:00
Victor Stinner
e215d960be
Issue #16147 : Rewrite PyUnicode_FromFormatV() to use _PyUnicodeWriter API
...
* Simplify the code: replace 4 steps with one unique step using the
_PyUnicodeWriter API. PyUnicode_Format() has the same design. It avoids to
store intermediate results which require to allocate an array of pointers on
the heap.
* Use the _PyUnicodeWriter API for speed (and its convinient API):
overallocate the buffer to reduce the number of "realloc()"
* Implement "width" and "precision" in Python, don't rely on sprintf(). It
avoids to need of a temporary buffer allocated on the heap: only use a small
buffer allocated in the stack.
* Add _PyUnicodeWriter_WriteCstr() function
* Split PyUnicode_FromFormatV() into two functions: add
unicode_fromformat_arg().
* Inline parse_format_flags(): the format of an argument is now only parsed
once, it's no more needed to have a subfunction.
* Optimize PyUnicode_FromFormatV() for characters between two "%" arguments:
search the next "%" and copy the substring in one chunk, instead of copying
character per character.
2012-10-06 23:03:36 +02:00
Benjamin Peterson
4eda93723e
add another testcase
2012-08-05 15:05:34 -07:00
Brett Cannon
acc0c181a8
Remove a now worthless test.
2012-05-12 17:40:28 -04:00
Victor Stinner
f59c28c930
unicode_writer_finish() checks string consistency
2012-05-09 03:24:14 +02:00
Victor Stinner
ece58deb9f
Close #14648 : Compute correctly maxchar in str.format() for substrin
2012-04-23 23:36:38 +02:00
Benjamin Peterson
80d07f8251
inherit maxchar of field value where needed ( closes #14648 )
2012-04-23 10:55:29 -04:00
Eric V. Smith
97722c4132
str.format_map tests don't do what they say: fix to actually implement the intent of the test. Closes #13450 . Patch by Akira Li.
2012-03-12 15:26:21 -07:00
Eric V. Smith
edbb6ca084
str.format_map tests don't do what they say: fix to actually implement the intent of the test. Closes #13450 .
2012-03-12 15:16:22 -07:00
Benjamin Peterson
d5890c8db5
add str.casefold() ( closes #13752 )
2012-01-14 13:23:30 -05:00
Benjamin Peterson
b2bf01d824
use full unicode mappings for upper/lower/title case ( #12736 )
...
Also broaden the category of characters that count as lowercase/uppercase.
2012-01-11 18:17:06 -05:00
Victor Stinner
6345be9a14
Close #13093 : PyUnicode_EncodeDecimal() doesn't support error handlers
...
different than "strict" anymore. The caller was unable to compute the
size of the output buffer: it depends on the error handler.
2011-11-25 20:09:01 +01:00
Victor Stinner
b84d723509
(Merge 3.2) Issue #13093 : Fix error handling on PyUnicode_EncodeDecimal()
2011-11-22 01:50:07 +01:00
Victor Stinner
c814a38f3f
Add a test on str.__getnewargs__()
...
It tests indirectly PyUnicode_Copy(): ensure that the string is a copy.
2011-11-22 01:06:15 +01:00
Victor Stinner
42bf77537e
Rewrite PyUnicode_EncodeDecimal() to use the new Unicode API
...
Add tests for PyUnicode_EncodeDecimal() and
PyUnicode_TransformDecimalToASCII().
2011-11-21 22:52:58 +01:00
Victor Stinner
040e16e3e8
"unicode_internal" codec has been deprecated: fix related tests
2011-11-15 22:44:05 +01:00
Antoine Pitrou
78edf7576e
Issue #13333 : The UTF-7 decoder now accepts lone surrogates
...
(the encoder already accepts them).
2011-11-15 01:44:16 +01:00
Antoine Pitrou
5418ee0b9a
Issue #13333 : The UTF-7 decoder now accepts lone surrogates
...
(the encoder already accepts them).
2011-11-15 01:42:21 +01:00
Ezio Melotti
40dc919b0d
Fix range in test.
2011-11-11 17:00:46 +02:00
Antoine Pitrou
51f6648a31
Make test more inclusive
2011-11-11 13:35:44 +01:00
Antoine Pitrou
dffab19218
Enable commented out test
2011-11-11 13:31:59 +01:00
Antoine Pitrou
2c3b2302ad
Issue #13134 : optimize finding single-character strings using memchr
2011-10-11 20:29:21 +02:00
Antoine Pitrou
798b4df812
test_unicode was forgetting to run the common string tests for str.find()
2011-10-08 22:42:00 +02:00
Antoine Pitrou
c0bbe7d38a
test_unicode was forgetting to run the common string tests for str.find()
2011-10-08 22:41:35 +02:00
Victor Stinner
1d972ad12a
Mark 'abc'.expandtab() optimization as specific to CPython
...
Improve also str.replace(a, a) test
2011-10-07 13:31:46 +02:00
Victor Stinner
59de0ee9e0
str.replace(a, a) is now returning str unchanged if a is a
2011-10-07 10:01:28 +02:00
Ezio Melotti
a9860aeb08
#13054 : fix usage of sys.maxunicode after PEP-393.
2011-10-04 19:06:00 +03:00
Antoine Pitrou
e19aa388e8
When expandtabs() would be a no-op, don't create a duplicate string
2011-10-04 16:04:01 +02:00
Victor Stinner
07ac3ebd7b
Optimize unicode_subtype_new(): don't encode to wchar_t and decode from wchar_t
...
Rewrite unicode_subtype_new(): allocate directly the right type.
2011-10-01 16:16:43 +02:00
Benjamin Peterson
811c2f1369
remove "fast-path" for (i)adding strings
...
These were just an artifact of the old unicode concatenation hack and likely
just penalized other kinds of adding. Also, this fixes __(i)add__ on string
subclasses.
2011-09-30 21:31:21 -04:00
Martin v. Löwis
287eca658d
Fix struct sizes. Drop -1, since the resulting string was actually the largest one
...
that could be allocated.
2011-09-28 10:03:28 +02:00
Martin v. Löwis
d63a3b8beb
Implement PEP 393.
2011-09-28 07:41:54 +02:00
Ezio Melotti
a3fbde3504
Merge indentation fix and skip decorator with 3.2.
2011-08-23 00:40:09 +03:00
Ezio Melotti
a5c92b4714
Fix indentation and add a skip decorator.
2011-08-23 00:37:08 +03:00
Ezio Melotti
6f2a683a0c
#9200 : merge with 3.2.
2011-08-22 20:31:11 +03:00
Ezio Melotti
93e7afc5d9
#9200 : The str.is* methods now work with strings that contain non-BMP characters even in narrow Unicode builds.
2011-08-22 14:08:38 +03:00
Benjamin Peterson
f8e7543df9
merge 3.2 ( #12732 )
2011-08-12 22:18:19 -05:00
Benjamin Peterson
f413b80806
in narrow builds, make sure to test codepoints as identifier characters ( closes #12732 )
...
This fixes the use of Unicode identifiers outside the BMP in narrow builds.
2011-08-12 22:17:18 -05:00
Victor Stinner
ab1d16b456
Issue #13093 : Fix error handling on PyUnicode_EncodeDecimal()
...
* Add tests for PyUnicode_EncodeDecimal() and PyUnicode_TransformDecimalToASCII()
* Remove the unused "e" variable in replace()
2011-11-22 01:45:37 +01:00
Eric V. Smith
c12469df22
Merge from 3.2.
2011-07-18 14:08:55 -04:00
Eric V. Smith
12ebefc9d3
Closes #12579 . Positional fields with str.format_map() now raise a ValueError instead of SystemError.
2011-07-18 14:03:41 -04:00
Senthil Kumaran
bc9d8f838b
merge from 3.2
2011-07-03 21:05:25 -07:00
Senthil Kumaran
9ebe08d2f6
Fix closes issue12471 - wrong TypeError message when '%i' format spec was used.
2011-07-03 21:03:16 -07:00
Ezio Melotti
bf1253b25a
#6780 : merge with 3.2.
2011-04-26 06:45:24 +03:00
Ezio Melotti
f2b3f780a1
#6780 : merge with 3.1.
2011-04-26 06:40:59 +03:00
Ezio Melotti
ba42fd5801
#6780 : fix starts/endswith error message to mention that tuples are accepted too.
2011-04-26 06:09:45 +03:00
Eric V. Smith
b9cd3531c4
Issue 9856: Change object.__format__ with a non-empty format string from a PendingDeprecationWarning to a DeprecationWarning.
2011-03-12 10:08:48 -05:00
Victor Stinner
6d970f4713
Issue #10831 : PyUnicode_FromFormat() supports %li, %lli and %zi formats
2011-03-02 00:04:25 +00:00
Victor Stinner
968654515f
Issue #10829 : Refactor PyUnicode_FromFormat()
...
* Use the same function to parse the format string in the 3 steps
* Fix crashs on invalid format strings
2011-03-01 23:44:09 +00:00
Victor Stinner
2b574a2332
Merged revisions 88697 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r88697 | victor.stinner | 2011-03-01 23:46:52 +0100 (mar., 01 mars 2011) | 4 lines
Issue #11246 : Fix PyUnicode_FromFormat("%V")
Decode the byte string from UTF-8 (with replace error handler) instead of
ISO-8859-1 (in strict mode). Patch written by Ray Allen.
........
2011-03-01 22:48:49 +00:00
Victor Stinner
2512a8b62e
Issue #11246 : Fix PyUnicode_FromFormat("%V")
...
Decode the byte string from UTF-8 (with replace error handler) instead of
ISO-8859-1 (in strict mode). Patch written by Ray Allen.
2011-03-01 22:46:52 +00:00
Marc-André Lemburg
8f36af7a4c
Normalize the encoding names for Latin-1 and UTF-8 to
...
'latin-1' and 'utf-8'.
These are optimized in the Python Unicode implementation
to result in more direct processing, bypassing the codec
registry.
Also see issue11303.
2011-02-25 15:42:01 +00:00
Victor Stinner
659eb84457
Merged revisions 88481 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r88481 | victor.stinner | 2011-02-21 22:13:44 +0100 (lun., 21 févr. 2011) | 4 lines
Fix PyUnicode_FromFormatV("%c") for non-BMP char
Issue #10830 : Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
narrow build.
........
2011-02-23 12:14:22 +00:00
Victor Stinner
5ed8b2c737
Fix PyUnicode_FromFormatV("%c") for non-BMP char
...
Issue #10830 : Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
narrow build.
2011-02-21 21:13:44 +00:00
Eric Smith
a1eac7218b
Issue #11302 : missing type check on _string.formatter_field_name_split and _string.formatter_parser caused crash.
...
Originial patch by haypo, reviewed by me, okayed by Georg.
2011-01-29 11:15:35 +00:00
Victor Stinner
ca1e7ec344
test_unicode: use ctypes to test PyUnicode_FromFormat()
...
Instead of _testcapi.format_unicode() because it has a limited API: it requires
exactly one argument of type unicode.
2011-01-05 00:19:28 +00:00
Alexander Belopolsky
942af5a9a4
Issue #10557 : Fixed error messages from float() and other numeric
...
types. Added a new API function, PyUnicode_TransformDecimalToASCII(),
which transforms non-ASCII decimal digits in a Unicode string to their
ASCII equivalents.
2010-12-04 03:38:46 +00:00
Ezio Melotti
ed3a7d2d60
#10273 : Rename assertRegexpMatches and assertRaisesRegexp to assertRegex and assertRaisesRegex.
2010-12-01 02:32:32 +00:00
Antoine Pitrou
0662bc297a
Fix tests when ctypes isn't available
2010-11-22 16:19:04 +00:00
Ezio Melotti
19f2aeba67
Merged revisions 86596 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r86596 | ezio.melotti | 2010-11-20 21:04:17 +0200 (Sat, 20 Nov 2010) | 1 line
#9424 : Replace deprecated assert* methods in the Python test suite.
........
2010-11-21 01:30:29 +00:00
Ezio Melotti
b3aedd4862
#9424 : Replace deprecated assert* methods in the Python test suite.
2010-11-20 19:04:17 +00:00
Eric Smith
72f6620859
Removed unused test classes from test_format_map().
2010-11-06 14:43:26 +00:00
Eric Smith
27bbca6f79
Issue #6081 : Add str.format_map. str.format_map(mapping) is similar to str.format(**mapping), except mapping does not get converted to a dict.
2010-11-04 17:06:58 +00:00
Antoine Pitrou
43ffd5c013
Merged revisions 85861 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r85861 | antoine.pitrou | 2010-10-27 20:52:48 +0200 (mer., 27 oct. 2010) | 3 lines
Recode modules from latin-1 to utf-8
........
2010-10-27 18:54:06 +00:00
Antoine Pitrou
d72402effc
Recode modules from latin-1 to utf-8
2010-10-27 18:52:48 +00:00
Victor Stinner
9a90900da5
PyUnicode_FromFormatV(): Fix %A format
...
It was not completly implemented. Add a test.
2010-10-18 20:59:24 +00:00
Martin v. Löwis
baecd7243a
Upgrade to Unicode 6.0.0.
...
makeunicodedata.py: download all data files from unicode.org,
switch to extracting Unihan data from zip file.
Read linebreakprops and derivednormalizationprops even for
old versions, even though they are not used in delta records.
test:unicode.py: U+11000 is now assigned, use U+14000 instead.
2010-10-11 22:42:28 +00:00
Victor Stinner
46c7b3b283
Issue #8670 : Rename testcapi unicode test methods
...
* test_aswidechar() => unicode_aswidechar()
* test_aswidecharstring() => unicode_aswidecharstring()
2010-10-02 11:49:31 +00:00
Victor Stinner
ea3f305a25
Oops, revert unwanted _testcapi changes of r85174
2010-10-02 11:46:20 +00:00
Victor Stinner
749261e241
Issue #8670 : ctypes.c_wchar supports non-BMP characters with 32 bits wchar_t
2010-10-02 11:25:35 +00:00
Victor Stinner
5593d8aeb4
Issue #8670 : PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() replace
...
UTF-16 surrogate pairs by single non-BMP characters for 16 bits Py_UNICODE
and 32 bits wchar_t (eg. Linux in narrow build).
2010-10-02 11:11:27 +00:00
Victor Stinner
1c24bd0252
Issue #8870 : PyUnicode_AsWideCharString() doesn't count the trailing nul character
...
And write unit tests for PyUnicode_AsWideChar() and PyUnicode_AsWideCharString().
2010-10-02 11:03:13 +00:00
Eric Smith
e4d6317c87
Issue 7994: Make object.__format__() raise a PendingDeprecationWarning
...
if the format string is not empty. Manually merge r79596 and r84772
from 2.x.
Also, apparently test_format() from test_builtin never made it into
3.x. I've added it as well. It tests the basic format()
infrastructure.
2010-09-13 20:48:43 +00:00
Florent Xicluna
a87b383ac1
Reenable test_ucs4 and remove some duplicated lines.
2010-09-13 02:28:18 +00:00
Victor Stinner
4c7db315df
Issue #9738 , #9836 : Fix refleak introduced by r84704
2010-09-12 07:51:18 +00:00
Victor Stinner
1205f2774e
Issue #9738 : PyUnicode_FromFormat() and PyErr_Format() raise an error on
...
a non-ASCII byte in the format string.
Document also the encoding.
2010-09-11 00:54:47 +00:00
Amaury Forgeot d'Arc
324ac65ceb
#5127 : Even on narrow unicode builds, the C functions that access the Unicode
...
Database (Py_UNICODE_TOLOWER, Py_UNICODE_ISDECIMAL, and others) now accept
and return characters from the full Unicode range (Py_UCS4).
The differences from Python code are few:
- unicodedata.numeric(), unicodedata.decimal() and unicodedata.digit()
now return the correct value for large code points
- repr() may consider more characters as printable.
2010-08-18 20:44:58 +00:00