Victor Stinner
e215d960be
Issue #16147 : Rewrite PyUnicode_FromFormatV() to use _PyUnicodeWriter API
...
* Simplify the code: replace 4 steps with one unique step using the
_PyUnicodeWriter API. PyUnicode_Format() has the same design. It avoids to
store intermediate results which require to allocate an array of pointers on
the heap.
* Use the _PyUnicodeWriter API for speed (and its convinient API):
overallocate the buffer to reduce the number of "realloc()"
* Implement "width" and "precision" in Python, don't rely on sprintf(). It
avoids to need of a temporary buffer allocated on the heap: only use a small
buffer allocated in the stack.
* Add _PyUnicodeWriter_WriteCstr() function
* Split PyUnicode_FromFormatV() into two functions: add
unicode_fromformat_arg().
* Inline parse_format_flags(): the format of an argument is now only parsed
once, it's no more needed to have a subfunction.
* Optimize PyUnicode_FromFormatV() for characters between two "%" arguments:
search the next "%" and copy the substring in one chunk, instead of copying
character per character.
2012-10-06 23:03:36 +02:00
Benjamin Peterson
4eda93723e
add another testcase
2012-08-05 15:05:34 -07:00
Brett Cannon
acc0c181a8
Remove a now worthless test.
2012-05-12 17:40:28 -04:00
Victor Stinner
f59c28c930
unicode_writer_finish() checks string consistency
2012-05-09 03:24:14 +02:00
Victor Stinner
ece58deb9f
Close #14648 : Compute correctly maxchar in str.format() for substrin
2012-04-23 23:36:38 +02:00
Benjamin Peterson
80d07f8251
inherit maxchar of field value where needed ( closes #14648 )
2012-04-23 10:55:29 -04:00
Eric V. Smith
97722c4132
str.format_map tests don't do what they say: fix to actually implement the intent of the test. Closes #13450 . Patch by Akira Li.
2012-03-12 15:26:21 -07:00
Eric V. Smith
edbb6ca084
str.format_map tests don't do what they say: fix to actually implement the intent of the test. Closes #13450 .
2012-03-12 15:16:22 -07:00
Benjamin Peterson
d5890c8db5
add str.casefold() ( closes #13752 )
2012-01-14 13:23:30 -05:00
Benjamin Peterson
b2bf01d824
use full unicode mappings for upper/lower/title case ( #12736 )
...
Also broaden the category of characters that count as lowercase/uppercase.
2012-01-11 18:17:06 -05:00
Victor Stinner
6345be9a14
Close #13093 : PyUnicode_EncodeDecimal() doesn't support error handlers
...
different than "strict" anymore. The caller was unable to compute the
size of the output buffer: it depends on the error handler.
2011-11-25 20:09:01 +01:00
Victor Stinner
b84d723509
(Merge 3.2) Issue #13093 : Fix error handling on PyUnicode_EncodeDecimal()
2011-11-22 01:50:07 +01:00
Victor Stinner
c814a38f3f
Add a test on str.__getnewargs__()
...
It tests indirectly PyUnicode_Copy(): ensure that the string is a copy.
2011-11-22 01:06:15 +01:00
Victor Stinner
42bf77537e
Rewrite PyUnicode_EncodeDecimal() to use the new Unicode API
...
Add tests for PyUnicode_EncodeDecimal() and
PyUnicode_TransformDecimalToASCII().
2011-11-21 22:52:58 +01:00
Victor Stinner
040e16e3e8
"unicode_internal" codec has been deprecated: fix related tests
2011-11-15 22:44:05 +01:00
Antoine Pitrou
78edf7576e
Issue #13333 : The UTF-7 decoder now accepts lone surrogates
...
(the encoder already accepts them).
2011-11-15 01:44:16 +01:00
Antoine Pitrou
5418ee0b9a
Issue #13333 : The UTF-7 decoder now accepts lone surrogates
...
(the encoder already accepts them).
2011-11-15 01:42:21 +01:00
Ezio Melotti
40dc919b0d
Fix range in test.
2011-11-11 17:00:46 +02:00
Antoine Pitrou
51f6648a31
Make test more inclusive
2011-11-11 13:35:44 +01:00
Antoine Pitrou
dffab19218
Enable commented out test
2011-11-11 13:31:59 +01:00
Antoine Pitrou
2c3b2302ad
Issue #13134 : optimize finding single-character strings using memchr
2011-10-11 20:29:21 +02:00
Antoine Pitrou
798b4df812
test_unicode was forgetting to run the common string tests for str.find()
2011-10-08 22:42:00 +02:00
Antoine Pitrou
c0bbe7d38a
test_unicode was forgetting to run the common string tests for str.find()
2011-10-08 22:41:35 +02:00
Victor Stinner
1d972ad12a
Mark 'abc'.expandtab() optimization as specific to CPython
...
Improve also str.replace(a, a) test
2011-10-07 13:31:46 +02:00
Victor Stinner
59de0ee9e0
str.replace(a, a) is now returning str unchanged if a is a
2011-10-07 10:01:28 +02:00
Ezio Melotti
a9860aeb08
#13054 : fix usage of sys.maxunicode after PEP-393.
2011-10-04 19:06:00 +03:00
Antoine Pitrou
e19aa388e8
When expandtabs() would be a no-op, don't create a duplicate string
2011-10-04 16:04:01 +02:00
Victor Stinner
07ac3ebd7b
Optimize unicode_subtype_new(): don't encode to wchar_t and decode from wchar_t
...
Rewrite unicode_subtype_new(): allocate directly the right type.
2011-10-01 16:16:43 +02:00
Benjamin Peterson
811c2f1369
remove "fast-path" for (i)adding strings
...
These were just an artifact of the old unicode concatenation hack and likely
just penalized other kinds of adding. Also, this fixes __(i)add__ on string
subclasses.
2011-09-30 21:31:21 -04:00
Martin v. Löwis
287eca658d
Fix struct sizes. Drop -1, since the resulting string was actually the largest one
...
that could be allocated.
2011-09-28 10:03:28 +02:00
Martin v. Löwis
d63a3b8beb
Implement PEP 393.
2011-09-28 07:41:54 +02:00
Ezio Melotti
a3fbde3504
Merge indentation fix and skip decorator with 3.2.
2011-08-23 00:40:09 +03:00
Ezio Melotti
a5c92b4714
Fix indentation and add a skip decorator.
2011-08-23 00:37:08 +03:00
Ezio Melotti
6f2a683a0c
#9200 : merge with 3.2.
2011-08-22 20:31:11 +03:00
Ezio Melotti
93e7afc5d9
#9200 : The str.is* methods now work with strings that contain non-BMP characters even in narrow Unicode builds.
2011-08-22 14:08:38 +03:00
Benjamin Peterson
f8e7543df9
merge 3.2 ( #12732 )
2011-08-12 22:18:19 -05:00
Benjamin Peterson
f413b80806
in narrow builds, make sure to test codepoints as identifier characters ( closes #12732 )
...
This fixes the use of Unicode identifiers outside the BMP in narrow builds.
2011-08-12 22:17:18 -05:00
Victor Stinner
ab1d16b456
Issue #13093 : Fix error handling on PyUnicode_EncodeDecimal()
...
* Add tests for PyUnicode_EncodeDecimal() and PyUnicode_TransformDecimalToASCII()
* Remove the unused "e" variable in replace()
2011-11-22 01:45:37 +01:00
Eric V. Smith
c12469df22
Merge from 3.2.
2011-07-18 14:08:55 -04:00
Eric V. Smith
12ebefc9d3
Closes #12579 . Positional fields with str.format_map() now raise a ValueError instead of SystemError.
2011-07-18 14:03:41 -04:00
Senthil Kumaran
bc9d8f838b
merge from 3.2
2011-07-03 21:05:25 -07:00
Senthil Kumaran
9ebe08d2f6
Fix closes issue12471 - wrong TypeError message when '%i' format spec was used.
2011-07-03 21:03:16 -07:00
Ezio Melotti
bf1253b25a
#6780 : merge with 3.2.
2011-04-26 06:45:24 +03:00
Ezio Melotti
f2b3f780a1
#6780 : merge with 3.1.
2011-04-26 06:40:59 +03:00
Ezio Melotti
ba42fd5801
#6780 : fix starts/endswith error message to mention that tuples are accepted too.
2011-04-26 06:09:45 +03:00
Eric V. Smith
b9cd3531c4
Issue 9856: Change object.__format__ with a non-empty format string from a PendingDeprecationWarning to a DeprecationWarning.
2011-03-12 10:08:48 -05:00
Victor Stinner
6d970f4713
Issue #10831 : PyUnicode_FromFormat() supports %li, %lli and %zi formats
2011-03-02 00:04:25 +00:00
Victor Stinner
968654515f
Issue #10829 : Refactor PyUnicode_FromFormat()
...
* Use the same function to parse the format string in the 3 steps
* Fix crashs on invalid format strings
2011-03-01 23:44:09 +00:00
Victor Stinner
2b574a2332
Merged revisions 88697 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r88697 | victor.stinner | 2011-03-01 23:46:52 +0100 (mar., 01 mars 2011) | 4 lines
Issue #11246 : Fix PyUnicode_FromFormat("%V")
Decode the byte string from UTF-8 (with replace error handler) instead of
ISO-8859-1 (in strict mode). Patch written by Ray Allen.
........
2011-03-01 22:48:49 +00:00
Victor Stinner
2512a8b62e
Issue #11246 : Fix PyUnicode_FromFormat("%V")
...
Decode the byte string from UTF-8 (with replace error handler) instead of
ISO-8859-1 (in strict mode). Patch written by Ray Allen.
2011-03-01 22:46:52 +00:00