Mark Dickinson
75d3600466
Issue #14700 : Fix buggy overflow checks for large precision and width in new-style and old-style formatting.
2012-10-28 10:00:46 +00:00
Victor Stinner
975134e2a2
Issue #13093 : Fix error handling on PyUnicode_EncodeDecimal()
...
Add tests for PyUnicode_EncodeDecimal()
2011-11-22 01:54:19 +01:00
Antoine Pitrou
30402549de
Issue #13333 : The UTF-7 decoder now accepts lone surrogates
...
(the encoder already accepts them).
2011-11-15 01:49:40 +01:00
Ezio Melotti
12682b10a7
#9200 : backport tests but run them on wide builds only.
2011-08-22 23:46:30 +03:00
Ezio Melotti
ea7b6f6e2a
#12266 : move the tests in test_unicode.
2011-08-15 10:04:28 +03:00
Ezio Melotti
e3685f6b1b
#6780 : fix starts/endswith error message to mention that tuples are accepted too.
2011-04-26 05:12:51 +03:00
Ezio Melotti
370d85cee4
Python 2 can encode/decode surrogates to utf-8. Add a test for this.
2011-02-28 01:42:29 +00:00
Antoine Pitrou
b27ddc72ea
Merged revisions 85861 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r85861 | antoine.pitrou | 2010-10-27 20:52:48 +0200 (mer., 27 oct. 2010) | 3 lines
Recode modules from latin-1 to utf-8
........
2010-10-27 18:58:04 +00:00
Florent Xicluna
c0c0b14671
Strengthen test_unicode with explicit type checking for assertEqual tests.
2010-09-13 08:53:00 +00:00
Florent Xicluna
60d512c3b0
Check PendingDeprecationWarning after issue #7994 .
2010-09-13 08:21:43 +00:00
Florent Xicluna
9b90cd1f7b
Merged revisions 84470-84471,84566-84567,84759 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r84470 | florent.xicluna | 2010-09-03 22:00:37 +0200 (ven., 03 sept. 2010) | 1 line
Strengthen BytesWarning tests.
........
r84471 | florent.xicluna | 2010-09-03 22:23:40 +0200 (ven., 03 sept. 2010) | 1 line
Typo
........
r84566 | florent.xicluna | 2010-09-06 22:27:15 +0200 (lun., 06 sept. 2010) | 1 line
typo
........
r84567 | florent.xicluna | 2010-09-06 22:27:55 +0200 (lun., 06 sept. 2010) | 1 line
typo
........
r84759 | florent.xicluna | 2010-09-13 04:28:18 +0200 (lun., 13 sept. 2010) | 1 line
Reenable test_ucs4 and remove some duplicated lines.
........
2010-09-13 07:46:37 +00:00
Stefan Krah
0b9201fa1c
Sub-issue of #9036 : Fix incorrect use of Py_CHARMASK.
2010-07-19 18:06:46 +00:00
Benjamin Peterson
eabdeba25e
use unicode literals
2010-06-07 22:33:09 +00:00
Benjamin Peterson
13e934acc0
correctly overflow when indexes are too large
2010-06-07 22:23:23 +00:00
Ezio Melotti
ab2eb0ee84
Add a NEWS entry for r81758 and clarify a comment.
2010-06-05 19:21:32 +00:00
Ezio Melotti
e57e50c8e7
Update PyUnicode_DecodeUTF8 from RFC 2279 to RFC 3629.
...
1) #8271 : when a byte sequence is invalid, only the start byte and all the
valid continuation bytes are now replaced by U+FFFD, instead of replacing
the number of bytes specified by the start byte.
See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95);
2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes
in behavior);
3) Add code and tests to reject surrogates (U+D800-U+DFFF) as defined in
RFC 3629, but leave it commented out since it's not backward compatible;
4) Change the error messages "unexpected code byte" to "invalid start byte"
and "invalid data" to "invalid continuation byte";
5) Add an extensive set of tests in test_unicode;
6) Fix test_codeccallbacks because it was failing after this change.
2010-06-05 17:51:07 +00:00
Georg Brandl
f0757a2937
#8016 : add the CP858 codec (approved by Benjamin). (Also add CP720 to the tests, it was missing there.)
2010-05-24 21:29:07 +00:00
Victor Stinner
c7790ed163
Fix the NEWS about my last commit: an unicode subclass can now override the
...
__unicode__ method (and not the __str__ method).
Simplify also the testcase.
2010-03-22 12:36:28 +00:00
Victor Stinner
95affc4449
Issue #1583863 : An unicode subclass can now override the __str__ method
2010-03-22 12:24:37 +00:00
Florent Xicluna
6de9e938a5
Issue #7849 : Now the utility ``check_warnings`` verifies if the warnings are
...
effectively raised. A new utility ``check_py3k_warnings`` deals with py3k warnings.
2010-03-07 12:18:33 +00:00
Victor Stinner
f20f9c299e
Issue #7649 : Fix u'%c' % char for character in range 0x80..0xFF
...
=> raise an UnicodeDecodeError. Patch written by Ezio Melotti.
2010-02-23 23:16:07 +00:00
Ezio Melotti
aa98058cc4
use assert[Not]In where appropriate
2010-01-23 23:04:36 +00:00
Antoine Pitrou
5b7139aab4
Issue #7462 : Implement the stringlib fast search algorithm for the `rfind`,
...
`rindex`, `rsplit` and `rpartition` methods. Patch by Florent Xicluna.
2010-01-02 21:12:58 +00:00
R. David Murray
0a0a1a842c
Issue #1680159 : unicode coercion during an 'in' operation was masking
...
any errors that might occur during coercion of the left operand and
turning them into a TypeError with a message text that was confusing in
the given context. This patch lets any errors through, as was already
done during coercion of the right hand side.
2009-12-14 16:28:26 +00:00
Benjamin Peterson
332d721750
add keyword arguments support to str/unicode encode and decode #6300
2009-09-18 21:14:55 +00:00
Benjamin Peterson
5c8da86f3a
convert usage of fail* to assert*
2009-06-30 22:57:08 +00:00
Eric Smith
4b94b192ff
Issue 6089: str.format raises SystemError.
2009-05-23 13:56:13 +00:00
Antoine Pitrou
653dece278
Issue #4426 : The UTF-7 decoder was too strict and didn't accept some legal sequences.
...
Patch by Nick Barnes and Victor Stinner.
2009-05-04 18:32:32 +00:00
Eric Smith
2ace4cf813
Unicode format tests weren't actually testing unicode. This was probably due to the original backport from py3k.
2009-03-14 14:37:38 +00:00
Eric Smith
6f42edb682
Issue 5237, Allow auto-numbered replacement fields in str.format() strings.
...
For simple uses for str.format(), this makes the typing easier. Hopfully this
will help in the adoption of str.format().
For example:
'The {} is {}'.format('sky', 'blue')
You can mix and matcth auto-numbering and named replacement fields:
'The {} is {color}'.format('sky', color='blue')
But you can't mix and match auto-numbering and specified numbering:
'The {0} is {}'.format('sky', 'blue')
ValueError: cannot switch from manual field specification to automatic field numbering
Will port to 3.1.
2009-03-14 11:57:26 +00:00
Antoine Pitrou
187ac1bda4
#3601 : test_unicode.test_raiseMemError fails in UCS4
...
Reviewed by Benjamin Peterson on IRC.
2008-09-05 22:04:54 +00:00
Antoine Pitrou
fd7c43e7be
#3556 : test_raiseMemError consumes an insane amount of memory
2008-08-17 17:01:49 +00:00
Amaury Forgeot d'Arc
06847b13ca
Correct a crash when two successive unicode allocations fail with a MemoryError:
...
the freelist contained half-initialized objects with freed pointers.
The comment
/* XXX UNREF/NEWREF interface should be more symmetrical */
was copied from tupleobject.c, and appears in some other places.
I sign the petition.
2008-07-31 23:39:05 +00:00
Antoine Pitrou
4982d5d04a
#2242 : utf7 decoding crashes on bogus input on some Windows/MSVC versions
2008-07-25 17:45:59 +00:00
Amaury Forgeot d'Arc
9a0d3462fc
#1477 : ur'\U0010FFFF' raised in narrow unicode builds.
...
Corrected the raw-unicode-escape codec to use UTF-16 surrogates in
this case, just like the unicode-escape codec.
2008-03-23 09:55:29 +00:00
Christian Heimes
c5f05e45cf
Patch #2167 from calvin: Remove unused imports
2008-02-23 17:40:11 +00:00
Eric Smith
bc32fee029
Added code to correct combining str and unicode in ''.format(). Added test case.
2008-02-18 18:02:34 +00:00
Eric Smith
a9f7d62480
Backport of PEP 3101, Advanced String Formatting, from py3k.
...
Highlights:
- Adding PyObject_Format.
- Adding string.Format class.
- Adding __format__ for str, unicode, int, long, float, datetime.
- Adding builtin format.
- Adding ''.format and u''.format.
- str/unicode fixups for formatters.
The files in Objects/stringlib that implement PEP 3101 (stringdefs.h,
unicodedefs.h, formatter.h, string_format.h) are identical in trunk
and py3k. Any changes from here on should be made to trunk, and
changes will propogate to py3k).
2008-02-17 19:46:49 +00:00
Kurt B. Kaiser
db98f3632a
Fix failing unicode test caused by change to ast.c at r56441
2007-07-18 19:58:42 +00:00
Neal Norwitz
ba965deea8
Prevent these tests from running on Win64 since they don\'t apply there either
2007-06-11 02:14:39 +00:00
Neal Norwitz
7dbd2a3720
Prevent expandtabs() on string and unicode objects from causing a segfault when
...
a large width is passed on 32-bit platforms. Found by Google.
It would be good for people to review this especially carefully and verify
I don't have an off by one error and there is no other way to cause overflow.
2007-06-09 03:36:34 +00:00
Collin Winter
c2898c5a67
Standardize on test.test_support.run_unittest() (as opposed to a mix of run_unittest() and run_suite()). Also, add functionality to run_unittest() that admits usage of unittest.TestLoader.loadTestsFromModule().
2007-04-25 17:29:52 +00:00
Neal Norwitz
17753ecbfa
Patch #1541585 : fix buffer overrun when performing repr() on
...
a unicode string in a build with wide unicode (UCS-4) support.
This code could be improved, so add an XXX comment.
2006-08-21 22:21:19 +00:00
Tim Peters
4511a713d5
Whitespace normalization.
2006-05-03 04:46:14 +00:00
Georg Brandl
de9b624fb9
Bug #1473625 : stop cPickle making float dumps locale dependent in protocol 0.
...
On the way, add a decorator to test_support to facilitate running single
test functions in different locales with automatic cleanup.
2006-04-30 11:13:56 +00:00
Anthony Baxter
67b6d516ce
Fixed bug #1459029 - unicode reprs were double-escaped.
2006-03-30 10:54:07 +00:00
Georg Brandl
da6b107745
Checkin the test of patch #1400181 .
2006-01-20 17:48:54 +00:00
Hye-Shik Chang
835b243c71
Bug #1379994 : Fix *unicode_escape codecs to encode r'\' as r'\\'
...
just like string codecs.
2005-12-17 04:38:31 +00:00
Neal Norwitz
430f68b447
Move registration of the codec search function to the module scope
...
so it is only executed once. Otherwise the same search function is
repeated added to the codec search path when regrtest is run with -R
and leaks are reported.
2005-11-24 22:00:56 +00:00
Neil Schemenauer
cf52c07843
Change the %s format specifier for str objects so that it returns a
...
unicode instance if the argument is not an instance of basestring and
calling __str__ on the argument returns a unicode instance.
2005-08-12 17:34:58 +00:00