1) #8271: when a byte sequence is invalid, only the start byte and all the
valid continuation bytes are now replaced by U+FFFD, instead of replacing
the number of bytes specified by the start byte.
See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95);
2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes
in behavior);
3) Change the error messages "unexpected code byte" to "invalid start byte"
and "invalid data" to "invalid continuation byte";
4) Add an extensive set of tests in test_unicode;
5) Fix test_codeccallbacks because it was failing after this change.
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81465 | georg.brandl | 2010-05-22 06:29:19 -0500 (Sat, 22 May 2010) | 2 lines
Issue #3924: Ignore cookies with invalid "version" field in cookielib.
........
r81466 | georg.brandl | 2010-05-22 06:31:16 -0500 (Sat, 22 May 2010) | 1 line
Underscore the name of an internal utility function.
........
r81468 | georg.brandl | 2010-05-22 06:43:25 -0500 (Sat, 22 May 2010) | 1 line
#8635: document enumerate() start parameter in docstring.
........
r81679 | benjamin.peterson | 2010-06-03 16:21:03 -0500 (Thu, 03 Jun 2010) | 1 line
use a set for membership testing
........
r81735 | michael.foord | 2010-06-05 06:46:59 -0500 (Sat, 05 Jun 2010) | 1 line
Extract error message truncating into a method (unittest.TestCase._truncateMessage).
........
r81760 | michael.foord | 2010-06-05 14:38:42 -0500 (Sat, 05 Jun 2010) | 1 line
Issue 8302. SkipTest exception is setUpClass or setUpModule is now reported as a skip rather than an error.
........
r81868 | benjamin.peterson | 2010-06-09 14:45:04 -0500 (Wed, 09 Jun 2010) | 1 line
fix code formatting
........
r82183 | benjamin.peterson | 2010-06-23 15:29:26 -0500 (Wed, 23 Jun 2010) | 1 line
cpython only gc tests
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r82157 | benjamin.peterson | 2010-06-22 14:16:37 -0500 (Tue, 22 Jun 2010) | 1 line
remove INT_MAX assertions; they can fail with large Py_ssize_t #9058
........
mode raises unicode errors. The encoder only supports "strict" and "replace"
error handlers, the decoder only supports "strict" and "ignore" error handlers.
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81907 | antoine.pitrou | 2010-06-11 23:42:26 +0200 (ven., 11 juin 2010) | 5 lines
Issue #8941: decoding big endian UTF-32 data in UCS-2 builds could crash
the interpreter with characters outside the Basic Multilingual Plane
(higher than 0x10000).
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81820 | benjamin.peterson | 2010-06-07 17:23:23 -0500 (Mon, 07 Jun 2010) | 1 line
correctly overflow when indexes are too large
........
(instances of int, float, complex, decimal.Decimal and
fractions.Fraction) that makes it easy to maintain the invariant that
hash(x) == hash(y) whenever x and y have equal value.
objects. (1) The comparison could incorrectly return True in some cases
(2**53+1 == complex(2**53) == 2**53), breaking transivity of equality.
(2) The comparison raised an OverflowError for large integers, leading
to unpredictable exceptions when combining integers and complex objects
in sets or dicts.
Patch by Meador Inge.
_PyUnicode_AsDefaultEncodedString() uses a magical PyUnicode attribute to
automatically destroy PyUnicode_EncodeUTF8() result when the unicode string is
destroyed.
object to Py_FileSystemDefaultEncoding with the "surrogateescape" error
handler, return a bytes object. If Py_FileSystemDefaultEncoding is not set,
fall back to UTF-8.