svn+ssh://svn.python.org/python/branches/py3k
........
r83444 | georg.brandl | 2010-08-01 22:51:02 +0200 (So, 01 Aug 2010) | 1 line
Revert r83395, it introduces test failures and is not necessary anyway since we now have to nul-terminate the string anyway.
........
svn+ssh://svn.python.org/python/branches/py3k
........
r83395 | georg.brandl | 2010-08-01 10:49:18 +0200 (So, 01 Aug 2010) | 1 line
#8821: do not rely on Unicode strings being terminated with a \u0000, rather explicitly check range before looking for a second surrogate character.
........
r83417 | georg.brandl | 2010-08-01 20:38:26 +0200 (So, 01 Aug 2010) | 1 line
#5776: fix mistakes in python specfile. (Nobody probably uses it anyway.)
........
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r82413 | ezio.melotti | 2010-07-01 10:32:02 +0300 (Thu, 01 Jul 2010) | 13 lines
Update PyUnicode_DecodeUTF8 from RFC 2279 to RFC 3629.
1) #8271: when a byte sequence is invalid, only the start byte and all the
valid continuation bytes are now replaced by U+FFFD, instead of replacing
the number of bytes specified by the start byte.
See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95);
2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes
in behavior);
3) Change the error messages "unexpected code byte" to "invalid start byte"
and "invalid data" to "invalid continuation byte";
4) Add an extensive set of tests in test_unicode;
5) Fix test_codeccallbacks because it was failing after this change.
........
r82468 | ezio.melotti | 2010-07-03 07:52:19 +0300 (Sat, 03 Jul 2010) | 1 line
Update comment about surrogates.
........
svn+ssh://pythondev@svn.python.org/python/branches/py3k
................
r79281 | victor.stinner | 2010-03-22 13:50:40 +0100 (lun., 22 mars 2010) | 16 lines
Merged revisions 79278,79280 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r79278 | victor.stinner | 2010-03-22 13:24:37 +0100 (lun., 22 mars 2010) | 2 lines
Issue #1583863: An unicode subclass can now override the __str__ method
........
r79280 | victor.stinner | 2010-03-22 13:36:28 +0100 (lun., 22 mars 2010) | 5 lines
Fix the NEWS about my last commit: an unicode subclass can now override the
__unicode__ method (and not the __str__ method).
Simplify also the testcase.
........
................
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r77395 | benjamin.peterson | 2010-01-09 15:45:28 -0600 (Sat, 09 Jan 2010) | 2 lines
Python strings ending with '\0' should not be equivalent to their C counterparts in PyUnicode_CompareWithASCIIString
........
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r73698 | amaury.forgeotdarc | 2009-06-30 00:36:49 +0200 (mar., 30 juin 2009) | 7 lines
#6373: SystemError in str.encode('latin1', 'surrogateescape')
if the string contains unpaired surrogates.
(In debug build, crash in assert())
This can happen with normal processing, if python starts with utf-8,
then calls sys.setfilesystemencoding('latin-1')
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r73190 | georg.brandl | 2009-06-04 01:23:45 +0200 (Do, 04 Jun 2009) | 2 lines
Avoid PendingDeprecationWarnings emitted by deprecated unittest methods.
........
r73213 | georg.brandl | 2009-06-04 12:15:57 +0200 (Do, 04 Jun 2009) | 1 line
#5967: note that the C slicing APIs do not support negative indices.
........
r73257 | georg.brandl | 2009-06-06 19:50:05 +0200 (Sa, 06 Jun 2009) | 1 line
#6211: elaborate a bit on ways to call the function.
........
r73258 | georg.brandl | 2009-06-06 19:51:31 +0200 (Sa, 06 Jun 2009) | 1 line
#6204: use a real reference instead of "see later".
........
r73260 | georg.brandl | 2009-06-06 20:21:58 +0200 (Sa, 06 Jun 2009) | 1 line
#6224: s/JPython/Jython/, and remove one link to a module nine years old.
........
r73275 | georg.brandl | 2009-06-07 22:37:52 +0200 (So, 07 Jun 2009) | 1 line
Add Ezio.
........
r73294 | georg.brandl | 2009-06-08 15:34:52 +0200 (Mo, 08 Jun 2009) | 1 line
#6194: O_SHLOCK/O_EXLOCK are not really more platform independent than lockf().
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r72283 | antoine.pitrou | 2009-05-04 20:32:32 +0200 (lun., 04 mai 2009) | 4 lines
Issue #4426: The UTF-7 decoder was too strict and didn't accept some legal sequences.
Patch by Nick Barnes and Victor Stinner.
........
r72284 | antoine.pitrou | 2009-05-04 20:32:50 +0200 (lun., 04 mai 2009) | 3 lines
Add Nick Barnes to ACKS.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r72260 | walter.doerwald | 2009-05-04 00:36:33 +0200 (Mo, 04 Mai 2009) | 5 lines
Issue #5108: Handle %s like %S and %R in PyUnicode_FromFormatV(): Call
PyUnicode_DecodeUTF8() once, remember the result and output it in a second
step. This avoids problems with counting UTF-8 bytes that ignores the effect
of using the replace error handler in PyUnicode_DecodeUTF8().
........
Addresses the float -> string conversion, using David Gay's code which
was added in Mark Dickinson's checkin r71663.
Also addresses these, which are intertwined with the short repr
changes:
- Issue #5772: format(1e100, '<') produces '1e+100', not '1.0e+100'
- Issue #5515: 'n' formatting with commas no longer works poorly
with leading zeros.
- PEP 378 Format Specifier for Thousands Separator: implemented
for floats.
This is incomplete, but I want to get some version into the next alpha. I am still working on:
Documentation.
More tests.
Implement for floats.
In addition, there's an existing bug with 'n' formatting that carries forward to thousands grouping (issue 5515).
svn+ssh://pythondev@svn.python.org/python/trunk
........
r70499 | hirokazu.yamamoto | 2009-03-21 19:32:52 +0900 | 1 line
There is no macro named SIZEOF_SSIZE_T. Should use SIZEOF_SIZE_T instead.
........
sizeof(Py_UNICODE) == 2, PyUnicode_FromWideChar now converts
each character outside the BMP to the appropriate surrogate pair.
Thanks Victor Stinner for the patch.
common cases are optimized thanks to a dedicated fast path and a moderate
amount of loop unrolling.
This will especially help text I/O (we already register a 30% speedup on large
reads on the io-c branch).
Also fix len() to return number of items rather than length in bytes.
I'm sorry it was not possible for me to work on this without reindenting
a bit some stuff around. The indentation in memoryobject.c is a mess,
I'll open a separate bug for it.
The approach used is similiar to what is currently used in the version
of unicodeobject.c in Python 2.x. The only difference is we use
_PyBytes_Resize instead of _PyString_Resize.