A Unicode string can now be a PyASCIIObject, PyCompactUnicodeObject or
PyUnicodeObject. Aliasing a PyASCIIObject* or PyCompactUnicodeObject* to
PyUnicodeObject* is wrong
Use PyUnicode_IS_ASCII instead of PyUnicode_IS_COMPACT_ASCII, so the following
test can be removed:
PyUnicode_DATA(op) == (((PyCompactUnicodeObject *)(op))->utf8)
* Create copy_characters() function which doesn't check for the maximum
character in release mode
* _PyUnicode_CheckConsistency() is no more static to be able to use it
in _PyUnicode_FormatAdvanced() (in formatter_unicode.c)
* _PyUnicode_CheckConsistency() checks the string hash
ucs1, ucs2 and ucs4 libraries have to scan created substring to find the
maximum character, whereas it is not need to ASCII strings. Because ASCII
strings are common, it is useful to optimize ASCII.
* str[a:b] doesn't scan the string for the maximum character if the string
is ascii only
* PyUnicode_FromKindAndData() stops if we are sure that we cannot use a
shorter character type. For example, _PyUnicode_FromUCS1() stops if we
have at least one character in range U+0080-U+00FF
* Change its prototype: PyObject* instead of PyUnicodeoObject*.
* Remove an old assertion, the result of PyUnicode_READY (_PyUnicode_Ready)
must be checked instead
types. Added a new API function, PyUnicode_TransformDecimalToASCII(),
which transforms non-ASCII decimal digits in a Unicode string to their
ASCII equivalents.
* PyUnicode_EncodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and
PyUnicode_DecodeFSDefault() use the locale encoding instead of UTF-8 if
Py_FileSystemDefaultEncoding is NULL
* redecode_filenames() functions and _Py_code_object_list (issue #9630)
are no more needed: remove them
Database (Py_UNICODE_TOLOWER, Py_UNICODE_ISDECIMAL, and others) now accept
and return characters from the full Unicode range (Py_UCS4).
The differences from Python code are few:
- unicodedata.numeric(), unicodedata.decimal() and unicodedata.digit()
now return the correct value for large code points
- repr() may consider more characters as printable.
It's a ParseTuple converter: decode bytes objects to unicode using
PyUnicode_DecodeFSDefaultAndSize(); str objects are output as-is.
* Don't specify surrogateescape error handler in the comments nor the
documentation, but PyUnicode_DecodeFSDefaultAndSize() and
PyUnicode_EncodeFSDefault() because these functions use strict error handler
for the mbcs encoding (on Windows).
* Remove PyUnicode_FSConverter() comment in unicodeobject.c to avoid
inconsistency with unicodeobject.h.
object to Py_FileSystemDefaultEncoding with the "surrogateescape" error
handler, return a bytes object. If Py_FileSystemDefaultEncoding is not set,
fall back to UTF-8.
* Add paragraph titles to c-api/unicode.rst.
* Fix PyUnicode_DecodeFSDefault*() comment: it now uses the "surrogateescape"
error handler (and not "replace")
* Remove "The function is intended to be used for paths and file names only
during bootstrapping process where the codecs are not set up." from
PyUnicode_FSConverter() comment: it is used after the bootstrapping and for
other purposes than file names
svn+ssh://pythondev@svn.python.org/python/trunk
........
r72283 | antoine.pitrou | 2009-05-04 20:32:32 +0200 (lun., 04 mai 2009) | 4 lines
Issue #4426: The UTF-7 decoder was too strict and didn't accept some legal sequences.
Patch by Nick Barnes and Victor Stinner.
........
r72284 | antoine.pitrou | 2009-05-04 20:32:50 +0200 (lun., 04 mai 2009) | 3 lines
Add Nick Barnes to ACKS.
........
Addresses the float -> string conversion, using David Gay's code which
was added in Mark Dickinson's checkin r71663.
Also addresses these, which are intertwined with the short repr
changes:
- Issue #5772: format(1e100, '<') produces '1e+100', not '1.0e+100'
- Issue #5515: 'n' formatting with commas no longer works poorly
with leading zeros.
- PEP 378 Format Specifier for Thousands Separator: implemented
for floats.
This is incomplete, but I want to get some version into the next alpha. I am still working on:
Documentation.
More tests.
Implement for floats.
In addition, there's an existing bug with 'n' formatting that carries forward to thousands grouping (issue 5515).
svn+ssh://pythondev@svn.python.org/python/trunk
........
r68167 | vinay.sajip | 2009-01-02 12:53:04 -0600 (Fri, 02 Jan 2009) | 1 line
Minor documentation changes relating to NullHandler, the module used for handlers and references to ConfigParser.
........
r68276 | tarek.ziade | 2009-01-03 18:04:49 -0600 (Sat, 03 Jan 2009) | 1 line
fixed#1702551: distutils sdist was not pruning VCS directories under win32
........
r68292 | skip.montanaro | 2009-01-04 04:36:58 -0600 (Sun, 04 Jan 2009) | 3 lines
If user configures --without-gcc give preference to $CC instead of blindly
assuming the compiler will be "cc".
........
r68293 | tarek.ziade | 2009-01-04 04:37:52 -0600 (Sun, 04 Jan 2009) | 1 line
using clearer syntax
........
r68344 | marc-andre.lemburg | 2009-01-05 13:43:35 -0600 (Mon, 05 Jan 2009) | 7 lines
Fix#4846 (Py_UNICODE_ISSPACE causes linker error) by moving the declaration
into the extern "C" section.
Add a few more comments and apply some minor edits to make the file contents
fit the original structure again.
........
PyUnicode_AsStringAndSize -> _PyUnicode_AsStringAndSize to mark
them for interpreter internal use only.
We'll have to rework these APIs or create new ones for the
purpose of accessing the UTF-8 representation of Unicode objects
for 3.1.
The repr() of a string now contains printable Unicode characters unescaped.
The new ascii() builtin can be used to get a repr() with only ASCII characters in it.
PEP and patch were written by Atsuo Ishimoto.
Use faster PyUnicode_FromEncodedObject() for bytes/bytearray.decode().
Add new PyCodec_KnownEncoding() API.
Add new PyUnicode_AsDecodedUnicode() and PyUnicode_AsEncodedUnicode() APIs.
Add missing PyUnicode_AsDecodedObject() to unicodeobject.h
Fix punicode codec to also work on memoryviews.
svn+ssh://pythondev@svn.python.org/python/trunk
When forward porting this, I added _PyUnicode_InsertThousandsGrouping.
........
r63078 | eric.smith | 2008-05-11 15:52:48 -0400 (Sun, 11 May 2008) | 14 lines
Addresses issue 2802: 'n' formatting for integers.
Adds 'n' as a format specifier for integers, to mirror the same
specifier which is already available for floats. 'n' is the same as
'd', but inserts the current locale-specific thousands grouping.
I added this as a stringlib function, but it's only used by str type,
not unicode. This is because of an implementation detail in
unicode.format(), which does its own str->unicode conversion. But the
unicode version will be needed in 3.0, and it may be needed by other
code eventually in 2.6 (maybe decimal?), so I left it as a stringlib
implementation. As long as the unicode version isn't instantiated,
there's no overhead for this.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r60790 | raymond.hettinger | 2008-02-14 10:32:45 +0100 (Thu, 14 Feb 2008) | 4 lines
Add diagnostic message to help figure-out why SocketServer tests occasionally crash
when trying to remove a pid that in not in the activechildren list.
........
r60791 | raymond.hettinger | 2008-02-14 11:46:57 +0100 (Thu, 14 Feb 2008) | 1 line
Add fixed-point examples to the decimal FAQ
........
r60792 | raymond.hettinger | 2008-02-14 12:01:10 +0100 (Thu, 14 Feb 2008) | 1 line
Improve rst markup
........
r60794 | raymond.hettinger | 2008-02-14 12:57:25 +0100 (Thu, 14 Feb 2008) | 1 line
Show how to remove exponents.
........
r60795 | raymond.hettinger | 2008-02-14 13:05:42 +0100 (Thu, 14 Feb 2008) | 1 line
Fix markup.
........
r60797 | christian.heimes | 2008-02-14 13:47:33 +0100 (Thu, 14 Feb 2008) | 1 line
Implemented Martin's suggestion to clear the free lists during the garbage collection of the highest generation.
........
r60798 | raymond.hettinger | 2008-02-14 13:49:37 +0100 (Thu, 14 Feb 2008) | 1 line
Simplify moneyfmt() recipe.
........
r60810 | raymond.hettinger | 2008-02-14 20:02:39 +0100 (Thu, 14 Feb 2008) | 1 line
Fix markup
........
r60811 | raymond.hettinger | 2008-02-14 20:30:30 +0100 (Thu, 14 Feb 2008) | 1 line
No need to register subclass of ABCs.
........
r60814 | thomas.heller | 2008-02-14 22:00:28 +0100 (Thu, 14 Feb 2008) | 1 line
Try to correct a markup error that does hide the following paragraph.
........
r60822 | christian.heimes | 2008-02-14 23:40:11 +0100 (Thu, 14 Feb 2008) | 1 line
Use a static and interned string for __subclasscheck__ and __instancecheck__ as suggested by Thomas Heller in #2115
........
r60827 | christian.heimes | 2008-02-15 07:57:08 +0100 (Fri, 15 Feb 2008) | 1 line
Fixed repr() and str() of complex numbers. Complex suffered from the same problem as floats but I forgot to test and fix them.
........
r60830 | christian.heimes | 2008-02-15 09:20:11 +0100 (Fri, 15 Feb 2008) | 2 lines
Bug #2111: mmap segfaults when trying to write a block opened with PROT_READ
Thanks to Thomas Herve for the fix.
........
r60835 | eric.smith | 2008-02-15 13:14:32 +0100 (Fri, 15 Feb 2008) | 1 line
In PyNumber_ToBase, changed from an assert to returning an error when PyObject_Index() returns something other than an int or long. It should never be possible to trigger this, as PyObject_Index checks to make sure it returns an int or long.
........
r60837 | skip.montanaro | 2008-02-15 20:03:59 +0100 (Fri, 15 Feb 2008) | 8 lines
Two new functions:
* place_summary_first copies the regrtest summary to the front of the file
making it easier to scan quickly for problems.
* count_failures gets the actual count of the number of failing tests, not
just a 1 (some failures) or 0 (no failures).
........
r60840 | raymond.hettinger | 2008-02-15 22:21:25 +0100 (Fri, 15 Feb 2008) | 1 line
Update example to match the current syntax.
........
r60841 | amaury.forgeotdarc | 2008-02-15 22:22:45 +0100 (Fri, 15 Feb 2008) | 8 lines
Issue #2115: __slot__ attributes setting was 10x slower.
Also correct a possible crash using ABCs.
This change is exactly the same as an optimisation
done 5 years ago, but on slot *access*:
http://svn.python.org/view?view=rev&rev=28297
........
r60842 | amaury.forgeotdarc | 2008-02-15 22:27:44 +0100 (Fri, 15 Feb 2008) | 2 lines
Temporarily let these tests pass
........
r60843 | kurt.kaiser | 2008-02-15 22:56:36 +0100 (Fri, 15 Feb 2008) | 2 lines
ScriptBinding event handlers weren't returning 'break'. Patch 2050, Tal Einat.
........
r60844 | kurt.kaiser | 2008-02-15 23:25:09 +0100 (Fri, 15 Feb 2008) | 4 lines
Configured selection highlighting colors were ignored; updating highlighting
in the config dialog would cause non-Python files to be colored as if they
were Python source; improve use of ColorDelagator. Patch 1334. Tal Einat.
........
r60845 | amaury.forgeotdarc | 2008-02-15 23:44:20 +0100 (Fri, 15 Feb 2008) | 9 lines
Re-enable tests, they were failing since gc.collect() clears the various freelists.
They still remain fragile.
For example, a call to assertEqual currently does not make any allocation
(which surprised me at first).
But this can change when gc.collect also deletes the numerous "zombie frames"
attached to each function.
........