makeunicodedata.py: download all data files from unicode.org,
switch to extracting Unihan data from zip file.
Read linebreakprops and derivednormalizationprops even for
old versions, even though they are not used in delta records.
test:unicode.py: U+11000 is now assigned, use U+14000 instead.
Database (Py_UNICODE_TOLOWER, Py_UNICODE_ISDECIMAL, and others) now accept
and return characters from the full Unicode range (Py_UCS4).
The differences from Python code are few:
- unicodedata.numeric(), unicodedata.decimal() and unicodedata.digit()
now return the correct value for large code points
- repr() may consider more characters as printable.
svn+ssh://pythondev@svn.python.org/python/trunk
........
r79494 | florent.xicluna | 2010-03-30 10:24:06 +0200 (mar, 30 mar 2010) | 2 lines
#7643: Unicode codepoints VT (0x0B) and FF (0x0C) are linebreaks according to Unicode Standard Annex #14.
........
r79496 | florent.xicluna | 2010-03-30 18:29:03 +0200 (mar, 30 mar 2010) | 2 lines
Highlight the change of behavior related to r79494. Now VT and FF are linebreaks.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r78982 | florent.xicluna | 2010-03-15 15:00:58 +0100 (lun, 15 mar 2010) | 2 lines
Remove py3k deprecation warnings from these Unicode tools.
........
r78986 | florent.xicluna | 2010-03-15 19:08:58 +0100 (lun, 15 mar 2010) | 3 lines
Issue #7783 and #7787: open_urlresource invalidates the outdated files from the local cache.
Use this feature to fix test_normalization.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r78806 | benjamin.peterson | 2010-03-08 16:15:11 -0600 (Mon, 08 Mar 2010) | 1 line
set svn:eol-style on various files
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r75272 | amaury.forgeotdarc | 2009-10-06 21:56:32 +0200 (mar., 06 oct. 2009) | 5 lines
#1571184: makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,
_PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace.
It now also parses the Unihan.txt for numeric values.
........
r75273 | amaury.forgeotdarc | 2009-10-06 22:02:09 +0200 (mar., 06 oct. 2009) | 2 lines
Add Anders Chrigstrom to Misc/ACKS for his work on unicodedata.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r74000 | amaury.forgeotdarc | 2009-07-13 22:01:11 +0200 (lun., 13 juil. 2009) | 4 lines
#1616979: Add the cp720 (Arabic DOS) encoding.
Since there is no official mapping file from unicode.org,
the codec file is generated on Windows with the new genwincodec.py script.
........
r74001 | amaury.forgeotdarc | 2009-07-13 22:03:21 +0200 (lun., 13 juil. 2009) | 2 lines
NEWS entry for r74000.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r72054 | antoine.pitrou | 2009-04-27 23:53:26 +0200 (lun., 27 avril 2009) | 5 lines
Issue #1734234: Massively speedup `unicodedata.normalize()` when the
string is already in normalized form, by performing a quick check beforehand.
Original patch by Rauli Ruohonen.
........
The repr() of a string now contains printable Unicode characters unescaped.
The new ascii() builtin can be used to get a repr() with only ASCII characters in it.
PEP and patch were written by Atsuo Ishimoto.