Commit Graph

96 Commits

Author SHA1 Message Date
Andrew Kuchling 9d5c071060 #1097797: add the original mapping file 2013-11-10 21:46:02 -05:00
Andrew Kuchling 695f07b27b Fix some PEP8-formatting problems in the generated code 2013-11-10 21:45:24 -05:00
Benjamin Peterson 94d08d908b upgrade unicode db to 6.3.0 (closes #19221) 2013-10-10 17:24:45 -04:00
Ezio Melotti d640fe2af5 #18803: merge with 3.3. 2013-08-26 01:33:30 +03:00
Ezio Melotti 7c4a7e6f3c #18803: fix more typos. Patch by Févry Thibault. 2013-08-26 01:32:56 +03:00
Antoine Pitrou 9ed5f27266 Issue #18722: Remove uses of the "register" keyword in C code. 2013-08-13 20:18:52 +02:00
Serhiy Storchaka 302b8c31ec Issue #15239: Make mkstringprep.py work again on Python 3. 2013-06-09 17:11:48 +03:00
Serhiy Storchaka e7275ffa4c Issue #15239: Make mkstringprep.py work again on Python 3. 2013-06-09 17:08:00 +03:00
Antoine Pitrou e9631e5d3a Issue #15378: Fix Tools/unicode/comparecodecs.py. Patch by Serhiy Storchaka. 2012-10-17 16:14:40 +02:00
Antoine Pitrou 31605ace0d Issue #15378: Fix Tools/unicode/comparecodecs.py. Patch by Serhiy Storchaka. 2012-10-17 16:13:55 +02:00
Antoine Pitrou 1eff0fc3cd Issue #15378: Fix Tools/unicode/comparecodecs.py. Patch by Serhiy Storchaka. 2012-10-17 16:12:30 +02:00
Benjamin Peterson b8350f1c7d upgrade to UCD 6.2 2012-09-29 13:47:39 -04:00
Florent Xicluna c20740109d Some cleanup in the Tools directory. 2012-07-07 17:03:54 +02:00
Antoine Pitrou aaefac76dd Issue #14874: Restore charmap decoding speed to pre-PEP 393 levels.
Patch by Serhiy Storchaka.
2012-06-16 22:48:21 +02:00
Benjamin Peterson 71f660e00f update to Unicode 6.1 2012-02-20 22:24:29 -05:00
Benjamin Peterson ad9c569825 delta encoding of upper/lower/title makes a glorious return (#12736) 2012-01-15 21:19:20 -05:00
Benjamin Peterson d5890c8db5 add str.casefold() (closes #13752) 2012-01-14 13:23:30 -05:00
Benjamin Peterson b2bf01d824 use full unicode mappings for upper/lower/title case (#12736)
Also broaden the category of characters that count as lowercase/uppercase.
2012-01-11 18:17:06 -05:00
Ezio Melotti 931b8aac80 #12753: Add support for Unicode name aliases and named sequences. 2011-10-21 21:57:36 +03:00
Ezio Melotti a9860aeb08 #13054: fix usage of sys.maxunicode after PEP-393. 2011-10-04 19:06:00 +03:00
Ezio Melotti 2a1e926d63 Fix ResourceWarnings in makeunicodedata.py. 2011-09-30 08:46:25 +03:00
Ezio Melotti 3b3499ba69 #11565: Merge with 3.1. 2011-03-16 11:35:38 +02:00
Ezio Melotti 13925008dc #11565: Fix several typos. Patch by Piotr Kasprzyk. 2011-03-16 11:05:33 +02:00
Georg Brandl 49857f8a93 Add updated .hgeol file and fix newlines in the 3.2 branch. 2011-03-05 15:11:35 +01:00
Alexander Belopolsky 827fdaae30 Issue #10552: Partially fixed a sort error in Tools/unicode/gencodec.py 2010-11-30 16:56:15 +00:00
Martin v. Löwis 5cbc71e50a Issue #10459: Update CJK character names to Unicode 6.0. 2010-11-22 09:00:02 +00:00
Martin v. Löwis baecd7243a Upgrade to Unicode 6.0.0.
makeunicodedata.py: download all data files from unicode.org,
  switch to extracting Unihan data from zip file.
  Read linebreakprops and derivednormalizationprops even for
  old versions, even though they are not used in delta records.
test:unicode.py: U+11000 is now assigned, use U+14000 instead.
2010-10-11 22:42:28 +00:00
Amaury Forgeot d'Arc feb7307db4 #9210: remove --with-wctype-functions configure option.
The internal unicode database is now always used.

(after 5 years: see
  http://mail.python.org/pipermail/python-dev/2004-December/050193.html
)
2010-09-12 22:42:57 +00:00
Amaury Forgeot d'Arc 324ac65ceb #5127: Even on narrow unicode builds, the C functions that access the Unicode
Database (Py_UNICODE_TOLOWER, Py_UNICODE_ISDECIMAL, and others) now accept
and return characters from the full Unicode range (Py_UCS4).

The differences from Python code are few:
- unicodedata.numeric(), unicodedata.decimal() and unicodedata.digit()
  now return the correct value for large code points
- repr() may consider more characters as printable.
2010-08-18 20:44:58 +00:00
Florent Xicluna 806d8cf0e8 Merged revisions 79494,79496 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r79494 | florent.xicluna | 2010-03-30 10:24:06 +0200 (mar, 30 mar 2010) | 2 lines

  #7643: Unicode codepoints VT (0x0B) and FF (0x0C) are linebreaks according to Unicode Standard Annex #14.
........
  r79496 | florent.xicluna | 2010-03-30 18:29:03 +0200 (mar, 30 mar 2010) | 2 lines

  Highlight the change of behavior related to r79494.  Now VT and FF are linebreaks.
........
2010-03-30 19:34:18 +00:00
Florent Xicluna f089fd67fc Merged revisions 78982,78986 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78982 | florent.xicluna | 2010-03-15 15:00:58 +0100 (lun, 15 mar 2010) | 2 lines

  Remove py3k deprecation warnings from these Unicode tools.
........
  r78986 | florent.xicluna | 2010-03-15 19:08:58 +0100 (lun, 15 mar 2010) | 3 lines

  Issue #7783 and #7787: open_urlresource invalidates the outdated files from the local cache.
  Use this feature to fix test_normalization.
........
2010-03-19 14:25:03 +00:00
Florent Xicluna faa663f03d Fixed a failure in test_bigmem.
Merged revision 79059 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r79059 | florent.xicluna | 2010-03-18 22:50:06 +0100 (jeu, 18 mar 2010) | 2 lines

  Issue #8024: Update the Unicode database to 5.2
........
2010-03-19 13:37:08 +00:00
Florent Xicluna f1789dee30 Revert Unicode UCD 5.2 upgrade in 3.x. It broke repr() for unicode objects, and gave failures in test_bigmem. Revert 79062, 79065 and 79083. 2010-03-19 01:17:46 +00:00
Florent Xicluna 8c8042734a Missing update from previous changeset r79062. 2010-03-18 22:19:01 +00:00
Benjamin Peterson 90f5ba538b convert shebang lines: python -> python3 2010-03-11 22:53:45 +00:00
Benjamin Peterson 75ad1fc089 Merged revisions 78806 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78806 | benjamin.peterson | 2010-03-08 16:15:11 -0600 (Mon, 08 Mar 2010) | 1 line

  set svn:eol-style on various files
........
2010-03-08 22:17:58 +00:00
Amaury Forgeot d'Arc 919765a095 Merged revisions 75396 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r75396 | amaury.forgeotdarc | 2009-10-13 23:29:34 +0200 (mar., 13 oct. 2009) | 3 lines

  #7112: Fix compilation warning in unicodetype_db.h
  makeunicodedata now generates double literals
........
2009-10-13 23:18:53 +00:00
Amaury Forgeot d'Arc 7d52079395 Merged revisions 75272-75273 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r75272 | amaury.forgeotdarc | 2009-10-06 21:56:32 +0200 (mar., 06 oct. 2009) | 5 lines

  #1571184: makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,
  _PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace.

  It now also parses the Unihan.txt for numeric values.
........
  r75273 | amaury.forgeotdarc | 2009-10-06 22:02:09 +0200 (mar., 06 oct. 2009) | 2 lines

  Add Anders Chrigstrom to Misc/ACKS for his work on unicodedata.
........
2009-10-06 21:03:20 +00:00
Amaury Forgeot d'Arc d8840860df Oops, really pass a bytes string to the ctypes function. 2009-07-13 20:48:07 +00:00
Amaury Forgeot d'Arc 8b84ea0aa4 Merged revisions 74000-74001 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r74000 | amaury.forgeotdarc | 2009-07-13 22:01:11 +0200 (lun., 13 juil. 2009) | 4 lines

  #1616979: Add the cp720 (Arabic DOS) encoding.
  Since there is no official mapping file from unicode.org,
  the codec file is generated on Windows with the new genwincodec.py script.
........
  r74001 | amaury.forgeotdarc | 2009-07-13 22:03:21 +0200 (lun., 13 juil. 2009) | 2 lines

  NEWS entry for r74000.
........
2009-07-13 20:38:21 +00:00
Antoine Pitrou 7a0fedfd1d Merged revisions 72054 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r72054 | antoine.pitrou | 2009-04-27 23:53:26 +0200 (lun., 27 avril 2009) | 5 lines

  Issue #1734234: Massively speedup `unicodedata.normalize()` when the
  string is already in normalized form, by performing a quick check beforehand.
  Original patch by Rauli Ruohonen.
........
2009-04-27 22:31:40 +00:00
Walter Dörwald 1b08b30743 Merged revisions 71894 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r71894 | walter.doerwald | 2009-04-25 16:03:16 +0200 (Sa, 25 Apr 2009) | 4 lines

  Issue #5828 (Invalid behavior of unicode.lower): Fixed bogus logic in
  makeunicodedata.py and regenerated the Unicode database (This fixes
  u'\u1d79'.lower() == '\x00').
........
2009-04-25 14:13:56 +00:00
Benjamin Peterson 09832740d1 fix isprintable() on space characters #5126 2009-03-26 17:15:46 +00:00
Mark Dickinson a56c467ac3 Issue #1717: Remove cmp. Stage 1: remove all uses of cmp and __cmp__ from
the standard library and tests.
2009-01-27 18:17:45 +00:00
Martin v. Löwis 93cbca33f2 Merged revisions 66362 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r66362 | martin.v.loewis | 2008-09-10 15:38:12 +0200 (Mi, 10 Sep 2008) | 3 lines

  Issue #3811: The Unicode database was updated to 5.1.
  Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
........
2008-09-10 14:08:48 +00:00
Georg Brandl d52429fb49 Issue #3282: str.isprintable() should return False for undefined Unicode characters. 2008-07-04 15:55:02 +00:00
Martin v. Löwis 59683e8529 Merged revisions 64226 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r64226 | martin.v.loewis | 2008-06-13 09:47:47 +0200 (Fr, 13 Jun 2008) | 2 lines

  Make more symbols static.
........
2008-06-13 07:50:45 +00:00
Georg Brandl 559e5d7f4d #2630: Implement PEP 3138.
The repr() of a string now contains printable Unicode characters unescaped.
The new ascii() builtin can be used to get a repr() with only ASCII characters in it.

PEP and patch were written by Atsuo Ishimoto.
2008-06-11 18:37:52 +00:00
Georg Brandl a26f8ca668 Revert r63934 -- it was mixing two patches. 2008-06-04 13:01:30 +00:00
Georg Brandl f954c4b9fb Remove meaning of -ttt, but still accept -t option on cmdline for compatibility. 2008-06-04 11:41:32 +00:00