Commit Graph

65 Commits

Author SHA1 Message Date
Miss Islington (bot) 0e2b76ea4e
bpo-29456: Fix bugs in unicodedata.normalize: u1176, u11a7 and u11c3 (GH-1958)
Hangul composition check boundaries are wrong for the second character
([0x1161, 0x1176) instead of [0x1161, 0x1176]) and third character ((0x11A7, 0x11C3)
instead of [0x11A7, 0x11C3]).
(cherry picked from commit d134809cd3)

Co-authored-by: Wonsup Yoon <pusnow@me.com>
2018-06-15 05:21:55 -07:00
Miss Islington (bot) 4705ea38c9 update to Unicode 11.0.0 (closes bpo-33778) (GH-7439) (GH-7470)
Also, standardize indentation of generated tables.
(cherry picked from commit 7c69c1c0fb)

Co-authored-by: Benjamin Peterson <benjamin@python.org>
2018-06-07 03:36:22 -04:00
Benjamin Peterson 279a96206f bpo-30736: upgrade to Unicode 10.0 (#2344)
Straightforward. While we're at it, though, strip trailing whitespace from generated tables.
2017-06-22 22:31:08 -07:00
Benjamin Peterson 6775231597 Unicode 9.0.0
Not completely mechanical since support for East Asian Width changes—emoji
codepoints became Wide—had to be added to unicodedata.
2016-09-14 23:53:47 -07:00
Berker Peksag 33a7fcc066 Issue #23981: Update test_unicodedata to use script_helpers
Patch by Christie.
2015-10-22 03:29:10 +03:00
Benjamin Peterson 4801383c29 upgrade to Unicode 8.0.0 2015-06-27 15:45:56 -05:00
Zachary Ware 38c707e7e0 Issue #21741: Update 147 test modules to use test discovery.
I have compared output between pre- and post-patch runs of these tests
to make sure there's nothing missing and nothing broken, on both
Windows and Linux.  The only differences I found were actually tests
that were previously *not* run.
2015-04-13 15:00:43 -05:00
Benjamin Peterson 96baaae46f for some reason, you don't get the right checksum from an incremental build 2014-07-06 22:07:08 -07:00
Benjamin Peterson 3032ed7cb1 upgrade to unicode 7.0.0 2014-07-06 13:04:20 -07:00
Antoine Pitrou 73abc527eb Fix expected checksum for new unicodedata (after full rebuild) 2013-10-11 21:40:55 +02:00
Benjamin Peterson 94d08d908b upgrade unicode db to 6.3.0 (closes #19221) 2013-10-10 17:24:45 -04:00
Benjamin Peterson b8350f1c7d upgrade to UCD 6.2 2012-09-29 13:47:39 -04:00
Benjamin Peterson 71f660e00f update to Unicode 6.1 2012-02-20 22:24:29 -05:00
Benjamin Peterson b2bf01d824 use full unicode mappings for upper/lower/title case (#12736)
Also broaden the category of characters that count as lowercase/uppercase.
2012-01-11 18:17:06 -05:00
Ezio Melotti f503673c4d Move UCS4-specific tests with the "normal" tests. 2011-09-29 03:14:56 +03:00
Alexander Belopolsky 86f65d5dbb Issue #10254: Fixed a crash and a regression introduced by the implementation of PRI 29. 2010-12-23 02:27:37 +00:00
Ezio Melotti b3aedd4862 #9424: Replace deprecated assert* methods in the Python test suite. 2010-11-20 19:04:17 +00:00
Antoine Pitrou 849e12bfe9 Fix resource warning in test_unicodedata. Patch by Brian Brazil. 2010-10-30 14:24:33 +00:00
Martin v. Löwis baecd7243a Upgrade to Unicode 6.0.0.
makeunicodedata.py: download all data files from unicode.org,
  switch to extracting Unihan data from zip file.
  Read linebreakprops and derivednormalizationprops even for
  old versions, even though they are not used in delta records.
test:unicode.py: U+11000 is now assigned, use U+14000 instead.
2010-10-11 22:42:28 +00:00
Amaury Forgeot d'Arc 7e44b6b0c5 Add more tests to unicodedata with large code points
(the other functions where not affected by the recent change)
2010-08-18 22:07:15 +00:00
Amaury Forgeot d'Arc 56ab01b66a Fix stupid typo in test. 2010-08-18 21:12:52 +00:00
Amaury Forgeot d'Arc 324ac65ceb #5127: Even on narrow unicode builds, the C functions that access the Unicode
Database (Py_UNICODE_TOLOWER, Py_UNICODE_ISDECIMAL, and others) now accept
and return characters from the full Unicode range (Py_UCS4).

The differences from Python code are few:
- unicodedata.numeric(), unicodedata.decimal() and unicodedata.digit()
  now return the correct value for large code points
- repr() may consider more characters as printable.
2010-08-18 20:44:58 +00:00
Mark Dickinson 388122d43b Issue #9337: Make float.__str__ identical to float.__repr__.
(And similarly for complex numbers.)
2010-08-04 20:56:28 +00:00
Florent Xicluna 806d8cf0e8 Merged revisions 79494,79496 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r79494 | florent.xicluna | 2010-03-30 10:24:06 +0200 (mar, 30 mar 2010) | 2 lines

  #7643: Unicode codepoints VT (0x0B) and FF (0x0C) are linebreaks according to Unicode Standard Annex #14.
........
  r79496 | florent.xicluna | 2010-03-30 18:29:03 +0200 (mar, 30 mar 2010) | 2 lines

  Highlight the change of behavior related to r79494.  Now VT and FF are linebreaks.
........
2010-03-30 19:34:18 +00:00
Florent Xicluna faa663f03d Fixed a failure in test_bigmem.
Merged revision 79059 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r79059 | florent.xicluna | 2010-03-18 22:50:06 +0100 (jeu, 18 mar 2010) | 2 lines

  Issue #8024: Update the Unicode database to 5.2
........
2010-03-19 13:37:08 +00:00
Florent Xicluna f1789dee30 Revert Unicode UCD 5.2 upgrade in 3.x. It broke repr() for unicode objects, and gave failures in test_bigmem. Revert 79062, 79065 and 79083. 2010-03-19 01:17:46 +00:00
Florent Xicluna 0106250f0d Fix bad unicodedata checksum merge from trunk in r79062 2010-03-19 00:03:01 +00:00
Florent Xicluna 657de43f97 Merged revisions 79059 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r79059 | florent.xicluna | 2010-03-18 22:50:06 +0100 (jeu, 18 mar 2010) | 2 lines

  Issue #8024: Update the Unicode database to 5.2
........
2010-03-18 22:11:01 +00:00
Victor Stinner 931bb02d96 oops, fix the test of my previous commit about unicodedata and PR #29 (r78647) 2010-03-04 12:47:32 +00:00
Victor Stinner 7ed9c4c910 Merged revisions 78646 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78646 | victor.stinner | 2010-03-04 13:09:33 +0100 (jeu., 04 mars 2010) | 5 lines

  Issue #1054943: Fix unicodedata.normalize('NFC', text) for the Public Review
  Issue #29.

  PR #29 was released in february 2004!
........
2010-03-04 12:14:57 +00:00
Benjamin Peterson 577473fe68 use assert[Not]In where appropriate
A patch from Dave Malcolm.
2010-01-19 00:09:57 +00:00
Amaury Forgeot d'Arc 7d52079395 Merged revisions 75272-75273 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r75272 | amaury.forgeotdarc | 2009-10-06 21:56:32 +0200 (mar., 06 oct. 2009) | 5 lines

  #1571184: makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,
  _PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace.

  It now also parses the Unihan.txt for numeric values.
........
  r75273 | amaury.forgeotdarc | 2009-10-06 22:02:09 +0200 (mar., 06 oct. 2009) | 2 lines

  Add Anders Chrigstrom to Misc/ACKS for his work on unicodedata.
........
2009-10-06 21:03:20 +00:00
Benjamin Peterson c9c0f201fe convert old fail* assertions to assert* 2009-06-30 23:06:06 +00:00
Martin v. Löwis 54d9d07806 Rename the surrogates handler to surrogatepass. 2009-05-10 09:33:21 +00:00
Martin v. Löwis db12d454e6 Issue #3672: Reject surrogates in utf-8 codec; add surrogates error
handler.
2009-05-02 18:52:14 +00:00
Walter Dörwald e250775d53 Merged revisions 71972 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r71972 | walter.doerwald | 2009-04-26 21:11:43 +0200 (So, 26 Apr 2009) | 2 lines

  Fix typo.
........
2009-04-26 19:16:11 +00:00
Martin v. Löwis 71efeb7cbf Merged revisions 71947 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r71947 | martin.v.loewis | 2009-04-26 02:53:18 +0200 (So, 26 Apr 2009) | 3 lines

  Issue #4971: Fix titlecase for characters that are their own
  titlecase, but not their own uppercase.
........
2009-04-26 01:02:07 +00:00
Walter Dörwald 1b08b30743 Merged revisions 71894 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r71894 | walter.doerwald | 2009-04-25 16:03:16 +0200 (Sa, 25 Apr 2009) | 4 lines

  Issue #5828 (Invalid behavior of unicode.lower): Fixed bogus logic in
  makeunicodedata.py and regenerated the Unicode database (This fixes
  u'\u1d79'.lower() == '\x00').
........
2009-04-25 14:13:56 +00:00
Benjamin Peterson 6f7fad16bc Merged revisions 67320 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r67320 | benjamin.peterson | 2008-11-21 16:27:24 -0600 (Fri, 21 Nov 2008) | 4 lines

  don't segfault when \N escapes are used and unicodedata fails to load

  Fixes #4367
........
2008-11-21 22:58:57 +00:00
Martin v. Löwis 93cbca33f2 Merged revisions 66362 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r66362 | martin.v.loewis | 2008-09-10 15:38:12 +0200 (Mi, 10 Sep 2008) | 3 lines

  Issue #3811: The Unicode database was updated to 5.1.
  Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
........
2008-09-10 14:08:48 +00:00
Walter Dörwald f342bfcbd4 Change all functions that expect one unicode character to accept a pair of
surrogates in narrow builds. Fixes issue #1706460. (Port of r63899).
2008-06-03 11:45:02 +00:00
Benjamin Peterson ee8712cda4 #2621 rename test.test_support to test.support 2008-05-20 21:35:26 +00:00
Guido van Rossum 254348e201 Rename buffer -> bytearray. 2007-11-21 19:29:53 +00:00
Guido van Rossum 98297ee781 Merging the py3k-pep3137 branch back into the py3k branch.
No detailed change log; just check out the change log for the py3k-pep3137
branch.  The most obvious changes:

  - str8 renamed to bytes (PyString at the C level);
  - bytes renamed to buffer (PyBytes at the C level);
  - PyString and PyUnicode are no longer compatible.

I.e. we now have an immutable bytes type and a mutable bytes type.

The behavior of PyString was modified quite a bit, to make it more
bytes-like.  Some changes are still on the to-do list.
2007-11-06 21:34:58 +00:00
Georg Brandl bd1c68c94f Patch #1303: Adapt str8 constructor to bytes (now buffer) one. 2007-10-24 18:55:37 +00:00
Guido van Rossum 9c62772d5e Changes in anticipation of stricter str vs. bytes enforcement. 2007-08-27 18:31:48 +00:00
Guido van Rossum 806c2469cb Merged revisions 56753-56781 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/p3yk

................
  r56760 | neal.norwitz | 2007-08-05 18:55:39 -0700 (Sun, 05 Aug 2007) | 178 lines

  Merged revisions 56477-56759 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r56485 | facundo.batista | 2007-07-21 17:13:00 -0700 (Sat, 21 Jul 2007) | 5 lines


    Selectively enable tests for asyncore.readwrite based on the presence
    of poll support in the select module (since this is the only case in
    which readwrite can be called). [GSoC - Alan McIntyre]
  ........
    r56488 | nick.coghlan | 2007-07-22 03:18:07 -0700 (Sun, 22 Jul 2007) | 1 line

    Add explicit relative import tests for runpy.run_module
  ........
    r56509 | nick.coghlan | 2007-07-23 06:41:45 -0700 (Mon, 23 Jul 2007) | 5 lines

    Correctly cleanup sys.modules after executing runpy relative import
    tests
    Restore Python 2.4 ImportError when attempting to execute a package
    (as imports cannot be guaranteed to work properly if you try it)
  ........
    r56519 | nick.coghlan | 2007-07-24 06:07:38 -0700 (Tue, 24 Jul 2007) | 1 line

    Tweak runpy test to do a better job of confirming that sys has been manipulated correctly
  ........
    r56520 | nick.coghlan | 2007-07-24 06:58:28 -0700 (Tue, 24 Jul 2007) | 1 line

    Fix an incompatibility between the -i and -m command line switches as reported on python-dev by PJE - runpy.run_module now leaves any changes it makes to the sys module intact after the function terminates
  ........
    r56523 | nick.coghlan | 2007-07-24 07:39:23 -0700 (Tue, 24 Jul 2007) | 1 line

    Try to get rid of spurious failure in test_resource on the Debian buildbots by changing the file size limit before attempting to close the file
  ........
    r56533 | facundo.batista | 2007-07-24 14:20:42 -0700 (Tue, 24 Jul 2007) | 7 lines


    New tests for basic behavior of smtplib.SMTP and
    smtpd.DebuggingServer. Change to use global host & port number
    variables. Modified the 'server' to take a string to send back in
    order to vary test server responses. Added a test for the reaction of
    smtplib.SMTP to a non-200 HELO response. [GSoC - Alan McIntyre]
  ........
    r56538 | nick.coghlan | 2007-07-25 05:57:48 -0700 (Wed, 25 Jul 2007) | 1 line

    More buildbot cleanup - let the OS assign the port for test_urllib2_localnet
  ........
    r56539 | nick.coghlan | 2007-07-25 06:18:58 -0700 (Wed, 25 Jul 2007) | 1 line

    Add a temporary diagnostic message before a strange failure on the alpha Debian buildbot
  ........
    r56543 | martin.v.loewis | 2007-07-25 09:24:23 -0700 (Wed, 25 Jul 2007) | 2 lines

    Change location of the package index to pypi.python.org/pypi
  ........
    r56551 | georg.brandl | 2007-07-26 02:36:25 -0700 (Thu, 26 Jul 2007) | 2 lines

    tabs, newlines and crs are valid XML characters.
  ........
    r56553 | nick.coghlan | 2007-07-26 07:03:00 -0700 (Thu, 26 Jul 2007) | 1 line

    Add explicit test for a misbehaving math.floor
  ........
    r56561 | mark.hammond | 2007-07-26 21:52:32 -0700 (Thu, 26 Jul 2007) | 3 lines

    In consultation with Kristjan Jonsson, only define WINVER and _WINNT_WIN32
    if (a) we are building Python itself and (b) no one previously defined them
  ........
    r56562 | mark.hammond | 2007-07-26 22:08:54 -0700 (Thu, 26 Jul 2007) | 2 lines

    Correctly detect AMD64 architecture on VC2003
  ........
    r56566 | nick.coghlan | 2007-07-27 03:36:30 -0700 (Fri, 27 Jul 2007) | 1 line

    Make test_math error messages more meaningful for small discrepancies in results
  ........
    r56588 | martin.v.loewis | 2007-07-27 11:28:22 -0700 (Fri, 27 Jul 2007) | 2 lines

    Bug #978833: Close https sockets by releasing the _ssl object.
  ........
    r56601 | martin.v.loewis | 2007-07-28 00:03:05 -0700 (Sat, 28 Jul 2007) | 3 lines

    Bug #1704793: Return UTF-16 pair if unicodedata.lookup cannot
    represent the result in a single character.
  ........
    r56604 | facundo.batista | 2007-07-28 07:21:22 -0700 (Sat, 28 Jul 2007) | 9 lines


    Moved all of the capture_server socket setup code into the try block
    so that the event gets set if a failure occurs during server setup
    (otherwise the test will block forever).  Changed to let the OS assign
    the server port number, and client side of test waits for port number
    assignment before proceeding. The test data in DispatcherWithSendTests
    is also sent in multiple send() calls instead of one to make sure this
    works properly. [GSoC - Alan McIntyre]
  ........
    r56611 | georg.brandl | 2007-07-29 01:26:10 -0700 (Sun, 29 Jul 2007) | 2 lines

    Clarify PEP 343 description.
  ........
    r56614 | georg.brandl | 2007-07-29 02:11:15 -0700 (Sun, 29 Jul 2007) | 2 lines

    try-except-finally is new in 2.5.
  ........
    r56617 | facundo.batista | 2007-07-29 07:23:08 -0700 (Sun, 29 Jul 2007) | 9 lines


    Added tests for asynchat classes simple_producer & fifo, and the
    find_prefix_at_end function. Check behavior of a string given as a
    producer.  Added tests for behavior of asynchat.async_chat when given
    int, long, and None terminator arguments. Added usepoll attribute to
    TestAsynchat to allow running the asynchat tests with poll support
    chosen whether it's available or not (improves coverage of asyncore
    code). [GSoC - Alan McIntyre]
  ........
    r56620 | georg.brandl | 2007-07-29 10:38:35 -0700 (Sun, 29 Jul 2007) | 2 lines

    Bug #1763149: use proper slice syntax in docstring.
     (backport)
  ........
    r56624 | mark.hammond | 2007-07-29 17:45:29 -0700 (Sun, 29 Jul 2007) | 4 lines

    Correct use of Py_BUILD_CORE - now make sure it is defined before it is
    referenced, and also fix definition of _WIN32_WINNT.
    Resolves patch 1761803.
  ........
    r56632 | facundo.batista | 2007-07-30 20:03:34 -0700 (Mon, 30 Jul 2007) | 8 lines


    When running asynchat tests on OS X (darwin), the test client now
    overrides asyncore.dispatcher.handle_expt to do nothing, since
    select.poll gives a POLLHUP error at the completion of these tests.
    Added timeout & count arguments to several asyncore.loop calls to
    avoid the possibility of a test hanging up a build. [GSoC - Alan
    McIntyre]
  ........
    r56633 | nick.coghlan | 2007-07-31 06:38:01 -0700 (Tue, 31 Jul 2007) | 1 line

    Eliminate RLock race condition reported in SF bug #1764059
  ........
    r56636 | martin.v.loewis | 2007-07-31 12:57:56 -0700 (Tue, 31 Jul 2007) | 2 lines

    Define _BSD_SOURCE, to get access to POSIX extensions on OpenBSD 4.1+.
  ........
    r56653 | facundo.batista | 2007-08-01 16:18:36 -0700 (Wed, 01 Aug 2007) | 9 lines


    Allow the OS to select a free port for each test server. For
    DebuggingServerTests, construct SMTP objects with a localhost argument
    to avoid abysmally long FQDN lookups (not relevant to items under
    test) on some machines that would cause the test to fail. Moved server
    setup code in the server function inside the try block to avoid the
    possibility of setup failure hanging the test.  Minor edits to conform
    to PEP 8. [GSoC - Alan McIntyre]
  ........
    r56681 | matthias.klose | 2007-08-02 14:33:13 -0700 (Thu, 02 Aug 2007) | 2 lines

    - Allow Emacs 22 for building the documentation in info format.
  ........
    r56689 | neal.norwitz | 2007-08-02 23:46:29 -0700 (Thu, 02 Aug 2007) | 1 line

    Py_ssize_t is defined regardless of HAVE_LONG_LONG.  Will backport
  ........
    r56727 | hyeshik.chang | 2007-08-03 21:10:18 -0700 (Fri, 03 Aug 2007) | 3 lines

    Fix gb18030 codec's bug that doesn't map two-byte characters on
    GB18030 extension in encoding. (bug reported by Bjorn Stabell)
  ........
    r56751 | neal.norwitz | 2007-08-04 20:23:31 -0700 (Sat, 04 Aug 2007) | 7 lines

    Handle errors when generating a warning.
    The value is always written to the returned pointer if getting it was
    successful, even if a warning causes an error. (This probably doesn't matter
    as the caller will probably discard the value.)

    Will backport.
  ........
................
2007-08-06 23:33:07 +00:00
Walter Dörwald 85d8e421a6 Fix test_unicodedata.py. 2007-05-23 20:11:33 +00:00
Guido van Rossum 805365ee39 Merged revisions 55007-55179 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/p3yk

........
  r55077 | guido.van.rossum | 2007-05-02 11:54:37 -0700 (Wed, 02 May 2007) | 2 lines

  Use the new print syntax, at least.
........
  r55142 | fred.drake | 2007-05-04 21:27:30 -0700 (Fri, 04 May 2007) | 1 line

  remove old cruftiness
........
  r55143 | fred.drake | 2007-05-04 21:52:16 -0700 (Fri, 04 May 2007) | 1 line

  make this work with the new Python
........
  r55162 | neal.norwitz | 2007-05-06 22:29:18 -0700 (Sun, 06 May 2007) | 1 line

  Get asdl code gen working with Python 2.3.  Should continue to work with 3.0
........
  r55164 | neal.norwitz | 2007-05-07 00:00:38 -0700 (Mon, 07 May 2007) | 1 line

  Verify checkins to p3yk (sic) branch go to 3000 list.
........
  r55166 | neal.norwitz | 2007-05-07 00:12:35 -0700 (Mon, 07 May 2007) | 1 line

  Fix this test so it runs again by importing warnings_test properly.
........
  r55167 | neal.norwitz | 2007-05-07 01:03:22 -0700 (Mon, 07 May 2007) | 8 lines

  So long xrange.  range() now supports values that are outside
  -sys.maxint to sys.maxint.  floats raise a TypeError.

  This has been sitting for a long time.  It probably has some problems and
  needs cleanup.  Objects/rangeobject.c now uses 4-space indents since
  it is almost completely new.
........
  r55171 | guido.van.rossum | 2007-05-07 10:21:26 -0700 (Mon, 07 May 2007) | 4 lines

  Fix two tests that were previously depending on significant spaces
  at the end of a line (and before that on Python 2.x print behavior
  that has no exact equivalent in 3.0).
........
2007-05-07 22:24:25 +00:00
Guido van Rossum 84fc66dd02 Rename 'unicode' to 'str' in its tp_name field. Rename 'str' to 'str8'.
Change all occurrences of unichr to chr.
2007-05-03 17:18:26 +00:00