Commit Graph

24 Commits

Author SHA1 Message Date
Amaury Forgeot d'Arc d0052d17b1 #1571184: makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,
_PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace.

It now also parses the Unihan.txt for numeric values.
2009-10-06 19:56:32 +00:00
Benjamin Peterson 5c8da86f3a convert usage of fail* to assert* 2009-06-30 22:57:08 +00:00
Walter Dörwald 4c69da2879 Fix typo. 2009-04-26 19:11:43 +00:00
Martin v. Löwis 99f277933e Issue #4971: Fix titlecase for characters that are their own
titlecase, but not their own uppercase.
2009-04-26 00:53:18 +00:00
Walter Dörwald 5d98ec76bb Issue #5828 (Invalid behavior of unicode.lower): Fixed bogus logic in
makeunicodedata.py and regenerated the Unicode database (This fixes
u'\u1d79'.lower() == '\x00').
2009-04-25 14:03:16 +00:00
Benjamin Peterson c078f929cb don't segfault when \N escapes are used and unicodedata fails to load
Fixes #4367
2008-11-21 22:27:24 +00:00
Martin v. Löwis 24329ba176 Issue #3811: The Unicode database was updated to 5.1.
Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
2008-09-10 13:38:12 +00:00
Walter Dörwald a2a89a8712 Change all functions that expect one unicode character to accept a pair of
surrogates in narrow builds. Fixes issue #1706460.
2008-06-02 20:36:03 +00:00
Martin v. Löwis f1e0b3f630 Bug #1704793: Return UTF-16 pair if unicodedata.lookup cannot
represent the result in a single character.
2007-07-28 07:03:05 +00:00
Martin v. Löwis d004fc810a Patch 1494554: Update numeric properties to Unicode 4.1. 2006-05-27 08:36:52 +00:00
Georg Brandl bffb0bc064 In stdlib, use hashlib instead of deprecated md5 and sha modules. 2006-04-30 08:57:35 +00:00
Martin v. Löwis 480f1bb67b Update Unicode database to Unicode 4.1. 2006-03-09 23:38:20 +00:00
Georg Brandl 7eb4b7d177 Fix all wrong instances of "it's". 2005-07-22 21:49:32 +00:00
Hye-Shik Chang e9ddfbb412 SF #989185: Drop unicode.iswide() and unicode.width() and add
unicodedata.east_asian_width().  You can still implement your own
simple width() function using it like this:
    def width(u):
        w = 0
        for c in unicodedata.normalize('NFC', u):
            cwidth = unicodedata.east_asian_width(c)
            if cwidth in ('W', 'F'): w += 2
            else: w += 1
        return w
2004-08-04 07:38:35 +00:00
Martin v. Löwis 61e40bd897 Special case normalization of empty strings. Fixes #924361.
Backported to 2.3.
2004-04-17 19:36:48 +00:00
Walter Dörwald 21d3a32b99 Combine the functionality of test_support.run_unittest()
and test_support.run_classtests() into run_unittest()
and use it wherever possible.

Also don't use "from test.test_support import ...", but
"from test import test_support" in a few spots.

From SF patch #662807.
2003-05-01 17:45:56 +00:00
Tim Peters 669454e9dc Whitespace normalization. 2003-03-07 17:30:48 +00:00
Walter Dörwald 37c4728c64 Port test_ucn and test_unicodedata to PyUnit. Add a few tests for error
cases increasing coverage in unicodedata.c from 87% to 95%
(when the normalization tests are run). From SF patch #662807.
2003-02-26 14:49:41 +00:00
Barry Warsaw 04f357cffe Get rid of relative imports in all unittests. Now anything that
imports e.g. test_support must do so using an absolute package name
such as "import test.test_support" or "from test import test_support".

This also updates the README in Lib/test, and gets rid of the
duplicate data dirctory in Lib/test/data (replaced by
Lib/email/test/data).

Now Tim and Jack can have at it. :)
2002-07-23 19:04:11 +00:00
Marc-André Lemburg 3661908a6a This patch removes all uses of "assert" in the regression test suite
and replaces them with a new API verify(). As a result the regression
suite will also perform its tests in optimization mode.

Written by Marc-Andre Lemburg. Copyright assigned to Guido van Rossum.
2001-01-17 19:11:13 +00:00
Fred Drake 004d5e6880 Make reindent.py happy (convert everything to 4-space indents!). 2000-10-23 17:22:08 +00:00
Marc-André Lemburg 67ceca7add Fixed encoding to use an endianness independent format. 2000-09-27 12:24:34 +00:00
Marc-André Lemburg 6a20ee7dec Added test suite for the complete Unicode database. The test previously
only tested a few cases.
2000-09-26 16:18:58 +00:00
Guido van Rossum 24bdb0474f Marc-Andre Lemburg:
The attached patch set includes a workaround to get Python with
Unicode compile on BSDI 4.x (courtesy Thomas Wouters; the cause
is a bug in the BSDI wchar.h header file) and Python interfaces
for the MBCS codec donated by Mark Hammond.

Also included are some minor corrections w/r to the docs of
the new "es" and "es#" parser markers (use PyMem_Free() instead
of free(); thanks to Mark Hammond for finding these).

The unicodedata tests are now in a separate file
(test_unicodedata.py) to avoid problems if the module cannot
be found.
2000-03-28 20:29:59 +00:00