Commit Graph

15 Commits

Author SHA1 Message Date
Florent Xicluna 2e0a53fdf6 Issue #8024: Update the Unicode database to 5.2 2010-03-18 21:50:06 +00:00
Amaury Forgeot d'Arc 5c92d4301d #7112: Fix compilation warning in unicodetype_db.h
makeunicodedata now generates double literals
2009-10-13 21:29:34 +00:00
Amaury Forgeot d'Arc d0052d17b1 #1571184: makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,
_PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace.

It now also parses the Unihan.txt for numeric values.
2009-10-06 19:56:32 +00:00
Walter Dörwald 5d98ec76bb Issue #5828 (Invalid behavior of unicode.lower): Fixed bogus logic in
makeunicodedata.py and regenerated the Unicode database (This fixes
u'\u1d79'.lower() == '\x00').
2009-04-25 14:03:16 +00:00
Martin v. Löwis 24329ba176 Issue #3811: The Unicode database was updated to 5.1.
Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
2008-09-10 13:38:12 +00:00
Martin v. Löwis 480f1bb67b Update Unicode database to Unicode 4.1. 2006-03-09 23:38:20 +00:00
Hye-Shik Chang e9ddfbb412 SF #989185: Drop unicode.iswide() and unicode.width() and add
unicodedata.east_asian_width().  You can still implement your own
simple width() function using it like this:
    def width(u):
        w = 0
        for c in unicodedata.normalize('NFC', u):
            cwidth = unicodedata.east_asian_width(c)
            if cwidth in ('W', 'F'): w += 2
            else: w += 1
        return w
2004-08-04 07:38:35 +00:00
Hye-Shik Chang 974ed7cfa5 - SF #962502: Add two more methods for unicode type; width() and
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
2004-06-02 16:49:17 +00:00
Martin v. Löwis b5c980b802 Add unidata_version. Bump generator version number. 2002-11-25 09:13:37 +00:00
Martin v. Löwis d5169bad94 Regenerate from Unicode 3.2.0 to include all First/Last ranges. 2002-11-24 23:10:08 +00:00
Martin v. Löwis 9def6a3a77 Update to Unicode 3.2 database. 2002-10-18 16:11:54 +00:00
Fredrik Lundh 9e9bcda547 forgot to check in the new makeunicodedata.py script 2001-01-21 17:01:31 +00:00
Fredrik Lundh fad27aee11 Added 38,642 missing characters to the Unicode database (first-last
ranges) -- but thanks to the 2.0 compression scheme, this doesn't add
a single byte to the resulting binaries (!)

Closes bug #117524
2000-11-03 20:24:15 +00:00
Fredrik Lundh 375732cd41 - don't set the titlecase flag for uppercase letters (sorry, tim) 2000-09-25 23:03:34 +00:00
Fredrik Lundh 69b58e2772 unicode database compression, step 3:
- use unidb compression for the unicodectype module.  smaller, faster,
  and slightly more portable...

(note: this commit doesn't include the unicodectype.c file itself; I'm
still waiting for the reviewers...)
2000-09-25 21:12:34 +00:00