Amaury Forgeot d'Arc
d0052d17b1
#1571184 : makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,
...
_PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace.
It now also parses the Unihan.txt for numeric values.
2009-10-06 19:56:32 +00:00
Martin v. Löwis
99f277933e
Issue #4971 : Fix titlecase for characters that are their own
...
titlecase, but not their own uppercase.
2009-04-26 00:53:18 +00:00
Martin v. Löwis
24329ba176
Issue #3811 : The Unicode database was updated to 5.1.
...
Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
2008-09-10 13:38:12 +00:00
Martin v. Löwis
d004fc810a
Patch 1494554: Update numeric properties to Unicode 4.1.
2006-05-27 08:36:52 +00:00
Marc-André Lemburg
2cb94aba12
Enhance the performance of two important Unicode character
...
type lookups: whitespace and linebreak.
These lookup tables are from the Python 1.6 version with the addition
of the 205F code point which was added as whitespace code point to Unicode
since then.
2005-10-20 19:06:35 +00:00
Hye-Shik Chang
e9ddfbb412
SF #989185 : Drop unicode.iswide() and unicode.width() and add
...
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
2004-08-04 07:38:35 +00:00
Hye-Shik Chang
974ed7cfa5
- SF #962502 : Add two more methods for unicode type; width() and
...
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
2004-06-02 16:49:17 +00:00
Hye-Shik Chang
7db07e6972
Fix gcc 3.3 warnings related to Py_UNICODE_WIDE.
2003-12-29 01:36:01 +00:00
Martin v. Löwis
edf368c351
Make lower/upper/title work for non-BMP characters.
2002-10-18 16:40:36 +00:00
Martin v. Löwis
9def6a3a77
Update to Unicode 3.2 database.
2002-10-18 16:11:54 +00:00
Fredrik Lundh
72b068566a
removed "register const" from scalar arguments to the unicode
...
predicates
2001-06-27 22:08:26 +00:00
Fredrik Lundh
8f4558583f
use Py_UNICODE_WIDE instead of USE_UCS4_STORAGE and Py_UNICODE_SIZE
...
tests.
2001-06-27 18:59:43 +00:00
Martin v. Löwis
ce9b5a55e1
Encode surrogates in UTF-8 even for a wide Py_UNICODE.
...
Implement sys.maxunicode.
Explicitly wrap around upper/lower computations for wide Py_UNICODE.
When decoding large characters with UTF-8, represent expected test
results using the \U notation.
2001-06-27 06:28:56 +00:00
Fredrik Lundh
ee13dba1aa
more unicode tweaks: fix unicodectype for sizeof(Py_UNICODE) >
...
sizeof(int)
2001-06-26 20:36:12 +00:00
Fredrik Lundh
9e7dd4c185
unicode database compression, step 3:
...
- use unidb compression for the unicodectype module. smaller, faster,
and slightly more portable...
2000-09-25 21:48:13 +00:00
Trent Mick
8a74e5fc2c
Add the current Win64 compiler to the list of those that need the
...
huge switch statement broken up. This will probably not be necessary when
the Win64 compiler matures.
2000-08-12 19:37:27 +00:00
Guido van Rossum
16b1ad9c7d
Changing the CNRI copyright notice according to CNRI's instructions.
...
This is a notice without a date, which apparently is not a claim to
copyright but only advice to the reader. IANAL. :-)
2000-08-03 16:24:25 +00:00
Jack Jansen
56cdce3070
Conditionally (currently on ifdef macintosh) break the large switch up
...
into 1000-case smaller ones.
2000-07-06 13:57:38 +00:00
Marc-André Lemburg
f3938f55c7
Added new lookup API which matches all alphabetic Unicode characters,
...
i.e the ones with category 'Ll','Lu','Lt','Lo','Lm'.
2000-07-05 09:48:59 +00:00
Guido van Rossum
dc742b3184
Marc-Andre Lemburg:
...
Added a few missing whitespace Unicode char mappings.
Thanks to Brian Hooper.
2000-04-11 15:39:02 +00:00
Guido van Rossum
603484d759
Unicode character type helpers, written by Marc-Andre Lemburg.
2000-03-10 22:52:46 +00:00