Hye-Shik Chang
e9ddfbb412
SF #989185 : Drop unicode.iswide() and unicode.width() and add
...
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
2004-08-04 07:38:35 +00:00
Hye-Shik Chang
974ed7cfa5
- SF #962502 : Add two more methods for unicode type; width() and
...
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
2004-06-02 16:49:17 +00:00
Hye-Shik Chang
7db07e6972
Fix gcc 3.3 warnings related to Py_UNICODE_WIDE.
2003-12-29 01:36:01 +00:00
Martin v. Löwis
edf368c351
Make lower/upper/title work for non-BMP characters.
2002-10-18 16:40:36 +00:00
Martin v. Löwis
9def6a3a77
Update to Unicode 3.2 database.
2002-10-18 16:11:54 +00:00
Fredrik Lundh
72b068566a
removed "register const" from scalar arguments to the unicode
...
predicates
2001-06-27 22:08:26 +00:00
Fredrik Lundh
8f4558583f
use Py_UNICODE_WIDE instead of USE_UCS4_STORAGE and Py_UNICODE_SIZE
...
tests.
2001-06-27 18:59:43 +00:00
Martin v. Löwis
ce9b5a55e1
Encode surrogates in UTF-8 even for a wide Py_UNICODE.
...
Implement sys.maxunicode.
Explicitly wrap around upper/lower computations for wide Py_UNICODE.
When decoding large characters with UTF-8, represent expected test
results using the \U notation.
2001-06-27 06:28:56 +00:00
Fredrik Lundh
ee13dba1aa
more unicode tweaks: fix unicodectype for sizeof(Py_UNICODE) >
...
sizeof(int)
2001-06-26 20:36:12 +00:00
Fredrik Lundh
9e7dd4c185
unicode database compression, step 3:
...
- use unidb compression for the unicodectype module. smaller, faster,
and slightly more portable...
2000-09-25 21:48:13 +00:00
Trent Mick
8a74e5fc2c
Add the current Win64 compiler to the list of those that need the
...
huge switch statement broken up. This will probably not be necessary when
the Win64 compiler matures.
2000-08-12 19:37:27 +00:00
Guido van Rossum
16b1ad9c7d
Changing the CNRI copyright notice according to CNRI's instructions.
...
This is a notice without a date, which apparently is not a claim to
copyright but only advice to the reader. IANAL. :-)
2000-08-03 16:24:25 +00:00
Jack Jansen
56cdce3070
Conditionally (currently on ifdef macintosh) break the large switch up
...
into 1000-case smaller ones.
2000-07-06 13:57:38 +00:00
Marc-André Lemburg
f3938f55c7
Added new lookup API which matches all alphabetic Unicode characters,
...
i.e the ones with category 'Ll','Lu','Lt','Lo','Lm'.
2000-07-05 09:48:59 +00:00
Guido van Rossum
dc742b3184
Marc-Andre Lemburg:
...
Added a few missing whitespace Unicode char mappings.
Thanks to Brian Hooper.
2000-04-11 15:39:02 +00:00
Guido van Rossum
603484d759
Unicode character type helpers, written by Marc-Andre Lemburg.
2000-03-10 22:52:46 +00:00