cpython

Commit Graph

Author	SHA1	Message	Date
Florent Xicluna	2e0a53fdf6	Issue #8024 : Update the Unicode database to 5.2	2010-03-18 21:50:06 +00:00
Amaury Forgeot d'Arc	d0052d17b1	#1571184 : makeunicodedata.py now generates the functions _PyUnicode_ToNumeric, _PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace. It now also parses the Unihan.txt for numeric values.	2009-10-06 19:56:32 +00:00
Antoine Pitrou	e988e286b2	Issue #1734234 : Massively speedup `unicodedata.normalize()` when the string is already in normalized form, by performing a quick check beforehand. Original patch by Rauli Ruohonen.	2009-04-27 21:53:26 +00:00
Martin v. Löwis	24329ba176	Issue #3811 : The Unicode database was updated to 5.1. Reviewed by Fredrik Lundh and Marc-Andre Lemburg.	2008-09-10 13:38:12 +00:00
Martin v. Löwis	111c180674	Make more symbols static.	2008-06-13 07:47:47 +00:00
Martin v. Löwis	480f1bb67b	Update Unicode database to Unicode 4.1.	2006-03-09 23:38:20 +00:00
Hye-Shik Chang	e9ddfbb412	SF #989185 : Drop unicode.iswide() and unicode.width() and add unicodedata.east_asian_width(). You can still implement your own simple width() function using it like this: def width(u): w = 0 for c in unicodedata.normalize('NFC', u): cwidth = unicodedata.east_asian_width(c) if cwidth in ('W', 'F'): w += 2 else: w += 1 return w	2004-08-04 07:38:35 +00:00
Hye-Shik Chang	974ed7cfa5	- SF #962502 : Add two more methods for unicode type; width() and iswide() for east asian width manipulation. (Inspired by David Goodger, Reviewed by Martin v. Loewis) - Move _PyUnicode_TypeRecord.flags to the end of the struct so that no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)	2004-06-02 16:49:17 +00:00
Martin v. Löwis	b5c980b802	Add unidata_version. Bump generator version number.	2002-11-25 09:13:37 +00:00
Martin v. Löwis	d5169bad94	Regenerate from Unicode 3.2.0 to include all First/Last ranges.	2002-11-24 23:10:08 +00:00
Martin v. Löwis	677bde2dd1	Patch #626485 : Support Unicode normalization.	2002-11-23 22:08:15 +00:00
Martin v. Löwis	9def6a3a77	Update to Unicode 3.2 database.	2002-10-18 16:11:54 +00:00
Fredrik Lundh	7b7dd107b3	compress unicode decomposition tables (this saves another 55k)	2001-01-21 22:41:08 +00:00
Fredrik Lundh	9e9bcda547	forgot to check in the new makeunicodedata.py script	2001-01-21 17:01:31 +00:00
Fredrik Lundh	fad27aee11	Added 38,642 missing characters to the Unicode database (first-last ranges) -- but thanks to the 2.0 compression scheme, this doesn't add a single byte to the resulting binaries (!) Closes bug #117524	2000-11-03 20:24:15 +00:00
Fredrik Lundh	cfcea49218	unicode database compression, step 2: - fixed attributions - moved decomposition data to a separate table, in preparation for step 3 (which won't happen before 2.0 final, promise!) - use relative paths in the generator script I have a lot more stuff in the works for 2.1, but let's leave that for another day...	2000-09-25 08:07:06 +00:00
Fredrik Lundh	eedb5764a5	unicode database compression, step 1: - use unidb compression for the unicodedata module. on Windows, the new unidatabase module is 120k, down from nearly 600k.	2000-09-24 21:28:28 +00:00

17 Commits