cpython

Commit Graph

Author	SHA1	Message	Date
James Gerity	def828995a	fixes gh-109559: Update `unicodedata` for Unicode 15.1.0 (GH-109560) --------- Co-authored-by: Benjamin Peterson <benjamin@python.org>	2023-09-19 22:07:47 -07:00
Benjamin Peterson	fd1e477f53	closes gh-96734: Update to Unicode 15.0.0. (GH-96809)	2022-09-13 15:45:12 -07:00
Carl Friedrich Bolz-Tereick	9c197bc8bf	GH-96172 fix unicodedata.east_asian_width being wrong on unassigned code points (#96207 )	2022-08-26 19:29:39 +03:00
Carl Friedrich Bolz-Tereick	2d9f252c0c	gh-96019: Fix caching of decompositions in makeunicodedata (GH-96020)	2022-08-19 12:20:44 +03:00
Benjamin Peterson	024fda47d4	closes bpo-45190: Update Unicode data to version 14.0.0. (GH-28336)	2021-09-14 11:00:38 -07:00
Benjamin Peterson	051b9d08d1	closes bpo-39926: Update Unicode to 13.0.0. (GH-18910)	2020-03-10 20:41:34 -07:00
Benjamin Peterson	3aca40d3cb	closes bpo-36861: Update Unicode database to 12.1.0. (GH-13214) Adds ㋿.	2019-05-08 20:59:35 -07:00
Inada Naoki	6fec905de5	bpo-36642: make unicodedata const (GH-12855)	2019-04-17 08:40:34 +09:00
Benjamin Peterson	738c19f4c5	closes bpo-33376: Update to Unicode 12.0.0. (GH-12256)	2019-03-09 16:25:55 -08:00
Benjamin Peterson	7c69c1c0fb	update to Unicode 11.0.0 (closes bpo-33778) (GH-7439) Also, standardize indentation of generated tables.	2018-06-06 20:14:28 -07:00
Benjamin Peterson	279a96206f	bpo-30736: upgrade to Unicode 10.0 (#2344 ) Straightforward. While we're at it, though, strip trailing whitespace from generated tables.	2017-06-22 22:31:08 -07:00
Benjamin Peterson	6775231597	Unicode 9.0.0 Not completely mechanical since support for East Asian Width changes—emoji codepoints became Wide—had to be added to unicodedata.	2016-09-14 23:53:47 -07:00
Benjamin Peterson	4801383c29	upgrade to Unicode 8.0.0	2015-06-27 15:45:56 -05:00
Benjamin Peterson	3032ed7cb1	upgrade to unicode 7.0.0	2014-07-06 13:04:20 -07:00
Benjamin Peterson	94d08d908b	upgrade unicode db to 6.3.0 (closes #19221 )	2013-10-10 17:24:45 -04:00
Benjamin Peterson	b8350f1c7d	upgrade to UCD 6.2	2012-09-29 13:47:39 -04:00
Benjamin Peterson	71f660e00f	update to Unicode 6.1	2012-02-20 22:24:29 -05:00
Martin v. Löwis	baecd7243a	Upgrade to Unicode 6.0.0. makeunicodedata.py: download all data files from unicode.org, switch to extracting Unihan data from zip file. Read linebreakprops and derivednormalizationprops even for old versions, even though they are not used in delta records. test:unicode.py: U+11000 is now assigned, use U+14000 instead.	2010-10-11 22:42:28 +00:00
Florent Xicluna	faa663f03d	Fixed a failure in test_bigmem. Merged revision 79059 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r79059 \| florent.xicluna \| 2010-03-18 22:50:06 +0100 (jeu, 18 mar 2010) \| 2 lines Issue #8024: Update the Unicode database to 5.2 ........	2010-03-19 13:37:08 +00:00
Florent Xicluna	f1789dee30	Revert Unicode UCD 5.2 upgrade in 3.x. It broke repr() for unicode objects, and gave failures in test_bigmem. Revert 79062, 79065 and 79083.	2010-03-19 01:17:46 +00:00
Florent Xicluna	657de43f97	Merged revisions 79059 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r79059 \| florent.xicluna \| 2010-03-18 22:50:06 +0100 (jeu, 18 mar 2010) \| 2 lines Issue #8024: Update the Unicode database to 5.2 ........	2010-03-18 22:11:01 +00:00
Amaury Forgeot d'Arc	7d52079395	Merged revisions 75272-75273 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r75272 \| amaury.forgeotdarc \| 2009-10-06 21:56:32 +0200 (mar., 06 oct. 2009) \| 5 lines #1571184: makeunicodedata.py now generates the functions _PyUnicode_ToNumeric, _PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace. It now also parses the Unihan.txt for numeric values. ........ r75273 \| amaury.forgeotdarc \| 2009-10-06 22:02:09 +0200 (mar., 06 oct. 2009) \| 2 lines Add Anders Chrigstrom to Misc/ACKS for his work on unicodedata. ........	2009-10-06 21:03:20 +00:00
Antoine Pitrou	7a0fedfd1d	Merged revisions 72054 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r72054 \| antoine.pitrou \| 2009-04-27 23:53:26 +0200 (lun., 27 avril 2009) \| 5 lines Issue #1734234: Massively speedup `unicodedata.normalize()` when the string is already in normalized form, by performing a quick check beforehand. Original patch by Rauli Ruohonen. ........	2009-04-27 22:31:40 +00:00
Martin v. Löwis	93cbca33f2	Merged revisions 66362 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r66362 \| martin.v.loewis \| 2008-09-10 15:38:12 +0200 (Mi, 10 Sep 2008) \| 3 lines Issue #3811: The Unicode database was updated to 5.1. Reviewed by Fredrik Lundh and Marc-Andre Lemburg. ........	2008-09-10 14:08:48 +00:00
Martin v. Löwis	59683e8529	Merged revisions 64226 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r64226 \| martin.v.loewis \| 2008-06-13 09:47:47 +0200 (Fr, 13 Jun 2008) \| 2 lines Make more symbols static. ........	2008-06-13 07:50:45 +00:00
Martin v. Löwis	480f1bb67b	Update Unicode database to Unicode 4.1.	2006-03-09 23:38:20 +00:00
Hye-Shik Chang	e9ddfbb412	SF #989185 : Drop unicode.iswide() and unicode.width() and add unicodedata.east_asian_width(). You can still implement your own simple width() function using it like this: def width(u): w = 0 for c in unicodedata.normalize('NFC', u): cwidth = unicodedata.east_asian_width(c) if cwidth in ('W', 'F'): w += 2 else: w += 1 return w	2004-08-04 07:38:35 +00:00
Hye-Shik Chang	974ed7cfa5	- SF #962502 : Add two more methods for unicode type; width() and iswide() for east asian width manipulation. (Inspired by David Goodger, Reviewed by Martin v. Loewis) - Move _PyUnicode_TypeRecord.flags to the end of the struct so that no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)	2004-06-02 16:49:17 +00:00
Martin v. Löwis	b5c980b802	Add unidata_version. Bump generator version number.	2002-11-25 09:13:37 +00:00
Martin v. Löwis	d5169bad94	Regenerate from Unicode 3.2.0 to include all First/Last ranges.	2002-11-24 23:10:08 +00:00
Martin v. Löwis	677bde2dd1	Patch #626485 : Support Unicode normalization.	2002-11-23 22:08:15 +00:00
Martin v. Löwis	9def6a3a77	Update to Unicode 3.2 database.	2002-10-18 16:11:54 +00:00
Fredrik Lundh	7b7dd107b3	compress unicode decomposition tables (this saves another 55k)	2001-01-21 22:41:08 +00:00
Fredrik Lundh	9e9bcda547	forgot to check in the new makeunicodedata.py script	2001-01-21 17:01:31 +00:00
Fredrik Lundh	fad27aee11	Added 38,642 missing characters to the Unicode database (first-last ranges) -- but thanks to the 2.0 compression scheme, this doesn't add a single byte to the resulting binaries (!) Closes bug #117524	2000-11-03 20:24:15 +00:00
Fredrik Lundh	cfcea49218	unicode database compression, step 2: - fixed attributions - moved decomposition data to a separate table, in preparation for step 3 (which won't happen before 2.0 final, promise!) - use relative paths in the generator script I have a lot more stuff in the works for 2.1, but let's leave that for another day...	2000-09-25 08:07:06 +00:00
Fredrik Lundh	eedb5764a5	unicode database compression, step 1: - use unidb compression for the unicodedata module. on Windows, the new unidatabase module is 120k, down from nearly 600k.	2000-09-24 21:28:28 +00:00

37 Commits