Benjamin Peterson
b027c6cae0
fix possible overflow bugs in unicodedata ( closes #23367 )
2015-03-02 11:17:05 -05:00
Ezio Melotti
6d0f0f299b
#18803 : fix more typos. Patch by Févry Thibault.
2013-08-26 01:31:30 +03:00
Ezio Melotti
419e23cbb0
#18466 : fix more typos. Patch by Févry Thibault.
2013-08-17 16:56:09 +03:00
Ezio Melotti
67c563e2f1
#16681 : use "bidirectional class" instead of "bidirectional category" in the docstring too.
2012-12-14 20:12:25 +02:00
Antoine Pitrou
44b3b5457a
Remove all other uses of the C tolower()/toupper() which could break with a Turkish locale.
...
(except in the strop module, which is deprecated anyway)
2011-10-04 13:55:37 +02:00
Alexander Belopolsky
dce6cf353c
Merged revisions 87442 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r87442 | alexander.belopolsky | 2010-12-22 21:27:37 -0500 (Wed, 22 Dec 2010) | 1 line
Issue #10254 : Fixed a crash and a regression introduced by the implementation of PRI 29.
........
2010-12-28 15:47:56 +00:00
Martin v. Löwis
e03c7787b9
Issue #10459 : Update CJK character names to Unicode 5.2.
2010-11-22 10:53:46 +00:00
Antoine Pitrou
c83ea137d7
Untabify C files. Will watch buildbots.
2010-05-09 14:46:46 +00:00
Larry Hastings
402b73fb8d
Backported PyCapsule from 3.1, and converted most uses of
...
CObject to PyCapsule.
2010-03-25 00:54:54 +00:00
Ezio Melotti
0d0b80bc3e
Link specifically to the UCD version 5.2.0.
2010-03-23 00:38:12 +00:00
Ezio Melotti
ae735a763e
Update the version number of the Unicode Database in a few more places.
2010-03-22 23:07:32 +00:00
Victor Stinner
7c924ec925
Issue #1054943 : Fix unicodedata.normalize('NFC', text) for the Public Review
...
Issue #29 .
PR #29 was released in february 2004!
2010-03-04 12:09:33 +00:00
Amaury Forgeot d'Arc
d0052d17b1
#1571184 : makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,
...
_PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace.
It now also parses the Unihan.txt for numeric values.
2009-10-06 19:56:32 +00:00
Antoine Pitrou
e988e286b2
Issue #1734234 : Massively speedup `unicodedata.normalize()` when the
...
string is already in normalized form, by performing a quick check beforehand.
Original patch by Rauli Ruohonen.
2009-04-27 21:53:26 +00:00
Martin v. Löwis
24329ba176
Issue #3811 : The Unicode database was updated to 5.1.
...
Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
2008-09-10 13:38:12 +00:00
Gregory P. Smith
dd96db63f6
This reverts r63675 based on the discussion in this thread:
...
http://mail.python.org/pipermail/python-dev/2008-June/079988.html
Python 2.6 should stick with PyString_* in its codebase. The PyBytes_* names
in the spirit of 3.0 are available via a #define only. See the email thread.
2008-06-09 04:58:54 +00:00
Walter Dörwald
a2a89a8712
Change all functions that expect one unicode character to accept a pair of
...
surrogates in narrow builds. Fixes issue #1706460 .
2008-06-02 20:36:03 +00:00
Christian Heimes
593daf545b
Renamed PyString to PyBytes
2008-05-26 12:51:38 +00:00
Christian Heimes
e93237dfcc
#1629 : Renamed Py_Size, Py_Type and Py_Refcnt to Py_SIZE, Py_TYPE and Py_REFCNT. Macros for b/w compatibility are available.
2007-12-19 02:37:44 +00:00
Martin v. Löwis
f1e0b3f630
Bug #1704793 : Return UTF-16 pair if unicodedata.lookup cannot
...
represent the result in a single character.
2007-07-28 07:03:05 +00:00
Martin v. Löwis
6819210b9e
PEP 3123: Provide forward compatibility with Python 3.0, while keeping
...
backwards compatibility. Add Py_Refcnt, Py_Type, Py_Size, and
PyVarObject_HEAD_INIT.
2007-07-21 06:55:02 +00:00
Walter Dörwald
6fc2382883
Replace C++ comment with C comment (fixes SF bug #1593525 ).
2006-11-09 16:23:26 +00:00
Neal Norwitz
b45f351832
I'm not sure why this code allocates this string for the error message.
...
I think it would be better to always use snprintf and have the format
limit the size of the name appropriately (like %.200s).
Klocwork #340
2006-08-12 01:57:47 +00:00
Martin v. Löwis
789c09d2cd
Update dangling references to the 3.2 database to
...
mention that this is UCD 4.1 now.
2006-08-10 19:04:00 +00:00
Neal Norwitz
37f694f21b
No functional change. Add comment and assert to describe why there cannot be overflow which was reported by Klocwork. Discussed on python-dev
2006-07-27 04:04:50 +00:00
Martin v. Löwis
d004fc810a
Patch 1494554: Update numeric properties to Unicode 4.1.
2006-05-27 08:36:52 +00:00
Neal Norwitz
88c97845c6
No reason to export get_decomp_record, make static
2006-04-17 00:36:29 +00:00
Martin v. Löwis
3c6e4188ed
Support NFD of very long strings.
2006-04-13 06:36:31 +00:00
Neal Norwitz
65c05b20e9
Get rid of warnings about using chars as subscripts
...
on Alpha (and possibly other platforms) by using Py_CHARMASK().
2006-04-10 02:17:47 +00:00
Martin v. Löwis
c350912990
Adjust CJK Ideograph range to Unicode 4.1.
2006-03-11 12:16:23 +00:00
Martin v. Löwis
0e2f9b2dfb
Fix refcounting bug.
2006-03-10 11:29:32 +00:00
Martin v. Löwis
5bd7c02298
Avoid forward-declaring the methods array.
...
Rename unicodedata.db* to unicodedata.ucd*
2006-03-10 11:20:04 +00:00
Martin v. Löwis
480f1bb67b
Update Unicode database to Unicode 4.1.
2006-03-09 23:38:20 +00:00
Thomas Wouters
1e365b265a
Remove gcc (4.0.x) warning about uninitialized value by explicitly setting
...
the sentinel value in the main function, rather than the helper. This
function could possibly do with an early-out if any of the helper calls ends
up with a len of 0, but I doubt it really matters (how common are malformed
hangul syllables, really?)
2006-03-01 21:58:30 +00:00
Martin v. Löwis
8b291e2d66
Patch #1213831 : Fix typo in unicodedata._getcode.
...
Will backport to Python 2.4.
2005-09-18 08:17:56 +00:00
Hye-Shik Chang
4c560ea05b
Correct URL to the official UnicodeData 3.2.0 resource. (Reported
...
by Darek Suchojad)
2005-06-04 07:31:48 +00:00
Hye-Shik Chang
cf18a5d67b
Fill docstrings for module and functions, extracted from the tex
...
documentation. (Patch #1173245 , Contributed by Jeremy Yallop)
2005-04-04 16:32:07 +00:00
Hye-Shik Chang
e9ddfbb412
SF #989185 : Drop unicode.iswide() and unicode.width() and add
...
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
2004-08-04 07:38:35 +00:00
Hye-Shik Chang
69dc1c8f6a
Fix typo.
2004-07-15 04:30:25 +00:00
Martin v. Löwis
61e40bd897
Special case normalization of empty strings. Fixes #924361 .
...
Backported to 2.3.
2004-04-17 19:36:48 +00:00
Martin v. Löwis
d2171d2ba4
Overallocate target buffer for normalization more early. Fixes #834676 .
...
Backported to 2.3.
2003-11-06 20:47:57 +00:00
Neal Norwitz
e9c571f968
Fix SF bug #694816 , remove comparison of unsigned value < 0
2003-02-28 03:14:37 +00:00
Martin v. Löwis
2fb661fb80
Remove C++ comment.
2002-12-07 14:56:36 +00:00
Martin v. Löwis
b5c980b802
Add unidata_version. Bump generator version number.
2002-11-25 09:13:37 +00:00
Martin v. Löwis
8d93ca1383
Verify that the code in CJK UNIFIED IDEOGRAPH- actually denotes an ideograph.
2002-11-23 22:10:29 +00:00
Martin v. Löwis
677bde2dd1
Patch #626485 : Support Unicode normalization.
2002-11-23 22:08:15 +00:00
Martin v. Löwis
ef7fe2e813
Implement names for CJK unified ideographs. Add name to KeyError output.
...
Verify that the lookup for an existing name succeeds.
2002-11-23 18:01:32 +00:00
Martin v. Löwis
2f4be4e38a
Fix off-by-one error.
2002-11-23 17:11:06 +00:00
Martin v. Löwis
7d41e29c58
Patch #626548 : Support Hangul syllable names.
2002-11-23 12:22:32 +00:00
Martin v. Löwis
9def6a3a77
Update to Unicode 3.2 database.
2002-10-18 16:11:54 +00:00