Commit Graph

49 Commits

Author SHA1 Message Date
Martin v. Löwis bd28ca65d6 Bug #1704793: Raise KeyError if unicodedata.lookup cannot
represent the result in a single character.
2007-07-28 07:01:43 +00:00
Walter Dörwald f2d5c6d117 Backport checkin:
Replace C++ comment with C comment (fixes SF bug #1593525).
2006-11-09 16:30:39 +00:00
Neal Norwitz b45f351832 I'm not sure why this code allocates this string for the error message.
I think it would be better to always use snprintf and have the format
limit the size of the name appropriately (like %.200s).

Klocwork #340
2006-08-12 01:57:47 +00:00
Martin v. Löwis 789c09d2cd Update dangling references to the 3.2 database to
mention that this is UCD 4.1 now.
2006-08-10 19:04:00 +00:00
Neal Norwitz 37f694f21b No functional change. Add comment and assert to describe why there cannot be overflow which was reported by Klocwork. Discussed on python-dev 2006-07-27 04:04:50 +00:00
Martin v. Löwis d004fc810a Patch 1494554: Update numeric properties to Unicode 4.1. 2006-05-27 08:36:52 +00:00
Neal Norwitz 88c97845c6 No reason to export get_decomp_record, make static 2006-04-17 00:36:29 +00:00
Martin v. Löwis 3c6e4188ed Support NFD of very long strings. 2006-04-13 06:36:31 +00:00
Neal Norwitz 65c05b20e9 Get rid of warnings about using chars as subscripts
on Alpha (and possibly other platforms) by using Py_CHARMASK().
2006-04-10 02:17:47 +00:00
Martin v. Löwis c350912990 Adjust CJK Ideograph range to Unicode 4.1. 2006-03-11 12:16:23 +00:00
Martin v. Löwis 0e2f9b2dfb Fix refcounting bug. 2006-03-10 11:29:32 +00:00
Martin v. Löwis 5bd7c02298 Avoid forward-declaring the methods array.
Rename unicodedata.db* to unicodedata.ucd*
2006-03-10 11:20:04 +00:00
Martin v. Löwis 480f1bb67b Update Unicode database to Unicode 4.1. 2006-03-09 23:38:20 +00:00
Thomas Wouters 1e365b265a Remove gcc (4.0.x) warning about uninitialized value by explicitly setting
the sentinel value in the main function, rather than the helper. This
function could possibly do with an early-out if any of the helper calls ends
up with a len of 0, but I doubt it really matters (how common are malformed
hangul syllables, really?)
2006-03-01 21:58:30 +00:00
Martin v. Löwis 8b291e2d66 Patch #1213831: Fix typo in unicodedata._getcode.
Will backport to Python 2.4.
2005-09-18 08:17:56 +00:00
Hye-Shik Chang 4c560ea05b Correct URL to the official UnicodeData 3.2.0 resource. (Reported
by Darek Suchojad)
2005-06-04 07:31:48 +00:00
Hye-Shik Chang cf18a5d67b Fill docstrings for module and functions, extracted from the tex
documentation.  (Patch #1173245, Contributed by Jeremy Yallop)
2005-04-04 16:32:07 +00:00
Hye-Shik Chang e9ddfbb412 SF #989185: Drop unicode.iswide() and unicode.width() and add
unicodedata.east_asian_width().  You can still implement your own
simple width() function using it like this:
    def width(u):
        w = 0
        for c in unicodedata.normalize('NFC', u):
            cwidth = unicodedata.east_asian_width(c)
            if cwidth in ('W', 'F'): w += 2
            else: w += 1
        return w
2004-08-04 07:38:35 +00:00
Hye-Shik Chang 69dc1c8f6a Fix typo. 2004-07-15 04:30:25 +00:00
Martin v. Löwis 61e40bd897 Special case normalization of empty strings. Fixes #924361.
Backported to 2.3.
2004-04-17 19:36:48 +00:00
Martin v. Löwis d2171d2ba4 Overallocate target buffer for normalization more early. Fixes #834676.
Backported to 2.3.
2003-11-06 20:47:57 +00:00
Neal Norwitz e9c571f968 Fix SF bug #694816, remove comparison of unsigned value < 0 2003-02-28 03:14:37 +00:00
Martin v. Löwis 2fb661fb80 Remove C++ comment. 2002-12-07 14:56:36 +00:00
Martin v. Löwis b5c980b802 Add unidata_version. Bump generator version number. 2002-11-25 09:13:37 +00:00
Martin v. Löwis 8d93ca1383 Verify that the code in CJK UNIFIED IDEOGRAPH- actually denotes an ideograph. 2002-11-23 22:10:29 +00:00
Martin v. Löwis 677bde2dd1 Patch #626485: Support Unicode normalization. 2002-11-23 22:08:15 +00:00
Martin v. Löwis ef7fe2e813 Implement names for CJK unified ideographs. Add name to KeyError output.
Verify that the lookup for an existing name succeeds.
2002-11-23 18:01:32 +00:00
Martin v. Löwis 2f4be4e38a Fix off-by-one error. 2002-11-23 17:11:06 +00:00
Martin v. Löwis 7d41e29c58 Patch #626548: Support Hangul syllable names. 2002-11-23 12:22:32 +00:00
Martin v. Löwis 9def6a3a77 Update to Unicode 3.2 database. 2002-10-18 16:11:54 +00:00
Mark Hammond 62b1ab1b31 Replace DL_IMPORT with PyMODINIT_FUNC and remove "/export:init..." link
command line for Windows builds.  This should allow MSVC to import and
build the Python MSVC6 project files without error.
2002-07-23 06:31:15 +00:00
Martin v. Löwis 14f8b4cfcb Patch #568124: Add doc string macros. 2002-06-13 20:33:02 +00:00
Andrew MacIntyre 74a3bec592 _Py prefix is verboten for static entry points 2002-06-13 11:55:14 +00:00
Fred Drake a2bd8d3816 Remove direct manipulation of the module dict. 2002-04-03 21:39:26 +00:00
Andrew MacIntyre 7bf6833e17 OS/2 EMX port changes (Modules part of patch #450267):
Modules/
    _hotshot.c
    dbmmodule.c
    fcntlmodule.c
    main.c
    pwdmodule.c
    readline.c
    selectmodule.c
    signalmodule.c
    termios.c
    timemodule.c
    unicodedata.c
2002-03-03 02:59:16 +00:00
Tim Peters 69b83b113f unicodedata_decomposition(): sprintf -> PyOS_snprintf. 2001-11-30 07:23:05 +00:00
Fred Drake 6a16ea07b8 Kill a warning on the SGI compiler.
This is part of SF patch #434992.
2001-07-19 21:11:13 +00:00
Fred Drake f585bef504 Be a bit more strict in setting up the export of the C API for this
module; do not attempt to insert the API object into the module dict
if there was an error creating it.
2001-03-03 19:41:55 +00:00
Fredrik Lundh b95896b2d2 renamed internal functions to avoid name clashes under OpenVMS
(fixes bug #132815)
2001-02-18 22:06:17 +00:00
Fredrik Lundh ae7636753e stupid typo (for some reason, this only caused problems on OpenVMS). 2001-02-18 11:41:49 +00:00
Fredrik Lundh 06d126803c Move uchhash functionality into unicodedata (after the recent
crop of changes, the files are small enough to do this).  Also
adds "name" and "lookup" functions to unicodedata.
2001-01-24 07:59:11 +00:00
Fredrik Lundh b2dfd73bdc Unicode nits: Don't include unicodedatabase.h no more. And make sure
to build *all* tables in makeunicodedata.py.
2001-01-21 23:31:52 +00:00
Fredrik Lundh 7b7dd107b3 compress unicode decomposition tables (this saves another 55k) 2001-01-21 22:41:08 +00:00
Fredrik Lundh cfcea49218 unicode database compression, step 2:
- fixed attributions
- moved decomposition data to a separate table, in preparation
  for step 3 (which won't happen before 2.0 final, promise!)
- use relative paths in the generator script

I have a lot more stuff in the works for 2.1, but let's leave
that for another day...
2000-09-25 08:07:06 +00:00
Fredrik Lundh a4287c29b3 unicode database compression, step 1:
- use unidb compression for the unicodedata module.  on Windows,
  the new unidatabase module is 120k, down from nearly 600k.
2000-09-24 21:45:34 +00:00
Guido van Rossum 16b1ad9c7d Changing the CNRI copyright notice according to CNRI's instructions.
This is a notice without a date, which apparently is not a claim to
copyright but only advice to the reader.  IANAL. :-)
2000-08-03 16:24:25 +00:00
Thomas Wouters f3f33dcf03 Bunch of minor ANSIfications: 'void initfunc()' -> 'void initfunc(void)',
and a couple of functions that were missed in the previous batches. Not
terribly tested, but very carefully scrutinized, three times.

All these were found by the little findkrc.py that I posted to python-dev,
which means there might be more lurking. Cases such as this:

long
func(a, b)
	long a;
	long b; /* flagword */
{

and other cases where the last ; in the argument list isn't followed by a
newline and an opening curly bracket. Regexps to catch all are welcome, of
course ;)
2000-07-21 06:00:07 +00:00
Guido van Rossum 8a16054240 Marc-Andre Lemburg: The large unicode database table is broken in
pages of 4k entries each. This should fix compiler problems on some
platforms.
2000-03-31 17:26:12 +00:00
Guido van Rossum 2a70a3a8fc Module unicodedata -- Provides access to the Unicode 3.0 data base.
Written by Marc-Andre Lemburg.
2000-03-10 23:10:21 +00:00