cpython/Modules/cjkcodecs
Georg Brandl 3dbca81c9b Merged revisions 65012,65035,65037-65040,65048,65057,65077,65091-65095,65097-65099,65127-65128,65131,65133-65136,65139,65149-65151,65155,65158-65159,65176-65178,65183-65184,65187-65190,65192,65194 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r65012 | jesse.noller | 2008-07-16 15:24:06 +0200 (Wed, 16 Jul 2008) | 2 lines

  Apply patch for issue 3090: ARCHFLAGS parsing incorrect
........
  r65035 | georg.brandl | 2008-07-16 23:19:28 +0200 (Wed, 16 Jul 2008) | 2 lines

  #3045: fix pydoc behavior for TEMP path with spaces.
........
  r65037 | georg.brandl | 2008-07-16 23:31:41 +0200 (Wed, 16 Jul 2008) | 2 lines

  #1608818: errno can get set by every call to readdir().
........
  r65038 | georg.brandl | 2008-07-17 00:04:20 +0200 (Thu, 17 Jul 2008) | 2 lines

  #3305: self->stream can be NULL.
........
  r65039 | georg.brandl | 2008-07-17 00:09:17 +0200 (Thu, 17 Jul 2008) | 2 lines

  #3345: fix docstring.
........
  r65040 | georg.brandl | 2008-07-17 00:33:18 +0200 (Thu, 17 Jul 2008) | 2 lines

  #3312: fix two sqlite3 crashes.
........
  r65048 | georg.brandl | 2008-07-17 01:35:54 +0200 (Thu, 17 Jul 2008) | 2 lines

  #3388: add a paragraph about using "with" for file objects.
........
  r65057 | gregory.p.smith | 2008-07-17 05:13:05 +0200 (Thu, 17 Jul 2008) | 2 lines

  news note for r63052
........
  r65077 | jesse.noller | 2008-07-17 23:01:05 +0200 (Thu, 17 Jul 2008) | 3 lines

  Fix issue 3395, update _debugInfo to be _debug_info
........
  r65091 | ronald.oussoren | 2008-07-18 07:48:03 +0200 (Fri, 18 Jul 2008) | 2 lines

  Last bit of a fix for issue3381 (addon for my patch in r65061)
........
  r65092 | vinay.sajip | 2008-07-18 10:59:06 +0200 (Fri, 18 Jul 2008) | 1 line

  Issue #3389: Allow resolving dotted names for handlers in logging configuration files. Thanks to Philip Jenvey for the patch.
........
  r65093 | vinay.sajip | 2008-07-18 11:00:00 +0200 (Fri, 18 Jul 2008) | 1 line

  Issue #3389: Allow resolving dotted names for handlers in logging configuration files. Thanks to Philip Jenvey for the patch.
........
  r65094 | vinay.sajip | 2008-07-18 11:00:35 +0200 (Fri, 18 Jul 2008) | 1 line

  Issue #3389: Allow resolving dotted names for handlers in logging configuration files. Thanks to Philip Jenvey for the patch.
........
  r65095 | vinay.sajip | 2008-07-18 11:01:10 +0200 (Fri, 18 Jul 2008) | 1 line

  Issue #3389: Allow resolving dotted names for handlers in logging configuration files. Thanks to Philip Jenvey for the patch.
........
  r65097 | georg.brandl | 2008-07-18 12:20:59 +0200 (Fri, 18 Jul 2008) | 2 lines

  Remove duplicate entry in __all__.
........
  r65098 | georg.brandl | 2008-07-18 12:29:30 +0200 (Fri, 18 Jul 2008) | 2 lines

  Correct attribute name.
........
  r65099 | georg.brandl | 2008-07-18 13:15:06 +0200 (Fri, 18 Jul 2008) | 3 lines

  Document the different meaning of precision for {:f} and {:g}.
  Also document how inf and nan are formatted. #3404.
........
  r65127 | raymond.hettinger | 2008-07-19 02:42:03 +0200 (Sat, 19 Jul 2008) | 1 line

  Improve accuracy of gamma test function
........
  r65128 | raymond.hettinger | 2008-07-19 02:43:00 +0200 (Sat, 19 Jul 2008) | 1 line

  Add recipe to the itertools docs.
........
  r65131 | georg.brandl | 2008-07-19 12:08:55 +0200 (Sat, 19 Jul 2008) | 2 lines

  #3378: in case of no memory, don't leak even more memory. :)
........
  r65133 | georg.brandl | 2008-07-19 14:39:10 +0200 (Sat, 19 Jul 2008) | 3 lines

  #3302: fix segfaults when passing None for arguments that can't
  be NULL for the C functions.
........
  r65134 | georg.brandl | 2008-07-19 14:46:12 +0200 (Sat, 19 Jul 2008) | 2 lines

  #3303: fix crash with invalid Py_DECREF in strcoll().
........
  r65135 | georg.brandl | 2008-07-19 15:00:22 +0200 (Sat, 19 Jul 2008) | 3 lines

  #3319: don't raise ZeroDivisionError if number of rounds is so
  low that benchtime is zero.
........
  r65136 | georg.brandl | 2008-07-19 15:09:42 +0200 (Sat, 19 Jul 2008) | 3 lines

  #3323: mention that if inheriting from a class without __slots__,
  the subclass will have a __dict__ available too.
........
  r65139 | georg.brandl | 2008-07-19 15:48:44 +0200 (Sat, 19 Jul 2008) | 2 lines

  Add ordering info for findall and finditer.
........
  r65149 | raymond.hettinger | 2008-07-20 01:21:57 +0200 (Sun, 20 Jul 2008) | 1 line

  Fix compress() recipe in docs to use itertools.
........
  r65150 | raymond.hettinger | 2008-07-20 01:58:47 +0200 (Sun, 20 Jul 2008) | 1 line

  Clean-up itertools docs and recipes.
........
  r65151 | gregory.p.smith | 2008-07-20 02:22:08 +0200 (Sun, 20 Jul 2008) | 9 lines

  fix issue3120 - don't truncate handles on 64-bit Windows.

  This is still messy, realistically PC/_subprocess.c should never cast pointers
  to python numbers and back at all.

  I don't have a 64-bit windows build environment because microsoft apparently
  thinks that should cost money.  Time to watch the buildbots.  It builds and
  passes tests on 32-bit windows.
........
  r65155 | georg.brandl | 2008-07-20 13:50:29 +0200 (Sun, 20 Jul 2008) | 2 lines

  #926501: add info where to put the docstring.
........
  r65158 | neal.norwitz | 2008-07-20 21:35:23 +0200 (Sun, 20 Jul 2008) | 1 line

  Fix a couple of names in error messages that were wrong
........
  r65159 | neal.norwitz | 2008-07-20 22:39:36 +0200 (Sun, 20 Jul 2008) | 1 line

  Fix misspeeld method name (negative)
........
  r65176 | amaury.forgeotdarc | 2008-07-21 23:36:24 +0200 (Mon, 21 Jul 2008) | 4 lines

  Increment version number in NEWS file, and move items that were added after 2.6b2.

  (I thought there was a script to automate this kind of updates)
........
  r65177 | amaury.forgeotdarc | 2008-07-22 00:00:38 +0200 (Tue, 22 Jul 2008) | 5 lines

  Issue2378: pdb would delete free variables when stepping into a class statement.

  The problem was introduced by r53954, the correction is to restore the symmetry between
  PyFrame_FastToLocals and PyFrame_LocalsToFast
........
  r65178 | benjamin.peterson | 2008-07-22 00:05:34 +0200 (Tue, 22 Jul 2008) | 1 line

  don't use assert statement
........
  r65183 | ronald.oussoren | 2008-07-22 09:06:00 +0200 (Tue, 22 Jul 2008) | 2 lines

  Fix buglet in fix for issue3381
........
  r65184 | ronald.oussoren | 2008-07-22 09:06:33 +0200 (Tue, 22 Jul 2008) | 2 lines

  Fix build issue on OSX 10.4, somehow this wasn't committed before.
........
  r65187 | raymond.hettinger | 2008-07-22 20:54:02 +0200 (Tue, 22 Jul 2008) | 1 line

  Remove out-of-date section on Exact/Inexact.
........
  r65188 | raymond.hettinger | 2008-07-22 21:00:47 +0200 (Tue, 22 Jul 2008) | 1 line

  Tuples now have both count() and index().
........
  r65189 | raymond.hettinger | 2008-07-22 21:03:05 +0200 (Tue, 22 Jul 2008) | 1 line

  Fix credits for math.sum()
........
  r65190 | raymond.hettinger | 2008-07-22 21:18:50 +0200 (Tue, 22 Jul 2008) | 1 line

  One more attribution.
........
  r65192 | benjamin.peterson | 2008-07-23 01:44:37 +0200 (Wed, 23 Jul 2008) | 1 line

  remove unneeded import
........
  r65194 | benjamin.peterson | 2008-07-23 15:25:06 +0200 (Wed, 23 Jul 2008) | 1 line

  use isinstance
........
2008-07-23 16:10:53 +00:00
..
README - Modernize code to use Py_ssize_t more intensively. 2006-03-04 16:08:19 +00:00
_codecs_cn.c Merged revisions 56753-56781 via svnmerge from 2007-08-06 23:33:07 +00:00
_codecs_hk.c Merged revisions 60481,60485,60489-60492,60494-60496,60498-60499,60501-60503,60505-60506,60508-60509,60523-60524,60532,60543,60545,60547-60548,60552,60554,60556-60559,60561-60562,60569,60571-60572,60574,60576-60583,60585-60586,60589,60591,60594-60595,60597-60598,60600-60601,60606-60612,60615,60617-60678 via svnmerge from 2008-02-09 02:18:51 +00:00
_codecs_iso2022.c Merged revisions 59488-59511 via svnmerge from 2007-12-15 01:27:15 +00:00
_codecs_jp.c Merge the rest of the trunk. 2006-06-08 15:35:45 +00:00
_codecs_kr.c Merged revisions 57152-57220 via svnmerge from 2007-08-20 19:06:03 +00:00
_codecs_tw.c - Modernize code to use Py_ssize_t more intensively. 2006-03-04 16:08:19 +00:00
alg_jisx0201.h - Modernize code to use Py_ssize_t more intensively. 2006-03-04 16:08:19 +00:00
cjkcodecs.h Implement PEP 3121: new module initialization and finalization API. 2008-06-11 05:26:20 +00:00
emu_jisx0213_2000.h - Modernize code to use Py_ssize_t more intensively. 2006-03-04 16:08:19 +00:00
mappings_cn.h - Modernize code to use Py_ssize_t more intensively. 2006-03-04 16:08:19 +00:00
mappings_hk.h Merged revisions 60481,60485,60489-60492,60494-60496,60498-60499,60501-60503,60505-60506,60508-60509,60523-60524,60532,60543,60545,60547-60548,60552,60554,60556-60559,60561-60562,60569,60571-60572,60574,60576-60583,60585-60586,60589,60591,60594-60595,60597-60598,60600-60601,60606-60612,60615,60617-60678 via svnmerge from 2008-02-09 02:18:51 +00:00
mappings_jisx0213_pair.h - Modernize code to use Py_ssize_t more intensively. 2006-03-04 16:08:19 +00:00
mappings_jp.h - Modernize code to use Py_ssize_t more intensively. 2006-03-04 16:08:19 +00:00
mappings_kr.h - Modernize code to use Py_ssize_t more intensively. 2006-03-04 16:08:19 +00:00
mappings_tw.h - Modernize code to use Py_ssize_t more intensively. 2006-03-04 16:08:19 +00:00
multibytecodec.c Merged revisions 65012,65035,65037-65040,65048,65057,65077,65091-65095,65097-65099,65127-65128,65131,65133-65136,65139,65149-65151,65155,65158-65159,65176-65178,65183-65184,65187-65190,65192,65194 via svnmerge from 2008-07-23 16:10:53 +00:00
multibytecodec.h Merge p3yk branch with the trunk up to revision 45595. This breaks a fair 2006-04-21 10:40:58 +00:00

README

To generate or modify mapping headers
-------------------------------------
Mapping headers are imported from CJKCodecs as pre-generated form.
If you need to tweak or add something on it, please look at tools/
subdirectory of CJKCodecs' distribution.



Notes on implmentation characteristics of each codecs
-----------------------------------------------------

1) Big5 codec

  The big5 codec maps the following characters as cp950 does rather
  than conforming Unicode.org's that maps to 0xFFFD.

    BIG5        Unicode     Description

    0xA15A      0x2574      SPACING UNDERSCORE
    0xA1C3      0xFFE3      SPACING HEAVY OVERSCORE
    0xA1C5      0x02CD      SPACING HEAVY UNDERSCORE
    0xA1FE      0xFF0F      LT DIAG UP RIGHT TO LOW LEFT
    0xA240      0xFF3C      LT DIAG UP LEFT TO LOW RIGHT
    0xA2CC      0x5341      HANGZHOU NUMERAL TEN
    0xA2CE      0x5345      HANGZHOU NUMERAL THIRTY

  Because unicode 0x5341, 0x5345, 0xFF0F, 0xFF3C is mapped to another
  big5 codes already, a roundtrip compatibility is not guaranteed for
  them.


2) cp932 codec

  To conform to Windows's real mapping, cp932 codec maps the following
  codepoints in addition of the official cp932 mapping.

    CP932     Unicode     Description

    0x80      0x80        UNDEFINED
    0xA0      0xF8F0      UNDEFINED
    0xFD      0xF8F1      UNDEFINED
    0xFE      0xF8F2      UNDEFINED
    0xFF      0xF8F3      UNDEFINED


3) euc-jisx0213 codec

  The euc-jisx0213 codec maps JIS X 0213 Plane 1 code 0x2140 into
  unicode U+FF3C instead of U+005C as on unicode.org's mapping.
  Because euc-jisx0213 has REVERSE SOLIDUS on 0x5c already and A140
  is shown as a full width character, mapping to U+FF3C can make
  more sense.

  The euc-jisx0213 codec is enabled to decode JIS X 0212 codes on
  codeset 2. Because JIS X 0212 and JIS X 0213 Plane 2 don't have
  overlapped by each other, it doesn't bother standard conformations
  (and JIS X 0213 Plane 2 is intended to use so.) On encoding
  sessions, the codec will try to encode kanji characters in this
  order:

    JIS X 0213 Plane 1 -> JIS X 0213 Plane 2 -> JIS X 0212


4) euc-jp codec

  The euc-jp codec is a compatibility instance on these points:
   - U+FF3C FULLWIDTH REVERSE SOLIDUS is mapped to EUC-JP A1C0 (vice versa)
   - U+00A5 YEN SIGN is mapped to EUC-JP 0x5c. (one way)
   - U+203E OVERLINE is mapped to EUC-JP 0x7e. (one way)


5) shift-jis codec

  The shift-jis codec is mapping 0x20-0x7e area to U+20-U+7E directly
  instead of using JIS X 0201 for compatibility. The differences are:
   - U+005C REVERSE SOLIDUS is mapped to SHIFT-JIS 0x5c.
   - U+007E TILDE is mapped to SHIFT-JIS 0x7e.
   - U+FF3C FULL-WIDTH REVERSE SOLIDUS is mapped to SHIFT-JIS 815f.