Commit Graph

72 Commits

Author SHA1 Message Date
Marc-André Lemburg 9ab8818c87 Rearranged mappings to value sorting order. 2004-12-10 21:54:35 +00:00
Walter Dörwald 69652035bc SF patch #998993: The UTF-8 and the UTF-16 stateful decoders now support
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
2004-09-07 20:24:22 +00:00
Tim Peters d1b7827216 Whitespace normalization. 2004-08-07 06:03:09 +00:00
Marc-André Lemburg c759f070ef Added new codecs and aliases for ISO_8859-11, ISO_8859-16 and
TIS-620.

Closes SF bug #1001895: Adding missing ISO 8859 codecs, especially Thai.
2004-08-05 12:43:30 +00:00
Tim Peters c0cbc8611b Whitespace normalization. 2004-07-31 21:17:37 +00:00
Marc-André Lemburg 17b6d28c64 New codec: [ 996067 ] hp-roman8 codec 2004-07-28 15:37:54 +00:00
Marc-André Lemburg cd8a4cb3d3 Added new codec hp-roman8 submitted as patch [ 996067 ] hp-roman8 codec. 2004-07-28 15:35:29 +00:00
Hye-Shik Chang 2bb146f2f4 Bring CJKCodecs 1.1 into trunk. This completely reorganizes source
and installed layouts to make maintenance simple and easy.  And it
also adds four new codecs; big5hkscs, euc-jis-2004, shift-jis-2004
and iso2022-jp-2004.
2004-07-18 03:06:29 +00:00
Tim Peters 4e0e1b6a54 Whitespace normalization. 2004-07-07 20:54:48 +00:00
Martin v. Löwis 708b4dacf4 Convert input to a string object. Fixes #909230.
Backported 2.3.
2004-03-23 23:40:36 +00:00
Hye-Shik Chang 5c5316f111 Add a new unicode codec: ptcp154 (Kazakh) 2004-03-19 08:06:07 +00:00
Marc-André Lemburg 361d66de5d Fix wrong character mapping in koi8_u: SF bug #902501. 2004-02-23 09:00:43 +00:00
Marc-André Lemburg c83dddf7fe Let the default encodings search function lookup aliases before trying the codec import. This allows applications to install codecs which override (non-special-cased) builtin codecs. 2004-01-20 09:40:14 +00:00
Marc-André Lemburg 5c94d33077 Add some more code page aliases needed for completeness. 2004-01-20 09:38:52 +00:00
Hye-Shik Chang b619e4b36c Fix a typo: s/iso_3022/iso2022/ 2004-01-20 09:33:30 +00:00
Hye-Shik Chang 3e2a306920 Add CJK codecs support as discussed on python-dev. (SF #873597)
Several style fixes are suggested by Martin v. Loewis and
Marc-Andre Lemburg. Thanks!
2004-01-17 14:29:29 +00:00
Raymond Hettinger 0ad142aba0 Revert previous change. MAL preferred the old version. 2003-12-01 13:26:46 +00:00
Raymond Hettinger a45517065a Simplifed the code. 2003-12-01 10:41:02 +00:00
Raymond Hettinger 9edae346dd Fix typo in the comments. 2003-09-24 03:57:36 +00:00
Raymond Hettinger 9a80c5dbc4 Added codec for bz2 compression. 2003-09-23 20:21:01 +00:00
Martin v. Löwis 0d8e16c7ad Support trailing dots in DNS names. Fixes #782510. Will backport to 2.3. 2003-08-05 06:19:47 +00:00
Skip Montanaro 5d6ceb4aae more generic reference to python interpreter 2003-07-22 14:37:42 +00:00
Marc-André Lemburg 2820125935 Remove usage of re module from encodings package search function. 2003-05-16 17:07:51 +00:00
Tim Peters 0eadaac7dc Whitespace normalization. 2003-04-24 16:02:54 +00:00
Martin v. Löwis 2548c730c1 Implement IDNA (Internationalized Domain Names in Applications). 2003-04-18 10:39:54 +00:00
Martin v. Löwis 7fb697b5d2 Revert Patch #670715: iconv support. 2003-04-03 04:49:12 +00:00
Neal Norwitz 6156a2d07c Handle iconv initialization erorrs 2003-02-28 20:00:42 +00:00
Martin v. Löwis 9789aefa61 Patch #670715: Universal Unicode Codec for POSIX iconv. 2003-01-26 11:30:36 +00:00
Tim Peters 6578dc925f Whitespace normalization. 2002-12-24 18:31:27 +00:00
Neal Norwitz d8407a7031 Add new encoding for Ukrainian Cyrillic 2002-10-17 22:15:33 +00:00
Guido van Rossum c8c6065231 When looking for an alias, first look for the normalized name (which
still may contain dots), then if that doesn't exist look for the name
with dots replaced by underscores.  This is a little more forgiving.
2002-10-04 20:49:05 +00:00
Marc-André Lemburg 8dc5ff2e5a Undo the removal. Guido mentioned that the encoding name is in active
by some email headers.
2002-10-04 16:30:42 +00:00
Marc-André Lemburg 68fc27385d Remove unneeded alias. 2002-10-04 15:57:03 +00:00
Marc-André Lemburg a40ea75625 Fix doc-string. 2002-10-04 11:58:24 +00:00
Marc-André Lemburg 9d158bb66f Adapt lookup names to new more general encoding name normalization
scheme.
2002-10-04 11:51:39 +00:00
Marc-André Lemburg 7012673d67 Extending the encoding name normalization to handle more non-alphanumeric
characters.
2002-10-04 11:45:38 +00:00
Guido van Rossum 479f3d3d2a Oops, must convert hyphens to underscores in keys of aliases dict. 2002-09-26 20:08:23 +00:00
Guido van Rossum b7a88e533d Add yet another alias for ASCII found in the field. Will backport to
2.2.2.
2002-09-25 16:44:34 +00:00
Tim Peters 280488b9a3 Whitespace normalization. 2002-08-23 18:19:30 +00:00
Martin v. Löwis 8a8da798a5 Patch #505705: Remove eval in pickle and cPickle. 2002-08-14 07:46:28 +00:00
Tim Peters 469cdad822 Whitespace normalization. 2002-08-08 20:19:19 +00:00
Martin v. Löwis b9e0764d8b Revert #571603 since it is ok to import codecs that are not subdirectories
of encodings. Skip modules that don't have a getregentry function.
2002-07-29 14:05:24 +00:00
Martin v. Löwis fc4c24c142 Patch #571603: Refer to encodings package explicitly. 2002-07-28 11:31:33 +00:00
Marc-André Lemburg a83ffa89f2 Palm OS encoding from Sjoerd Mullender 2002-07-12 14:36:22 +00:00
Marc-André Lemburg 3ccb09cba3 Fix for bug #222395: UTF-16 et al. don't handle .readline().
They now raise an NotImplementedError to hint to the truth ;-)
2002-04-05 12:12:00 +00:00
Marc-André Lemburg a0af63b242 Corrected import behaviour for codecs which live outside the encodings
package.
2002-02-11 17:43:46 +00:00
Marc-André Lemburg 462004e90a Add IANA character set aliases to the encodings alias dictionary
and make alias lookup lazy.

Note that only those IANA character set aliases were added for which
we actually have codecs in the encodings package.
2002-02-10 21:36:20 +00:00
Martin v. Löwis 79d802d58c Patch #487275: Add windows-1251 charset alias. 2001-12-02 12:24:19 +00:00
Marc-André Lemburg 35b0cb09d7 Python part of the UTF-7 codec by Brian Quinlan. 2001-09-20 12:56:14 +00:00
Marc-André Lemburg c60e6f7771 Patch #435971: UTF-7 codec by Brian Quinlan. 2001-09-20 10:35:46 +00:00