Hye-Shik Chang
5c5316f111
Add a new unicode codec: ptcp154 (Kazakh)
2004-03-19 08:06:07 +00:00
Marc-André Lemburg
361d66de5d
Fix wrong character mapping in koi8_u: SF bug #902501 .
2004-02-23 09:00:43 +00:00
Marc-André Lemburg
c83dddf7fe
Let the default encodings search function lookup aliases before trying the codec import. This allows applications to install codecs which override (non-special-cased) builtin codecs.
2004-01-20 09:40:14 +00:00
Marc-André Lemburg
5c94d33077
Add some more code page aliases needed for completeness.
2004-01-20 09:38:52 +00:00
Hye-Shik Chang
b619e4b36c
Fix a typo: s/iso_3022/iso2022/
2004-01-20 09:33:30 +00:00
Hye-Shik Chang
3e2a306920
Add CJK codecs support as discussed on python-dev. (SF #873597 )
...
Several style fixes are suggested by Martin v. Loewis and
Marc-Andre Lemburg. Thanks!
2004-01-17 14:29:29 +00:00
Raymond Hettinger
0ad142aba0
Revert previous change. MAL preferred the old version.
2003-12-01 13:26:46 +00:00
Raymond Hettinger
a45517065a
Simplifed the code.
2003-12-01 10:41:02 +00:00
Raymond Hettinger
9edae346dd
Fix typo in the comments.
2003-09-24 03:57:36 +00:00
Raymond Hettinger
9a80c5dbc4
Added codec for bz2 compression.
2003-09-23 20:21:01 +00:00
Martin v. Löwis
0d8e16c7ad
Support trailing dots in DNS names. Fixes #782510 . Will backport to 2.3.
2003-08-05 06:19:47 +00:00
Skip Montanaro
5d6ceb4aae
more generic reference to python interpreter
2003-07-22 14:37:42 +00:00
Marc-André Lemburg
2820125935
Remove usage of re module from encodings package search function.
2003-05-16 17:07:51 +00:00
Tim Peters
0eadaac7dc
Whitespace normalization.
2003-04-24 16:02:54 +00:00
Martin v. Löwis
2548c730c1
Implement IDNA (Internationalized Domain Names in Applications).
2003-04-18 10:39:54 +00:00
Martin v. Löwis
7fb697b5d2
Revert Patch #670715 : iconv support.
2003-04-03 04:49:12 +00:00
Neal Norwitz
6156a2d07c
Handle iconv initialization erorrs
2003-02-28 20:00:42 +00:00
Martin v. Löwis
9789aefa61
Patch #670715 : Universal Unicode Codec for POSIX iconv.
2003-01-26 11:30:36 +00:00
Tim Peters
6578dc925f
Whitespace normalization.
2002-12-24 18:31:27 +00:00
Neal Norwitz
d8407a7031
Add new encoding for Ukrainian Cyrillic
2002-10-17 22:15:33 +00:00
Guido van Rossum
c8c6065231
When looking for an alias, first look for the normalized name (which
...
still may contain dots), then if that doesn't exist look for the name
with dots replaced by underscores. This is a little more forgiving.
2002-10-04 20:49:05 +00:00
Marc-André Lemburg
8dc5ff2e5a
Undo the removal. Guido mentioned that the encoding name is in active
...
by some email headers.
2002-10-04 16:30:42 +00:00
Marc-André Lemburg
68fc27385d
Remove unneeded alias.
2002-10-04 15:57:03 +00:00
Marc-André Lemburg
a40ea75625
Fix doc-string.
2002-10-04 11:58:24 +00:00
Marc-André Lemburg
9d158bb66f
Adapt lookup names to new more general encoding name normalization
...
scheme.
2002-10-04 11:51:39 +00:00
Marc-André Lemburg
7012673d67
Extending the encoding name normalization to handle more non-alphanumeric
...
characters.
2002-10-04 11:45:38 +00:00
Guido van Rossum
479f3d3d2a
Oops, must convert hyphens to underscores in keys of aliases dict.
2002-09-26 20:08:23 +00:00
Guido van Rossum
b7a88e533d
Add yet another alias for ASCII found in the field. Will backport to
...
2.2.2.
2002-09-25 16:44:34 +00:00
Tim Peters
280488b9a3
Whitespace normalization.
2002-08-23 18:19:30 +00:00
Martin v. Löwis
8a8da798a5
Patch #505705 : Remove eval in pickle and cPickle.
2002-08-14 07:46:28 +00:00
Tim Peters
469cdad822
Whitespace normalization.
2002-08-08 20:19:19 +00:00
Martin v. Löwis
b9e0764d8b
Revert #571603 since it is ok to import codecs that are not subdirectories
...
of encodings. Skip modules that don't have a getregentry function.
2002-07-29 14:05:24 +00:00
Martin v. Löwis
fc4c24c142
Patch #571603 : Refer to encodings package explicitly.
2002-07-28 11:31:33 +00:00
Marc-André Lemburg
a83ffa89f2
Palm OS encoding from Sjoerd Mullender
2002-07-12 14:36:22 +00:00
Marc-André Lemburg
3ccb09cba3
Fix for bug #222395 : UTF-16 et al. don't handle .readline().
...
They now raise an NotImplementedError to hint to the truth ;-)
2002-04-05 12:12:00 +00:00
Marc-André Lemburg
a0af63b242
Corrected import behaviour for codecs which live outside the encodings
...
package.
2002-02-11 17:43:46 +00:00
Marc-André Lemburg
462004e90a
Add IANA character set aliases to the encodings alias dictionary
...
and make alias lookup lazy.
Note that only those IANA character set aliases were added for which
we actually have codecs in the encodings package.
2002-02-10 21:36:20 +00:00
Martin v. Löwis
79d802d58c
Patch #487275 : Add windows-1251 charset alias.
2001-12-02 12:24:19 +00:00
Marc-André Lemburg
35b0cb09d7
Python part of the UTF-7 codec by Brian Quinlan.
2001-09-20 12:56:14 +00:00
Marc-André Lemburg
c60e6f7771
Patch #435971 : UTF-7 codec by Brian Quinlan.
2001-09-20 10:35:46 +00:00
Marc-André Lemburg
26e3b681b2
Patch #462635 by Andrew Kuchling correcting bugs in the new
...
codecs -- the self argument does matter for Python functions (it
does not for C functions which most other codecs use).
2001-09-20 10:33:38 +00:00
Marc-André Lemburg
816a1b75b7
Fixed search function error reporting in the encodings package
...
__init__.py module to raise errors which can be catched as LookupErrors
as well as SystemErrors.
Modified the error messages to include more information about the
failing module.
2001-09-19 11:52:07 +00:00
Andrew M. Kuchling
fd6608bcea
Fix typo (PyChecker)
2001-08-13 13:48:55 +00:00
Martin v. Löwis
9b75dca192
Expose nl_langinfo through locale where available.
2001-08-10 13:58:50 +00:00
Marc-André Lemburg
92b550cdd8
This patch by Martin v. Loewis changes the UTF-16 codec to only
...
write a BOM at the start of the stream and also to only read it as
BOM at the start of a stream.
Subsequent reading/writing of BOMs will read/write the BOM as ZWNBSP
character. This is in sync with the Unicode specifications.
Note that UTF-16 files will now *have* to start with a BOM mark
in order to be readable by the codec.
2001-06-19 20:07:51 +00:00
Martin v. Löwis
13b8bc5478
Patch #429957 : Add support for cp1140, which is identical to cp037,
...
with the addition of the euro character.
Also added a few EDBDIC aliases.
2001-06-07 19:39:25 +00:00
Mark Hammond
194bfb2805
Add some useful Windows encodings - patch #423221 .
2001-06-04 02:31:23 +00:00
Marc-André Lemburg
716cf91839
Moved the encoding map building logic from the individual mapping
...
codec files to codecs.py and added logic so that multi mappings
in the decoding maps now result in mappings to None (undefined mapping)
in the encoding maps.
2001-05-16 09:41:45 +00:00
Guido van Rossum
acfdf156aa
Add quoted-printable codec
2001-05-15 15:34:07 +00:00
Marc-André Lemburg
2d9204199f
This patch changes the way the string .encode() method works slightly
...
and introduces a new method .decode().
The major change is that strg.encode() will no longer try to convert
Unicode returns from the codec into a string, but instead pass along
the Unicode object as-is. The same is now true for all other codec
return types. The underlying C APIs were changed accordingly.
Note that even though this does have the potential of breaking
existing code, the chances are low since conversion from Unicode
previously took place using the default encoding which is normally
set to ASCII rendering this auto-conversion mechanism useless for
most Unicode encodings.
The good news is that you can now use .encode() and .decode() with
much greater ease and that the door was opened for better accessibility
of the builtin codecs.
As demonstration of the new feature, the patch includes a few new
codecs which allow string to string encoding and decoding (rot13,
hex, zip, uu, base64).
Written by Marc-Andre Lemburg. Copyright assigned to the PSF.
2001-05-15 12:00:02 +00:00