cpython

Commit Graph

Author	SHA1	Message	Date
Amaury Forgeot d'Arc	70dda76cde	#1616979 : Add the cp720 (Arabic DOS) encoding. Since there is no official mapping file from unicode.org, the codec file is generated on Windows with the new genwincodec.py script.	2009-07-13 20:01:11 +00:00
Christian Heimes	082c9b0267	Fixed bug #1915 : Python compiles with --enable-unicode=no again. However several extension methods and modules do not work without unicode support.	2008-01-23 14:20:50 +00:00
Amaury Forgeot d'Arc	5087980c1e	The incremental decoder for utf-7 must preserve its state between calls. Solves issue1460. Might not be a backport candidate: a new API function was added, and some code may rely on details in utf-7.py.	2007-11-20 23:31:27 +00:00
Walter Dörwald	183744d6b9	Fix for #1444 : utf_8_sig.StreamReader was (indirectly through decode()) calling codecs.utf_8_decode() with final==True, which falled with incomplete byte sequences. Fix and test by James G. Sack.	2007-11-19 12:41:10 +00:00
Georg Brandl	4cdceac760	Fix #883466 : don't allow Unicode as arguments to quopri and uu codecs.	2007-09-03 07:16:46 +00:00
Walter Dörwald	6e39080649	Backport r57105 and r57145 from the py3k branch: UTF-32 codecs.	2007-08-17 16:41:28 +00:00
Walter Dörwald	4234827e99	Fix utf-8-sig incremental decoder, which didn't recognise a BOM when the first chunk fed to the decoder started with a BOM, but was longer than 3 bytes.	2007-04-12 10:35:00 +00:00
Brett Cannon	fa6521b4fd	Make the __import__ call in encodings.__init__ absolute with a level 0 call.	2007-02-16 19:33:01 +00:00
Brett Cannon	971a012ce1	Update the encoding package's search function to use absolute imports when calling __import__. This helps make the expected search locations for encoding modules be more explicit. One could use an explicit value for __path__ when making the call to __import__ to force the exact location searched for encodings. This would give the most strict search path possible if one is worried about malicious code being imported. The unfortunate side-effect of that is that if __path__ was modified on 'encodings' on purpose in a safe way it would not be picked up in future __import__ calls.	2007-02-15 22:54:39 +00:00
Georg Brandl	4ba9e5bdc7	Patch #1634778 : add missing encoding aliases for iso8859_15 and iso8859_16.	2007-01-27 17:59:42 +00:00
Walter Dörwald	39b8b6afb5	Change decode() so that it works with a buffer (i.e. unicode(..., 'utf-8-sig')) SF bug #1601501.	2006-11-23 05:03:56 +00:00
Georg Brandl	2c9838e30f	Bug #1586613 : fix zlib and bz2 codecs' incremental en/decoders.	2006-10-29 14:39:09 +00:00
Georg Brandl	a92979a1db	Bug #1446043 : correctly raise a LookupError if an encoding name given to encodings.search_function() contains a dot.	2006-09-30 11:22:28 +00:00
Neal Norwitz	391e5f4c9f	importing types is not necessary if we use isinstance	2006-08-25 01:52:49 +00:00
Martin v. Löwis	961b91bd3c	Correction of patch #1455898 : In the mbcs decoder, set final=False for stream decoder, but final=True for the decode function.	2006-08-02 13:53:55 +00:00
Martin v. Löwis	0eac11826a	Make import/lookup of mbcs fail on non-Windows systems.	2006-06-15 06:45:05 +00:00
Martin v. Löwis	d825143be1	Patch #1455898 : Incremental mode for "mbcs" codec.	2006-06-14 05:21:04 +00:00
Walter Dörwald	c6f5b3ad6c	errors is an attribute in the incremental decoder not an argument.	2006-06-13 12:04:43 +00:00
Walter Dörwald	6b6e2bb8b1	Fix passing errors to the encoder and decoder functions.	2006-06-13 12:02:12 +00:00
Tim Peters	c7d14452a4	Whitespace normalization.	2006-06-04 23:43:53 +00:00
Martin v. Löwis	3f767795f6	Patch #1359618 : Speed-up charmap encoder.	2006-06-04 19:36:28 +00:00
Walter Dörwald	78a0be6ab3	Add a BufferedIncrementalEncoder class that can be used for implementing an incremental encoder that must retain part of the data between calls to the encode() method. Fix the incremental encoder and decoder for the IDNA encoding. This closes SF patch #1453235.	2006-04-14 18:25:39 +00:00
Walter Dörwald	a40cf31de6	Make error message less misleading for u"a..b".encode("idna").	2006-04-14 17:00:36 +00:00
Walter Dörwald	6493699c0d	Make raise statements PEP 8 compatible.	2006-04-14 15:22:27 +00:00
Walter Dörwald	a8da934069	Whitespace.	2006-03-27 09:02:04 +00:00
Hye-Shik Chang	e2ac4abd01	Patch #1443155 : Add the incremental codecs support for CJK codecs. (reviewed by Walter Dörwald)	2006-03-26 02:34:59 +00:00
Guido van Rossum	f8480a7856	Instead of relative imports, use (implicitly) absolute ones.	2006-03-15 23:08:13 +00:00
Tim Peters	f99b8162a2	Whitespace normalization.	2006-03-15 18:08:37 +00:00
Walter Dörwald	13ed60b504	Fix typo.	2006-03-15 13:36:50 +00:00
Walter Dörwald	abb02e5994	Patch #1436130 : codecs.lookup() now returns a CodecInfo object (a subclass of tuple) that provides incremental decoders and encoders (a way to use stateful codecs without the stream API). Functions codecs.getincrementaldecoder() and codecs.getincrementalencoder() have been added.	2006-03-15 11:35:15 +00:00
Guido van Rossum	87de069e4e	Use relative imports in a few places where I noticed the need. (Ideally, all packages in Python 2.5 will use the relative import syntax for all their relative import needs.)	2006-03-15 04:33:54 +00:00
Martin v. Löwis	5bd7c02298	Avoid forward-declaring the methods array. Rename unicodedata.db* to unicodedata.ucd*	2006-03-10 11:20:04 +00:00
Martin v. Löwis	480f1bb67b	Update Unicode database to Unicode 4.1.	2006-03-09 23:38:20 +00:00
Marc-André Lemburg	fe4b34cc4b	Fix the encodings package codec search function to only search inside its own package. Fixes problem reported in patch #1433198. Add codec search function for codec test codec.	2006-02-19 15:22:22 +00:00
Martin v. Löwis	412ed3b8a7	Patch #1177307 : UTF-8-Sig codec.	2006-01-08 10:45:39 +00:00
Tim Peters	536cf99536	Whitespace normalization.	2005-12-25 23:18:31 +00:00
Marc-André Lemburg	d9cf593b49	Cosmetic change: make all hex literals use upper case hex so that they look more like the Unicode Consortium files. Add ending new-line to all source files.	2005-10-24 12:14:59 +00:00
Marc-André Lemburg	3c72ded23d	Removed the decoding_map from the codecs where this is possible. Replaced the tis_620, cp1140 and koi8_u codecs with new ones based on custom mapping files.	2005-10-24 12:07:49 +00:00
Marc-André Lemburg	0f00ba8bd8	Replace the old EBCDIC codecs with new ones using the decoding table.	2005-10-21 14:35:35 +00:00
Marc-André Lemburg	7797be7b3b	Alias iso8859_1 to latin_1 which is the same encoding, but has a much faster codec implementation.	2005-10-21 14:02:28 +00:00
Marc-André Lemburg	75c9e8392e	Add a few more Mac OS encodings. The mapping tables for these are available at ftp.unicode.org.	2005-10-21 13:58:32 +00:00
Marc-André Lemburg	a1129f4b9b	Replace the old charmap codecs with new ones generated from the current mapping tables available at ftp.unicode.org. These new codecs include and use character decoding tables which speeds up decoding by a few factors.	2005-10-21 13:49:12 +00:00
Walter Dörwald	007f8dfde2	Bug #1245379 : Add "unicode-1-1-utf-7" as an alias for "utf-7" as specified by RFC 1642.	2005-10-09 19:42:27 +00:00
Neal Norwitz	4ce69a5b06	No need to import exceptions, they are builtins	2005-09-01 00:45:28 +00:00
Martin v. Löwis	8b59514e57	Make IDNA return an empty string when the input is empty. Fixes #1163178 . Will backport to 2.4.	2005-08-25 11:03:38 +00:00
Walter Dörwald	729c31f5c3	Reset internal buffers when seek() is called. This fixes SF bug #1156259 .	2005-03-14 19:06:30 +00:00
Walter Dörwald	e1a0391b49	Fix wrong variable name.	2004-12-29 13:11:10 +00:00
Marc-André Lemburg	9ab8818c87	Rearranged mappings to value sorting order.	2004-12-10 21:54:35 +00:00
Walter Dörwald	69652035bc	SF patch #998993 : The UTF-8 and the UTF-16 stateful decoders now support decoding incomplete input (when the input stream is temporarily exhausted). codecs.StreamReader now implements buffering, which enables proper readline support for the UTF-16 decoders. codecs.StreamReader.read() has a new argument chars which specifies the number of characters to return. codecs.StreamReader.readline() and codecs.StreamReader.readlines() have a new argument keepends. Trailing "\n"s will be stripped from the lines if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and PyUnicode_DecodeUTF16Stateful.	2004-09-07 20:24:22 +00:00
Tim Peters	d1b7827216	Whitespace normalization.	2004-08-07 06:03:09 +00:00

1 2 3

119 Commits