Commit Graph

9 Commits

Author SHA1 Message Date
Victor Stinner 082a65ab1f Merged revisions 83198 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/release27-maint

........
  r83198 | victor.stinner | 2010-07-28 03:39:45 +0200 (mer., 28 juil. 2010) | 3 lines

  Issue #6213: Implement getstate() and setstate() methods of utf-8-sig and
  utf-16 incremental encoders.
........
2010-07-28 01:55:43 +00:00
Victor Stinner 09c0f24f39 Merged revisions 81471-81472 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81471 | victor.stinner | 2010-05-22 15:37:56 +0200 (sam., 22 mai 2010) | 7 lines

  Issue #6268: More bugfixes about BOM, UTF-16 and UTF-32

   * Fix seek() method of codecs.open(), don't write the BOM twice after seek(0)
   * Fix reset() method of codecs, UTF-16, UTF-32 and StreamWriter classes
   * test_codecs: use "w+" mode instead of "wt+". "t" mode is not supported by
     Solaris or Windows, but does it really exist? I found it the in the issue.
........
  r81472 | victor.stinner | 2010-05-22 15:44:25 +0200 (sam., 22 mai 2010) | 4 lines

  Fix my last commit (r81471) about codecs

  Rememder: don't touch the code just before a commit
........
2010-05-22 16:52:13 +00:00
Walter Dörwald abb02e5994 Patch #1436130: codecs.lookup() now returns a CodecInfo object (a subclass
of tuple) that provides incremental decoders and encoders (a way to use
stateful codecs without the stream API). Functions
codecs.getincrementaldecoder() and codecs.getincrementalencoder() have
been added.
2006-03-15 11:35:15 +00:00
Walter Dörwald 729c31f5c3 Reset internal buffers when seek() is called. This fixes SF bug #1156259. 2005-03-14 19:06:30 +00:00
Walter Dörwald 69652035bc SF patch #998993: The UTF-8 and the UTF-16 stateful decoders now support
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
2004-09-07 20:24:22 +00:00
Tim Peters 469cdad822 Whitespace normalization. 2002-08-08 20:19:19 +00:00
Marc-André Lemburg 3ccb09cba3 Fix for bug #222395: UTF-16 et al. don't handle .readline().
They now raise an NotImplementedError to hint to the truth ;-)
2002-04-05 12:12:00 +00:00
Marc-André Lemburg 92b550cdd8 This patch by Martin v. Loewis changes the UTF-16 codec to only
write a BOM at the start of the stream and also to only read it as
BOM at the start of a stream.

Subsequent reading/writing of BOMs will read/write the BOM as ZWNBSP
character. This is in sync with the Unicode specifications.

Note that UTF-16 files will now *have* to start with a BOM mark
in order to be readable by the codec.
2001-06-19 20:07:51 +00:00
Guido van Rossum 0229bf6001 Marc-Andre Lemburg: Unicode encodings. 2000-03-10 23:17:24 +00:00