svn+ssh://pythondev@svn.python.org/python/trunk
........
r81471 | victor.stinner | 2010-05-22 15:37:56 +0200 (sam., 22 mai 2010) | 7 lines
Issue #6268: More bugfixes about BOM, UTF-16 and UTF-32
* Fix seek() method of codecs.open(), don't write the BOM twice after seek(0)
* Fix reset() method of codecs, UTF-16, UTF-32 and StreamWriter classes
* test_codecs: use "w+" mode instead of "wt+". "t" mode is not supported by
Solaris or Windows, but does it really exist? I found it the in the issue.
........
r81472 | victor.stinner | 2010-05-22 15:44:25 +0200 (sam., 22 mai 2010) | 4 lines
Fix my last commit (r81471) about codecs
Rememder: don't touch the code just before a commit
........
of tuple) that provides incremental decoders and encoders (a way to use
stateful codecs without the stream API). Functions
codecs.getincrementaldecoder() and codecs.getincrementalencoder() have
been added.
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
write a BOM at the start of the stream and also to only read it as
BOM at the start of a stream.
Subsequent reading/writing of BOMs will read/write the BOM as ZWNBSP
character. This is in sync with the Unicode specifications.
Note that UTF-16 files will now *have* to start with a BOM mark
in order to be readable by the codec.