even if there is an exception in later lines, resulting in
correct line numbers for decoding errors in source code. Fixes#1178484.
Will backport to 2.4.
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
and installed layouts to make maintenance simple and easy. And it
also adds four new codecs; big5hkscs, euc-jis-2004, shift-jis-2004
and iso2022-jp-2004.
error handers in the Unicode codecs: Negative
positions are treated as being relative to the end of
the input and out of bounds positions result in an
IndexError.
Also update the PEP and include an explanation of
this in the documentation for codecs.register_error.
Fixes a small bug in iconv_codecs: if the position
from the callback is negative *add* it to the size
instead of substracting it.
From SF patch #677429.
The errors attribute can be changed after the reader/writer
is created.
For encoding there are two additional errors values:
"xmlcharrefreplace" and "backslashreplace".
These values can be extended via register_error().
BOM_UTF32, BOM_UTF32_LE and BOM_UTF32_BE that represent the Byte
Order Mark in UTF-8, UTF-16 and UTF-32 encodings for little and
big endian systems.
The old names BOM32_* and BOM64_* were off by a factor of 2.
This closes SF bug http://www.python.org/sf/555360