Merge with 3.3
This commit is contained in:
commit
f567727abc
|
@ -78,7 +78,11 @@ It defines the following functions:
|
||||||
reference (for encoding only)
|
reference (for encoding only)
|
||||||
* ``'backslashreplace'``: replace with backslashed escape sequences (for
|
* ``'backslashreplace'``: replace with backslashed escape sequences (for
|
||||||
encoding only)
|
encoding only)
|
||||||
* ``'surrogateescape'``: replace with surrogate U+DCxx, see :pep:`383`
|
* ``'surrogateescape'``: on decoding, replace with code points in the Unicode
|
||||||
|
Private Use Area ranging from U+DC80 to U+DCFF. These private code
|
||||||
|
points will then be turned back into the same bytes when the
|
||||||
|
``surrogateescape`` error handler is used when encoding the data.
|
||||||
|
(See :pep:`383` for more.)
|
||||||
|
|
||||||
as well as any other error handling name defined via :func:`register_error`.
|
as well as any other error handling name defined via :func:`register_error`.
|
||||||
|
|
||||||
|
|
|
@ -905,16 +905,36 @@ are always available. They are listed here in alphabetical order.
|
||||||
the list of supported encodings.
|
the list of supported encodings.
|
||||||
|
|
||||||
*errors* is an optional string that specifies how encoding and decoding
|
*errors* is an optional string that specifies how encoding and decoding
|
||||||
errors are to be handled--this cannot be used in binary mode. Pass
|
errors are to be handled--this cannot be used in binary mode.
|
||||||
``'strict'`` to raise a :exc:`ValueError` exception if there is an encoding
|
A variety of standard error handlers are available, though any
|
||||||
error (the default of ``None`` has the same effect), or pass ``'ignore'`` to
|
error handling name that has been registered with
|
||||||
ignore errors. (Note that ignoring encoding errors can lead to data loss.)
|
:func:`codecs.register_error` is also valid. The standard names
|
||||||
``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted
|
are:
|
||||||
where there is malformed data. When writing, ``'xmlcharrefreplace'``
|
|
||||||
(replace with the appropriate XML character reference) or
|
* ``'strict'`` to raise a :exc:`ValueError` exception if there is
|
||||||
``'backslashreplace'`` (replace with backslashed escape sequences) can be
|
an encoding error. The default value of ``None`` has the same
|
||||||
used. Any other error handling name that has been registered with
|
effect.
|
||||||
:func:`codecs.register_error` is also valid.
|
|
||||||
|
* ``'ignore'`` ignores errors. Note that ignoring encoding errors
|
||||||
|
can lead to data loss.
|
||||||
|
|
||||||
|
* ``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted
|
||||||
|
where there is malformed data.
|
||||||
|
|
||||||
|
* ``'surrogateescape'`` will represent any incorrect bytes as code
|
||||||
|
points in the Unicode Private Use Area ranging from U+DC80 to
|
||||||
|
U+DCFF. These private code points will then be turned back into
|
||||||
|
the same bytes when the ``surrogateescape`` error handler is used
|
||||||
|
when writing data. This is useful for processing files in an
|
||||||
|
unknown encoding.
|
||||||
|
|
||||||
|
* ``'xmlcharrefreplace'`` is only supported when writing to a file.
|
||||||
|
Characters not supported by the encoding are replaced with the
|
||||||
|
appropriate XML character reference ``&#nnn;``.
|
||||||
|
|
||||||
|
* ``'backslashreplace'`` (also only supported when writing)
|
||||||
|
replaces unsupported characters with Python's backslashed escape
|
||||||
|
sequences.
|
||||||
|
|
||||||
.. index::
|
.. index::
|
||||||
single: universal newlines; open() built-in function
|
single: universal newlines; open() built-in function
|
||||||
|
|
|
@ -105,6 +105,7 @@ class Codec:
|
||||||
Python will use the official U+FFFD REPLACEMENT
|
Python will use the official U+FFFD REPLACEMENT
|
||||||
CHARACTER for the builtin Unicode codecs on
|
CHARACTER for the builtin Unicode codecs on
|
||||||
decoding and '?' on encoding.
|
decoding and '?' on encoding.
|
||||||
|
'surrogateescape' - replace with private codepoints U+DCnn.
|
||||||
'xmlcharrefreplace' - Replace with the appropriate XML
|
'xmlcharrefreplace' - Replace with the appropriate XML
|
||||||
character reference (only for encoding).
|
character reference (only for encoding).
|
||||||
'backslashreplace' - Replace with backslashed escape sequences
|
'backslashreplace' - Replace with backslashed escape sequences
|
||||||
|
|
|
@ -168,8 +168,8 @@ PyDoc_STRVAR(open_doc,
|
||||||
"'strict' to raise a ValueError exception if there is an encoding error\n"
|
"'strict' to raise a ValueError exception if there is an encoding error\n"
|
||||||
"(the default of None has the same effect), or pass 'ignore' to ignore\n"
|
"(the default of None has the same effect), or pass 'ignore' to ignore\n"
|
||||||
"errors. (Note that ignoring encoding errors can lead to data loss.)\n"
|
"errors. (Note that ignoring encoding errors can lead to data loss.)\n"
|
||||||
"See the documentation for codecs.register for a list of the permitted\n"
|
"See the documentation for codecs.register or run 'help(codecs.Codec)'\n"
|
||||||
"encoding error strings.\n"
|
"for a list of the permitted encoding error strings.\n"
|
||||||
"\n"
|
"\n"
|
||||||
"newline controls how universal newlines works (it only applies to text\n"
|
"newline controls how universal newlines works (it only applies to text\n"
|
||||||
"mode). It can be None, '', '\\n', '\\r', and '\\r\\n'. It works as\n"
|
"mode). It can be None, '', '\\n', '\\r', and '\\r\\n'. It works as\n"
|
||||||
|
|
|
@ -642,8 +642,9 @@ PyDoc_STRVAR(textiowrapper_doc,
|
||||||
"encoding gives the name of the encoding that the stream will be\n"
|
"encoding gives the name of the encoding that the stream will be\n"
|
||||||
"decoded or encoded with. It defaults to locale.getpreferredencoding(False).\n"
|
"decoded or encoded with. It defaults to locale.getpreferredencoding(False).\n"
|
||||||
"\n"
|
"\n"
|
||||||
"errors determines the strictness of encoding and decoding (see the\n"
|
"errors determines the strictness of encoding and decoding (see\n"
|
||||||
"codecs.register) and defaults to \"strict\".\n"
|
"help(codecs.Codec) or the documentation for codecs.register) and\n"
|
||||||
|
"defaults to \"strict\".\n"
|
||||||
"\n"
|
"\n"
|
||||||
"newline controls how line endings are handled. It can be None, '',\n"
|
"newline controls how line endings are handled. It can be None, '',\n"
|
||||||
"'\\n', '\\r', and '\\r\\n'. It works as follows:\n"
|
"'\\n', '\\r', and '\\r\\n'. It works as follows:\n"
|
||||||
|
|
Loading…
Reference in New Issue