Merge with 3.3
This commit is contained in:
commit
f567727abc
|
@ -78,7 +78,11 @@ It defines the following functions:
|
|||
reference (for encoding only)
|
||||
* ``'backslashreplace'``: replace with backslashed escape sequences (for
|
||||
encoding only)
|
||||
* ``'surrogateescape'``: replace with surrogate U+DCxx, see :pep:`383`
|
||||
* ``'surrogateescape'``: on decoding, replace with code points in the Unicode
|
||||
Private Use Area ranging from U+DC80 to U+DCFF. These private code
|
||||
points will then be turned back into the same bytes when the
|
||||
``surrogateescape`` error handler is used when encoding the data.
|
||||
(See :pep:`383` for more.)
|
||||
|
||||
as well as any other error handling name defined via :func:`register_error`.
|
||||
|
||||
|
|
|
@ -905,16 +905,36 @@ are always available. They are listed here in alphabetical order.
|
|||
the list of supported encodings.
|
||||
|
||||
*errors* is an optional string that specifies how encoding and decoding
|
||||
errors are to be handled--this cannot be used in binary mode. Pass
|
||||
``'strict'`` to raise a :exc:`ValueError` exception if there is an encoding
|
||||
error (the default of ``None`` has the same effect), or pass ``'ignore'`` to
|
||||
ignore errors. (Note that ignoring encoding errors can lead to data loss.)
|
||||
``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted
|
||||
where there is malformed data. When writing, ``'xmlcharrefreplace'``
|
||||
(replace with the appropriate XML character reference) or
|
||||
``'backslashreplace'`` (replace with backslashed escape sequences) can be
|
||||
used. Any other error handling name that has been registered with
|
||||
:func:`codecs.register_error` is also valid.
|
||||
errors are to be handled--this cannot be used in binary mode.
|
||||
A variety of standard error handlers are available, though any
|
||||
error handling name that has been registered with
|
||||
:func:`codecs.register_error` is also valid. The standard names
|
||||
are:
|
||||
|
||||
* ``'strict'`` to raise a :exc:`ValueError` exception if there is
|
||||
an encoding error. The default value of ``None`` has the same
|
||||
effect.
|
||||
|
||||
* ``'ignore'`` ignores errors. Note that ignoring encoding errors
|
||||
can lead to data loss.
|
||||
|
||||
* ``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted
|
||||
where there is malformed data.
|
||||
|
||||
* ``'surrogateescape'`` will represent any incorrect bytes as code
|
||||
points in the Unicode Private Use Area ranging from U+DC80 to
|
||||
U+DCFF. These private code points will then be turned back into
|
||||
the same bytes when the ``surrogateescape`` error handler is used
|
||||
when writing data. This is useful for processing files in an
|
||||
unknown encoding.
|
||||
|
||||
* ``'xmlcharrefreplace'`` is only supported when writing to a file.
|
||||
Characters not supported by the encoding are replaced with the
|
||||
appropriate XML character reference ``&#nnn;``.
|
||||
|
||||
* ``'backslashreplace'`` (also only supported when writing)
|
||||
replaces unsupported characters with Python's backslashed escape
|
||||
sequences.
|
||||
|
||||
.. index::
|
||||
single: universal newlines; open() built-in function
|
||||
|
|
|
@ -105,6 +105,7 @@ class Codec:
|
|||
Python will use the official U+FFFD REPLACEMENT
|
||||
CHARACTER for the builtin Unicode codecs on
|
||||
decoding and '?' on encoding.
|
||||
'surrogateescape' - replace with private codepoints U+DCnn.
|
||||
'xmlcharrefreplace' - Replace with the appropriate XML
|
||||
character reference (only for encoding).
|
||||
'backslashreplace' - Replace with backslashed escape sequences
|
||||
|
|
|
@ -168,8 +168,8 @@ PyDoc_STRVAR(open_doc,
|
|||
"'strict' to raise a ValueError exception if there is an encoding error\n"
|
||||
"(the default of None has the same effect), or pass 'ignore' to ignore\n"
|
||||
"errors. (Note that ignoring encoding errors can lead to data loss.)\n"
|
||||
"See the documentation for codecs.register for a list of the permitted\n"
|
||||
"encoding error strings.\n"
|
||||
"See the documentation for codecs.register or run 'help(codecs.Codec)'\n"
|
||||
"for a list of the permitted encoding error strings.\n"
|
||||
"\n"
|
||||
"newline controls how universal newlines works (it only applies to text\n"
|
||||
"mode). It can be None, '', '\\n', '\\r', and '\\r\\n'. It works as\n"
|
||||
|
|
|
@ -642,8 +642,9 @@ PyDoc_STRVAR(textiowrapper_doc,
|
|||
"encoding gives the name of the encoding that the stream will be\n"
|
||||
"decoded or encoded with. It defaults to locale.getpreferredencoding(False).\n"
|
||||
"\n"
|
||||
"errors determines the strictness of encoding and decoding (see the\n"
|
||||
"codecs.register) and defaults to \"strict\".\n"
|
||||
"errors determines the strictness of encoding and decoding (see\n"
|
||||
"help(codecs.Codec) or the documentation for codecs.register) and\n"
|
||||
"defaults to \"strict\".\n"
|
||||
"\n"
|
||||
"newline controls how line endings are handled. It can be None, '',\n"
|
||||
"'\\n', '\\r', and '\\r\\n'. It works as follows:\n"
|
||||
|
|
Loading…
Reference in New Issue