mirror of https://github.com/python/cpython
bpo-28749: Fixed the documentation of the mapping codec APIs. (#487)
Added the documentation for PyUnicode_Translate().
This commit is contained in:
parent
909a6f626f
commit
c85a26628c
|
@ -1399,72 +1399,45 @@ Character Map Codecs
|
|||
This codec is special in that it can be used to implement many different codecs
|
||||
(and this is in fact what was done to obtain most of the standard codecs
|
||||
included in the :mod:`encodings` package). The codec uses mapping to encode and
|
||||
decode characters.
|
||||
|
||||
Decoding mappings must map single string characters to single Unicode
|
||||
characters, integers (which are then interpreted as Unicode ordinals) or ``None``
|
||||
(meaning "undefined mapping" and causing an error).
|
||||
|
||||
Encoding mappings must map single Unicode characters to single string
|
||||
characters, integers (which are then interpreted as Latin-1 ordinals) or ``None``
|
||||
(meaning "undefined mapping" and causing an error).
|
||||
|
||||
The mapping objects provided must only support the __getitem__ mapping
|
||||
interface.
|
||||
|
||||
If a character lookup fails with a LookupError, the character is copied as-is
|
||||
meaning that its ordinal value will be interpreted as Unicode or Latin-1 ordinal
|
||||
resp. Because of this, mappings only need to contain those mappings which map
|
||||
characters to different code points.
|
||||
decode characters. The mapping objects provided must support the
|
||||
:meth:`__getitem__` mapping interface; dictionaries and sequences work well.
|
||||
|
||||
These are the mapping codec APIs:
|
||||
|
||||
.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, \
|
||||
.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *data, Py_ssize_t size, \
|
||||
PyObject *mapping, const char *errors)
|
||||
|
||||
Create a Unicode object by decoding *size* bytes of the encoded string *s* using
|
||||
the given *mapping* object. Return *NULL* if an exception was raised by the
|
||||
codec. If *mapping* is *NULL* latin-1 decoding will be done. Else it can be a
|
||||
dictionary mapping byte or a unicode string, which is treated as a lookup table.
|
||||
Byte values greater that the length of the string and U+FFFE "characters" are
|
||||
treated as "undefined mapping".
|
||||
Create a Unicode object by decoding *size* bytes of the encoded string *s*
|
||||
using the given *mapping* object. Return *NULL* if an exception was raised
|
||||
by the codec.
|
||||
|
||||
If *mapping* is *NULL*, Latin-1 decoding will be applied. Else
|
||||
*mapping* must map bytes ordinals (integers in the range from 0 to 255)
|
||||
to Unicode strings, integers (which are then interpreted as Unicode
|
||||
ordinals) or ``None``. Unmapped data bytes -- ones which cause a
|
||||
:exc:`LookupError`, as well as ones which get mapped to ``None``,
|
||||
``0xFFFE`` or ``'\ufffe'``, are treated as undefined mappings and cause
|
||||
an error.
|
||||
|
||||
|
||||
.. c:function:: PyObject* PyUnicode_AsCharmapString(PyObject *unicode, PyObject *mapping)
|
||||
|
||||
Encode a Unicode object using the given *mapping* object and return the result
|
||||
as Python string object. Error handling is "strict". Return *NULL* if an
|
||||
Encode a Unicode object using the given *mapping* object and return the
|
||||
result as a bytes object. Error handling is "strict". Return *NULL* if an
|
||||
exception was raised by the codec.
|
||||
|
||||
The following codec API is special in that maps Unicode to Unicode.
|
||||
|
||||
|
||||
.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, \
|
||||
PyObject *table, const char *errors)
|
||||
|
||||
Translate a :c:type:`Py_UNICODE` buffer of the given *size* by applying a
|
||||
character mapping *table* to it and return the resulting Unicode object. Return
|
||||
*NULL* when an exception was raised by the codec.
|
||||
|
||||
The *mapping* table must map Unicode ordinal integers to Unicode ordinal
|
||||
integers or ``None`` (causing deletion of the character).
|
||||
|
||||
Mapping tables need only provide the :meth:`__getitem__` interface; dictionaries
|
||||
and sequences work well. Unmapped character ordinals (ones which cause a
|
||||
:exc:`LookupError`) are left untouched and are copied as-is.
|
||||
|
||||
.. deprecated-removed:: 3.3 4.0
|
||||
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
|
||||
:c:func:`PyUnicode_Translate`. or :ref:`generic codec based API
|
||||
<codec-registry>`
|
||||
The *mapping* object must map Unicode ordinal integers to bytes objects,
|
||||
integers in the range from 0 to 255 or ``None``. Unmapped character
|
||||
ordinals (ones which cause a :exc:`LookupError`) as well as mapped to
|
||||
``None`` are treated as "undefined mapping" and cause an error.
|
||||
|
||||
|
||||
.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, \
|
||||
PyObject *mapping, const char *errors)
|
||||
|
||||
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given
|
||||
*mapping* object and return a Python string object. Return *NULL* if an
|
||||
exception was raised by the codec.
|
||||
*mapping* object and return the result as a bytes object. Return *NULL* if
|
||||
an exception was raised by the codec.
|
||||
|
||||
.. deprecated-removed:: 3.3 4.0
|
||||
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
|
||||
|
@ -1472,6 +1445,34 @@ The following codec API is special in that maps Unicode to Unicode.
|
|||
:c:func:`PyUnicode_AsEncodedString`.
|
||||
|
||||
|
||||
The following codec API is special in that maps Unicode to Unicode.
|
||||
|
||||
.. c:function:: PyObject* PyUnicode_Translate(PyObject *unicode, \
|
||||
PyObject *mapping, const char *errors)
|
||||
|
||||
Translate a Unicode object using the given *mapping* object and return the
|
||||
resulting Unicode object. Return *NULL* if an exception was raised by the
|
||||
codec.
|
||||
|
||||
The *mapping* object must map Unicode ordinal integers to Unicode strings,
|
||||
integers (which are then interpreted as Unicode ordinals) or ``None``
|
||||
(causing deletion of the character). Unmapped character ordinals (ones
|
||||
which cause a :exc:`LookupError`) are left untouched and are copied as-is.
|
||||
|
||||
|
||||
.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, \
|
||||
PyObject *mapping, const char *errors)
|
||||
|
||||
Translate a :c:type:`Py_UNICODE` buffer of the given *size* by applying a
|
||||
character *mapping* table to it and return the resulting Unicode object.
|
||||
Return *NULL* when an exception was raised by the codec.
|
||||
|
||||
.. deprecated-removed:: 3.3 4.0
|
||||
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
|
||||
:c:func:`PyUnicode_Translate`. or :ref:`generic codec based API
|
||||
<codec-registry>`
|
||||
|
||||
|
||||
MBCS codecs for Windows
|
||||
"""""""""""""""""""""""
|
||||
|
||||
|
|
|
@ -1613,50 +1613,41 @@ PyAPI_FUNC(PyObject*) PyUnicode_EncodeASCII(
|
|||
|
||||
This codec uses mappings to encode and decode characters.
|
||||
|
||||
Decoding mappings must map single string characters to single
|
||||
Unicode characters, integers (which are then interpreted as Unicode
|
||||
ordinals) or None (meaning "undefined mapping" and causing an
|
||||
error).
|
||||
Decoding mappings must map byte ordinals (integers in the range from 0 to
|
||||
255) to Unicode strings, integers (which are then interpreted as Unicode
|
||||
ordinals) or None. Unmapped data bytes (ones which cause a LookupError)
|
||||
as well as mapped to None, 0xFFFE or '\ufffe' are treated as "undefined
|
||||
mapping" and cause an error.
|
||||
|
||||
Encoding mappings must map single Unicode characters to single
|
||||
string characters, integers (which are then interpreted as Latin-1
|
||||
ordinals) or None (meaning "undefined mapping" and causing an
|
||||
error).
|
||||
|
||||
If a character lookup fails with a LookupError, the character is
|
||||
copied as-is meaning that its ordinal value will be interpreted as
|
||||
Unicode or Latin-1 ordinal resp. Because of this mappings only need
|
||||
to contain those mappings which map characters to different code
|
||||
points.
|
||||
Encoding mappings must map Unicode ordinal integers to bytes objects,
|
||||
integers in the range from 0 to 255 or None. Unmapped character
|
||||
ordinals (ones which cause a LookupError) as well as mapped to
|
||||
None are treated as "undefined mapping" and cause an error.
|
||||
|
||||
*/
|
||||
|
||||
PyAPI_FUNC(PyObject*) PyUnicode_DecodeCharmap(
|
||||
const char *string, /* Encoded string */
|
||||
Py_ssize_t length, /* size of string */
|
||||
PyObject *mapping, /* character mapping
|
||||
(char ordinal -> unicode ordinal) */
|
||||
PyObject *mapping, /* decoding mapping */
|
||||
const char *errors /* error handling */
|
||||
);
|
||||
|
||||
PyAPI_FUNC(PyObject*) PyUnicode_AsCharmapString(
|
||||
PyObject *unicode, /* Unicode object */
|
||||
PyObject *mapping /* character mapping
|
||||
(unicode ordinal -> char ordinal) */
|
||||
PyObject *mapping /* encoding mapping */
|
||||
);
|
||||
|
||||
#ifndef Py_LIMITED_API
|
||||
PyAPI_FUNC(PyObject*) PyUnicode_EncodeCharmap(
|
||||
const Py_UNICODE *data, /* Unicode char buffer */
|
||||
Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
|
||||
PyObject *mapping, /* character mapping
|
||||
(unicode ordinal -> char ordinal) */
|
||||
PyObject *mapping, /* encoding mapping */
|
||||
const char *errors /* error handling */
|
||||
) Py_DEPRECATED(3.3);
|
||||
PyAPI_FUNC(PyObject*) _PyUnicode_EncodeCharmap(
|
||||
PyObject *unicode, /* Unicode object */
|
||||
PyObject *mapping, /* character mapping
|
||||
(unicode ordinal -> char ordinal) */
|
||||
PyObject *mapping, /* encoding mapping */
|
||||
const char *errors /* error handling */
|
||||
);
|
||||
#endif
|
||||
|
@ -1665,8 +1656,8 @@ PyAPI_FUNC(PyObject*) _PyUnicode_EncodeCharmap(
|
|||
character mapping table to it and return the resulting Unicode
|
||||
object.
|
||||
|
||||
The mapping table must map Unicode ordinal integers to Unicode
|
||||
ordinal integers or None (causing deletion of the character).
|
||||
The mapping table must map Unicode ordinal integers to Unicode strings,
|
||||
Unicode ordinal integers or None (causing deletion of the character).
|
||||
|
||||
Mapping tables may be dictionaries or sequences. Unmapped character
|
||||
ordinals (ones which cause a LookupError) are left untouched and
|
||||
|
@ -1964,8 +1955,8 @@ PyAPI_FUNC(PyObject*) PyUnicode_RSplit(
|
|||
/* Translate a string by applying a character mapping table to it and
|
||||
return the resulting Unicode object.
|
||||
|
||||
The mapping table must map Unicode ordinal integers to Unicode
|
||||
ordinal integers or None (causing deletion of the character).
|
||||
The mapping table must map Unicode ordinal integers to Unicode strings,
|
||||
Unicode ordinal integers or None (causing deletion of the character).
|
||||
|
||||
Mapping tables may be dictionaries or sequences. Unmapped character
|
||||
ordinals (ones which cause a LookupError) are left untouched and
|
||||
|
|
Loading…
Reference in New Issue