Issue #17844: Refactor a documentation of Python specific encodings.

Add links to encoders and decoders for binary-to-binary codecs.
This commit is contained in:
Serhiy Storchaka 2013-05-22 15:28:30 +03:00
parent cb1a0283c3
commit 54f70923a3
2 changed files with 105 additions and 76 deletions

View File

@ -1098,88 +1098,112 @@ particular, the following variants typically exist:
| utf_8_sig | | all languages |
+-----------------+--------------------------------+--------------------------------+
A number of codecs are specific to Python, so their codec names have no meaning
outside Python. Some of them don't convert from Unicode strings to byte strings,
but instead use the property of the Python codecs machinery that any bijective
function with one argument can be considered as an encoding.
Python Specific Encodings
-------------------------
For the codecs listed below, the result in the "encoding" direction is always a
byte string. The result of the "decoding" direction is listed as operand type in
the table.
A number of predefined codecs are specific to Python, so their codec names have
no meaning outside Python. These are listed in the tables below based on the
expected input and output types (note that while text encodings are the most
common use case for codecs, the underlying codec infrastructure supports
arbitrary data transforms rather than just text encodings). For asymmetric
codecs, the stated purpose describes the encoding direction.
.. tabularcolumns:: |l|p{0.3\linewidth}|l|p{0.3\linewidth}|
The following codecs provide unicode-to-str encoding [#encoding-note]_ and
str-to-unicode decoding [#decoding-note]_, similar to the Unicode text
encodings.
+--------------------+---------------------------+----------------+---------------------------+
| Codec | Aliases | Operand type | Purpose |
+====================+===========================+================+===========================+
| base64_codec | base64, base-64 | byte string | Convert operand to MIME |
| | | | base64 (the result always |
| | | | includes a trailing |
| | | | ``'\n'``) |
+--------------------+---------------------------+----------------+---------------------------+
| bz2_codec | bz2 | byte string | Compress the operand |
| | | | using bz2 |
+--------------------+---------------------------+----------------+---------------------------+
| hex_codec | hex | byte string | Convert operand to |
| | | | hexadecimal |
| | | | representation, with two |
| | | | digits per byte |
+--------------------+---------------------------+----------------+---------------------------+
| idna | | Unicode string | Implements :rfc:`3490`, |
| | | | see also |
| | | | :mod:`encodings.idna` |
+--------------------+---------------------------+----------------+---------------------------+
| mbcs | dbcs | Unicode string | Windows only: Encode |
| | | | operand according to the |
| | | | ANSI codepage (CP_ACP) |
+--------------------+---------------------------+----------------+---------------------------+
| palmos | | Unicode string | Encoding of PalmOS 3.5 |
+--------------------+---------------------------+----------------+---------------------------+
| punycode | | Unicode string | Implements :rfc:`3492` |
+--------------------+---------------------------+----------------+---------------------------+
| quopri_codec | quopri, quoted-printable, | byte string | Convert operand to MIME |
| | quotedprintable | | quoted printable |
+--------------------+---------------------------+----------------+---------------------------+
| raw_unicode_escape | | Unicode string | Produce a string that is |
| | | | suitable as raw Unicode |
| | | | literal in Python source |
| | | | code |
+--------------------+---------------------------+----------------+---------------------------+
| rot_13 | rot13 | Unicode string | Returns the Caesar-cypher |
| | | | encryption of the operand |
+--------------------+---------------------------+----------------+---------------------------+
| string_escape | | byte string | Produce a string that is |
| | | | suitable as string |
| | | | literal in Python source |
| | | | code |
+--------------------+---------------------------+----------------+---------------------------+
| undefined | | any | Raise an exception for |
| | | | all conversions. Can be |
| | | | used as the system |
| | | | encoding if no automatic |
| | | | :term:`coercion` between |
| | | | byte and Unicode strings |
| | | | is desired. |
+--------------------+---------------------------+----------------+---------------------------+
| unicode_escape | | Unicode string | Produce a string that is |
| | | | suitable as Unicode |
| | | | literal in Python source |
| | | | code |
+--------------------+---------------------------+----------------+---------------------------+
| unicode_internal | | Unicode string | Return the internal |
| | | | representation of the |
| | | | operand |
+--------------------+---------------------------+----------------+---------------------------+
| uu_codec | uu | byte string | Convert the operand using |
| | | | uuencode |
+--------------------+---------------------------+----------------+---------------------------+
| zlib_codec | zip, zlib | byte string | Compress the operand |
| | | | using gzip |
+--------------------+---------------------------+----------------+---------------------------+
.. tabularcolumns:: |l|L|L|
+--------------------+---------------------------+---------------------------+
| Codec | Aliases | Purpose |
+====================+===========================+===========================+
| idna | | Implements :rfc:`3490`, |
| | | see also |
| | | :mod:`encodings.idna` |
+--------------------+---------------------------+---------------------------+
| mbcs | dbcs | Windows only: Encode |
| | | operand according to the |
| | | ANSI codepage (CP_ACP) |
+--------------------+---------------------------+---------------------------+
| palmos | | Encoding of PalmOS 3.5 |
+--------------------+---------------------------+---------------------------+
| punycode | | Implements :rfc:`3492` |
+--------------------+---------------------------+---------------------------+
| raw_unicode_escape | | Produce a string that is |
| | | suitable as raw Unicode |
| | | literal in Python source |
| | | code |
+--------------------+---------------------------+---------------------------+
| rot_13 | rot13 | Returns the Caesar-cypher |
| | | encryption of the operand |
+--------------------+---------------------------+---------------------------+
| undefined | | Raise an exception for |
| | | all conversions. Can be |
| | | used as the system |
| | | encoding if no automatic |
| | | :term:`coercion` between |
| | | byte and Unicode strings |
| | | is desired. |
+--------------------+---------------------------+---------------------------+
| unicode_escape | | Produce a string that is |
| | | suitable as Unicode |
| | | literal in Python source |
| | | code |
+--------------------+---------------------------+---------------------------+
| unicode_internal | | Return the internal |
| | | representation of the |
| | | operand |
+--------------------+---------------------------+---------------------------+
.. versionadded:: 2.3
The ``idna`` and ``punycode`` encodings.
The following codecs provide str-to-str encoding and decoding
[#decoding-note]_.
.. tabularcolumns:: |l|L|L|L|
+--------------------+---------------------------+---------------------------+------------------------------+
| Codec | Aliases | Purpose | Encoder/decoder |
+====================+===========================+===========================+==============================+
| base64_codec | base64, base-64 | Convert operand to MIME | :meth:`base64.b64encode`, |
| | | base64 (the result always | :meth:`base64.b64decode` |
| | | includes a trailing | |
| | | ``'\n'``) | |
+--------------------+---------------------------+---------------------------+------------------------------+
| bz2_codec | bz2 | Compress the operand | :meth:`bz2.compress`, |
| | | using bz2 | :meth:`bz2.decompress` |
+--------------------+---------------------------+---------------------------+------------------------------+
| hex_codec | hex | Convert operand to | :meth:`base64.b16encode`, |
| | | hexadecimal | :meth:`base64.b16decode` |
| | | representation, with two | |
| | | digits per byte | |
+--------------------+---------------------------+---------------------------+------------------------------+
| quopri_codec | quopri, quoted-printable, | Convert operand to MIME | :meth:`quopri.encodestring`, |
| | quotedprintable | quoted printable | :meth:`quopri.decodestring` |
+--------------------+---------------------------+---------------------------+------------------------------+
| string_escape | | Produce a string that is | |
| | | suitable as string | |
| | | literal in Python source | |
| | | code | |
+--------------------+---------------------------+---------------------------+------------------------------+
| uu_codec | uu | Convert the operand using | :meth:`uu.encode`, |
| | | uuencode | :meth:`uu.decode` |
+--------------------+---------------------------+---------------------------+------------------------------+
| zlib_codec | zip, zlib | Compress the operand | :meth:`zlib.compress`, |
| | | using gzip | :meth:`zlib.decompress` |
+--------------------+---------------------------+---------------------------+------------------------------+
.. [#encoding-note] str objects are also accepted as input in place of unicode
objects. They are implicitly converted to unicode by decoding them using
the default encoding. If this conversion fails, it may lead to encoding
operations raising :exc:`UnicodeDecodeError`.
.. [#decoding-note] unicode objects are also accepted as input in place of str
objects. They are implicitly converted to str by encoding them using the
default encoding. If this conversion fails, it may lead to decoding
operations raising :exc:`UnicodeEncodeError`.
:mod:`encodings.idna` --- Internationalized Domain Names in Applications
------------------------------------------------------------------------

View File

@ -26,12 +26,17 @@ IDLE
- Issue #14146: Highlight source line while debugging on Windows.
Tests
-----
- Issue #11995: test_pydoc doesn't import all sys.path modules anymore.
Documentation
-------------
- Issue #17844: Refactor a documentation of Python specific encodings.
Add links to encoders and decoders for binary-to-binary codecs.
What's New in Python 2.7.5?
===========================