#1753718: clarify RFC compliance and bytes/string argument types.

Patch includes contributions by Isobel Hooper, incorporating suggestions from Paul Winkler. Reviewed by Martin Panter. In addition to accepting the corrections for the RFC compliance wording, I went through and corrected all the argument and return types, and made the pattern of how the arguments and return types are documented consistent. So, this patch also addresses #20782, though I had forgotten about that issue and its patch.
2015-12-23 21:17:17 -05:00 · 2015-12-23 21:17:17 -05:00 · a198645fa0
parent a17ca19d34
commit a198645fa0
1 changed files with 71 additions and 57 deletions
--- a/Doc/library/base64.rst
+++ b/Doc/library/base64.rst
@ -21,13 +21,19 @@ safely sent by email, used as parts of URLs, or included as part of an HTTP
 POST request.  The encoding algorithm is not the same as the
 :program:`uuencode` program.

-There are two :rfc:`3548` interfaces provided by this module.  The modern
-interface supports encoding and decoding ASCII byte string objects using all
-three :rfc:`3548` defined alphabets (normal, URL-safe, and filesystem-safe).
-Additionally, the decoding functions of the modern interface also accept
-Unicode strings containing only ASCII characters. The legacy interface provides
-for encoding and decoding to and from file-like objects as well as byte
-strings, but only using the Base64 standard alphabet.
+There are two interfaces provided by this module.  The modern interface
+supports encoding :term:`bytes-like objects <bytes-like object>` to ASCII
+:class:`bytes`, and decoding :term:`bytes-like objects <bytes-like object>` or
+strings containing ASCII to :class:`bytes`.  All three :rfc:`3548` defined
+alphabets (normal, URL-safe, and filesystem-safe) are supported.
+
+The legacy interface does not support decoding from strings, but it does
+provide functions for encoding and decoding to and from :term:`file objects
+<file object>`.  It only supports the Base64 standard alphabet, and it adds
+newlines every 76 characters as per :rfc:`2045`.  Note that if you are looking
+for :rfc:`2045` support you probably want to be looking at the :mod:`email`
+package instead.
+

 .. versionchanged:: 3.3
   ASCII-only Unicode strings are now accepted by the decoding functions of
@ -41,26 +47,26 @@ The modern interface provides:

 .. function:: b64encode(s, altchars=None)

-   Encode a byte string using Base64.
+   Encode the :term:`bytes-like object` *s* using Base64 and return the encoded
+   :class:`bytes`.

-   *s* is the string to encode.  Optional *altchars* must be a string of at least
+   Optional *altchars* must be a :term:`bytes-like object` of at least
   length 2 (additional characters are ignored) which specifies an alternative
   alphabet for the ``+`` and ``/`` characters.  This allows an application to e.g.
   generate URL or filesystem safe Base64 strings.  The default is ``None``, for
   which the standard Base64 alphabet is used.

-   The encoded byte string is returned.
-

 .. function:: b64decode(s, altchars=None, validate=False)

-   Decode a Base64 encoded byte string.
+   Decode the Base64 encoded :term:`bytes-like object` or ASCII string
+   *s* and return the decoded :class:`bytes`.

-   *s* is the byte string to decode.  Optional *altchars* must be a string of
+   Optional *altchars* must be a :term:`bytes-like object` or ASCII string of
   at least length 2 (additional characters are ignored) which specifies the
   alternative alphabet used instead of the ``+`` and ``/`` characters.

-   The decoded string is returned.  A :exc:`binascii.Error` exception is raised
+   A :exc:`binascii.Error` exception is raised
   if *s* is incorrectly padded.

   If *validate* is ``False`` (the default), non-base64-alphabet characters are
@ -71,38 +77,44 @@ The modern interface provides:

 .. function:: standard_b64encode(s)

-   Encode byte string *s* using the standard Base64 alphabet.
+   Encode :term:`bytes-like object` *s* using the standard Base64 alphabet
+   and return the encoded :class:`bytes`.


 .. function:: standard_b64decode(s)

-   Decode byte string *s* using the standard Base64 alphabet.
+   Decode :term:`bytes-like object` or ASCII string *s* using the standard
+   Base64 alphabet and return the decoded :class:`bytes`.


 .. function:: urlsafe_b64encode(s)

-   Encode byte string *s* using a URL-safe alphabet, which substitutes ``-`` instead of
-   ``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet.  The result
+   Encode :term:`bytes-like object` *s* using a URL-safe alphabet, which
+   substitutes ``-`` instead of ``+`` and ``_`` instead of ``/`` in the
+   standard Base64 alphabet, and return the encoded :class:`bytes`.  The result
   can still contain ``=``.


 .. function:: urlsafe_b64decode(s)

-   Decode byte string *s* using a URL-safe alphabet, which substitutes ``-`` instead of
-   ``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet.
+   Decode :term:`bytes-like object` or ASCII string *s* using a URL-safe
+   alphabet, which substitutes ``-`` instead of ``+`` and ``_`` instead of
+   ``/`` in the standard Base64 alphabet, and return the decoded
+   :class:`bytes`.


 .. function:: b32encode(s)

-   Encode a byte string using Base32.  *s* is the string to encode.  The encoded string
-   is returned.
+   Encode the :term:`bytes-like object` *s* using Base32 and return the
+   encoded :class:`bytes`.


 .. function:: b32decode(s, casefold=False, map01=None)

-   Decode a Base32 encoded byte string.
+   Decode the Base32 encoded :term:`bytes-like object` or ASCII string *s* and
+   return the decoded :class:`bytes`.

-   *s* is the byte string to decode.  Optional *casefold* is a flag specifying
+   Optional *casefold* is a flag specifying
   whether a lowercase alphabet is acceptable as input.  For security purposes,
   the default is ``False``.

@ -113,46 +125,45 @@ The modern interface provides:
   digit 0 is always mapped to the letter O).  For security purposes the default is
   ``None``, so that 0 and 1 are not allowed in the input.

-   The decoded byte string is returned.  A :exc:`binascii.Error` is raised if *s* is
+   A :exc:`binascii.Error` is raised if *s* is
   incorrectly padded or if there are non-alphabet characters present in the
-   string.
+   input.


 .. function:: b16encode(s)

-   Encode a byte string using Base16.
-
-   *s* is the string to encode.  The encoded byte string is returned.
+   Encode the :term:`bytes-like object` *s* using Base16 and return the
+   encoded :class:`bytes`.


 .. function:: b16decode(s, casefold=False)

-   Decode a Base16 encoded byte string.
+   Decode the Base16 encoded :term:`bytes-like object` or ASCII string *s* and
+   return the decoded :class:`bytes`.

-   *s* is the string to decode.  Optional *casefold* is a flag specifying whether a
+   Optional *casefold* is a flag specifying whether a
   lowercase alphabet is acceptable as input.  For security purposes, the default
   is ``False``.

-   The decoded byte string is returned.  A :exc:`TypeError` is raised if *s* were
+   A :exc:`TypeError` is raised if *s* is
   incorrectly padded or if there are non-alphabet characters present in the
-   string.
+   input.


 .. function:: a85encode(s, *, foldspaces=False, wrapcol=0, pad=False, adobe=False)

-   Encode a byte string using Ascii85.
-
-   *s* is the string to encode. The encoded byte string is returned.
+   Encode the :term:`bytes-like object` *s* using Ascii85 and return the
+   encoded :class:`bytes`.

   *foldspaces* is an optional flag that uses the special short sequence 'y'
   instead of 4 consecutive spaces (ASCII 0x20) as supported by 'btoa'. This
   feature is not supported by the "standard" Ascii85 encoding.

-   *wrapcol* controls whether the output should have newline (``'\n'``)
+   *wrapcol* controls whether the output should have newline (``b'\n'``)
   characters added to it. If this is non-zero, each output line will be
   at most this many characters long.

-   *pad* controls whether the input string is padded to a multiple of 4
+   *pad* controls whether the input is padded to a multiple of 4
   before encoding. Note that the ``btoa`` implementation always pads.

   *adobe* controls whether the encoded byte sequence is framed with ``<~``
@ -163,9 +174,8 @@ The modern interface provides:

 .. function:: a85decode(s, *, foldspaces=False, adobe=False, ignorechars=b' \\t\\n\\r\\v')

-   Decode an Ascii85 encoded byte string.
-
-   *s* is the byte string to decode.
+   Decode the Ascii85 encoded :term:`bytes-like object` or ASCII string *s* and
+   return the decoded :class:`bytes`.

   *foldspaces* is a flag that specifies whether the 'y' short sequence
   should be accepted as shorthand for 4 consecutive spaces (ASCII 0x20).
@ -174,7 +184,8 @@ The modern interface provides:
   *adobe* controls whether the input sequence is in Adobe Ascii85 format
   (i.e. is framed with <~ and ~>).

-   *ignorechars* should be a byte string containing characters to ignore
+   *ignorechars* should be a :term:`bytes-like object` or ASCII string
+   containing characters to ignore
   from the input. This should only contain whitespace characters, and by
   default contains all whitespace characters in ASCII.

@ -183,18 +194,19 @@ The modern interface provides:

 .. function:: b85encode(s, pad=False)

-   Encode a byte string using base85, as used in e.g. git-style binary
-   diffs.
+   Encode the :term:`bytes-like object` *s* using base85 (as used in e.g.
+   git-style binary diffs) and return the encoded :class:`bytes`.

-   If *pad* is true, the input is padded with "\\0" so its length is a
-   multiple of 4 characters before encoding.
+   If *pad* is true, the input is padded with ``b'\0'`` so its length is a
+   multiple of 4 bytes before encoding.

   .. versionadded:: 3.4


 .. function:: b85decode(b)

-   Decode base85-encoded byte string.  Padding is implicitly removed, if
+   Decode the base85-encoded :term:`bytes-like object` or ASCII string *b* and
+   return the decoded :class:`bytes`.  Padding is implicitly removed, if
   necessary.

   .. versionadded:: 3.4
@ -214,15 +226,15 @@ The legacy interface:

   Decode the contents of the binary *input* file and write the resulting binary
   data to the *output* file. *input* and *output* must be :term:`file objects
-   <file object>`. *input* will be read until ``input.read()`` returns an empty
-   bytes object.
+   <file object>`. *input* will be read until ``input.readline()`` returns an
+   empty bytes object.


 .. function:: decodebytes(s)
              decodestring(s)

-   Decode the byte string *s*, which must contain one or more lines of base64
-   encoded data, and return a byte string containing the resulting binary data.
+   Decode the :term:`bytes-like object` *s*, which must contain one or more
+   lines of base64 encoded data, and return the decoded :class:`bytes`.
   ``decodestring`` is a deprecated alias.

   .. versionadded:: 3.1
@ -233,17 +245,19 @@ The legacy interface:
   Encode the contents of the binary *input* file and write the resulting base64
   encoded data to the *output* file. *input* and *output* must be :term:`file
   objects <file object>`. *input* will be read until ``input.read()`` returns
-   an empty bytes object. :func:`encode` returns the encoded data plus a trailing
-   newline character (``b'\n'``).
+   an empty bytes object. :func:`encode` inserts a newline character (``b'\n'``)
+   after every 76 bytes of the output, as well as ensuring that the output
+   always ends with a newline, as per :rfc:`2045` (MIME).


 .. function:: encodebytes(s)
              encodestring(s)

-   Encode the byte string *s*, which can contain arbitrary binary data, and
-   return a byte string containing one or more lines of base64-encoded data.
-   :func:`encodebytes` returns a string containing one or more lines of
-   base64-encoded data always including an extra trailing newline (``b'\n'``).
+   Encode the :term:`bytes-like object` *s*, which can contain arbitrary binary
+   data, and return :class:`bytes` containing the base64-encoded data, with newlines
+   (``b'\n'``) inserted after every 76 bytes of output, and ensuring that
+   there is a trailing newline, as per :rfc:`2045` (MIME).
+
   ``encodestring`` is a deprecated alias.