cpython/Doc/library/base64.rst

264 lines
9.1 KiB
ReStructuredText
Raw Normal View History

2007-08-15 11:28:22 -03:00
:mod:`base64` --- RFC 3548: Base16, Base32, Base64 Data Encodings
=================================================================
.. module:: base64
:synopsis: RFC 3548: Base16, Base32, Base64 Data Encodings
.. index::
pair: base64; encoding
single: MIME; base64 encoding
This module provides data encoding and decoding as specified in :rfc:`3548`.
This standard defines the Base16, Base32, and Base64 algorithms for encoding
and decoding arbitrary binary strings into ASCII-only byte strings that can be
safely sent by email, used as parts of URLs, or included as part of an HTTP
POST request. The encoding algorithm is not the same as the
:program:`uuencode` program.
There are two interfaces provided by this module. The modern interface
supports encoding and decoding ASCII byte string objects using all three
alphabets. Additionally, the decoding functions of the modern interface also
accept Unicode strings containing only ASCII characters. The legacy interface
provides for encoding and decoding to and from file-like objects as well as
byte strings, but only using the Base64 standard alphabet.
.. versionchanged:: 3.3
ASCII-only Unicode strings are now accepted by the decoding functions of
the modern interface.
2007-08-15 11:28:22 -03:00
.. versionchanged:: 3.4
Any :term:`bytes-like object`\ s are now accepted by all
encoding and decoding functions in this module.
The modern interface provides:
2007-08-15 11:28:22 -03:00
.. function:: b64encode(s, altchars=None)
2007-08-15 11:28:22 -03:00
Encode a byte string using Base64.
2007-08-15 11:28:22 -03:00
*s* is the string to encode. Optional *altchars* must be a string of at least
length 2 (additional characters are ignored) which specifies an alternative
alphabet for the ``+`` and ``/`` characters. This allows an application to e.g.
generate URL or filesystem safe Base64 strings. The default is ``None``, for
which the standard Base64 alphabet is used.
The encoded byte string is returned.
2007-08-15 11:28:22 -03:00
.. function:: b64decode(s, altchars=None, validate=False)
2007-08-15 11:28:22 -03:00
Decode a Base64 encoded byte string.
2007-08-15 11:28:22 -03:00
*s* is the byte string to decode. Optional *altchars* must be a string of
at least length 2 (additional characters are ignored) which specifies the
alternative alphabet used instead of the ``+`` and ``/`` characters.
2007-08-15 11:28:22 -03:00
The decoded string is returned. A :exc:`binascii.Error` exception is raised
if *s* is incorrectly padded.
If *validate* is ``False`` (the default), non-base64-alphabet characters are
discarded prior to the padding check. If *validate* is ``True``,
non-base64-alphabet characters in the input result in a
:exc:`binascii.Error`.
2007-08-15 11:28:22 -03:00
.. function:: standard_b64encode(s)
Encode byte string *s* using the standard Base64 alphabet.
2007-08-15 11:28:22 -03:00
.. function:: standard_b64decode(s)
Decode byte string *s* using the standard Base64 alphabet.
2007-08-15 11:28:22 -03:00
.. function:: urlsafe_b64encode(s)
Encode byte string *s* using a URL-safe alphabet, which substitutes ``-`` instead of
Merged revisions 69576,69579-69580,69589,69619-69620,69633,69703-69704,69728-69730 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r69576 | georg.brandl | 2009-02-13 04:56:50 -0600 (Fri, 13 Feb 2009) | 1 line #1661108: note that urlsafe encoded string can contain "=". ........ r69579 | georg.brandl | 2009-02-13 05:06:59 -0600 (Fri, 13 Feb 2009) | 2 lines Fix warnings GCC emits where the argument of PyErr_Format is a single variable. ........ r69580 | georg.brandl | 2009-02-13 05:10:04 -0600 (Fri, 13 Feb 2009) | 2 lines Fix warnings GCC emits where the argument of PyErr_Format is a single variable. ........ r69589 | martin.v.loewis | 2009-02-13 14:11:34 -0600 (Fri, 13 Feb 2009) | 2 lines Move amd64 properties further to the top, so that they override the linker options correctly. ........ r69619 | benjamin.peterson | 2009-02-14 11:00:51 -0600 (Sat, 14 Feb 2009) | 1 line this needn't be a shebang line ........ r69620 | georg.brandl | 2009-02-14 11:01:36 -0600 (Sat, 14 Feb 2009) | 1 line #5179: don't leak PIPE fds when child execution fails. ........ r69633 | hirokazu.yamamoto | 2009-02-15 03:19:48 -0600 (Sun, 15 Feb 2009) | 1 line Fixed typo. ........ r69703 | raymond.hettinger | 2009-02-16 16:42:54 -0600 (Mon, 16 Feb 2009) | 3 lines Issue 5229: Documentation for super() neglects to say what super() actually does ........ r69704 | raymond.hettinger | 2009-02-16 17:00:25 -0600 (Mon, 16 Feb 2009) | 1 line Add explanation for super(type1, type2). ........ r69728 | georg.brandl | 2009-02-17 18:22:55 -0600 (Tue, 17 Feb 2009) | 2 lines #5297: fix example. ........ r69729 | georg.brandl | 2009-02-17 18:25:13 -0600 (Tue, 17 Feb 2009) | 2 lines #5296: sequence -> iterable. ........ r69730 | georg.brandl | 2009-02-17 18:31:36 -0600 (Tue, 17 Feb 2009) | 2 lines #5268: mention VMSError. ........
2009-02-19 00:22:03 -04:00
``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet. The result
can still contain ``=``.
2007-08-15 11:28:22 -03:00
.. function:: urlsafe_b64decode(s)
Decode byte string *s* using a URL-safe alphabet, which substitutes ``-`` instead of
2007-08-15 11:28:22 -03:00
``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet.
.. function:: b32encode(s)
Encode a byte string using Base32. *s* is the string to encode. The encoded string
2007-08-15 11:28:22 -03:00
is returned.
.. function:: b32decode(s, casefold=False, map01=None)
2007-08-15 11:28:22 -03:00
Decode a Base32 encoded byte string.
2007-08-15 11:28:22 -03:00
*s* is the byte string to decode. Optional *casefold* is a flag specifying
whether a lowercase alphabet is acceptable as input. For security purposes,
the default is ``False``.
2007-08-15 11:28:22 -03:00
:rfc:`3548` allows for optional mapping of the digit 0 (zero) to the letter O
(oh), and for optional mapping of the digit 1 (one) to either the letter I (eye)
or letter L (el). The optional argument *map01* when not ``None``, specifies
which letter the digit 1 should be mapped to (when *map01* is not ``None``, the
digit 0 is always mapped to the letter O). For security purposes the default is
``None``, so that 0 and 1 are not allowed in the input.
The decoded byte string is returned. A :exc:`binascii.Error` is raised if *s* were
2007-08-15 11:28:22 -03:00
incorrectly padded or if there are non-alphabet characters present in the
string.
.. function:: b16encode(s)
Encode a byte string using Base16.
2007-08-15 11:28:22 -03:00
*s* is the string to encode. The encoded byte string is returned.
2007-08-15 11:28:22 -03:00
.. function:: b16decode(s, casefold=False)
2007-08-15 11:28:22 -03:00
Decode a Base16 encoded byte string.
2007-08-15 11:28:22 -03:00
*s* is the string to decode. Optional *casefold* is a flag specifying whether a
lowercase alphabet is acceptable as input. For security purposes, the default
is ``False``.
The decoded byte string is returned. A :exc:`TypeError` is raised if *s* were
2007-08-15 11:28:22 -03:00
incorrectly padded or if there are non-alphabet characters present in the
string.
.. function:: a85encode(s, *, foldspaces=False, wrapcol=0, pad=False, adobe=False)
Encode a byte string using Ascii85.
*s* is the string to encode. The encoded byte string is returned.
*foldspaces* is an optional flag that uses the special short sequence 'y'
instead of 4 consecutive spaces (ASCII 0x20) as supported by 'btoa'. This
feature is not supported by the "standard" Ascii85 encoding.
*wrapcol* controls whether the output should have newline ('\n')
characters added to it. If this is non-zero, each output line will be
at most this many characters long.
*pad* controls whether the input string is padded to a multiple of 4
before encoding. Note that the ``btoa`` implementation always pads.
*adobe* controls whether the encoded byte sequence is framed with ``<~``
and ``~>``, which is used by the Adobe implementation.
.. versionadded:: 3.4
.. function:: a85decode(s, *, foldspaces=False, adobe=False, ignorechars=b' \t\n\r\v')
Decode an Ascii85 encoded byte string.
*s* is the byte string to decode.
*foldspaces* is a flag that specifies whether the 'y' short sequence
should be accepted as shorthand for 4 consecutive spaces (ASCII 0x20).
This feature is not supported by the "standard" Ascii85 encoding.
*adobe* controls whether the input sequence is in Adobe Ascii85 format
(i.e. is framed with <~ and ~>).
*ignorechars* should be a byte string containing characters to ignore
from the input. This should only contain whitespace characters, and by
default contains all whitespace characters in ASCII.
.. versionadded:: 3.4
.. function:: b85encode(s, pad=False)
Encode a byte string using base85, as used in e.g. git-style binary
diffs.
If *pad* is true, the input is padded with "\\0" so its length is a
multiple of 4 characters before encoding.
.. versionadded:: 3.4
.. function:: b85decode(b)
Decode base85-encoded byte string. Padding is implicitly removed, if
necessary.
.. versionadded:: 3.4
.. note::
Both Base85 and Ascii85 have an expansion factor of 5 to 4 (5 Base85 or
Ascii85 characters can encode 4 binary bytes), while the better-known
Base64 has an expansion factor of 6 to 4. They are therefore more
efficient when space expensive. They differ by details such as the
character map used for encoding.
The legacy interface:
2007-08-15 11:28:22 -03:00
.. function:: decode(input, output)
Decode the contents of the binary *input* file and write the resulting binary
data to the *output* file. *input* and *output* must be :term:`file objects
<file object>`. *input* will be read until ``input.read()`` returns an empty
bytes object.
2007-08-15 11:28:22 -03:00
.. function:: decodebytes(s)
decodestring(s)
2007-08-15 11:28:22 -03:00
Decode the byte string *s*, which must contain one or more lines of base64
encoded data, and return a byte string containing the resulting binary data.
``decodestring`` is a deprecated alias.
2007-08-15 11:28:22 -03:00
.. versionadded:: 3.1
2007-08-15 11:28:22 -03:00
.. function:: encode(input, output)
Encode the contents of the binary *input* file and write the resulting base64
encoded data to the *output* file. *input* and *output* must be :term:`file
objects <file object>`. *input* will be read until ``input.read()`` returns
an empty bytes object. :func:`encode` returns the encoded data plus a trailing
newline character (``b'\n'``).
2007-08-15 11:28:22 -03:00
.. function:: encodebytes(s)
encodestring(s)
2007-08-15 11:28:22 -03:00
Encode the byte string *s*, which can contain arbitrary binary data, and
return a byte string containing one or more lines of base64-encoded data.
:func:`encodebytes` returns a string containing one or more lines of
base64-encoded data always including an extra trailing newline (``b'\n'``).
``encodestring`` is a deprecated alias.
2007-08-15 11:28:22 -03:00
Merged revisions 61724-61725,61731-61735,61737,61739,61741,61743-61744,61753,61761,61765-61767,61769,61773,61776-61778,61780-61783,61788,61793,61796,61807,61813 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ................ r61724 | martin.v.loewis | 2008-03-22 01:01:12 +0100 (Sat, 22 Mar 2008) | 49 lines Merged revisions 61602-61723 via svnmerge from svn+ssh://pythondev@svn.python.org/sandbox/trunk/2to3/lib2to3 ........ r61626 | david.wolever | 2008-03-19 17:19:16 +0100 (Mi, 19 M?\195?\164r 2008) | 1 line Added fixer for implicit local imports. See #2414. ........ r61628 | david.wolever | 2008-03-19 17:57:43 +0100 (Mi, 19 M?\195?\164r 2008) | 1 line Added a class for tests which should not run if a particular import is found. ........ r61629 | collin.winter | 2008-03-19 17:58:19 +0100 (Mi, 19 M?\195?\164r 2008) | 1 line Two more relative import fixes in pgen2. ........ r61635 | david.wolever | 2008-03-19 20:16:03 +0100 (Mi, 19 M?\195?\164r 2008) | 1 line Fixed print fixer so it will do the Right Thing when it encounters __future__.print_function. 2to3 gets upset, though, so the tests have been commented out. ........ r61637 | david.wolever | 2008-03-19 21:37:17 +0100 (Mi, 19 M?\195?\164r 2008) | 3 lines Added a fixer for itertools imports (from itertools import imap, ifilterfalse --> from itertools import filterfalse) ........ r61645 | david.wolever | 2008-03-19 23:22:35 +0100 (Mi, 19 M?\195?\164r 2008) | 1 line SVN is happier when you add the files you create... -_-' ........ r61654 | david.wolever | 2008-03-20 01:09:56 +0100 (Do, 20 M?\195?\164r 2008) | 1 line Added an explicit sort order to fixers -- fixes problems like #2427 ........ r61664 | david.wolever | 2008-03-20 04:32:40 +0100 (Do, 20 M?\195?\164r 2008) | 3 lines Fixes #2428 -- comments are no longer eatten by __future__ fixer. ........ r61673 | david.wolever | 2008-03-20 17:22:40 +0100 (Do, 20 M?\195?\164r 2008) | 1 line Added 2to3 node pretty-printer ........ r61679 | david.wolever | 2008-03-20 20:50:42 +0100 (Do, 20 M?\195?\164r 2008) | 1 line Made node printing a little bit prettier ........ r61723 | martin.v.loewis | 2008-03-22 00:59:27 +0100 (Sa, 22 M?\195?\164r 2008) | 2 lines Fix whitespace. ........ ................ r61725 | martin.v.loewis | 2008-03-22 01:02:41 +0100 (Sat, 22 Mar 2008) | 2 lines Install lib2to3. ................ r61731 | facundo.batista | 2008-03-22 03:45:37 +0100 (Sat, 22 Mar 2008) | 4 lines Small fix that complicated the test actually when that test failed. ................ r61732 | alexandre.vassalotti | 2008-03-22 05:08:44 +0100 (Sat, 22 Mar 2008) | 2 lines Added warning for the removal of 'hotshot' in Py3k. ................ r61733 | georg.brandl | 2008-03-22 11:07:29 +0100 (Sat, 22 Mar 2008) | 4 lines #1918: document that weak references *to* an object are cleared before the object's __del__ is called, to ensure that the weak reference callback (if any) finds the object healthy. ................ r61734 | georg.brandl | 2008-03-22 11:56:23 +0100 (Sat, 22 Mar 2008) | 2 lines Activate the Sphinx doctest extension and convert howto/functional to use it. ................ r61735 | georg.brandl | 2008-03-22 11:58:38 +0100 (Sat, 22 Mar 2008) | 2 lines Allow giving source names on the cmdline. ................ r61737 | georg.brandl | 2008-03-22 12:00:48 +0100 (Sat, 22 Mar 2008) | 2 lines Fixup this HOWTO's doctest blocks so that they can be run with sphinx' doctest builder. ................ r61739 | georg.brandl | 2008-03-22 12:47:10 +0100 (Sat, 22 Mar 2008) | 2 lines Test decimal.rst doctests as far as possible with sphinx doctest. ................ r61741 | georg.brandl | 2008-03-22 13:04:26 +0100 (Sat, 22 Mar 2008) | 2 lines Make doctests in re docs usable with sphinx' doctest. ................ r61743 | georg.brandl | 2008-03-22 13:59:37 +0100 (Sat, 22 Mar 2008) | 2 lines Make more doctests in pprint docs testable. ................ r61744 | georg.brandl | 2008-03-22 14:07:06 +0100 (Sat, 22 Mar 2008) | 2 lines No need to specify explicit "doctest_block" anymore. ................ r61753 | georg.brandl | 2008-03-22 21:08:43 +0100 (Sat, 22 Mar 2008) | 2 lines Fix-up syntax problems. ................ r61761 | georg.brandl | 2008-03-22 22:06:20 +0100 (Sat, 22 Mar 2008) | 4 lines Make collections' doctests executable. (The <BLANKLINE>s will be stripped from presentation output.) ................ r61765 | georg.brandl | 2008-03-22 22:21:57 +0100 (Sat, 22 Mar 2008) | 2 lines Test doctests in datetime docs. ................ r61766 | georg.brandl | 2008-03-22 22:26:44 +0100 (Sat, 22 Mar 2008) | 2 lines Test doctests in operator docs. ................ r61767 | georg.brandl | 2008-03-22 22:38:33 +0100 (Sat, 22 Mar 2008) | 2 lines Enable doctests in functions.rst. Already found two errors :) ................ r61769 | georg.brandl | 2008-03-22 23:04:10 +0100 (Sat, 22 Mar 2008) | 3 lines Enable doctest running for several other documents. We have now over 640 doctests that are run with "make doctest". ................ r61773 | raymond.hettinger | 2008-03-23 01:55:46 +0100 (Sun, 23 Mar 2008) | 1 line Simplify demo code. ................ r61776 | neal.norwitz | 2008-03-23 04:43:33 +0100 (Sun, 23 Mar 2008) | 7 lines Try to make this test a little more robust and not fail with: timeout (10.0025) is more than 2 seconds more than expected (0.001) I'm assuming this problem is caused by DNS lookup. This change does a DNS lookup of the hostname before trying to connect, so the time is not included. ................ r61777 | neal.norwitz | 2008-03-23 05:08:30 +0100 (Sun, 23 Mar 2008) | 1 line Speed up the test by avoiding socket timeouts. ................ r61778 | neal.norwitz | 2008-03-23 05:43:09 +0100 (Sun, 23 Mar 2008) | 1 line Skip the epoll test if epoll() does not work ................ r61780 | neal.norwitz | 2008-03-23 06:47:20 +0100 (Sun, 23 Mar 2008) | 1 line Suppress failure (to avoid a flaky test) if we cannot connect to svn.python.org ................ r61781 | neal.norwitz | 2008-03-23 07:13:25 +0100 (Sun, 23 Mar 2008) | 4 lines Move itertools before future_builtins since the latter depends on the former. From a clean build importing future_builtins would fail since itertools wasn't built yet. ................ r61782 | neal.norwitz | 2008-03-23 07:16:04 +0100 (Sun, 23 Mar 2008) | 1 line Try to prevent the alarm going off early in tearDown ................ r61783 | neal.norwitz | 2008-03-23 07:19:57 +0100 (Sun, 23 Mar 2008) | 4 lines Remove compiler warnings (on Alpha at least) about using chars as array subscripts. Using chars are dangerous b/c they are signed on some platforms and unsigned on others. ................ r61788 | georg.brandl | 2008-03-23 09:05:30 +0100 (Sun, 23 Mar 2008) | 2 lines Make the doctests presentation-friendlier. ................ r61793 | amaury.forgeotdarc | 2008-03-23 10:55:29 +0100 (Sun, 23 Mar 2008) | 4 lines #1477: ur'\U0010FFFF' raised in narrow unicode builds. Corrected the raw-unicode-escape codec to use UTF-16 surrogates in this case, just like the unicode-escape codec. ................ r61796 | raymond.hettinger | 2008-03-23 14:32:32 +0100 (Sun, 23 Mar 2008) | 1 line Issue 1681432: Add triangular distribution the random module. ................ r61807 | raymond.hettinger | 2008-03-23 20:37:53 +0100 (Sun, 23 Mar 2008) | 4 lines Adopt Nick's suggestion for useful default arguments. Clean-up floating point issues by adding true division and float constants. ................ r61813 | gregory.p.smith | 2008-03-23 22:04:43 +0100 (Sun, 23 Mar 2008) | 6 lines Fix gzip to deal with CRC's being signed values in Python 2.x properly and to read 32bit values as unsigned to start with rather than applying signedness fixups allover the place afterwards. This hopefully fixes the test_tarfile failure on the alpha/tru64 buildbot. ................
2008-03-23 18:54:12 -03:00
An example usage of the module:
2007-08-15 11:28:22 -03:00
>>> import base64
2010-10-17 08:36:28 -03:00
>>> encoded = base64.b64encode(b'data to be encoded')
2007-08-15 11:28:22 -03:00
>>> encoded
2009-01-18 06:43:58 -04:00
b'ZGF0YSB0byBiZSBlbmNvZGVk'
2007-08-15 11:28:22 -03:00
>>> data = base64.b64decode(encoded)
>>> data
2010-10-17 08:36:28 -03:00
b'data to be encoded'
2007-08-15 11:28:22 -03:00
.. seealso::
Module :mod:`binascii`
Support module containing ASCII-to-binary and binary-to-ASCII conversions.
:rfc:`1521` - MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies
Section 5.2, "Base64 Content-Transfer-Encoding," provides the definition of the
base64 encoding.