cpython/Doc/library/bz2.rst

257 lines
8.7 KiB
ReStructuredText
Raw Normal View History

:mod:`bz2` --- Support for :program:`bzip2` compression
=======================================================
2007-08-15 11:28:22 -03:00
.. module:: bz2
:synopsis: Interfaces for bzip2 compression and decompression.
2007-08-15 11:28:22 -03:00
.. moduleauthor:: Gustavo Niemeyer <niemeyer@conectiva.com>
.. moduleauthor:: Nadeem Vawda <nadeem.vawda@gmail.com>
2007-08-15 11:28:22 -03:00
.. sectionauthor:: Gustavo Niemeyer <niemeyer@conectiva.com>
.. sectionauthor:: Nadeem Vawda <nadeem.vawda@gmail.com>
2007-08-15 11:28:22 -03:00
**Source code:** :source:`Lib/bz2.py`
--------------
2007-08-15 11:28:22 -03:00
This module provides a comprehensive interface for compressing and
decompressing data using the bzip2 compression algorithm.
2007-08-15 11:28:22 -03:00
The :mod:`bz2` module contains:
2007-08-15 11:28:22 -03:00
* The :func:`.open` function and :class:`BZ2File` class for reading and
writing compressed files.
* The :class:`BZ2Compressor` and :class:`BZ2Decompressor` classes for
incremental (de)compression.
* The :func:`compress` and :func:`decompress` functions for one-shot
(de)compression.
2007-08-15 11:28:22 -03:00
All of the classes in this module may safely be accessed from multiple threads.
2007-08-15 11:28:22 -03:00
(De)compression of files
------------------------
.. function:: open(filename, mode='r', compresslevel=9, encoding=None, errors=None, newline=None)
Open a bzip2-compressed file in binary or text mode, returning a :term:`file
object`.
As with the constructor for :class:`BZ2File`, the *filename* argument can be
an actual filename (a :class:`str` or :class:`bytes` object), or an existing
file object to read from or write to.
The *mode* argument can be any of ``'r'``, ``'rb'``, ``'w'``, ``'wb'``,
``'x'``, ``'xb'``, ``'a'`` or ``'ab'`` for binary mode, or ``'rt'``,
``'wt'``, ``'xt'``, or ``'at'`` for text mode. The default is ``'rb'``.
The *compresslevel* argument is an integer from 1 to 9, as for the
:class:`BZ2File` constructor.
For binary mode, this function is equivalent to the :class:`BZ2File`
constructor: ``BZ2File(filename, mode, compresslevel=compresslevel)``. In
this case, the *encoding*, *errors* and *newline* arguments must not be
provided.
For text mode, a :class:`BZ2File` object is created, and wrapped in an
:class:`io.TextIOWrapper` instance with the specified encoding, error
handling behavior, and line ending(s).
.. versionadded:: 3.3
.. versionchanged:: 3.4
The ``'x'`` (exclusive creation) mode was added.
.. versionchanged:: 3.6
Accepts a :term:`path-like object`.
.. class:: BZ2File(filename, mode='r', buffering=None, compresslevel=9)
2007-08-15 11:28:22 -03:00
Open a bzip2-compressed file in binary mode.
2007-08-15 11:28:22 -03:00
If *filename* is a :class:`str` or :class:`bytes` object, open the named file
directly. Otherwise, *filename* should be a :term:`file object`, which will
be used to read or write the compressed data.
2007-08-15 11:28:22 -03:00
2011-05-26 20:52:16 -03:00
The *mode* argument can be either ``'r'`` for reading (default), ``'w'`` for
overwriting, ``'x'`` for exclusive creation, or ``'a'`` for appending. These
can equivalently be given as ``'rb'``, ``'wb'``, ``'xb'`` and ``'ab'``
respectively.
If *filename* is a file object (rather than an actual file name), a mode of
``'w'`` does not truncate the file, and is instead equivalent to ``'a'``.
Merged revisions 69998-69999,70002,70022-70023,70025-70026,70061,70086,70145,70171,70183,70188,70235,70244,70275,70281 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r69998 | benjamin.peterson | 2009-02-26 13:04:40 -0600 (Thu, 26 Feb 2009) | 1 line the startship is rather outdated now ........ r69999 | benjamin.peterson | 2009-02-26 13:05:59 -0600 (Thu, 26 Feb 2009) | 1 line comma ........ r70002 | andrew.kuchling | 2009-02-26 16:34:30 -0600 (Thu, 26 Feb 2009) | 1 line The curses panel library is now supported ........ r70022 | georg.brandl | 2009-02-27 10:23:18 -0600 (Fri, 27 Feb 2009) | 1 line #5361: fix typo. ........ r70023 | georg.brandl | 2009-02-27 10:39:26 -0600 (Fri, 27 Feb 2009) | 1 line #5363: fix cmpfiles() docs. Another instance where a prose description is twice as long as the code. ........ r70025 | georg.brandl | 2009-02-27 10:52:55 -0600 (Fri, 27 Feb 2009) | 1 line #5344: fix punctuation. ........ r70026 | georg.brandl | 2009-02-27 10:59:03 -0600 (Fri, 27 Feb 2009) | 1 line #5365: add quick look conversion table for different time representations. ........ r70061 | hirokazu.yamamoto | 2009-02-28 09:24:00 -0600 (Sat, 28 Feb 2009) | 1 line Binary flag is needed on windows. ........ r70086 | benjamin.peterson | 2009-03-01 21:35:12 -0600 (Sun, 01 Mar 2009) | 1 line fix a silly problem of caching gone wrong #5401 ........ r70145 | benjamin.peterson | 2009-03-03 16:51:57 -0600 (Tue, 03 Mar 2009) | 1 line making the writing more formal ........ r70171 | facundo.batista | 2009-03-04 15:18:17 -0600 (Wed, 04 Mar 2009) | 3 lines Fixed a typo. ........ r70183 | benjamin.peterson | 2009-03-04 18:17:57 -0600 (Wed, 04 Mar 2009) | 1 line add example ........ r70188 | hirokazu.yamamoto | 2009-03-05 03:34:14 -0600 (Thu, 05 Mar 2009) | 1 line Fixed memory leak on failure. ........ r70235 | benjamin.peterson | 2009-03-07 18:21:17 -0600 (Sat, 07 Mar 2009) | 1 line fix funky indentation ........ r70244 | martin.v.loewis | 2009-03-08 09:06:19 -0500 (Sun, 08 Mar 2009) | 2 lines Add Chris Withers. ........ r70275 | georg.brandl | 2009-03-09 11:35:48 -0500 (Mon, 09 Mar 2009) | 2 lines Add missing space. ........ r70281 | benjamin.peterson | 2009-03-09 15:38:56 -0500 (Mon, 09 Mar 2009) | 1 line gzip and bz2 are context managers ........
2009-03-09 18:04:33 -03:00
The *buffering* argument is ignored. Its use is deprecated since Python 3.0.
2007-08-15 11:28:22 -03:00
2011-05-26 20:52:16 -03:00
If *mode* is ``'w'`` or ``'a'``, *compresslevel* can be a number between
``1`` and ``9`` specifying the level of compression: ``1`` produces the
least compression, and ``9`` (default) produces the most compression.
If *mode* is ``'r'``, the input file may be the concatenation of multiple
compressed streams.
2007-08-15 11:28:22 -03:00
:class:`BZ2File` provides all of the members specified by the
:class:`io.BufferedIOBase`, except for :meth:`detach` and :meth:`truncate`.
Iteration and the :keyword:`with` statement are supported.
2007-08-15 11:28:22 -03:00
:class:`BZ2File` also provides the following method:
2007-08-15 11:28:22 -03:00
.. method:: peek([n])
2007-08-15 11:28:22 -03:00
Return buffered data without advancing the file position. At least one
byte of data will be returned (unless at EOF). The exact number of bytes
returned is unspecified.
2007-08-15 11:28:22 -03:00
.. note:: While calling :meth:`peek` does not change the file position of
the :class:`BZ2File`, it may change the position of the underlying file
object (e.g. if the :class:`BZ2File` was constructed by passing a file
object for *filename*).
.. versionadded:: 3.3
2007-08-15 11:28:22 -03:00
.. deprecated:: 3.0
The keyword argument *buffering* was deprecated and is now ignored.
.. versionchanged:: 3.1
Support for the :keyword:`with` statement was added.
2007-08-15 11:28:22 -03:00
.. versionchanged:: 3.3
The :meth:`fileno`, :meth:`readable`, :meth:`seekable`, :meth:`writable`,
:meth:`read1` and :meth:`readinto` methods were added.
2007-08-15 11:28:22 -03:00
.. versionchanged:: 3.3
Support was added for *filename* being a :term:`file object` instead of an
actual filename.
2007-08-15 11:28:22 -03:00
2011-05-26 20:52:16 -03:00
.. versionchanged:: 3.3
The ``'a'`` (append) mode was added, along with support for reading
multi-stream files.
.. versionchanged:: 3.4
The ``'x'`` (exclusive creation) mode was added.
.. versionchanged:: 3.5
The :meth:`~io.BufferedIOBase.read` method now accepts an argument of
``None``.
.. versionchanged:: 3.6
Accepts a :term:`path-like object`.
2007-08-15 11:28:22 -03:00
Incremental (de)compression
---------------------------
2007-08-15 11:28:22 -03:00
.. class:: BZ2Compressor(compresslevel=9)
2007-08-15 11:28:22 -03:00
Create a new compressor object. This object may be used to compress data
incrementally. For one-shot compression, use the :func:`compress` function
instead.
2007-08-15 11:28:22 -03:00
*compresslevel*, if given, must be a number between ``1`` and ``9``. The
default is ``9``.
2007-08-15 11:28:22 -03:00
.. method:: compress(data)
2007-08-15 11:28:22 -03:00
Provide data to the compressor object. Returns a chunk of compressed data
if possible, or an empty byte string otherwise.
2007-08-15 11:28:22 -03:00
When you have finished providing data to the compressor, call the
:meth:`flush` method to finish the compression process.
2007-08-15 11:28:22 -03:00
.. method:: flush()
2007-08-15 11:28:22 -03:00
Finish the compression process. Returns the compressed data left in
internal buffers.
2007-08-15 11:28:22 -03:00
The compressor object may not be used after this method has been called.
2007-08-15 11:28:22 -03:00
.. class:: BZ2Decompressor()
2007-08-15 11:28:22 -03:00
Create a new decompressor object. This object may be used to decompress data
incrementally. For one-shot compression, use the :func:`decompress` function
instead.
2007-08-15 11:28:22 -03:00
2011-05-26 20:52:16 -03:00
.. note::
This class does not transparently handle inputs containing multiple
compressed streams, unlike :func:`decompress` and :class:`BZ2File`. If
you need to decompress a multi-stream input with :class:`BZ2Decompressor`,
you must use a new decompressor for each stream.
.. method:: decompress(data, max_length=-1)
2007-08-15 11:28:22 -03:00
Decompress *data* (a :term:`bytes-like object`), returning
uncompressed data as bytes. Some of *data* may be buffered
internally, for use in later calls to :meth:`decompress`. The
returned data should be concatenated with the output of any
previous calls to :meth:`decompress`.
2007-08-15 11:28:22 -03:00
If *max_length* is nonnegative, returns at most *max_length*
bytes of decompressed data. If this limit is reached and further
output can be produced, the :attr:`~.needs_input` attribute will
be set to ``False``. In this case, the next call to
:meth:`~.decompress` may provide *data* as ``b''`` to obtain
more of the output.
2007-08-15 11:28:22 -03:00
If all of the input data was decompressed and returned (either
because this was less than *max_length* bytes, or because
*max_length* was negative), the :attr:`~.needs_input` attribute
will be set to ``True``.
Attempting to decompress data after the end of stream is reached
raises an `EOFError`. Any data found after the end of the
stream is ignored and saved in the :attr:`~.unused_data` attribute.
.. versionchanged:: 3.5
Added the *max_length* parameter.
2007-08-15 11:28:22 -03:00
.. attribute:: eof
2007-08-15 11:28:22 -03:00
``True`` if the end-of-stream marker has been reached.
2007-08-15 11:28:22 -03:00
.. versionadded:: 3.3
2007-08-15 11:28:22 -03:00
.. attribute:: unused_data
2007-08-15 11:28:22 -03:00
Data found after the end of the compressed stream.
2007-08-15 11:28:22 -03:00
2011-05-26 20:52:16 -03:00
If this attribute is accessed before the end of the stream has been
reached, its value will be ``b''``.
.. attribute:: needs_input
``False`` if the :meth:`.decompress` method can provide more
decompressed data before requiring new uncompressed input.
.. versionadded:: 3.5
2007-08-15 11:28:22 -03:00
One-shot (de)compression
------------------------
.. function:: compress(data, compresslevel=9)
2007-08-15 11:28:22 -03:00
Compress *data*.
2007-08-15 11:28:22 -03:00
*compresslevel*, if given, must be a number between ``1`` and ``9``. The
default is ``9``.
2007-08-15 11:28:22 -03:00
For incremental compression, use a :class:`BZ2Compressor` instead.
2007-08-15 11:28:22 -03:00
.. function:: decompress(data)
Decompress *data*.
2011-05-26 20:52:16 -03:00
If *data* is the concatenation of multiple compressed streams, decompress
all of the streams.
For incremental decompression, use a :class:`BZ2Decompressor` instead.
2007-08-15 11:28:22 -03:00
2011-05-26 20:52:16 -03:00
.. versionchanged:: 3.3
Support for multi-stream inputs was added.