2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
:mod:`bz2` --- Compression compatible with :program:`bzip2`
|
|
|
|
===========================================================
|
|
|
|
|
|
|
|
.. module:: bz2
|
|
|
|
:synopsis: Interface to compression and decompression routines compatible with bzip2.
|
|
|
|
.. moduleauthor:: Gustavo Niemeyer <niemeyer@conectiva.com>
|
|
|
|
.. sectionauthor:: Gustavo Niemeyer <niemeyer@conectiva.com>
|
|
|
|
|
|
|
|
|
|
|
|
.. versionadded:: 2.3
|
|
|
|
|
|
|
|
This module provides a comprehensive interface for the bz2 compression library.
|
|
|
|
It implements a complete file interface, one-shot (de)compression functions, and
|
|
|
|
types for sequential (de)compression.
|
|
|
|
|
2007-11-05 05:22:48 -04:00
|
|
|
For other archive formats, see the :mod:`gzip`, :mod:`zipfile`, and
|
|
|
|
:mod:`tarfile` modules.
|
|
|
|
|
|
|
|
Here is a summary of the features offered by the bz2 module:
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
* :class:`BZ2File` class implements a complete file interface, including
|
2010-03-23 10:20:39 -03:00
|
|
|
:meth:`~BZ2File.readline`, :meth:`~BZ2File.readlines`,
|
|
|
|
:meth:`~BZ2File.writelines`, :meth:`~BZ2File.seek`, etc;
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2010-03-23 10:20:39 -03:00
|
|
|
* :class:`BZ2File` class implements emulated :meth:`~BZ2File.seek` support;
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
* :class:`BZ2File` class implements universal newline support;
|
|
|
|
|
|
|
|
* :class:`BZ2File` class offers an optimized line iteration using the readahead
|
|
|
|
algorithm borrowed from file objects;
|
|
|
|
|
|
|
|
* Sequential (de)compression supported by :class:`BZ2Compressor` and
|
|
|
|
:class:`BZ2Decompressor` classes;
|
|
|
|
|
|
|
|
* One-shot (de)compression supported by :func:`compress` and :func:`decompress`
|
|
|
|
functions;
|
|
|
|
|
2007-11-05 05:22:48 -04:00
|
|
|
* Thread safety uses individual locking mechanism.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
|
|
|
(De)compression of files
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
Handling of compressed files is offered by the :class:`BZ2File` class.
|
|
|
|
|
|
|
|
|
|
|
|
.. class:: BZ2File(filename[, mode[, buffering[, compresslevel]]])
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Open a bz2 file. Mode can be either ``'r'`` or ``'w'``, for reading (default)
|
2007-08-15 11:28:01 -03:00
|
|
|
or writing. When opened for writing, the file will be created if it doesn't
|
2008-04-24 22:29:10 -03:00
|
|
|
exist, and truncated otherwise. If *buffering* is given, ``0`` means
|
|
|
|
unbuffered, and larger numbers specify the buffer size; the default is
|
|
|
|
``0``. If *compresslevel* is given, it must be a number between ``1`` and
|
|
|
|
``9``; the default is ``9``. Add a ``'U'`` to mode to open the file for input
|
|
|
|
with universal newline support. Any line ending in the input file will be
|
|
|
|
seen as a ``'\n'`` in Python. Also, a file so opened gains the attribute
|
2007-08-15 11:28:01 -03:00
|
|
|
:attr:`newlines`; the value for this attribute is one of ``None`` (no newline
|
2008-04-24 22:29:10 -03:00
|
|
|
read yet), ``'\r'``, ``'\n'``, ``'\r\n'`` or a tuple containing all the
|
|
|
|
newline types seen. Universal newlines are available only when
|
|
|
|
reading. Instances support iteration in the same way as normal :class:`file`
|
|
|
|
instances.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: close()
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Close the file. Sets data attribute :attr:`closed` to true. A closed file
|
|
|
|
cannot be used for further I/O operations. :meth:`close` may be called
|
|
|
|
more than once without error.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: read([size])
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Read at most *size* uncompressed bytes, returned as a string. If the
|
|
|
|
*size* argument is negative or omitted, read until EOF is reached.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: readline([size])
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Return the next line from the file, as a string, retaining newline. A
|
|
|
|
non-negative *size* argument limits the maximum number of bytes to return
|
|
|
|
(an incomplete line may be returned then). Return an empty string at EOF.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: readlines([size])
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Return a list of lines read. The optional *size* argument, if given, is an
|
|
|
|
approximate bound on the total number of bytes in the lines returned.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: xreadlines()
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
For backward compatibility. :class:`BZ2File` objects now include the
|
|
|
|
performance optimizations previously implemented in the :mod:`xreadlines`
|
|
|
|
module.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
Merged revisions 68133-68134,68141-68142,68145-68146,68148-68149,68159-68162,68166,68171-68174,68179,68195-68196,68210,68214-68215,68217-68222 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r68133 | antoine.pitrou | 2009-01-01 16:38:03 +0100 (Thu, 01 Jan 2009) | 1 line
fill in actual issue number in tests
........
r68134 | hirokazu.yamamoto | 2009-01-01 16:45:39 +0100 (Thu, 01 Jan 2009) | 2 lines
Issue #4797: IOError.filename was not set when _fileio.FileIO failed to open
file with `str' filename on Windows.
........
r68141 | benjamin.peterson | 2009-01-01 17:43:12 +0100 (Thu, 01 Jan 2009) | 1 line
fix highlighting
........
r68142 | benjamin.peterson | 2009-01-01 18:29:49 +0100 (Thu, 01 Jan 2009) | 2 lines
welcome to 2009, Python!
........
r68145 | amaury.forgeotdarc | 2009-01-02 01:03:54 +0100 (Fri, 02 Jan 2009) | 5 lines
#4801 _collections module fails to build on cygwin.
_PyObject_GC_TRACK is the macro version of PyObject_GC_Track,
and according to documentation it should not be used for extension modules.
........
r68146 | ronald.oussoren | 2009-01-02 11:44:46 +0100 (Fri, 02 Jan 2009) | 2 lines
Fix for issue4472: "configure --enable-shared doesn't work on OSX"
........
r68148 | ronald.oussoren | 2009-01-02 11:48:31 +0100 (Fri, 02 Jan 2009) | 2 lines
Forgot to add a NEWS item in my previous checkin
........
r68149 | ronald.oussoren | 2009-01-02 11:50:48 +0100 (Fri, 02 Jan 2009) | 2 lines
Fix for issue4780
........
r68159 | ronald.oussoren | 2009-01-02 15:48:17 +0100 (Fri, 02 Jan 2009) | 2 lines
Fix for issue 1627952
........
r68160 | ronald.oussoren | 2009-01-02 15:52:09 +0100 (Fri, 02 Jan 2009) | 2 lines
Fix for issue r1737832
........
r68161 | ronald.oussoren | 2009-01-02 16:00:05 +0100 (Fri, 02 Jan 2009) | 3 lines
Fix for issue 1149804
........
r68162 | ronald.oussoren | 2009-01-02 16:06:00 +0100 (Fri, 02 Jan 2009) | 3 lines
Fix for issue 4472 is incompatible with Cygwin, this patch
should fix that.
........
r68166 | benjamin.peterson | 2009-01-02 19:26:23 +0100 (Fri, 02 Jan 2009) | 1 line
document PyMemberDef
........
r68171 | georg.brandl | 2009-01-02 21:25:14 +0100 (Fri, 02 Jan 2009) | 3 lines
#4811: fix markup glitches (mostly remains of the conversion),
found by Gabriel Genellina.
........
r68172 | martin.v.loewis | 2009-01-02 21:32:55 +0100 (Fri, 02 Jan 2009) | 2 lines
Issue #4075: Use OutputDebugStringW in Py_FatalError.
........
r68173 | martin.v.loewis | 2009-01-02 21:40:14 +0100 (Fri, 02 Jan 2009) | 2 lines
Issue #4051: Prevent conflict of UNICODE macros in cPickle.
........
r68174 | benjamin.peterson | 2009-01-02 21:47:27 +0100 (Fri, 02 Jan 2009) | 1 line
fix compilation on non-Windows platforms
........
r68179 | raymond.hettinger | 2009-01-02 22:26:45 +0100 (Fri, 02 Jan 2009) | 1 line
Issue #4615. Document how to use itertools for de-duping.
........
r68195 | georg.brandl | 2009-01-03 14:45:15 +0100 (Sat, 03 Jan 2009) | 2 lines
Remove useless string literal.
........
r68196 | georg.brandl | 2009-01-03 15:29:53 +0100 (Sat, 03 Jan 2009) | 2 lines
Fix indentation.
........
r68210 | georg.brandl | 2009-01-03 20:10:12 +0100 (Sat, 03 Jan 2009) | 2 lines
Set eol-style correctly for mp_distributing.py.
........
r68214 | georg.brandl | 2009-01-03 20:44:48 +0100 (Sat, 03 Jan 2009) | 2 lines
Make indentation consistent.
........
r68215 | georg.brandl | 2009-01-03 21:15:14 +0100 (Sat, 03 Jan 2009) | 2 lines
Fix role name.
........
r68217 | georg.brandl | 2009-01-03 21:30:15 +0100 (Sat, 03 Jan 2009) | 2 lines
Add rstlint, a little tool to find subtle markup problems and inconsistencies in the Doc sources.
........
r68218 | georg.brandl | 2009-01-03 21:38:59 +0100 (Sat, 03 Jan 2009) | 2 lines
Recognize usage of the default role.
........
r68219 | georg.brandl | 2009-01-03 21:47:01 +0100 (Sat, 03 Jan 2009) | 2 lines
Fix uses of the default role.
........
r68220 | georg.brandl | 2009-01-03 21:55:06 +0100 (Sat, 03 Jan 2009) | 2 lines
Remove trailing whitespace.
........
r68221 | georg.brandl | 2009-01-03 22:04:55 +0100 (Sat, 03 Jan 2009) | 2 lines
Remove tabs from the documentation.
........
r68222 | georg.brandl | 2009-01-03 22:11:58 +0100 (Sat, 03 Jan 2009) | 2 lines
Disable the line length checker by default.
........
2009-01-03 17:55:17 -04:00
|
|
|
.. deprecated:: 2.3
|
2008-04-24 22:29:10 -03:00
|
|
|
This exists only for compatibility with the method by this name on
|
|
|
|
:class:`file` objects, which is deprecated. Use ``for line in file``
|
|
|
|
instead.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: seek(offset[, whence])
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Move to new file position. Argument *offset* is a byte count. Optional
|
|
|
|
argument *whence* defaults to ``os.SEEK_SET`` or ``0`` (offset from start
|
|
|
|
of file; offset should be ``>= 0``); other values are ``os.SEEK_CUR`` or
|
|
|
|
``1`` (move relative to current position; offset can be positive or
|
|
|
|
negative), and ``os.SEEK_END`` or ``2`` (move relative to end of file;
|
|
|
|
offset is usually negative, although many platforms allow seeking beyond
|
|
|
|
the end of a file).
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Note that seeking of bz2 files is emulated, and depending on the
|
|
|
|
parameters the operation may be extremely slow.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: tell()
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Return the current file position, an integer (may be a long integer).
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: write(data)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Write string *data* to file. Note that due to buffering, :meth:`close` may
|
|
|
|
be needed before the file on disk reflects the data written.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: writelines(sequence_of_strings)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Write the sequence of strings to the file. Note that newlines are not
|
|
|
|
added. The sequence can be any iterable object producing strings. This is
|
|
|
|
equivalent to calling write() for each string.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
|
|
|
Sequential (de)compression
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
Sequential compression and decompression is done using the classes
|
|
|
|
:class:`BZ2Compressor` and :class:`BZ2Decompressor`.
|
|
|
|
|
|
|
|
|
|
|
|
.. class:: BZ2Compressor([compresslevel])
|
|
|
|
|
|
|
|
Create a new compressor object. This object may be used to compress data
|
2008-04-24 22:29:10 -03:00
|
|
|
sequentially. If you want to compress data in one shot, use the
|
|
|
|
:func:`compress` function instead. The *compresslevel* parameter, if given,
|
|
|
|
must be a number between ``1`` and ``9``; the default is ``9``.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: compress(data)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Provide more data to the compressor object. It will return chunks of
|
|
|
|
compressed data whenever possible. When you've finished providing data to
|
|
|
|
compress, call the :meth:`flush` method to finish the compression process,
|
|
|
|
and return what is left in internal buffers.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: flush()
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Finish the compression process and return what is left in internal
|
|
|
|
buffers. You must not use the compressor object after calling this method.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
|
|
|
.. class:: BZ2Decompressor()
|
|
|
|
|
|
|
|
Create a new decompressor object. This object may be used to decompress data
|
|
|
|
sequentially. If you want to decompress data in one shot, use the
|
|
|
|
:func:`decompress` function instead.
|
|
|
|
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
.. method:: decompress(data)
|
2007-08-15 11:28:01 -03:00
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Provide more data to the decompressor object. It will return chunks of
|
|
|
|
decompressed data whenever possible. If you try to decompress data after
|
|
|
|
the end of stream is found, :exc:`EOFError` will be raised. If any data
|
|
|
|
was found after the end of stream, it'll be ignored and saved in
|
|
|
|
:attr:`unused_data` attribute.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
|
|
|
One-shot (de)compression
|
|
|
|
------------------------
|
|
|
|
|
|
|
|
One-shot compression and decompression is provided through the :func:`compress`
|
|
|
|
and :func:`decompress` functions.
|
|
|
|
|
|
|
|
|
|
|
|
.. function:: compress(data[, compresslevel])
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Compress *data* in one shot. If you want to compress data sequentially, use
|
|
|
|
an instance of :class:`BZ2Compressor` instead. The *compresslevel* parameter,
|
|
|
|
if given, must be a number between ``1`` and ``9``; the default is ``9``.
|
2007-08-15 11:28:01 -03:00
|
|
|
|
|
|
|
|
|
|
|
.. function:: decompress(data)
|
|
|
|
|
2008-04-24 22:29:10 -03:00
|
|
|
Decompress *data* in one shot. If you want to decompress data sequentially,
|
|
|
|
use an instance of :class:`BZ2Decompressor` instead.
|
2007-08-15 11:28:01 -03:00
|
|
|
|