2002-11-05 12:50:05 -04:00
|
|
|
\section{\module{bz2} ---
|
|
|
|
Compression compatible with \program{bzip2}}
|
|
|
|
|
|
|
|
\declaremodule{builtin}{bz2}
|
|
|
|
\modulesynopsis{Interface to compression and decompression
|
|
|
|
routines compatible with \program{bzip2}.}
|
|
|
|
\moduleauthor{Gustavo Niemeyer}{niemeyer@conectiva.com}
|
|
|
|
\sectionauthor{Gustavo Niemeyer}{niemeyer@conectiva.com}
|
|
|
|
|
|
|
|
\versionadded{2.3}
|
|
|
|
|
|
|
|
This module provides a comprehensive interface for the bz2 compression library.
|
|
|
|
It implements a complete file interface, one-shot (de)compression functions,
|
|
|
|
and types for sequential (de)compression.
|
|
|
|
|
|
|
|
Here is a resume of the features offered by the bz2 module:
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
\item \class{BZ2File} class implements a complete file interface, including
|
|
|
|
\method{readline()}, \method{readlines()}, \method{xreadlines()},
|
|
|
|
\method{writelines()}, \method{seek()}, etc;
|
|
|
|
\item \class{BZ2File} class implements emulated \method{seek()} support;
|
|
|
|
\item \class{BZ2File} class implements universal newline support;
|
|
|
|
\item \class{BZ2File} class offers an optimized line iteration using
|
|
|
|
the readahead algorithm borrowed from file objects;
|
2002-11-05 19:55:27 -04:00
|
|
|
\item \class{BZ2File} class inherits from the builtin file type
|
2002-11-05 13:54:02 -04:00
|
|
|
(\code{issubclass(BZ2File, file)} is \code{True});
|
2002-11-05 12:50:05 -04:00
|
|
|
\item Sequential (de)compression supported by \class{BZ2Compressor} and
|
|
|
|
\class{BZ2Decompressor} classes;
|
|
|
|
\item One-shot (de)compression supported by \function{compress()} and
|
|
|
|
\function{decompress()} functions;
|
|
|
|
\item Thread safety uses individual locking mechanism;
|
|
|
|
\item Complete inline documentation;
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
|
|
\subsection{(De)compression of files}
|
|
|
|
|
|
|
|
Handling of compressed files is offered by the \class{BZ2File} class.
|
|
|
|
|
2002-11-05 13:54:02 -04:00
|
|
|
\begin{classdesc}{BZ2File}{filename\optional{, mode\optional{,
|
|
|
|
buffering\optional{, compresslevel}}}}
|
|
|
|
Open a bz2 file. Mode can be either \code{'r'} or \code{'w'}, for reading
|
2002-11-05 12:50:05 -04:00
|
|
|
(default) or writing. When opened for writing, the file will be created if
|
2002-11-05 13:54:02 -04:00
|
|
|
it doesn't exist, and truncated otherwise. If \var{buffering} is given,
|
|
|
|
\code{0} means unbuffered, and larger numbers specify the buffer size;
|
|
|
|
the default is \code{0}. If
|
|
|
|
\var{compresslevel} is given, must be a number between \code{1} and
|
|
|
|
\code{9}; the default is \code{9}.
|
2002-11-05 12:50:05 -04:00
|
|
|
Add a \code{'U'} to mode to open the file for input with universal newline
|
|
|
|
support. Any line ending in the input file will be seen as a
|
2002-11-05 13:54:02 -04:00
|
|
|
\character{\textbackslash n}
|
2002-11-05 12:50:05 -04:00
|
|
|
in Python. Also, a file so opened gains the attribute \member{newlines};
|
|
|
|
the value for this attribute is one of \code{None} (no newline read yet),
|
|
|
|
\code{'\textbackslash r'}, \code{'\textbackslash n'},
|
|
|
|
\code{'\textbackslash r\textbackslash n'} or a tuple containing all the
|
|
|
|
newline types seen. Universal newlines are available only when reading.
|
|
|
|
\end{classdesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2File]{close}{}
|
|
|
|
Close the file. Sets data attribute \member{closed} to true. A closed file
|
|
|
|
cannot be used for further I/O operations. \method{close()} may be called
|
|
|
|
more than once without error.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2File]{read}{\optional{size}}
|
|
|
|
Read at most \var{size} uncompressed bytes, returned as a string. If the
|
|
|
|
\var{size} argument is negative or omitted, read until EOF is reached.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2File]{readline}{\optional{size}}
|
|
|
|
Return the next line from the file, as a string, retaining newline.
|
|
|
|
A non-negative \var{size} argument limits the maximum number of bytes to
|
|
|
|
return (an incomplete line may be returned then). Return an empty
|
|
|
|
string at EOF.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2File]{readlines}{\optional{size}}
|
|
|
|
Return a list of lines read. The optional \var{size} argument, if given,
|
|
|
|
is an approximate bound on the total number of bytes in the lines returned.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2File]{xreadlines}{}
|
|
|
|
For backward compatibility. \class{BZ2File} objects now include the
|
|
|
|
performance optimizations previously implemented in the \module{xreadlines}
|
|
|
|
module.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2File]{\_\_iter\_\_}{}
|
2002-11-05 19:55:27 -04:00
|
|
|
Iterate through the file lines. Iteration optimization is implemented
|
2002-11-05 12:50:05 -04:00
|
|
|
using the same readahead algorithm available in \class{file} objects.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
2002-11-05 13:54:02 -04:00
|
|
|
\begin{methoddesc}[BZ2File]{seek}{offset\optional{, whence}}
|
2002-11-05 12:50:05 -04:00
|
|
|
Move to new file position. Argument \var{offset} is a byte count. Optional
|
|
|
|
argument \var{whence} defaults to \code{0} (offset from start of file,
|
|
|
|
offset should be \code{>= 0}); other values are \code{1} (move relative to
|
|
|
|
current position, positive or negative), and \code{2} (move relative to end
|
|
|
|
of file, usually negative, although many platforms allow seeking beyond
|
|
|
|
the end of a file).
|
|
|
|
|
|
|
|
Note that seeking of bz2 files is emulated, and depending on the parameters
|
|
|
|
the operation may be extremely slow.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2File]{tell}{}
|
|
|
|
Return the current file position, an integer (may be a long integer).
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2File]{write}{data}
|
|
|
|
Write string \var{data} to file. Note that due to buffering, \method{close()}
|
|
|
|
may be needed before the file on disk reflects the data written.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2File]{writelines}{sequence_of_strings}
|
|
|
|
Write the sequence of strings to the file. Note that newlines are not added.
|
|
|
|
The sequence can be any iterable object producing strings. This is equivalent
|
|
|
|
to calling write() for each string.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
|
|
|
|
\subsection{Sequential (de)compression}
|
|
|
|
|
|
|
|
Sequential compression and decompression is done using the classes
|
|
|
|
\class{BZ2Compressor} and \class{BZ2Decompressor}.
|
|
|
|
|
2002-11-05 13:54:02 -04:00
|
|
|
\begin{classdesc}{BZ2Compressor}{\optional{compresslevel}}
|
2002-11-05 12:50:05 -04:00
|
|
|
Create a new compressor object. This object may be used to compress
|
|
|
|
data sequentially. If you want to compress data in one shot, use the
|
|
|
|
\function{compress()} function instead. The \var{compresslevel} parameter,
|
2002-11-05 13:54:02 -04:00
|
|
|
if given, must be a number between \code{1} and \code{9}; the default
|
|
|
|
is \code{9}.
|
2002-11-05 12:50:05 -04:00
|
|
|
\end{classdesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2Compressor]{compress}{data}
|
|
|
|
Provide more data to the compressor object. It will return chunks of compressed
|
|
|
|
data whenever possible. When you've finished providing data to compress, call
|
|
|
|
the \method{flush()} method to finish the compression process, and return what
|
|
|
|
is left in internal buffers.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2Compressor]{flush}{}
|
|
|
|
Finish the compression process and return what is left in internal buffers. You
|
|
|
|
must not use the compressor object after calling this method.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
\begin{classdesc}{BZ2Decompressor}{}
|
|
|
|
Create a new decompressor object. This object may be used to decompress
|
|
|
|
data sequentially. If you want to decompress data in one shot, use the
|
|
|
|
\function{decompress()} function instead.
|
|
|
|
\end{classdesc}
|
|
|
|
|
|
|
|
\begin{methoddesc}[BZ2Decompressor]{decompress}{data}
|
|
|
|
Provide more data to the decompressor object. It will return chunks of
|
|
|
|
decompressed data whenever possible. If you try to decompress data after the
|
|
|
|
end of stream is found, \exception{EOFError} will be raised. If any data was
|
|
|
|
found after the end of stream, it'll be ignored and saved in
|
|
|
|
\member{unused\_data} attribute.
|
|
|
|
\end{methoddesc}
|
|
|
|
|
|
|
|
|
|
|
|
\subsection{One-shot (de)compression}
|
|
|
|
|
2002-11-05 19:55:27 -04:00
|
|
|
One-shot compression and decompression is provided through the
|
2002-11-05 12:50:05 -04:00
|
|
|
\function{compress()} and \function{decompress()} functions.
|
|
|
|
|
2002-11-05 13:54:02 -04:00
|
|
|
\begin{funcdesc}{compress}{data\optional{, compresslevel}}
|
2002-11-05 12:50:05 -04:00
|
|
|
Compress \var{data} in one shot. If you want to compress data sequentially,
|
|
|
|
use an instance of \class{BZ2Compressor} instead. The \var{compresslevel}
|
2002-11-05 13:54:02 -04:00
|
|
|
parameter, if given, must be a number between \code{1} and \code{9};
|
|
|
|
the default is \code{9}.
|
2002-11-05 12:50:05 -04:00
|
|
|
\end{funcdesc}
|
|
|
|
|
2002-11-05 13:54:02 -04:00
|
|
|
\begin{funcdesc}{decompress}{data}
|
2002-11-05 12:50:05 -04:00
|
|
|
Decompress \var{data} in one shot. If you want to decompress data
|
|
|
|
sequentially, use an instance of \class{BZ2Decompressor} instead.
|
|
|
|
\end{funcdesc}
|