cpython/Doc/library/gzip.rst

106 lines
3.9 KiB
ReStructuredText
Raw Normal View History

2007-08-15 11:28:01 -03:00
:mod:`gzip` --- Support for :program:`gzip` files
=================================================
.. module:: gzip
:synopsis: Interfaces for gzip compression and decompression using file objects.
2008-03-28 05:06:56 -03:00
This module provides a simple interface to compress and decompress files just
like the GNU programs :program:`gzip` and :program:`gunzip` would.
Merged revisions 68133-68134,68141-68142,68145-68146,68148-68149,68159-68162,68166,68171-68174,68179,68195-68196,68210,68214-68215,68217-68222 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r68133 | antoine.pitrou | 2009-01-01 16:38:03 +0100 (Thu, 01 Jan 2009) | 1 line fill in actual issue number in tests ........ r68134 | hirokazu.yamamoto | 2009-01-01 16:45:39 +0100 (Thu, 01 Jan 2009) | 2 lines Issue #4797: IOError.filename was not set when _fileio.FileIO failed to open file with `str' filename on Windows. ........ r68141 | benjamin.peterson | 2009-01-01 17:43:12 +0100 (Thu, 01 Jan 2009) | 1 line fix highlighting ........ r68142 | benjamin.peterson | 2009-01-01 18:29:49 +0100 (Thu, 01 Jan 2009) | 2 lines welcome to 2009, Python! ........ r68145 | amaury.forgeotdarc | 2009-01-02 01:03:54 +0100 (Fri, 02 Jan 2009) | 5 lines #4801 _collections module fails to build on cygwin. _PyObject_GC_TRACK is the macro version of PyObject_GC_Track, and according to documentation it should not be used for extension modules. ........ r68146 | ronald.oussoren | 2009-01-02 11:44:46 +0100 (Fri, 02 Jan 2009) | 2 lines Fix for issue4472: "configure --enable-shared doesn't work on OSX" ........ r68148 | ronald.oussoren | 2009-01-02 11:48:31 +0100 (Fri, 02 Jan 2009) | 2 lines Forgot to add a NEWS item in my previous checkin ........ r68149 | ronald.oussoren | 2009-01-02 11:50:48 +0100 (Fri, 02 Jan 2009) | 2 lines Fix for issue4780 ........ r68159 | ronald.oussoren | 2009-01-02 15:48:17 +0100 (Fri, 02 Jan 2009) | 2 lines Fix for issue 1627952 ........ r68160 | ronald.oussoren | 2009-01-02 15:52:09 +0100 (Fri, 02 Jan 2009) | 2 lines Fix for issue r1737832 ........ r68161 | ronald.oussoren | 2009-01-02 16:00:05 +0100 (Fri, 02 Jan 2009) | 3 lines Fix for issue 1149804 ........ r68162 | ronald.oussoren | 2009-01-02 16:06:00 +0100 (Fri, 02 Jan 2009) | 3 lines Fix for issue 4472 is incompatible with Cygwin, this patch should fix that. ........ r68166 | benjamin.peterson | 2009-01-02 19:26:23 +0100 (Fri, 02 Jan 2009) | 1 line document PyMemberDef ........ r68171 | georg.brandl | 2009-01-02 21:25:14 +0100 (Fri, 02 Jan 2009) | 3 lines #4811: fix markup glitches (mostly remains of the conversion), found by Gabriel Genellina. ........ r68172 | martin.v.loewis | 2009-01-02 21:32:55 +0100 (Fri, 02 Jan 2009) | 2 lines Issue #4075: Use OutputDebugStringW in Py_FatalError. ........ r68173 | martin.v.loewis | 2009-01-02 21:40:14 +0100 (Fri, 02 Jan 2009) | 2 lines Issue #4051: Prevent conflict of UNICODE macros in cPickle. ........ r68174 | benjamin.peterson | 2009-01-02 21:47:27 +0100 (Fri, 02 Jan 2009) | 1 line fix compilation on non-Windows platforms ........ r68179 | raymond.hettinger | 2009-01-02 22:26:45 +0100 (Fri, 02 Jan 2009) | 1 line Issue #4615. Document how to use itertools for de-duping. ........ r68195 | georg.brandl | 2009-01-03 14:45:15 +0100 (Sat, 03 Jan 2009) | 2 lines Remove useless string literal. ........ r68196 | georg.brandl | 2009-01-03 15:29:53 +0100 (Sat, 03 Jan 2009) | 2 lines Fix indentation. ........ r68210 | georg.brandl | 2009-01-03 20:10:12 +0100 (Sat, 03 Jan 2009) | 2 lines Set eol-style correctly for mp_distributing.py. ........ r68214 | georg.brandl | 2009-01-03 20:44:48 +0100 (Sat, 03 Jan 2009) | 2 lines Make indentation consistent. ........ r68215 | georg.brandl | 2009-01-03 21:15:14 +0100 (Sat, 03 Jan 2009) | 2 lines Fix role name. ........ r68217 | georg.brandl | 2009-01-03 21:30:15 +0100 (Sat, 03 Jan 2009) | 2 lines Add rstlint, a little tool to find subtle markup problems and inconsistencies in the Doc sources. ........ r68218 | georg.brandl | 2009-01-03 21:38:59 +0100 (Sat, 03 Jan 2009) | 2 lines Recognize usage of the default role. ........ r68219 | georg.brandl | 2009-01-03 21:47:01 +0100 (Sat, 03 Jan 2009) | 2 lines Fix uses of the default role. ........ r68220 | georg.brandl | 2009-01-03 21:55:06 +0100 (Sat, 03 Jan 2009) | 2 lines Remove trailing whitespace. ........ r68221 | georg.brandl | 2009-01-03 22:04:55 +0100 (Sat, 03 Jan 2009) | 2 lines Remove tabs from the documentation. ........ r68222 | georg.brandl | 2009-01-03 22:11:58 +0100 (Sat, 03 Jan 2009) | 2 lines Disable the line length checker by default. ........
2009-01-03 17:55:17 -04:00
The data compression is provided by the :mod:`zlib` module.
2007-08-15 11:28:01 -03:00
2008-03-28 05:06:56 -03:00
The :mod:`gzip` module provides the :class:`GzipFile` class which is modeled
after Python's File Object. The :class:`GzipFile` class reads and writes
2007-08-15 11:28:01 -03:00
:program:`gzip`\ -format files, automatically compressing or decompressing the
2008-03-28 05:06:56 -03:00
data so that it looks like an ordinary file object.
Note that additional file formats which can be decompressed by the
:program:`gzip` and :program:`gunzip` programs, such as those produced by
:program:`compress` and :program:`pack`, are not supported by this module.
2007-08-15 11:28:01 -03:00
For other archive formats, see the :mod:`bz2`, :mod:`zipfile`, and
:mod:`tarfile` modules.
2007-08-15 11:28:01 -03:00
The module defines the following items:
.. class:: GzipFile([filename[, mode[, compresslevel[, fileobj]]]])
Constructor for the :class:`GzipFile` class, which simulates most of the methods
of a file object, with the exception of the :meth:`readinto` and
:meth:`truncate` methods. At least one of *fileobj* and *filename* must be
given a non-trivial value.
The new class instance is based on *fileobj*, which can be a regular file, a
:class:`StringIO` object, or any other object which simulates a file. It
defaults to ``None``, in which case *filename* is opened to provide a file
object.
When *fileobj* is not ``None``, the *filename* argument is only used to be
included in the :program:`gzip` file header, which may includes the original
filename of the uncompressed file. It defaults to the filename of *fileobj*, if
discernible; otherwise, it defaults to the empty string, and in this case the
original filename is not included in the header.
The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``, ``'w'``,
or ``'wb'``, depending on whether the file will be read or written. The default
is the mode of *fileobj* if discernible; otherwise, the default is ``'rb'``. If
not given, the 'b' flag will be added to the mode to ensure the file is opened
in binary mode for cross-platform portability.
The *compresslevel* argument is an integer from ``1`` to ``9`` controlling the
level of compression; ``1`` is fastest and produces the least compression, and
``9`` is slowest and produces the most compression. The default is ``9``.
Calling a :class:`GzipFile` object's :meth:`close` method does not close
*fileobj*, since you might wish to append more material after the compressed
data. This also allows you to pass a :class:`StringIO` object opened for
writing as *fileobj*, and retrieve the resulting memory buffer using the
:class:`StringIO` object's :meth:`getvalue` method.
Merged revisions 78859-78860,78952,79168-79169,79173,79176,79178-79179,79181,79184-79185,79192,79212 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r78859 | georg.brandl | 2010-03-12 10:57:43 +0100 (Fr, 12 Mär 2010) | 1 line Get rid of backticks. ........ r78860 | georg.brandl | 2010-03-12 11:02:03 +0100 (Fr, 12 Mär 2010) | 1 line Fix warnings from "make check". ........ r78952 | georg.brandl | 2010-03-14 10:55:08 +0100 (So, 14 Mär 2010) | 1 line #8137: add iso-8859-16 to the standard encodings table. ........ r79168 | georg.brandl | 2010-03-21 10:01:27 +0100 (So, 21 Mär 2010) | 1 line Fix some issues found by Jacques Ducasse on the docs list. ........ r79169 | georg.brandl | 2010-03-21 10:02:01 +0100 (So, 21 Mär 2010) | 1 line Remove the "built-in objects" file. It only contained two paragraphs of which only one contained useful information, which belongs in the ref manual however. ........ r79173 | georg.brandl | 2010-03-21 10:09:38 +0100 (So, 21 Mär 2010) | 1 line Document that GzipFile supports iteration. ........ r79176 | georg.brandl | 2010-03-21 10:17:41 +0100 (So, 21 Mär 2010) | 1 line Introduce copy by slicing, used in later chapters. ........ r79178 | georg.brandl | 2010-03-21 10:28:16 +0100 (So, 21 Mär 2010) | 1 line Clarify that for shell=True, the shell PID will be the child PID. ........ r79179 | georg.brandl | 2010-03-21 10:37:54 +0100 (So, 21 Mär 2010) | 1 line Mention inefficiency of lists as queues, add link to collections.deque discussion. ........ r79181 | georg.brandl | 2010-03-21 10:51:16 +0100 (So, 21 Mär 2010) | 1 line Update os.kill() emulation example for Windows to use ctypes. ........ r79184 | georg.brandl | 2010-03-21 10:58:36 +0100 (So, 21 Mär 2010) | 1 line Update text for newest US DST regulation. The sample file already has the calculation right. ........ r79185 | georg.brandl | 2010-03-21 11:02:47 +0100 (So, 21 Mär 2010) | 1 line Include structmember.h correctly. ........ r79192 | georg.brandl | 2010-03-21 12:50:58 +0100 (So, 21 Mär 2010) | 1 line Remove leftover word. ........ r79212 | georg.brandl | 2010-03-21 20:01:38 +0100 (So, 21 Mär 2010) | 1 line Fix plural. ........
2010-03-21 16:34:26 -03:00
:class:`GzipFile` supports iteration.
2007-08-15 11:28:01 -03:00
.. function:: open(filename[, mode[, compresslevel]])
This is a shorthand for ``GzipFile(filename,`` ``mode,`` ``compresslevel)``.
The *filename* argument is required; *mode* defaults to ``'rb'`` and
*compresslevel* defaults to ``9``.
2008-03-28 05:06:56 -03:00
.. _gzip-usage-examples:
Examples of usage
-----------------
Example of how to read a compressed file::
import gzip
f = gzip.open('/home/joe/file.txt.gz', 'rb')
file_content = f.read()
f.close()
Example of how to create a compressed GZIP file::
import gzip
content = "Lots of content here"
f = gzip.open('/home/joe/file.txt.gz', 'wb')
f.write(content)
f.close()
Example of how to GZIP compress an existing file::
import gzip
f_in = open('/home/joe/file.txt', 'rb')
f_out = gzip.open('/home/joe/file.txt.gz', 'wb')
f_out.writelines(f_in)
f_out.close()
f_in.close()
2007-08-15 11:28:01 -03:00
.. seealso::
Module :mod:`zlib`
The basic data compression module needed to support the :program:`gzip` file
format.