Fixed the signed/unsigned confusions when dealing with files >= 2GB.
4GB is still a hard limitation of the gzip file format, though.
Testing this was a bitch on Win98SE due to frequent system freezes. It
didn't freeze while running gzip, it kept freezing while trying to *create*
a > 2GB test file! This wasn't Python's doing. I don't know of a
reasonable way to test this functionality in regrtest.py, so I'm not
checking in a test case (a test case would necessarily require creating
a 2GB+ file first, using gzip to zip it, using gzip to unzip it again,
and then compare before-and-after; so >4GB free space would be required,
and a loooong time; I did all this "by hand" once).
Bugfix candidate, I guess.
*this* set of patches is Ka-Ping's final sweep:
The attached patches update the standard library so that all modules
have docstrings beginning with one-line summaries.
A new docstring was added to formatter. The docstring for os.py
was updated to mention nt, os2, ce in addition to posix, dos, mac.
who writes:
Here is batch 2, as a big collection of CVS context diffs.
Along with moving comments into docstrings, i've added a
couple of missing docstrings and attempted to make sure more
module docstrings begin with a one-line summary.
I did not add docstrings to the methods in profile.py for
fear of upsetting any careful optimizations there, though
i did move class documentation into class docstrings.
The convention i'm using is to leave credits/version/copyright
type of stuff in # comments, and move the rest of the descriptive
stuff about module usage into module docstrings. Hope this is
okay.
1. Jack Jansen reports that on the Mac, the time may be negative, and
solves this by adding a write32u() function that writes an unsigned
long.
2. On 64-bit platforms the CRC comparison fails; I've fixed this by
casting both values to be compared to "unsigned long" i.e. modulo
0x100000000L.
allow using the 'a' flag as a mode for opening a GzipFile. gzip
files, surprisingly enough, can be concatenated and then decompressed;
the effect is to concatenate the two chunks of data.
If we support it on writing, it should also be supported on reading.
This *wasn't* trivial, and required rearranging the code in the
reading path, particularly the _read() method.
Raise IOError instead of RuntimeError in two cases, 'Not a gzipped file'
and 'Unknown compression method'
problem was a couple of bugs in the readline implementation.
1. Include the '\n' in the string returned by readline
2. Bug calculating new buffer size in _unread
Also remove unncessary import of StringIO
Here's my suggested replacement for gzip.py for 1.5.1. I've
re-implemeted methods readline and readlines, added an _unread, and
tweaked read and _read.
I tried a more complicated buffer scheme for unread (using a list of
strings and string.join), but it was more complicated and slower.
This version is a lot faster than the current version and is still
pretty simple.