On platforms that do not support (symbolic) links, tarfile offers a
work-around and extracts a link in an archive as the regular file the link is
pointing to. On other platforms, this code was accidentally executed even
after the link had been successfully extracted which failed due to the already
existing link.
On platforms that do not support (symbolic) links, tarfile offers a
work-around and extracts a link in an archive as the regular file the link is
pointing to. On other platforms, this code was accidentally executed even
after the link had been successfully extracted which failed due to the already
existing link.
The nti() function that converts a number field from a tar header to a number
failed to decode GNU tar specific base-256 fields. I also added support for
decoding and encoding negative base-256 number fields.
The nti() function that converts a number field from a tar header to a number
failed to decode GNU tar specific base-256 fields. I also added support for
decoding and encoding negative base-256 number fields.
tarfile unnecessarily checked the existence of numerical user and group ids on
extraction. If one of them did not exist the respective id of the current user
(i.e. root) was used for the file and ownership information was lost. (Patch
by Sebastien Luttringer)
'latin-1' and 'utf-8'.
These are optimized in the Python Unicode implementation
to result in more direct processing, bypassing the codec
registry.
Also see issue11303.
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r88528 | lars.gustaebel | 2011-02-23 12:42:22 +0100 (Wed, 23 Feb 2011) | 16 lines
Issue #11224: Improved sparse file read support (r85916) introduced a
regression in _FileInFile which is used in file-like objects returned
by TarFile.extractfile(). The inefficient design of the
_FileInFile.read() method causes various dramatic side-effects and
errors:
- The data segment of a file member is read completely into memory
every(!) time a small block is accessed. This is not only slow
but may cause unexpected MemoryErrors with very large files.
- Reading members from compressed tar archives is even slower
because of the excessive backwards seeking which is done when the
same data segment is read over and over again.
- As a backwards seek on a TarFile opened in stream mode is not
possible, using extractfile() fails with a StreamError.
........
regression in _FileInFile which is used in file-like objects returned
by TarFile.extractfile(). The inefficient design of the
_FileInFile.read() method causes various dramatic side-effects and
errors:
- The data segment of a file member is read completely into memory
every(!) time a small block is accessed. This is not only slow
but may cause unexpected MemoryErrors with very large files.
- Reading members from compressed tar archives is even slower
because of the excessive backwards seeking which is done when the
same data segment is read over and over again.
- As a backwards seek on a TarFile opened in stream mode is not
possible, using extractfile() fails with a StreamError.
keyword-only argument. The preceding positional argument was deprecated,
so it made no sense to add filter as a positional argument.
(Patch reviewed by Brian Curtin and Anthony Long.)
extensions. Thus, in addition to GNUTYPE_SPARSE headers, sparse
information in pax headers created by GNU tar can now be decoded.
All three formats 0.0, 0.1 and 1.0 are supported.
On filesystems that support this, holes in files are now restored
whenever a sparse member is extracted.
uname and gname field.
If tarfile creates a new archive and adds a file with a
uid/gid that doesn't have a corresponding name on the
system (e.g. because the user/group account was deleted) it
uses the empty string in the uname/gname field now instead
of "root". Using "root" as the default was a bad idea
because on extraction the uname/gname fields are supposed
to override the uid/gid fields. So, all archive members
with nameless uids/gids belonged to the root user after
extraction.
using WindowsError in a try/except. Only add WindowsError to the list of
exceptions to catch when we are actually running on Windows.
Additionally, add a call that was left out in test_posixpath.
Thanks Amaury, Antoine, and Jason.
Added Windows support for os.symlink when run on Windows 6.0 or greater,
aka Vista. Previous Windows versions will raise NotImplementedError
when trying to symlink.
Includes numerous test updates and additions to test_os, including
a symlink_support module because of the fact that privilege escalation
is required in order to run the tests to ensure that the user is able
to create symlinks. By default, accounts do not have the required
privilege, so the escalation code will have to be exposed later (or
documented on how to do so). I'll be following up with that work next.
Note that the tests use ctypes, which was agreed on during the PyCon
language summit.
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81667 | lars.gustaebel | 2010-06-03 14:34:14 +0200 (Thu, 03 Jun 2010) | 8 lines
Issue #8741: Fixed the TarFile.makelink() method that is responsible
for extracting symbolic and hard link entries as regular files as a
work-around on platforms that do not support filesystem links.
This stopped working reliably after a change in r74571. I also added
a few tests for this functionality.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81663 | lars.gustaebel | 2010-06-03 11:56:22 +0200 (Thu, 03 Jun 2010) | 4 lines
Issue #8833: tarfile created hard link entries with a size
field != 0 by mistake. The associated testcase did not
expose this bug because it was broken too.
........
tarfile is now able to read and write pax headers with a
"hdrcharset=BINARY" record. This record was introduced in
POSIX.1-2008 as a method to store unencoded binary strings that
cannot be translated to UTF-8. In practice, this is just a workaround
that allows a tar implementation to store filenames that do not
comply with the current filesystem encoding and thus cannot be
decoded correctly.
Additionally, tarfile works around a bug in current versions of GNU
tar: undecodable filenames are stored as-is in a pax header without a
"hdrcharset" record being added. Technically, these headers are
invalid, but tarfile manages to read them correctly anyway.
svn+ssh://pythondev@svn.python.org/python/trunk
........
r78623 | lars.gustaebel | 2010-03-03 12:55:48 +0100 (Wed, 03 Mar 2010) | 3 lines
Issue #7232: Add support for the context manager protocol
to the TarFile class.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r76780 | lars.gustaebel | 2009-12-13 12:32:27 +0100 (Sun, 13 Dec 2009) | 21 lines
Issue #7357: No longer suppress fatal extraction errors by
default.
TarFile's errorlevel argument controls how errors are
handled that occur during extraction. There are three
possible levels 0, 1 and 2. If errorlevel is set to 1 or 2
fatal errors (e.g. a full filesystem) are raised as
exceptions. If it is set to 0, which is the default value,
extraction errors are suppressed, and error messages are
written to the debug log instead. But, if the debug log is
not activated, which is the default as well, all these
errors go unnoticed.
The original intention was to imitate GNU tar which tries
to extract as many members as possible instead of stopping
on the first error. It turns out that this is no good
default behaviour for a tar library. This patch simply
changes the default value for the errorlevel argument from
0 to 1, so that fatal extraction errors are raised as
EnvironmentError exceptions.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r76443 | lars.gustaebel | 2009-11-22 19:30:53 +0100 (Sun, 22 Nov 2009) | 24 lines
Issue #6123: Fix opening empty archives and files.
(Note that an empty archive is not the same as an empty file. An
empty archive contains no members and is correctly terminated with an
EOF block full of zeros. An empty file contains no data at all.)
The problem was that although tarfile was able to create empty
archives, it failed to open them raising a ReadError. On the other
hand, tarfile opened empty files without error in most read modes and
presented them as empty archives. (However, some modes still raised
errors: "r|gz" raised ReadError, but "r:gz" worked, "r:bz2" even
raised EOFError.)
In order to get a more fine-grained control over the various internal
error conditions I now split up the HeaderError exception into a
number of meaningful sub-exceptions. This makes it easier in the
TarFile.next() method to react to the different conditions in the
correct way.
The visible change in its behaviour now is that tarfile will open
empty archives correctly and raise ReadError consistently for empty
files.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r76381 | lars.gustaebel | 2009-11-18 21:24:54 +0100 (Wed, 18 Nov 2009) | 3 lines
Issue #7341: Close the internal file object in the TarFile
constructor in case of an error.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r74750 | lars.gustaebel | 2009-09-12 12:28:15 +0200 (Sat, 12 Sep 2009) | 9 lines
Issue #6856: Add a filter keyword argument to TarFile.add().
The filter argument must be a function that takes a TarInfo
object argument, changes it and returns it again. If the
function returns None the TarInfo object will be excluded
from the archive.
The exclude argument is deprecated from now on, because it
does something similar but is not as flexible.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r74571 | lars.gustaebel | 2009-08-28 21:23:44 +0200 (Fri, 28 Aug 2009) | 7 lines
Issue #6054: Do not normalize stored pathnames.
No longer use tarfile.normpath() on pathnames. Store pathnames
unchanged, i.e. do not remove "./", "../" and "//" occurrences.
However, still convert absolute to relative paths.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r70523 | lars.gustaebel | 2009-03-22 21:09:33 +0100 (Sun, 22 Mar 2009) | 5 lines
Issue #5068: Fixed the tarfile._BZ2Proxy.read() method that would loop
forever on incomplete input. That caused tarfile.open() to hang when used
with mode 'r' or 'r:bz2' and a fileobj argument that contained no data or
partial bzip2 compressed data.
........