Commit Graph

127 Commits

Author SHA1 Message Date
Lars Gustäbel 7a919e9930 Issue #13815: TarFile.extractfile() now returns io.BufferedReader objects.
The ExFileObject class was removed, some of its code went into _FileInFile.
2012-05-05 18:15:03 +02:00
Lars Gustäbel 9f478c021d Merge with 3.2: Issue #14160: TarFile.extractfile() failed to resolve symbolic
links when the links were not located in an archive subdirectory.
2012-04-24 21:09:17 +02:00
Lars Gustäbel 1ef9eda7bc Issue #14160: TarFile.extractfile() failed to resolve symbolic links
when the links were not located in an archive subdirectory.
2012-04-24 21:04:40 +02:00
Lars Gustäbel c5e1199f38 Issue #5689: Avoid excessive memory usage by using the default lzma preset. 2012-01-18 14:01:17 +01:00
Lars Gustäbel dee45e20f6 Issue #12926: Fix a bug in tarfile's link extraction.
On platforms that do not support (symbolic) links, tarfile offers a
work-around and extracts a link in an archive as the regular file the link is
pointing to. On other platforms, this code was accidentally executed even
after the link had been successfully extracted which failed due to the already
existing link.
2012-01-05 18:48:06 +01:00
Lars Gustäbel 8f771a4716 Merge from 3.2: Issue #12926: Fix a bug in tarfile's link extraction.
On platforms that do not support (symbolic) links, tarfile offers a
work-around and extracts a link in an archive as the regular file the link is
pointing to. On other platforms, this code was accidentally executed even
after the link had been successfully extracted which failed due to the already
existing link.
2012-01-05 18:53:00 +01:00
Eli Bendersky 74c503b40d use io.SEEK_* constants instead of os.SEEK_* where an IO stream is seeked, leaving the os.SEEK_* constants only for os.lseek, as documented 2012-01-03 06:26:13 +02:00
Lars Gustäbel 0a9dd2f11d Issue #5689: Add support for lzma compression to the tarfile module. 2011-12-10 20:38:14 +01:00
Lars Gustäbel bb44b73e17 Remove no longer needed work-around for bz2 file object support. 2011-12-06 13:44:10 +01:00
Lars Gustäbel 45fb082180 Merge with 3.2: Correctly detect bzip2 compressed streams with blocksizes other than 900k. 2011-12-06 13:00:58 +01:00
Lars Gustäbel ed1ac587df Correctly detect bzip2 compressed streams with blocksizes other than 900k. 2011-12-06 12:56:38 +01:00
Florent Xicluna 68f71a34f4 Simplify and remove few dependencies on 'errno', thanks to PEP 3151. 2011-10-28 16:06:23 +02:00
Lars Gustäbel 01277d166a Merge with 3.2: Issue #13158: Fix decoding and encoding of base-256 number fields in tarfile.
The nti() function that converts a number field from a tar header to a number
failed to decode GNU tar specific base-256 fields. I also added support for
decoding and encoding negative base-256 number fields.
2011-10-14 12:53:10 +02:00
Lars Gustäbel ac3d137a30 Issue #13158: Fix decoding and encoding of base-256 number fields in tarfile.
The nti() function that converts a number field from a tar header to a number
failed to decode GNU tar specific base-256 fields. I also added support for
decoding and encoding negative base-256 number fields.
2011-10-14 12:46:40 +02:00
Lars Gustäbel 24757851b7 Merge with 3.2: Issue #12841: Fix tarfile extraction of non-existent uids/gids. 2011-09-05 16:59:44 +02:00
Lars Gustäbel 2e7ddd374b Issue #12841: Fix tarfile extraction of non-existent uids/gids.
tarfile unnecessarily checked the existence of numerical user and group ids on
extraction. If one of them did not exist the respective id of the current user
(i.e. root) was used for the file and ownership information was lost. (Patch
by Sebastien Luttringer)
2011-09-05 16:58:14 +02:00
Georg Brandl 74b6abf61f Merge with 3.2. 2011-08-13 11:48:40 +02:00
Georg Brandl 3abb372c81 Fix #11513: wrong exception handling for the case that GzipFile itself raises an IOError. 2011-08-13 11:48:12 +02:00
Senthil Kumaran a2250e61db merge from 3.2 - Fix closes Issue11439 Remove the SVN keywords from the code as it is no longer applicable in hg. Patch Contributed by Neil Muller. 2011-07-28 23:39:08 +08:00
Senthil Kumaran 7c9719cf74 Fix closes Issue11439 Remove the SVN keywords from the code as it is no longer applicable in hg. Patch Contributed by Neil Muller. 2011-07-28 22:32:49 +08:00
Benjamin Peterson 8c6f88efa2 remove __version__s dependent on subversion keyword expansion (closes #12221) 2011-05-31 20:52:17 -05:00
Marc-André Lemburg 8f36af7a4c Normalize the encoding names for Latin-1 and UTF-8 to
'latin-1' and 'utf-8'.

These are optimized in the Python Unicode implementation
to result in more direct processing, bypassing the codec
registry.

Also see issue11303.
2011-02-25 15:42:01 +00:00
Lars Gustäbel 9f6cbe09cc Merged revisions 88528 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88528 | lars.gustaebel | 2011-02-23 12:42:22 +0100 (Wed, 23 Feb 2011) | 16 lines

  Issue #11224: Improved sparse file read support (r85916) introduced a
  regression in _FileInFile which is used in file-like objects returned
  by TarFile.extractfile(). The inefficient design of the
  _FileInFile.read() method causes various dramatic side-effects and
  errors:

    - The data segment of a file member is read completely into memory
      every(!) time a small block is accessed. This is not only slow
      but may cause unexpected MemoryErrors with very large files.
    - Reading members from compressed tar archives is even slower
      because of the excessive backwards seeking which is done when the
      same data segment is read over and over again.
    - As a backwards seek on a TarFile opened in stream mode is not
      possible, using extractfile() fails with a StreamError.
........
2011-02-23 11:52:31 +00:00
Lars Gustäbel dd071045e7 Issue #11224: Improved sparse file read support (r85916) introduced a
regression in _FileInFile which is used in file-like objects returned
by TarFile.extractfile(). The inefficient design of the
_FileInFile.read() method causes various dramatic side-effects and
errors:

  - The data segment of a file member is read completely into memory
    every(!) time a small block is accessed. This is not only slow
    but may cause unexpected MemoryErrors with very large files.
  - Reading members from compressed tar archives is even slower
    because of the excessive backwards seeking which is done when the
    same data segment is read over and over again.
  - As a backwards seek on a TarFile opened in stream mode is not
    possible, using extractfile() fails with a StreamError.
2011-02-23 11:42:22 +00:00
Raymond Hettinger a63a312a3f Issue #11014: Make 'filter' argument in tarfile.Tarfile.add() into a
keyword-only argument.  The preceding positional argument was deprecated,
so it made no sense to add filter as a positional argument.

(Patch reviewed by Brian Curtin and Anthony Long.)
2011-01-26 20:34:14 +00:00
Martin v. Löwis 16f344df36 Issue #10184: Touch directories only once when extracting a tarfile. 2010-11-01 21:39:13 +00:00
Antoine Pitrou e1eca4e3f5 Issue #10233: Close file objects in a timely manner in the tarfile module
and its test suite.
2010-10-29 23:49:49 +00:00
Lars Gustäbel 9cbdd75ec5 Add read support for all missing variants of the GNU sparse
extensions. Thus, in addition to GNUTYPE_SPARSE headers, sparse
information in pax headers created by GNU tar can now be decoded.
All three formats 0.0, 0.1 and 1.0 are supported.
On filesystems that support this, holes in files are now restored
whenever a sparse member is extracted.
2010-10-29 09:08:19 +00:00
Lars Gustäbel 331b8002f0 Issue #9065: no longer use "root" as the default for the
uname and gname field.

If tarfile creates a new archive and adds a file with a
uid/gid that doesn't have a corresponding name on the
system (e.g. because the user/group account was deleted) it
uses the empty string in the uname/gname field now instead
of "root". Using "root" as the default was a bad idea
because on extraction the uname/gname fields are supposed
to override the uid/gid fields. So, all archive members
with nameless uids/gids belonged to the root user after
extraction.
2010-10-04 15:18:47 +00:00
Brian Curtin 82df53e932 Fix a line that got hacked up by r82659. 2010-09-24 21:04:05 +00:00
Antoine Pitrou 605c293031 Further tarfile / test_tarfile cleanup 2010-09-23 20:15:14 +00:00
Antoine Pitrou 95f5560b46 Try to fix test_tarfile issues on Windows buildbots by closing file
objects explicitly instead of letting them linger on.
2010-09-23 18:36:46 +00:00
Brian Curtin 16633fa497 Fix the breakage of Lib/tarfile.py on non-Windows platforms due to
using WindowsError in a try/except. Only add WindowsError to the list of
exceptions to catch when we are actually running on Windows.

Additionally, add a call that was left out in test_posixpath.

Thanks Amaury, Antoine, and Jason.
2010-07-09 13:54:27 +00:00
Brian Curtin d40e6f70a5 Implement #1578269. Patch by Jason R. Coombs.
Added Windows support for os.symlink when run on Windows 6.0 or greater,
aka Vista. Previous Windows versions will raise NotImplementedError
when trying to symlink.

Includes numerous test updates and additions to test_os, including
a symlink_support module because of the fact that privilege escalation
is required in order to run the tests to ensure that the user is able
to create symlinks. By default, accounts do not have the required
privilege, so the escalation code will have to be exposed later (or
documented on how to do so). I'll be following up with that work next.

Note that the tests use ctypes, which was agreed on during the PyCon
language summit.
2010-07-08 21:39:08 +00:00
Victor Stinner 0f35e2c0f4 Issue #8784: Set tarfile default encoding to 'utf-8' on Windows.
Note: file system encoding cannot be None anymore (since r81190, issue #8610).
2010-06-11 23:46:47 +00:00
Lars Gustäbel 1b51272b1b Merged revisions 81667 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81667 | lars.gustaebel | 2010-06-03 14:34:14 +0200 (Thu, 03 Jun 2010) | 8 lines

  Issue #8741: Fixed the TarFile.makelink() method that is responsible
  for extracting symbolic and hard link entries as regular files as a
  work-around on platforms that do not support filesystem links.

  This stopped working reliably after a change in r74571. I also added
  a few tests for this functionality.
........
2010-06-03 12:45:16 +00:00
Lars Gustäbel 2470ff19e6 Merged revisions 81663 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81663 | lars.gustaebel | 2010-06-03 11:56:22 +0200 (Thu, 03 Jun 2010) | 4 lines

  Issue #8833: tarfile created hard link entries with a size
  field != 0 by mistake. The associated testcase did not
  expose this bug because it was broken too.
........
2010-06-03 10:11:52 +00:00
Lars Gustäbel 1465cc2887 Issue #8633: Support for POSIX.1-2008 binary pax headers.
tarfile is now able to read and write pax headers with a
"hdrcharset=BINARY" record. This record was introduced in
POSIX.1-2008 as a method to store unencoded binary strings that
cannot be translated to UTF-8. In practice, this is just a workaround
that allows a tar implementation to store filenames that do not
comply with the current filesystem encoding and thus cannot be
decoded correctly.
Additionally, tarfile works around a bug in current versions of GNU
tar: undecodable filenames are stored as-is in a pax header without a
"hdrcharset" record being added. Technically, these headers are
invalid, but tarfile manages to read them correctly anyway.
2010-05-17 18:02:50 +00:00
Victor Stinner de629d46f2 Issue #8390: tarfile uses surrogateespace as the default error handler
(instead of replace in read mode or strict in write mode)
2010-05-05 21:43:57 +00:00
Ronald Oussoren 94f25283c9 Remove traces of MacOS9 support.
Fix for issue #7908
2010-05-05 19:11:21 +00:00
Lars Gustäbel d6eb70b7b4 Merged revisions 80616 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r80616 | lars.gustaebel | 2010-04-29 17:23:38 +0200 (Thu, 29 Apr 2010) | 4 lines

  Issue #8464: tarfile.open(name, mode="w|") no longer creates
  files with execute permissions set.
........
2010-04-29 15:37:02 +00:00
Benjamin Peterson 90f5ba538b convert shebang lines: python -> python3 2010-03-11 22:53:45 +00:00
Lars Gustäbel 0138581c43 Merged revisions 78623 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78623 | lars.gustaebel | 2010-03-03 12:55:48 +0100 (Wed, 03 Mar 2010) | 3 lines

  Issue #7232: Add support for the context manager protocol
  to the TarFile class.
........
2010-03-03 12:08:54 +00:00
Antoine Pitrou 77b338be20 Issue #4757: `zlib.compress` and other methods in the zlib module now
raise a TypeError when given an `str` object (rather than a `bytes`-like
object).  Patch by Victor Stinner and Florent Xicluna.
2009-12-14 18:00:06 +00:00
Lars Gustäbel 365aff3a9c Merged revisions 76780 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r76780 | lars.gustaebel | 2009-12-13 12:32:27 +0100 (Sun, 13 Dec 2009) | 21 lines

  Issue #7357: No longer suppress fatal extraction errors by
  default.

  TarFile's errorlevel argument controls how errors are
  handled that occur during extraction. There are three
  possible levels 0, 1 and 2. If errorlevel is set to 1 or 2
  fatal errors (e.g. a full filesystem) are raised as
  exceptions. If it is set to 0, which is the default value,
  extraction errors are suppressed, and error messages are
  written to the debug log instead. But, if the debug log is
  not activated, which is the default as well, all these
  errors go unnoticed.

  The original intention was to imitate GNU tar which tries
  to extract as many members as possible instead of stopping
  on the first error. It turns out that this is no good
  default behaviour for a tar library. This patch simply
  changes the default value for the errorlevel argument from
  0 to 1, so that fatal extraction errors are raised as
  EnvironmentError exceptions.
........
2009-12-13 11:42:29 +00:00
Lars Gustäbel 9520a430ef Merged revisions 76443 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r76443 | lars.gustaebel | 2009-11-22 19:30:53 +0100 (Sun, 22 Nov 2009) | 24 lines

  Issue #6123: Fix opening empty archives and files.

  (Note that an empty archive is not the same as an empty file. An
  empty archive contains no members and is correctly terminated with an
  EOF block full of zeros. An empty file contains no data at all.)

  The problem was that although tarfile was able to create empty
  archives, it failed to open them raising a ReadError. On the other
  hand, tarfile opened empty files without error in most read modes and
  presented them as empty archives. (However, some modes still raised
  errors: "r|gz" raised ReadError, but "r:gz" worked, "r:bz2" even
  raised EOFError.)

  In order to get a more fine-grained control over the various internal
  error conditions I now split up the HeaderError exception into a
  number of meaningful sub-exceptions. This makes it easier in the
  TarFile.next() method to react to the different conditions in the
  correct way.

  The visible change in its behaviour now is that tarfile will open
  empty archives correctly and raise ReadError consistently for empty
  files.
........
2009-11-22 18:48:49 +00:00
Lars Gustäbel 7b465390fa Merged revisions 76381 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r76381 | lars.gustaebel | 2009-11-18 21:24:54 +0100 (Wed, 18 Nov 2009) | 3 lines

  Issue #7341: Close the internal file object in the TarFile
  constructor in case of an error.
........
2009-11-18 20:29:25 +00:00
Lars Gustäbel 049d2aa952 Merged revisions 74750 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r74750 | lars.gustaebel | 2009-09-12 12:28:15 +0200 (Sat, 12 Sep 2009) | 9 lines

  Issue #6856: Add a filter keyword argument to TarFile.add().

  The filter argument must be a function that takes a TarInfo
  object argument, changes it and returns it again. If the
  function returns None the TarInfo object will be excluded
  from the archive.
  The exclude argument is deprecated from now on, because it
  does something similar but is not as flexible.
........
2009-09-12 10:44:00 +00:00
Lars Gustäbel bfdfdda106 Merged revisions 74571 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r74571 | lars.gustaebel | 2009-08-28 21:23:44 +0200 (Fri, 28 Aug 2009) | 7 lines

  Issue #6054: Do not normalize stored pathnames.

  No longer use tarfile.normpath() on pathnames. Store pathnames
  unchanged, i.e. do not remove "./", "../" and "//" occurrences.
  However, still convert absolute to relative paths.
........
2009-08-28 19:59:59 +00:00
Lars Gustäbel 42e0091208 Merged revisions 70523 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r70523 | lars.gustaebel | 2009-03-22 21:09:33 +0100 (Sun, 22 Mar 2009) | 5 lines

  Issue #5068: Fixed the tarfile._BZ2Proxy.read() method that would loop
  forever on incomplete input. That caused tarfile.open() to hang when used
  with mode 'r' or 'r:bz2' and a fileobj argument that contained no data or
  partial bzip2 compressed data.
........
2009-03-22 20:34:29 +00:00