Commit Graph

59 Commits

Author SHA1 Message Date
Lars Gustäbel 0f4a14b56f TarFile.__init__() no longer fails if no name argument is passed and
the fileobj argument has no usable name attribute (e.g. StringIO).

(will backport to 2.5)
2007-08-28 12:31:09 +00:00
Lars Gustäbel 104490e615 Added exclude keyword argument to the TarFile.add() method. 2007-06-18 11:42:11 +00:00
Lars Gustäbel a0fcb9384e Added errors argument to TarFile class that allows the user to
specify an error handling scheme for character conversion. Additional
scheme "utf-8" in read mode. Unicode input filenames are now
supported by design. The values of the pax_headers dictionary are now
limited to unicode objects.

Fixed: The prefix field is no longer used in PAX_FORMAT (in
conformance with POSIX).
Fixed: In read mode use a possible pax header size field.
Fixed: Strip trailing slashes from pax header name values.
Fixed: Give values in user-specified pax_headers precedence when
writing.

Added unicode tests. Added pax/regtype4 member to testtar.tar all
possible number fields in a pax header.

Added two chapters to the documentation about the different formats
tarfile.py supports and how unicode issues are handled.
2007-05-27 19:49:30 +00:00
Brett Cannon 6cef076ba5 Remove direct call's to file's constructor and replace them with calls to
open() as ths is considered best practice.
2007-05-25 20:17:15 +00:00
Lars Gustäbel c64e40215d This is the implementation of POSIX.1-2001 (pax) format read/write
support.

The TarInfo class now contains all necessary logic to process and
create tar header data which has been moved there from the TarFile
class. The fromtarfile() method was added. The new path and linkpath
properties are aliases for the name and linkname attributes in
correspondence to the pax naming scheme.

The TarFile constructor and classmethods now accept a number of
keyword arguments which could only be set as attributes before (e.g.
dereference, ignore_zeros). The encoding and pax_headers arguments
were added for pax support. There is a new tarinfo keyword argument
that allows using subclassed TarInfo objects in TarFile.

The boolean TarFile.posix attribute is deprecated, because now three
tar formats are supported. Instead, the desired format for writing is
specified using the constants USTAR_FORMAT, GNU_FORMAT and PAX_FORMAT
as the format keyword argument. This change affects TarInfo.tobuf()
as well.

The test suite has been heavily reorganized and partially rewritten.
A new testtar.tar was added that contains sample data in many formats
from 4 different tar programs.

Some bugs and quirks that also have been fixed:
Directory names do no longer have a trailing slash in TarInfo.name or
TarFile.getnames().
Adding the same file twice does not create a hardlink file member.
The TarFile constructor does no longer need a name argument.
The TarFile._mode attribute was renamed to mode and contains either
'r', 'w' or 'a'.
2007-03-13 10:47:19 +00:00
Lars Gustäbel 3f8aca1164 Patch #1652681: create nonexistent files in append mode and
allow appending to empty files.
2007-02-06 18:38:13 +00:00
Lars Gustäbel d2e22903d3 Patch #1507247: tarfile.py: use current umask for intermediate
directories.
2007-01-23 11:17:33 +00:00
Lars Gustäbel a7ba6fc548 Patch #1504073: Fix tarfile.open() for mode "r" with a fileobj argument.
Will backport to 2.5.
2006-12-27 10:30:46 +00:00
Lars Gustäbel a4b2381b20 Patch #1262036: Prevent TarFiles from being added to themselves under
certain conditions.

Will backport to 2.5.
2006-12-23 17:57:23 +00:00
Lars Gustäbel 6baa502769 Patch #1230446: tarfile.py: fix ExFileObject so that read() and tell()
work correctly together with readline().

Will backport to 2.5.
2006-12-23 16:40:13 +00:00
Georg Brandl ded1c4df0b Testcase for patch #1484695. 2006-12-20 11:55:16 +00:00
Georg Brandl ebbeed781d Patch #1484695: The tarfile module now raises a HeaderError exception
if a buffer given to frombuf() is invalid.
2006-12-19 22:06:46 +00:00
Georg Brandl 87fa559479 Patch #1610437: fix a tarfile bug with long filename headers. 2006-12-06 22:21:18 +00:00
Georg Brandl 3354f285b9 Patch #1583880: fix tarfile's problems with long names and posix/
GNU modes.
2006-10-29 09:16:12 +00:00
Georg Brandl a32e0a099b Patch [ 1583506 ] tarfile.py: 100-char filenames are truncated 2006-10-24 16:54:16 +00:00
Georg Brandl 35207712dc Fix tarfile depending on buggy int('1\0', base) behavior. 2006-10-12 12:03:07 +00:00
Neal Norwitz 8a519392d5 Fix bug #1543303, tarfile adds padding that breaks gunzip.
Patch # 1543897.

Will backport to 2.5
2006-08-21 17:59:46 +00:00
Tim Peters a05f6e244a _Stream.close(): Try to kill struct.pack() warnings when
writing the crc to file on the "PPC64 Debian trunk" buildbot
when running test_tarfile.

This is again a case where the native zlib crc is an unsigned
32-bit int, but the Python wrapper implicitly casts it to
signed C long, so that "the sign bit looks different" on
different platforms.
2006-08-02 05:20:08 +00:00
Neal Norwitz 4a5fbda66d Part of SF patch #1484695. This removes dead code. The chksum was
already verified in .frombuf() on the lines above.  If there was
a problem an exception is raised, so there was no way this condition
could have been true.
2006-07-10 00:23:17 +00:00
Georg Brandl e895318ee2 Always close BZ2Proxy object. Remove unnecessary struct usage. 2006-05-27 14:02:03 +00:00
Tim Peters 8a299d25ec Whitespace normalization. 2006-05-19 19:16:34 +00:00
Georg Brandl e4751e3cdc Amendments to patch #1484695. 2006-05-18 06:11:19 +00:00
Georg Brandl 49c8f4cf36 [ 1488881 ] tarfile.py: support for file-objects and bz2 (cp. #1488634) 2006-05-15 19:30:35 +00:00
Georg Brandl 38c6a22f38 Patch #1484695: Update the tarfile module to version 0.8. This fixes
a couple of issues, notably handling of long file names using the
GNU LONGNAME extension.
2006-05-10 16:26:03 +00:00
Neal Norwitz f339654280 Patch #1338314, Bug #1336623: fix tarfile so it can extract
REGTYPE directories from tarfiles written by old programs.

Will backport.
2005-10-28 05:52:22 +00:00
Neal Norwitz b0e32e2b71 Fix SF bug # 1330039, patch # 1331635 from Lars Gustaebel (tarfile maintainer)
Problem: if two files are assigned the same inode
number by the filesystem, the second one will be added
as a hardlink to the first, which means that the
content will be lost.

The patched code checks if the file's st_nlink is
greater 1. So only for files that actually have several
links pointing to them hardlinks will be created, which
is what GNU tar does.

Will backport.
2005-10-20 04:50:13 +00:00
Raymond Hettinger a1d09e2ded Minor cleanup. 2005-09-11 16:34:05 +00:00
Martin v. Löwis 5dbdc59577 Patch #1168594: set sizes of non-regular files to zero. Fixes #1167128.
Will backport to 2.4.
2005-08-27 10:07:56 +00:00
Martin v. Löwis faffa15842 Revert previous checkin. 2005-08-24 06:43:09 +00:00
Martin v. Löwis bc3b06087c Patch #1262036: Make tarfile name absolute. Fixes #1257255.
Will backport to 2.4.
2005-08-24 06:06:52 +00:00
Georg Brandl 7eb4b7d177 Fix all wrong instances of "it's". 2005-07-22 21:49:32 +00:00
Tim Peters eba28bea9b Whitespace normalization. 2005-03-28 01:08:02 +00:00
Martin v. Löwis 78be7df9e4 Patch #918101: Add tarfile open mode r|* for auto-detection of the
stream compression; add, for symmetry reasons, r:* as a synonym of r.
2005-03-05 12:47:42 +00:00
Martin v. Löwis 00a73e7715 Patch #1043890: tarfile: add extractall() method. 2005-03-04 19:40:34 +00:00
Martin v. Löwis 637431bf14 Patch #1103407: Properly deal with tarfile iterators when untarring
symbolic links on Windows. Fixes #1100429. Will backport to 2.4.
2005-03-03 23:12:42 +00:00
Martin v. Löwis df24153f65 Patch #1107973: tarfile.ExFileObject iterators. 2005-03-03 08:17:42 +00:00
Guido van Rossum 75b64e65f1 Use decorators. 2005-01-16 00:16:11 +00:00
Raymond Hettinger a617271dbd Use cStringIO where available. 2004-12-31 19:15:26 +00:00
Andrew M. Kuchling 8bc462fcaf [Patch #1043972, for bug #1017553] filemode() returns an incorrect value for the mode 07111 2004-10-20 11:48:42 +00:00
Martin v. Löwis f3c5611fef Patch #1029061: Always extract member names from the tarinfo. 2004-09-18 09:08:52 +00:00
Martin v. Löwis c11d6f13ae Patch #1014992: Never return more than a line from readline.
Will backport to 2.3.
2004-08-25 10:52:58 +00:00
Martin v. Löwis c234a52458 Flush bz2 data even if nothing had been written so far. Fixes #1013882.
Will backport to 2.3.
2004-08-22 21:28:33 +00:00
Martin v. Löwis 61d77e0d97 Replace tricky and/or with straight-forward if:else: 2004-08-20 06:35:46 +00:00
Martin v. Löwis 75b9da4aaf Patch #995126: Correct directory size, and generate GNU tarfiles by default. 2004-08-18 13:57:44 +00:00
Neal Norwitz 0260519c52 Remove unused variables 2004-07-20 22:31:34 +00:00
Neal Norwitz d96d1015ef SF #918101, allow files >= 8 GB using GNU extension 2004-07-20 22:23:02 +00:00
Neal Norwitz a4f651a2ae SF #857297 and 916874, improve handling of hard links when extracting 2004-07-20 22:07:44 +00:00
Neal Norwitz 0662f8a5ea SF #846659, fix bufsize violation and GNU longname/longlink extensions 2004-07-20 21:54:18 +00:00
Andrew M. Kuchling 864bba1981 [Patch 988444]
Read multiple special headers
- fixed/improved handling of extended/special headers
in read-mode (adding new extended headers should be
less painful now).
- improved nts() function.
- removed TarFile.chunks datastructure which is not
(and was never) needed.
- fixed TarInfo.tobuf(), fields could overflow with too
large values, values are now clipped.
2004-07-10 22:02:11 +00:00
Andrew M. Kuchling 6e4f7a82da [Bug #812325 ] tarfile.close() can write out more bytes to the output
than are specified by the buffer size.  The patch calls .__write()
   to ensure that any full blocks are written out.
2004-01-02 15:44:29 +00:00