forever on incomplete input. That caused tarfile.open() to hang when used
with mode 'r' or 'r:bz2' and a fileobj argument that contained no data or
partial bzip2 compressed data.
specify an error handling scheme for character conversion. Additional
scheme "utf-8" in read mode. Unicode input filenames are now
supported by design. The values of the pax_headers dictionary are now
limited to unicode objects.
Fixed: The prefix field is no longer used in PAX_FORMAT (in
conformance with POSIX).
Fixed: In read mode use a possible pax header size field.
Fixed: Strip trailing slashes from pax header name values.
Fixed: Give values in user-specified pax_headers precedence when
writing.
Added unicode tests. Added pax/regtype4 member to testtar.tar all
possible number fields in a pax header.
Added two chapters to the documentation about the different formats
tarfile.py supports and how unicode issues are handled.
support.
The TarInfo class now contains all necessary logic to process and
create tar header data which has been moved there from the TarFile
class. The fromtarfile() method was added. The new path and linkpath
properties are aliases for the name and linkname attributes in
correspondence to the pax naming scheme.
The TarFile constructor and classmethods now accept a number of
keyword arguments which could only be set as attributes before (e.g.
dereference, ignore_zeros). The encoding and pax_headers arguments
were added for pax support. There is a new tarinfo keyword argument
that allows using subclassed TarInfo objects in TarFile.
The boolean TarFile.posix attribute is deprecated, because now three
tar formats are supported. Instead, the desired format for writing is
specified using the constants USTAR_FORMAT, GNU_FORMAT and PAX_FORMAT
as the format keyword argument. This change affects TarInfo.tobuf()
as well.
The test suite has been heavily reorganized and partially rewritten.
A new testtar.tar was added that contains sample data in many formats
from 4 different tar programs.
Some bugs and quirks that also have been fixed:
Directory names do no longer have a trailing slash in TarInfo.name or
TarFile.getnames().
Adding the same file twice does not create a hardlink file member.
The TarFile constructor does no longer need a name argument.
The TarFile._mode attribute was renamed to mode and contains either
'r', 'w' or 'a'.
failures on Windows buildbots, but it's hard to know how since the regrtest
failure output is useless here, and it never fails when a buildbot slave runs
test_tarfile the second time in verbose mode.
the file object in binary mode.
The Windows buildbot slaves shouldn't swap themselves to death
anymore. However, test_tarfile may still fail because of a
temp directory left behind from a previous failing run.
Windows buildbot owners may need to remove that directory
by hand.
Problem: if two files are assigned the same inode
number by the filesystem, the second one will be added
as a hardlink to the first, which means that the
content will be lost.
The patched code checks if the file's st_nlink is
greater 1. So only for files that actually have several
links pointing to them hardlinks will be created, which
is what GNU tar does.
Will backport.
code use proper functions to get paths.
Changed the name of tar file that is searched for to be absolute (i.e., not use
os.extsep) since filename is locked in based on name of file in CVS
(testtar.tar).
Closes bug #731403 .
and test_support.run_classtests() into run_unittest()
and use it wherever possible.
Also don't use "from test.test_support import ...", but
"from test import test_support" in a few spots.
From SF patch #662807.
- the test was sloppy about filenames: "0-REGTYPE-TEXT" was used where
the archive held "/0-REGTYPE-TEXT".
- tarfile extracts all files in binary mode, but the test expected to be able to
read and compare text files in text mode. Use universal text mode.