Commit Graph

225 Commits

Author SHA1 Message Date
Petr Viktorin af53046995
gh-102950: Implement PEP 706 – Filter for tarfile.extractall (#102953) 2023-04-24 10:58:06 +02:00
Oleg Iarygin 56d055a0d8
gh-74468: [tarfile] Fix incorrect name attribute of ExFileObject (GH-102424)
Co-authored-by: Simeon Visser <svisser@users.noreply.github.com>
2023-03-27 16:21:07 -07:00
Nick Drozd 024ac542d7
bpo-45975: Simplify some while-loops with walrus operator (GH-29347) 2022-11-26 14:33:25 -08:00
Sam Ezeh 78365b8e28
gh-91078: Return None from TarFile.next when the tarfile is empty (GH-91850)
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
2022-11-26 09:57:05 -08:00
Nikita Sobolev faf7dfa656
gh-99325: Remove unused `NameError` handling (#99326) 2022-11-11 09:56:57 +00:00
Yaron de Leeuw 50cd4b6959
bpo-26253: Add compressionlevel to tarfile stream (GH-2962)
`tarfile` already accepts a compressionlevel argument for creating
files. This patch adds the same for stream-based tarfile usage.
The default is 9, the value that was previously hard-coded.
2022-06-25 11:43:54 +03:00
Chris Fernald c1e19421c2
gh-91387: Strip trailing slash from tarfile longname directories (GH-32423)
Co-authored-by: Brett Cannon <brett@python.org>
2022-06-17 15:38:41 -07:00
Joshua Root bf2d44ffb0
bpo-45863: tarfile: don't zero out header fields unnecessarily (GH-29693)
Numeric fields of type float, notably mtime, can't be represented
exactly in the ustar header, so the pax header is used. But it is
helpful to set them to the nearest int (i.e. second rather than
nanosecond precision mtimes) in the ustar header as well, for the
benefit of unarchivers that don't understand the pax header.

Add test for tarfile.TarInfo.create_pax_header to confirm correct
behaviour.
2022-02-09 18:06:19 +01:00
Andrzej Mateja 128ab092ca
bpo-44289: Keep argument file object's current position in tarfile.is_tarfile (GH-26488) 2022-02-09 08:19:16 -08:00
andrei kulakov cfadcc31ea
bpo-21987: Fix TarFile.getmember getting a dir with a trailing slash (GH-30283) 2022-01-21 09:40:32 +02:00
Jack DeVries b6fe857250
bpo-39039: tarfile raises descriptive exception from zlib.error (GH-27766)
* during tarfile parsing, a zlib error indicates invalid data
* tarfile.open now raises a descriptive exception from the zlib error
* this makes it clear to the user that they may be trying to open a
  corrupted tar file
2021-09-29 11:25:48 +02:00
Anthony Sottile 9aea31dedd
bpo-8978: improve tarfile.open error message when lzma / bz2 are missing (GH-24850)
Automerge-Triggered-By: GH:pablogsal
2021-04-27 10:39:01 -07:00
Ethan Furman b5a6db9111
bpo-39717: [tarfile] update nested exception raising (GH-23739)
- `from None` if the new exception uses, or doesn't need, the previous one
- `from e` if the previous exception is still relevant
2020-12-12 13:26:44 -08:00
Julien Palard 4fedd7123e
bpo-12800: tarfile: Restore fix from 011525ee9 (GH-21409)
Restore fix from 011525ee92.
2020-11-25 10:23:17 +01:00
Andrey Doroschenko ec42789e6e
bpo-39693: mention KeyError in tarfile extractfile documentation (GH-18639)
Co-authored-by: Andrey Darascheka <andrei.daraschenka@leverx.com>
2020-10-20 10:05:01 -04:00
Artem Bulgakov 22748a83d9
bpo-41316: Make tarfile follow specs for FNAME (GH-21511)
tarfile writes full path to FNAME field of GZIP format instead of just basename if user specified absolute path. Some archive viewers may process file incorrectly. Also it creates security issue because anyone can know structure of directories on system and know username or other personal information.

RFC1952 says about FNAME:
This is the original name of the file being compressed, with any directory components removed.

So tarfile must remove directory names from FNAME and write only basename of file.

Automerge-Triggered-By: @jaraco
2020-09-07 09:46:33 -07:00
Rishi 5a8d121a1f
bpo-39017: Avoid infinite loop in the tarfile module (GH-21454)
Avoid infinite loop when reading specially crafted TAR files using the tarfile module
(CVE-2019-20907).
2020-07-15 13:51:00 +02:00
William Chargin 674935b8ca
bpo-18819: tarfile: only set device fields for device files (GH-18080)
The GNU docs describe the `devmajor` and `devminor` fields of the tar
header struct only in the context of character and block special files,
suggesting that in other cases they are not populated. Typical utilities
behave accordingly; this patch teaches `tarfile` to do the same.
2020-02-12 11:56:02 -08:00
Serhiy Storchaka 9017e0bd5e bpo-39430: Fix race condition in lazy imports in tarfile. (GH-18161)
Use `from ... import ...` to ensure module is fully loaded before accessing its attributes.
2020-01-24 09:55:52 -08:00
William Woodruff dd754caf14 bpo-29435: Allow is_tarfile to take a filelike obj (GH-18090)
`is_tarfile()` now supports `name` being a file or file-like object.
2020-01-22 18:24:16 -08:00
Raymond Hettinger a694f23948
Add missing docstrings for TarInfo objects (#12555) 2019-03-27 13:16:34 -07:00
CAM Gerlach e680c3db80 bpo-36268: Change default tar format to pax from GNU. (GH-12355) 2019-03-21 16:44:51 +02:00
Anthony Sottile 8377cd4fcd Clean up code which checked presence of os.{stat,lstat,chmod} (#11643) 2019-02-25 23:32:27 +01:00
INADA Naoki 8d130913cb
bpo-34043: Optimize tarfile uncompress performance (GH-8089)
tarfile._Stream has two buffer for compressed and uncompressed data.
Those buffers are not aligned so unnecessary bytes slicing happens
for every reading chunks.

This commit bypass compressed buffering.

In this benchmark [1], user time become 250ms from 300ms.

[1]: https://bugs.python.org/msg320763
2018-07-06 14:06:00 +09:00
hajoscher 12a08c4760 bpo-34010: Fix tarfile read performance regression (GH-8020)
During buffered read, use a list followed by join instead of extending a bytes object.
This is how it was done before but changed in commit b506dc32c1.
2018-07-04 17:13:18 +09:00
INADA Naoki 461a1c4b49
bpo-33842: Remove tarfile.filemode (GH-7661) 2018-06-28 17:10:36 +09:00
Joffrey F 72d9b2be36 bpo-32713: Fix tarfile.itn for large/negative float values. (GH-5434) 2018-02-27 02:02:21 +02:00
Bernhard M. Wiedemann 84521047e4 bpo-30693: zip+tarfile: sort directory listing (#2263)
tarfile and zipfile now sort directory listing to generate tar and zip archives
in a more reproducible way.

See also https://reproducible-builds.org/docs/stable-inputs/ on that topic.
2018-01-31 11:17:10 +01:00
Mike 53f7a7c281 bpo-32297: Few misspellings found in Python source code comments. (#4803)
* Fix multiple typos in code comments

* Add spacing in comments (test_logging.py, test_math.py)

* Fix spaces at the beginning of comments in test_logging.py
2017-12-14 13:04:53 +02:00
Alex Gaynor c7cc14a825 Remove two legacy constants which hopefully have no consumers (#1087)
The data contained in them is nonsensical
2017-04-11 22:41:42 -04:00
Serhiy Storchaka 150cd1916a bpo-29958: Minor improvements to zipfile and tarfile CLI. (#944) 2017-04-07 18:56:12 +03:00
Serhiy Storchaka bdf6b910f9 bpo-29776: Use decorator syntax for properties. (#585) 2017-03-19 08:40:32 +02:00
Serhiy Storchaka 4f76fb16b7 Issue #29210: Removed support of deprecated argument "exclude" in
tarfile.TarFile.add().
2017-01-13 13:25:24 +02:00
Xavier de Gaye f44abdab1e Issue #26937: The chown() method of the tarfile.TarFile class does not fail now
when the grp module cannot be imported, as for example on Android platforms.
2016-12-09 09:33:09 +01:00
Serhiy Storchaka 2f4453eff8 Issue #28449: tarfile.open() with mode "r" or "r:" now tries to open a tar
file with compression before trying to open it without compression.  Otherwise
it had 50% chance failed with ignore_zeros=True.
2016-10-30 20:56:23 +02:00
Serhiy Storchaka a89d22aff3 Issue #28449: tarfile.open() with mode "r" or "r:" now tries to open a tar
file with compression before trying to open it without compression.  Otherwise
it had 50% chance failed with ignore_zeros=True.
2016-10-30 20:52:29 +02:00
Łukasz Langa 04bedfa3ce Issue #27199: TarFile expose copyfileobj bufsize to improve throughput
Patch by Jason Fried.
2016-09-09 19:48:14 -07:00
Larry Hastings 10108a7b9a Issue #27355: Removed support for Windows CE. It was never finished,
and Windows CE is no longer a relevant platform for Python.
2016-09-05 15:11:23 -07:00
Łukasz Langa 5135e9ed51 Merge 3.5, issue #27194 2016-06-11 16:56:18 -07:00
Łukasz Langa e7f27481a8 Issue #27194: superfluous truncate calls in tarfile.py slow down extraction
Patch by Jason Fried.
2016-06-11 16:42:36 -07:00
Lars Gustäbel 7c3e6848f2 Issue #24838: Merge tarfile fix from 3.5. 2016-04-19 08:53:14 +02:00
Lars Gustäbel 0f450abec4 Issue #24838: tarfile's ustar and gnu formats now correctly calculate name and
link field limits for multibyte character encodings like utf-8.
2016-04-19 08:43:17 +02:00
Serhiy Storchaka b6a9c9761c Issue #26778: Fixed "a/an/and" typos in code comment, documentation and error
messages.
2016-04-17 09:39:28 +03:00
Serhiy Storchaka 6a7b3a77b4 Issue #26778: Fixed "a/an/and" typos in code comment and documentation. 2016-04-17 08:32:47 +03:00
Martin Panter 2d2d08d2cc Issue #22468: Merge gettarinfo() doc from 3.5 2016-02-19 23:46:59 +00:00
Martin Panter f817a48d17 Issues #22468, #21996, #22208: Clarify gettarinfo() and TarInfo usage
* The Windows-specific binary notice was probably a Python 2 thing
* Make it more obvious gettarinfo() is based on stat(), and that non-ordinary
  files may need special care
* The file name must be text; suggest dummy arcname as a workaround
* Indicate TarInfo may be used directly, not just via gettarinfo()
2016-02-19 23:34:56 +00:00
Martin Panter 104dcdab59 Issue #23883: Add missing APIs to tarfile.__all__
Patch by Joel Taddei and Jacek Kołodziej.
2016-01-16 06:59:13 +00:00
Serhiy Storchaka a254921cd4 Issue #22227: The TarFile iterator is reimplemented using generator.
This implementation is simpler that using class.
2015-12-19 09:43:14 +02:00
Martin Panter b82032f935 Issue #22341: Drop Python 2 workaround and document CRC initial value
Also align the parameter naming in binascii to be consistent with zlib.
2015-12-11 05:19:29 +00:00
Lars Gustäbel e12aa62d68 Merge with 3.4: Issue #24259: tarfile now raises a ReadError if an archive is truncated inside a data segment. 2015-07-06 09:29:41 +02:00