Commit Graph

252 Commits

Author SHA1 Message Date
morotti 6efd95c465
gh-117151: increase default buffer size of shutil.copyfileobj() to 256k. (GH-119783)
* gh-117151: increase default buffer size of shutil.copyfileobj() to 256k.

it was set to 16k in the 1990s.
it was raised to 64k in 2019. the discussion at the time mentioned another 5% improvement by raising to 128k and settled for a very conservative setting.

it's 2024 now, I think it should be revisited to match modern hardware. I am measuring 0-15% performance improvement when raising to 256k on various types of disk. there is no downside as far as I can tell.

this function is only intended for sequential copy of full files (or file like objects). it's the typical use case that benefits from larger operations.

for reference, I came across this function while trying to profile pip that is using it to copy files when installing python packages.

* add news

---------

Co-authored-by: rmorotti <romain.morotti@man.com>
2024-10-04 16:51:22 -07:00
Jakub Kulík b169cf394f
gh-86009: Fix solaris detection in `_USE_CP_SENDFILE` check (GH-124289) 2024-09-24 22:23:17 +02:00
Jakub Kulík 8f82d9aa21
bpo-41843: Reenable use of sendfile in shutil module on Solaris (GH-23893) 2024-09-19 14:47:05 +00:00
Peter Bierma 9dbd123755
gh-123084: Turn `shutil.ExecError` into a deprecated alias of `RuntimeError` (#123125) 2024-08-21 00:39:24 +00:00
Xie Yanbo b6c80e21c7
Fix typos in comments and docstring (#122720)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-08-07 07:39:16 +01:00
Barney Gale 69058e20e4
GH-73991: Use same signature for `shutil._rmtree_[un]safe()`. (#120517)
Preparatory work for moving `_rmtree_unsafe()` and `_rmtree_safe_fd()` to
`pathlib._os` so that they can be used from both `shutil` and `pathlib`.

Move implementation-specific setup from `rmtree()` into the safe/unsafe
functions, and give them the same signature `(path, dir_fd, onexc)`.

In the tests, mock `os.open` rather than `_rmtree_safe_fd()` to ensure the
FD-based walk is used, and replace a couple references to
`shutil._use_fd_functions` with `shutil.rmtree.avoids_symlink_attacks`
(which has the same value).

No change of behaviour.
2024-06-18 22:15:18 +01:00
Barney Gale 53b1981fb0
GH-89727: Fix `shutil.rmtree()` recursion error on deep trees (#119808)
Implement `shutil._rmtree_safe_fd()` using a list as a stack to avoid emitting recursion errors on deeply nested trees.

`shutil._rmtree_unsafe()` was fixed in a150679f90.
2024-06-01 19:49:12 +01:00
Barney Gale a150679f90
GH-89727: Partially fix `shutil.rmtree()` recursion error on deep trees (#119634)
Make `shutil._rmtree_unsafe()` call `os.walk()`, which is implemented
without recursion.

`shutil._rmtree_safe_fd()` is not affected and can still raise a recursion
error.

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
2024-05-29 20:11:30 +00:00
Andrew Zipperer a6b610a94b
docs: typo: tiny grammar change: "pointed by" -> "pointed to by" (#118411)
* docs: tiny grammar change: "pointed by" -> "pointed to by"

This commit uses "file pointed to by" to replace "file pointed by" in
 - doc for shutil.copytree
 - docstring for shutil.copytree
 - docstring _abc.PathBase.open
 - docstring for pathlib.Path.open
 - doc for os.copy_file_range
 - doc for os.splice

The docs use "file pointed to by" more frequently than
"file pointed by". So, this commit replaces the uses of
"file pointed by" in order to make the uses consistent
through the docs.

```bash
$ grep -ri 'pointed to by' cpython/
```
yields more results than
```bash
$ grep -ri 'pointed by' cpython/
```

Separately:

There are two occurrences of "tree pointed by":
 - cpython/Doc/library/xml.etree.elementtree.rst for
     `xml.etree.ElementInclude.include`
 - cpython/Lib/xml/etree/ElementInclude.py for `include`

For those uses of "tree pointed by", I expect "tree pointed to by"
instead. However, I found enough uses online of (a) "tree pointed by"
rather than (b) "tree pointed to by" to convince me that (a) is in
common use.

So, this commit does not replace those occurrences of "tree pointed by"
to "tree pointed to by". But I will replace them if a reviewer
believes it is correct to replace them.

* docs: typo: "exists and executable" -> "exists and is executable"

---------

Co-authored-by: Andrew-Zipperer <atzipperer@gmail.com>
2024-05-02 05:37:12 +00:00
tahia 8974a63f5e
bpo-18108: Adding dir_fd and follow_symlinks keyword args to shutil.chown (GH-15811)
* Adding dir_fd and follow_symlinks keyword args to shutil.chown
* Extending test_shutil.TestShutil.test_chown to include new kwargs
* Updating shutil.chown documentation

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Berker Peksag <berker.peksag@gmail.com>
Co-authored-by: Zachary Ware <zachary.ware@gmail.com>
2024-04-22 18:23:36 +00:00
Serhiy Storchaka aa7bcf284f
gh-116401: Fix blocking os.fwalk() and shutil.rmtree() on opening a named pipe (GH-116421) 2024-03-13 11:40:28 +02:00
Malcolm Smith 872c0714fc
gh-71052: Change Android's `sys.platform` from "linux" to "android"
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
2024-03-11 19:25:39 +00:00
Dai Wentao da8f9fb2ea
gh-113803: Fix inaccurate documentation for shutil.move when dst is an existing directory (#113837)
* fix the usage of dst and destination in shutil.move doc
* update shutil.move doc
2024-02-04 13:42:58 -05:00
Jeffrey Kintscher c66b577d9f
bpo-26791: Update shutil.move() to provide the same symlink move behavior as the mv shell when moving a symlink into a directory that is the target of the symlink (GH-21759) 2023-12-27 16:23:42 +00:00
Serhiy Storchaka 6e02d79f96
gh-113188: Fix shutil.copymode() on Windows (GH-113189)
Previously it worked differently if dst is a symbolic link:
it modified the permission bits of dst itself rather than the file
it points to if follow_symlinks is true or src is not a symbolic link,
and did nothing if follow_symlinks is false and src is a symbolic link.

Also document similar changes in shutil.copystat().
2023-12-23 11:07:54 +00:00
Zackery Spytz 11d88a178b
bpo-35332: Handle os.close() errors in shutil.rmtree() (GH-23766)
* Ignore os.close() errors when ignore_errors is True.
* Pass os.close() errors to the error handler if specified.
* os.close no longer retried after error.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-12-05 19:09:39 +02:00
Serhiy Storchaka 563ccded6e
gh-94692: Only catch OSError in shutil.rmtree() (#112756)
Previously a symlink attack resistant version of shutil.rmtree() could ignore
or pass to the error handler arbitrary exception when invalid arguments
were provided.
2023-12-05 16:40:49 +01:00
Jeffrey Kintscher 268415bbb3
gh-81441: shutil.rmtree() FileNotFoundError race condition (GH-14064)
Ignore missing files and directories while enumerating
directory entries in shutil.rmtree().

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-12-05 09:33:51 +00:00
Irit Katriel 97857ac058
gh-112645: remove deprecation warning for use of onerror in shutil.rmtree (#112659) 2023-12-03 14:02:37 +00:00
Alex Waygood bfe7e72522
gh-109653: Defer importing `warnings` in several modules (#110286) 2023-10-04 06:09:43 +01:00
Charles Machalow 29b875bb93
gh-109590: Update shutil.which on Windows to prefer a PATHEXT extension on executable files (GH-109995)
The default arguments for shutil.which() request an executable file, but extensionless files are not executable on Windows and should be ignored.
2023-10-02 09:27:30 +01:00
Victor Stinner 0def8c712b
gh-109748: Fix again venv test_zippath_from_non_installed_posix() (#110149)
Call also copy_python_src_ignore() on listdir() names.

shutil.copytree(): replace set() with an empty tuple. An empty tuple
becomes a constant in the compiler and checking if an item is in an
empty tuple is cheap.
2023-09-30 20:23:26 +02:00
6t8k a86df298df
gh-99203: shutil.make_archive(): restore select CPython <= 3.10.5 behavior (GH-99802)
Restore following CPython <= 3.10.5 behavior of shutil.make_archive()
that went away as part of gh-93160:

Do not create an empty archive if root_dir is not a directory, and, in
that case, raise FileNotFoundError or NotADirectoryError regardless
of format choice. Beyond the brought-back behavior, the function may
now also raise these exceptions in dry_run mode.
2023-08-16 10:00:03 +03:00
Steve Dower cda1bd3c9d
gh-88745: Add _winapi.CopyFile2 and update shutil.copy2 to use it (GH-105055) 2023-05-30 11:00:29 +01:00
Allan Lago 3df3b91e6a
gh-82814: fix shutil access error on WSL (#103790)
gh-82814: Adds `errno.EACCES` to the list of ignored errors on
`_copyxattr`.  EPERM and EACCES are different constants but
in general should be treated the same.

News entry authored by: Gregory P. Smith <greg@krypto.org>
2023-04-25 00:45:38 +00:00
Petr Viktorin af53046995
gh-102950: Implement PEP 706 – Filter for tarfile.extractall (#102953) 2023-04-24 10:58:06 +02:00
Irit Katriel 8026cda10c
gh-102828: set stacklevel on deprecation warning (#103422) 2023-04-11 09:31:39 +01:00
Charles Machalow 935aa45235
GH-75586: Make shutil.which() on Windows more consistent with the OS (GH-103179) 2023-04-04 23:24:13 +01:00
Irit Katriel 7f760c2fca
gh-102828: emit deprecation warning for onerror arg to shutil.rmtree (#102850) 2023-03-21 11:08:46 +00:00
Irit Katriel d51a6dc28e
gh-102828: add onexc arg to shutil.rmtree. Deprecate onerror. (#102829) 2023-03-19 18:33:51 +00:00
Nick Drozd 024ac542d7
bpo-45975: Simplify some while-loops with walrus operator (GH-29347) 2022-11-26 14:33:25 -08:00
Charles Machalow 1b2de89bce
gh-99547: Add isjunction methods for checking if a path is a junction (GH-99548) 2022-11-22 17:19:34 +00:00
Zackery Spytz 5ff81da6d3
bpo-38523: ignore_dangling_symlinks does not apply recursively (GH-22937) 2022-11-07 11:45:16 +00:00
Serhiy Storchaka e3ef400be7
gh-74696: Pass root_dir to custom archivers which support it (GH-94251)
Co-authored-by: Éric <merwok@netwok.org>
2022-10-05 12:48:59 +03:00
Oleg Iarygin ed4441567e
gh-94844: Add pathlib support to shutil archive management (GH-94846)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
2022-07-20 18:55:12 +03:00
Serhiy Storchaka fda4b2f063
gh-74696: Do not change the current working directory in shutil.make_archive() if possible (GH-93160)
It is no longer changed when create a zip or tar archive.

It is still changed for custom archivers registered with shutil.register_archive_format()
if root_dir is not None.

Co-authored-by: Éric <merwok@netwok.org>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
2022-06-22 10:47:25 +02:00
Lucinda May Phipps 87b9b4e060
shutil._copyfileobj_readinto: tiny speedup (#92377) 2022-05-20 08:14:05 -07:00
Jack DeVries f33e2c87a8
gh-88513: clarify shutil.copytree's dirs_exist_ok arg (GH-91434)
* add a paragraph to document this kwarg in detail
* update docstring in the source accordingly
2022-04-11 17:57:52 -07:00
Serhiy Storchaka 02fbaf4887
bpo-46245: Add optional parameter dir_fd in shutil.rmtree() (GH-30365) 2022-03-09 14:29:33 +02:00
Lital Natan b77158b4da
bpo-39327: Close file descriptors as soon as possible in shutil.rmtree (GH-31384)
It fixes the "Text File Busy" OSError when using 'rmtree' on a
windows-managed filesystem in via the VirtualBox shared folder
(and possible other scenarios like a windows-managed network file
system).
2022-02-20 18:02:10 +02:00
Filipe Laíns 236e301b8a
bpo-42174: fallback to sane values if the columns or lines are 0 in get_terminal_size (GH-29046)
I considered only falling back when both were 0, but that still seems
wrong, and the highly popular rich[1] library does it this way, so I
thought we should probably inherit that behavior.

[1] https://github.com/willmcgugan/rich

Signed-off-by: Filipe Laíns <lains@riseup.net>

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
2021-10-19 20:42:13 +02:00
andrei kulakov b7eac52b46
bpo-45234: Fix FileNotFound exception raised instead of IsADirectoryError in shutil.copyfile() (GH-28421)
This was a regression from fixing BPO-43219.
2021-09-21 23:53:07 +02:00
Anthony Sottile b08c48e617
bpo-33671 fix orphaned comment in shutil.copyfileobj (GH-27516) 2021-07-31 15:15:45 -04:00
andrei kulakov 248173cc04
bpo-43219: shutil.copyfile, raise a less confusing exception instead of IsADirectoryError (GH-27049)
Fixes the misleading IsADirectoryError to be FileNotFoundError.
2021-07-09 20:47:41 -07:00
Igor Bolshakov f32c7950e0
bpo-43650: Fix MemoryError on zip.read in shutil._unpack_zipfile for large files (GH-25058)
`shutil.unpack_archive()` tries to read the whole file into memory, making no use of any kind of smaller buffer. Process crashes for really large files: I.e. archive: ~1.7G, unpacked: ~10G. Before the crash it can easily take away all available RAM on smaller systems. Had to pull the code form `zipfile.Zipfile.extractall()` to fix this

Automerge-Triggered-By: GH:gpshead
2021-05-17 01:28:21 -07:00
Giampaolo Rodola c0190137ca
bpo-43743 add comment stating _USE_CP_SENDFILE should not be removed (#26024) 2021-05-10 22:45:06 +02:00
Victor Stinner d72e8d4875
bpo-41718: subprocess imports grp and pwd on demand (GH-24987)
The shutil and subprocess modules now only import the grp and pwd
modules when they are needed, which is not the case by default in
subprocess.
2021-03-23 17:42:51 +01:00
Winson Luk 132131b404
bpo-42782: Fail fast for permission errors in shutil.move() (GH-24001)
* Fail fast in shutil.move() to avoid creating destination directories on failure.

Co-authored-by: Zackery Spytz <zspytz@gmail.com>
2021-03-02 12:53:15 -08:00
Michal Čihař e59b2deffd
bpo-42014: shutil.rmtree: call onerror with correct function (GH-22585)
The onerror is supposed to be called with failed function, but in this case lstat is wrongly used instead of open.

Not sure if this needs bug or not...

Automerge-Triggered-By: GH:hynek
2020-11-10 08:06:02 -08:00
Christopher Marchfelder da6f098188
bpo-40592: shutil.which will not return None anymore if ; is the last char in PATHEXT (GH-20088)
shutil.which will not return None anymore for empty str in PATHEXT
Empty PATHEXT will now be defaulted to _WIN_DEFAULT_PATHEXT
2020-10-23 11:08:24 +01:00