Commit Graph

218 Commits

Author SHA1 Message Date
Oleg Iarygin ed4441567e
gh-94844: Add pathlib support to shutil archive management (GH-94846)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
2022-07-20 18:55:12 +03:00
Serhiy Storchaka fda4b2f063
gh-74696: Do not change the current working directory in shutil.make_archive() if possible (GH-93160)
It is no longer changed when create a zip or tar archive.

It is still changed for custom archivers registered with shutil.register_archive_format()
if root_dir is not None.

Co-authored-by: Éric <merwok@netwok.org>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
2022-06-22 10:47:25 +02:00
Lucinda May Phipps 87b9b4e060
shutil._copyfileobj_readinto: tiny speedup (#92377) 2022-05-20 08:14:05 -07:00
Jack DeVries f33e2c87a8
gh-88513: clarify shutil.copytree's dirs_exist_ok arg (GH-91434)
* add a paragraph to document this kwarg in detail
* update docstring in the source accordingly
2022-04-11 17:57:52 -07:00
Serhiy Storchaka 02fbaf4887
bpo-46245: Add optional parameter dir_fd in shutil.rmtree() (GH-30365) 2022-03-09 14:29:33 +02:00
Lital Natan b77158b4da
bpo-39327: Close file descriptors as soon as possible in shutil.rmtree (GH-31384)
It fixes the "Text File Busy" OSError when using 'rmtree' on a
windows-managed filesystem in via the VirtualBox shared folder
(and possible other scenarios like a windows-managed network file
system).
2022-02-20 18:02:10 +02:00
Filipe Laíns 236e301b8a
bpo-42174: fallback to sane values if the columns or lines are 0 in get_terminal_size (GH-29046)
I considered only falling back when both were 0, but that still seems
wrong, and the highly popular rich[1] library does it this way, so I
thought we should probably inherit that behavior.

[1] https://github.com/willmcgugan/rich

Signed-off-by: Filipe Laíns <lains@riseup.net>

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
2021-10-19 20:42:13 +02:00
andrei kulakov b7eac52b46
bpo-45234: Fix FileNotFound exception raised instead of IsADirectoryError in shutil.copyfile() (GH-28421)
This was a regression from fixing BPO-43219.
2021-09-21 23:53:07 +02:00
Anthony Sottile b08c48e617
bpo-33671 fix orphaned comment in shutil.copyfileobj (GH-27516) 2021-07-31 15:15:45 -04:00
andrei kulakov 248173cc04
bpo-43219: shutil.copyfile, raise a less confusing exception instead of IsADirectoryError (GH-27049)
Fixes the misleading IsADirectoryError to be FileNotFoundError.
2021-07-09 20:47:41 -07:00
Igor Bolshakov f32c7950e0
bpo-43650: Fix MemoryError on zip.read in shutil._unpack_zipfile for large files (GH-25058)
`shutil.unpack_archive()` tries to read the whole file into memory, making no use of any kind of smaller buffer. Process crashes for really large files: I.e. archive: ~1.7G, unpacked: ~10G. Before the crash it can easily take away all available RAM on smaller systems. Had to pull the code form `zipfile.Zipfile.extractall()` to fix this

Automerge-Triggered-By: GH:gpshead
2021-05-17 01:28:21 -07:00
Giampaolo Rodola c0190137ca
bpo-43743 add comment stating _USE_CP_SENDFILE should not be removed (#26024) 2021-05-10 22:45:06 +02:00
Victor Stinner d72e8d4875
bpo-41718: subprocess imports grp and pwd on demand (GH-24987)
The shutil and subprocess modules now only import the grp and pwd
modules when they are needed, which is not the case by default in
subprocess.
2021-03-23 17:42:51 +01:00
Winson Luk 132131b404
bpo-42782: Fail fast for permission errors in shutil.move() (GH-24001)
* Fail fast in shutil.move() to avoid creating destination directories on failure.

Co-authored-by: Zackery Spytz <zspytz@gmail.com>
2021-03-02 12:53:15 -08:00
Michal Čihař e59b2deffd
bpo-42014: shutil.rmtree: call onerror with correct function (GH-22585)
The onerror is supposed to be called with failed function, but in this case lstat is wrongly used instead of open.

Not sure if this needs bug or not...

Automerge-Triggered-By: GH:hynek
2020-11-10 08:06:02 -08:00
Christopher Marchfelder da6f098188
bpo-40592: shutil.which will not return None anymore if ; is the last char in PATHEXT (GH-20088)
shutil.which will not return None anymore for empty str in PATHEXT
Empty PATHEXT will now be defaulted to _WIN_DEFAULT_PATHEXT
2020-10-23 11:08:24 +01:00
Saiyang Gou 7514f4f625
bpo-39184: Add audit events to functions in `fcntl`, `msvcrt`, `os`, `resource`, `shutil`, `signal`, `syslog` (GH-18407) 2020-02-13 07:47:42 +00:00
mbarkhau 88704334e5 bpo-39390 shutil: fix argument types for ignore callback (GH-18122) 2020-01-24 15:51:16 +01:00
Bruno P. Kinoshita 9bbcbc9f6d bpo-38688, shutil.copytree: consume iterator and create list of entries to prevent infinite recursion (GH-17098) 2019-11-27 09:10:37 +08:00
Giampaolo Rodola 94e165096f
bpo-38319: Fix shutil._fastcopy_sendfile(): set sendfile() max block size (GH-16491) 2019-10-01 11:40:54 +08:00
Maxwell A McKinnon cf57cabef8 bpo-32689: Updates shutil.move to allow for Path objects to be used as source arg (GH-15326)
Important work originally done by @emilyemorehouse two years ago and nearly ready to go in.

This bug has affected many people and in some cases has been a dealbreaker to the adoption of the otherwise wonderful pathlib and PEP519. https://stackoverflow.com/questions/33625931/copy-file-with-pathlib-in-python.

This adds the outstanding test request from that PR @vstinner (https://github.com/python/cpython/pull/5393).

Test fails without the change, passes with it, along with every other test in test_shutil.

Some variants were experimented with to make the one line change and the most performant one was picked.


# Added Test for PathLike directory destination, the current fail case

```
Lib/test/test_shutil.py::TestMove::test_move_file_pathlike FAILED                                                               [100%]

============================================================== FAILURES ===============================================================
__________________________________________________ TestMove.test_move_file_pathlike ___________________________________________________

self = <test.test_shutil.TestMove testMethod=test_move_file_pathlike>

    def test_move_file_pathlike(self):
        # Move a file to another location on the same filesystem.
        src = pathlib.Path(self.src_file)
>       self._check_move_file(src, self.dst_dir, self.dst_file)

Lib/test/test_shutil.py:1563:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Lib/test/test_shutil.py:1545: in _check_move_file
    shutil.move(src, dst)
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:562: in move
    real_dst = os.path.join(dst, _basename(src))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = PosixPath('/var/folders/r2/psq74t5x3nbfzlph8bh2pvdw0000gn/T/tmp9ie0wh9_/foo')

    def _basename(path):
        # A basename() variant which first strips the trailing slash, if present.
        # Thus we always get the last component of the path, even for directories.
        sep = os.path.sep + (os.path.altsep or '')
>       return os.path.basename(path.rstrip(sep))
E       AttributeError: 'PosixPath' object has no attribute 'rstrip'

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/shutil.py:526: AttributeError
============================================== 1 failed, 102 deselected in 0.30 seconds ===============================================
```

After change:

```
========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items / 102 deselected / 1 selected

Lib/test/test_shutil.py::TestMove::test_move_file_pathlike PASSED                                                               [100%]

============================================== 1 passed, 102 deselected in 0.06 seconds ===============================================
```

Running all the tests in test_shutil.py
```
╰─ pytest Lib/test/test_shutil.py -v
========================================================= test session starts =========================================================
platform darwin -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /Users/maxwellmckinnon/.venvs/TA3.7/bin/python3.7
cachedir: .pytest_cache
rootdir: /Users/maxwellmckinnon/dev/cpython
plugins: cov-2.7.1, mock-1.10.4
collected 103 items

Lib/test/test_shutil.py::TestShutil::test_chown PASSED                                                                          [  0%]
Lib/test/test_shutil.py::TestShutil::test_copy PASSED                                                                           [  1%]
...
Lib/test/test_shutil.py::TermsizeTests::test_stty_match SKIPPED                                                                 [ 99%]
Lib/test/test_shutil.py::PublicAPITests::test_module_all_attribute PASSED                                                       [100%]

================================================ 96 passed, 7 skipped in 1.25 seconds =================================================
```

# Performance Considerations
Is it considered poor form to get rid of _basename altogether and make use of pathlib in the move function? I'm not sure if the idea is for all these modules to strictly avoid circular dependencies. They are already using os.path which is just as much a citizen in 3.8 as pathlib right?

e.g.

`real_dst = os.path.join(dst, _basename(src))`
becomes
`real_dst = Path(dst) / Path(src).name`

I've looked around and familiarized myself, and I now think importing pathlib here is fine. My only remaining concern is that of performance.

Here's the performance difference for this step. 

```
In [46]: %timeit real_dst = os.path.join("a/b/c", _basename('b/'))
2.71 µs ± 62.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [47]: %timeit real_dst = Path("a/b/c") / Path('b/').name
12.4 µs ± 65.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
```

Is 10us significant or insignificant compared to the least expensive operation this function will do? I don't know. Let's find out.

```
In [55]: %timeit os.rename('/tmp/a/a.txt', '/tmp/a/b.txt'); os.rename('/tmp/a/b.txt', '/tmp/a/a.txt')
124 µs ± 2.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
62us to rename. 10us seems significant enough that we wouldn't want to favor the Path sugar suggestion. 16% speed decrease from adding the 10us.

What do people think? I was hoping to get to use pathlib.Path here, but I suspect for this low level move, it should be as fast as possible, and 16% is not worth one line of sugary code to me.



https://bugs.python.org/issue32689



Automerge-Triggered-By: @gvanrossum
2019-09-30 19:41:16 -07:00
Boris Verhovsky 9488a5289d Clarify that shutil's copy functions can accept path-like values (GH-15141) 2019-09-09 08:51:56 -07:00
Ned Deily 7fcc2088a5 bpo-37834: Prevent shutil.rmtree exception (GH-15602)
when built on non-Windows system without fd system call support,
like older versions of macOS.
2019-08-29 23:20:03 +02:00
Steve Dower df2d4a6f3d
bpo-37834: Normalise handling of reparse points on Windows (GH-15231)
bpo-37834: Normalise handling of reparse points on Windows
* ntpath.realpath() and nt.stat() will traverse all supported reparse points (previously was mixed)
* nt.lstat() will let the OS traverse reparse points that are not name surrogates (previously would not traverse any reparse point)
* nt.[l]stat() will only set S_IFLNK for symlinks (previous behaviour)
* nt.readlink() will read destinations for symlinks and junction points only

bpo-1311: os.path.exists('nul') now returns True on Windows
* nt.stat('nul').st_mode is now S_IFCHR (previously was an error)
2019-08-21 15:27:33 -07:00
Steve Dower 60419a7e96
bpo-37363: Add audit events for a range of modules (GH-14301) 2019-06-24 08:42:54 -07:00
Serhiy Storchaka e9b51c0ad8
bpo-26660, bpo-35144: Fix permission errors in TemporaryDirectory cleanup. (GH-10320)
TemporaryDirectory.cleanup() failed when non-writeable or non-searchable
files or directories were created inside a temporary directory.
2019-05-31 11:30:37 +03:00
Giampaolo Rodola 413d955f8e
bpo-36610: shutil.copyfile(): use sendfile() on Linux only (GH-13675)
...and avoid using it on Solaris as it can raise EINVAL if offset is equal or bigger than the size of the file
2019-05-30 14:05:41 +08:00
Ying Wang a16387ab2d bpo-24564: shutil.copystat(): ignore EINVAL on os.setxattr() (GH-13369) 2019-05-30 11:25:31 +08:00
Olexa Bilaniuk 79efbb7193 bpo-24538: Fix bug in shutil involving the copying of xattrs to read-only files. (PR-13212)
Extended attributes can only be set on user-writeable files, but shutil previously
first chmod()ed the destination file to the source's permissions and then tried to
copy xattrs. This will cause failures if attempting to copy read-only files with
xattrs, as occurs with Git clones on Lustre FS.
2019-05-10 11:22:06 +08:00
Victor Stinner 197f0447e3
bpo-35755: Don't say "to mimick Unix which command behavior" (GH-12861) 2019-04-17 17:44:06 +02:00
Victor Stinner 228a3c99bd
bpo-35755: shutil.which() uses os.confstr("CS_PATH") (GH-12858)
shutil.which() and distutils.spawn.find_executable() now use
os.confstr("CS_PATH") if available instead of os.defpath, if the PATH
environment variable is not set.

Don't use os.confstr("CS_PATH") nor os.defpath if the PATH
environment variable is set to an empty string to mimick Unix 'which'
command behavior.

Changes:

* find_executable() now starts by checking for the executable in the
  current working directly case. Add an explicit
  "if not path: return None".
* Add tests for PATH='' (empty string), PATH=':' and for PATHEXT.
2019-04-17 16:26:36 +02:00
Inada Naoki 4f19030618
bpo-36103: change default buffer size of shutil.copyfileobj() (GH-12115)
It is changed from 16KiB to 64KiB.  The previous default value
is used since 1990.

coreutils chose 128 KiB as minimum buffer size for block device I/O.

But shutil.copyfileobj() can be used for non block devices.
So I choose more conservative value.

As my quick benchmark, performance difference between 64KiB and
128 KiB is up to ~5%.  On the other hand, performance difference
between 32 KiB and 64 KiB can be more than 10% when file is fully
buffered.

This is why 64 KiB is rational value.
2019-03-02 13:31:01 +09:00
Giampaolo Rodola c606a9cbd4
bpo-35652: shutil.copytree(copy_function=...) erroneously pass DirEntry instead of path str (GH-11997) 2019-02-26 12:04:41 +01:00
Anthony Sottile 8377cd4fcd Clean up code which checked presence of os.{stat,lstat,chmod} (#11643) 2019-02-25 23:32:27 +01:00
Giampaolo Rodola 3b0abb0196 bpo-33671: allow setting shutil.copyfile() bufsize globally (GH-12016) 2019-02-25 08:46:40 +09:00
Cheryl Sabella 5680f6546d bpo-18283: Add support for bytes to shutil.which (GH-11818) 2019-02-13 12:25:10 +01:00
jab 9e00d9e88f bpo-20849: add dirs_exist_ok arg to shutil.copytree (patch by Josh Bronson) 2018-12-28 19:03:40 +01:00
Giampaolo Rodola 19c46a4c96
bpo-33695 shutil.copytree() + os.scandir() cache (#7874) 2018-11-12 06:18:15 -08:00
Srinivas Thatiparthy (శ్రీనివాస్ తాటిపర్తి) 2d1bc537fe bpo-35202: Remove unused imports in Lib directory. (GH-10445) 2018-11-10 07:45:28 +02:00
Zsolt Cserna 4f399be0e7 bpo-34260, shutil: fix copy2 and copystat documentation (GH-8523)
Fix the documentation of copy2, as it does not copy file ownership (user and
group), only mode, mtime, atime and flags.

The original text was confusing to developers as it suggested that this
command is the same as 'cp -p', but according to cp(1), '-p' copies file
ownership as well.

Clarify which metadata is copied by shutil.copystat in its docstring.
2018-10-23 12:09:50 +02:00
Giampaolo Rodola c7f02a9659
bpo-33671 / shutil.copyfile: use memoryview() with dynamic size on Windows (#7681)
bpo-33671
* use memoryview() with size == file size on Windows, see https://github.com/python/cpython/pull/7160#discussion_r195405230
* release intermediate (sliced) memoryview immediately
* replace "OSX" occurrences with "macOS"
* add some unittests for copyfileobj()
2018-06-19 08:27:29 -07:00
Giampaolo Rodola 4a172ccc73
bpo-33671: efficient zero-copy for shutil.copy* functions (Linux, OSX and Win) (#7160)
* have shutil.copyfileobj use sendfile() if possible

* refactoring: use ctx manager

* add test with non-regular file obj

* emulate case where file size can't be determined

* reference _copyfileobj_sendfile directly

* add test for offset() at certain position

* add test for empty file

* add test for non regular file dst

* small refactoring

* leave copyfileobj() alone in order to not introduce any incompatibility

* minor refactoring

* remove old test

* update docstring

* update docstring; rename exception class

* detect platforms which only support file to socket zero copy

* don't run test on platforms where file-to-file zero copy is not supported

* use tempfiles

* reset verbosity

* add test for smaller chunks

* add big file size test

* add comment

* update doc

* update whatsnew doc

* update doc

* catch Exception

* remove unused import

* add test case for error on second sendfile() call

* turn docstring into comment

* add one more test

* update comment

* add Misc/NEWS entry

* get rid of COPY_BUFSIZE; it belongs to another PR

* update doc

* expose posix._fcopyfile() for OSX

* merge from linux branch

* merge from linux branch

* expose fcopyfile

* arg clinic for the win implementation

* convert path type to path_t

* expose CopyFileW

* fix windows tests

* release GIL

* minor refactoring

* update doc

* update comment

* update docstrings

* rename functions

* rename test classes

* update doc

* update doc

* update docstrings and comments

* avoid do import nt|posix modules if unnecessary

* set nt|posix modules to None if not available

* micro speedup

* update description

* add doc note

* use better wording in doc

* rename function using 'fastcopy' prefix instead of 'zerocopy'

* use :ref: in rst doc

* change wording in doc

* add test to make sure sendfile() doesn't get called aymore in case it doesn't support file to file copies

* move CopyFileW in _winapi and actually expose CopyFileExW instead

* fix line endings

* add tests for mode bits

* add docstring

* remove test file mode class; let's keep it for later when Istart addressing OSX fcopyfile() specific copies

* update doc to reflect new changes

* update doc

* adjust tests on win

* fix argument clinic error

* update doc

* OSX: expose copyfile(3) instead of fcopyfile(3); also expose flags arg to python

* osx / copyfile: use path_t instead of char

* do not set dst name in the OSError exception in order to remain consistent with platforms which cannot do that (e.g. linux)

* add same file test

* add test for same file

* have osx copyfile() pre-emptively check if src and dst are the same, otherwise it will return immedialtey and src file content gets deleted

* turn PermissionError into appropriate SameFileError

* expose ERROR_SHARING_VIOLATION in order to raise more appropriate SameFileError

* honour follow_symlinks arg when using CopyFileEx

* update Misc/NEWS

* expose CreateDirectoryEx mock

* change C type

* CreateDirectoryExW actual implementation

* provide specific makedirs() implementation for win

* fix typo

* skeleton for SetNamedSecurityInfo

* get security info for src path

* finally set security attrs

* add unit tests

* mimick os.makedirs() behavior and raise if dst dir exists

* set 2 paths for OSError object

* set 2 paths for OSError object

* expand windows test

* in case of exception on os.sendfile() set filename and filename2 exception attributes

* set 2 filenames (src, dst) for OSError in case copyfile() fails on OSX

* update doc

* do not use CreateDirectoryEx() in copytree() if source dir is a symlink (breaks test_copytree_symlink_dir); instead just create a plain dir and remain consistent with POSIX implementation

* use bytearray() and readinto()

* use memoryview() with bytearray()

* refactoring + introduce a new _fastcopy_binfileobj() fun

* remove CopyFileEx and other C wrappers

* remove code related to CopyFileEx

* Recognize binary files in copyfileobj()
...and use fastest _fastcopy_binfileobj() when possible

* set 1MB copy bufsize on win; also add a global _COPY_BUFSIZE variable

* use ctx manager for memoryview()

* update doc

* remove outdated doc

* remove last CopyFileEx remnants

* OSX - use fcopyfile(3) instead of copyfile(3)

...as an extra safety measure: in case src/dst are "exotic" files (non
regular or living on a network fs etc.) we better fail on open() instead
of copyfile(3) as we're not quite sure what's gonna happen in that
case.

* update doc
2018-06-12 23:04:50 +02:00
Serhiy Storchaka d4d79bc1ff
bpo-28564: Use os.scandir() in shutil.rmtree(). (#4085)
This speeds up it to 20-40%.
2017-11-04 14:16:35 +02:00
Jelle Zijlstra a12df7b7d4 bpo-30218: support path-like objects in shutil.unpack_archive() (GH-1367)
Thanks to Jelle Zijlstra for the patch.
2017-05-05 14:27:12 -07:00
Serhiy Storchaka 5affd23e6f bpo-29762: More use "raise from None". (#569)
This hides unwanted implementation details from tracebacks.
2017-04-05 09:37:24 +03:00
Serhiy Storchaka 9bb6fe5274 Issue #14061: Misc fixes and cleanups in archiving code in shutil.
Imporoved the documentation and tests for make_archive() and unpack_archive().
Improved error handling when corresponding compress module is not available.
Brake circular dependency between shutil and tarfile modules.
2016-12-16 19:00:55 +02:00
Serhiy Storchaka 20cdffd830 Issue #14061: Misc fixes and cleanups in archiving code in shutil.
Imporoved the documentation and tests for make_archive() and unpack_archive().
Improved error handling when corresponding compress module is not available.
Brake circular dependency between shutil and tarfile modules.
2016-12-16 18:58:33 +02:00
Serhiy Storchaka 7fc92bb38a Issue #28488: shutil.make_archive() no longer adds entry "./" to ZIP archive. 2016-10-23 15:57:42 +03:00
Serhiy Storchaka 666de77727 Issue #28488: shutil.make_archive() no longer adds entry "./" to ZIP archive. 2016-10-23 15:55:09 +03:00
Martin Panter 0be894b2f6 Issue #27895: Spelling fixes (Contributed by Ville Skyttä). 2016-09-07 12:03:06 +00:00