Re-order table of corresponding functions with the following priorities:
1. Pure functionality is at the top
2. `os.path` functions are shown before `os` functions
3. Similar functionality is kept together
4. Functionality follows docs order where possible
Add a few missed correspondences:
- `os.path.isjunction` and `Path.is_junction`
- `os.path.ismount` and `Path.is_mount`
- `os.lstat()` and `Path.lstat()`
- `os.lchmod()` and `Path.lchmod()`
Also add footnotes describing a few differences.
Corrected some grammar and spelling issues in documentation.
Co-authored-by: Russell Keith-Magee <russell@keith-magee.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Add *preserve_metadata* keyword-only argument to `pathlib.Path.copytree()`,
defaulting to false. When set to true, we copy timestamps, permissions,
extended attributes and flags where available, like `shutil.copystat()`.
Add a `Path.rmtree()` method that removes an entire directory tree, like
`shutil.rmtree()`. The signature of the optional *on_error* argument
matches the `Path.walk()` argument of the same name, but differs from the
*onexc* and *onerror* arguments to `shutil.rmtree()`. Consistency within
pathlib is probably more important.
In the private pathlib ABCs, we add an implementation based on `walk()`.
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Add *preserve_metadata* keyword-only argument to `pathlib.Path.copy()`, defaulting to false. When set to true, we copy timestamps, permissions, extended attributes and flags where available, like `shutil.copystat()`. The argument has no effect on Windows, where metadata is always copied.
Internally (in the pathlib ABCs), path types gain `_readable_metadata` and `_writable_metadata` attributes. These sets of strings describe what kinds of metadata can be retrieved and stored. We take an intersection of `source._readable_metadata` and `target._writable_metadata` to minimise reads/writes. A new `_read_metadata()` method accepts a set of metadata keys and returns a dict with those keys, and a new `_write_metadata()` method accepts a dict of metadata. We *might* make these public in future, but it's hard to justify while the ABCs are still private.
Check for `ERROR_INVALID_PARAMETER` when calling `_winapi.CopyFile2()` and
raise `UnsupportedOperation`. In `Path.copy()`, handle this exception and
fall back to the `PathBase.copy()` implementation.
Add dedicated subsection for `home()`, `expanduser()`, `cwd()`,
`absolute()`, `resolve()` and `readlink()`. The position of this section
keeps all the `Path` constructors (`Path()`, `Path.from_uri()`,
`Path.home()` and `Path.cwd()`) near the top. Within the section, closely
related methods are kept adjacent. Specifically:
-.`home()` and `expanduser()` (the former calls the latter)
- `cwd()` and `absolute()` (the former calls the latter)
- `absolute()` and `resolve()` (both make paths absolute)
- `resolve()` and `readlink()` (both read symlink targets)
- Ditto `cwd()` and `absolute()`
- Ditto `absolute()` and `resolve()`
The "Other methods" section is removed.
Add dedicated subsection for `pathlib.owner()`, `group()`, `chmod()` and
`lchmod()`.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Add `pathlib.Path.copytree()` method, which recursively copies one
directory to another.
This differs from `shutil.copytree()` in the following respects:
1. Our method has a *follow_symlinks* argument, whereas shutil's has a
*symlinks* argument with an inverted meaning.
2. Our method lacks something like a *copy_function* argument. It always
uses `Path.copy()` to copy files.
3. Our method lacks something like a *ignore_dangling_symlinks* argument.
Instead, users can filter out danging symlinks with *ignore*, or
ignore exceptions with *on_error*
4. Our *ignore* argument is a callable that accepts a single path object,
whereas shutil's accepts a path and a list of child filenames.
5. We add an *on_error* argument, which is a callable that accepts
an `OSError` instance. (`Path.walk()` also accepts such a callable).
Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>
Add support for not following symlinks in `pathlib.Path.copy()`.
On Windows we add the `COPY_FILE_COPY_SYMLINK` flag is following symlinks is disabled. If the source is symlink to a directory, this call will fail with `ERROR_ACCESS_DENIED`. In this case we add `COPY_FILE_DIRECTORY` to the flags and retry. This can fail on old Windowses, which we note in the docs.
No news as `copy()` was only just added.
Add a `Path.copy()` method that copies the content of one file to another.
This method is similar to `shutil.copyfile()` but differs in the following ways:
- Uses `fcntl.FICLONE` where available (see GH-81338)
- Uses `os.copy_file_range` where available (see GH-81340)
- Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`.
The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata.
Incorporates code from GH-81338 and GH-93152.
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Add dedicated subsection for `pathlib.Path.touch()`, `mkdir()`,
`symlink_to()` and `hardlink_to()`. Also note that `open()`, `write_text()`
and `write_bytes()` are often used to create files.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
pathlib now treats "`.`" as a valid file extension (suffix). This brings
it in line with `os.path.splitext()`.
In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method
that splits a path into a `(root, ext)` pair, like `os.path.splitext()`.
This method is called by `PurePathBase.stem`, `suffix`, etc. In a future
version of pathlib, we might make these base classes public, and so users
will be able to define their own `splitext()` method to control file
extension splitting.
In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes`
properties that don't use `splitext()`, which avoids computing the path
base name twice.
Suppress all `OSError` exceptions from `pathlib.Path.exists()` and `is_*()`
rather than a selection of more common errors as we do presently. Also
adjust the implementations to call `os.path.exists()` etc, which are much
faster on Windows thanks to GH-101196.
Since 6258844c, paths that might not exist can be fed into pathlib's
globbing implementation, which will call `os.scandir()` / `os.lstat()` only
when strictly necessary. This allows us to drop an initial `self.is_dir()`
call, which saves a `stat()`.
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
Replace tri-state `follow_symlinks` with boolean `recurse_symlinks` argument. The new argument controls whether symlinks are followed when expanding recursive `**` wildcards. The possible argument values correspond as follows:
follow_symlinks recurse_symlinks
=============== ================
False N/A
None False
True True
We therefore drop support for not following symlinks when expanding non-recursive pattern parts; it wasn't requested in the original issue, and it's a feature not found in any shells.
This makes the API a easier to grok by eliminating `None` as an option.
No news blurb as `follow_symlinks` was new in 3.13.
Explain the `full_match()` / `glob()` / `rglob()` pattern language in its own section. Move `rglob()` documentation under `glob()` and reduce duplicated text.
Return files and directories from `pathlib.Path.glob()` if the pattern ends
with `**`. This is more compatible with `PurePath.full_match()` and with
other glob implementations such as bash and `glob.glob()`. Users can add a
trailing slash to match only directories.
In my previous patch I added a `FutureWarning` with the intention of fixing
this in Python 3.15. Upon further reflection I think this was an
unnecessarily cautious remedy to a clear bug.
Add `ntpath.isreserved()`, which identifies reserved pathnames such as "NUL", "AUX" and "CON".
Deprecate `pathlib.PurePath.is_reserved()`.
---------
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Brett Cannon <brett@python.org>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
In 49f90ba we added support for the recursive wildcard `**` in
`pathlib.PurePath.match()`. This should allow arbitrary prefix and suffix
matching, like `p.match('foo/**')` or `p.match('**/foo')`, but there's a
problem: for relative patterns only, `match()` implicitly inserts a `**`
token on the left hand side, causing all patterns to match from the right.
As a result, it's impossible to match relative patterns from the left:
`PurePath('foo/bar').match('bar/**')` is true!
This commit reverts the changes to `match()`, and instead adds a new
`full_match()` method that:
- Allows empty patterns
- Supports the recursive wildcard `**`
- Matches the *entire* path when given a relative pattern
Remove a double negative in the documentation of `mkdir()`'s *exist_ok*
parameter.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Allow `os.PathLike` objects to be passed as patterns to `pathlib.Path.glob()` and `rglob()`. (It's already possible to use them in `PurePath.match()`)
While we're in the area:
- Allow empty glob patterns in `PathBase` (but not `Path`)
- Speed up globbing in `PathBase` by generating paths with trailing slashes only as a final step, rather than for every intermediate directory.
- Simplify and speed up handling of rare patterns involving both `**` and `..` segments.
This is a very soft deprecation of `PurePath.as_uri()`. We instead document
it as a `Path` method, and add a couple of sentences mentioning that it's
also available in `PurePath`.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
This method supports file URIs (including variants) as described in RFC 8089, such as URIs generated by `pathlib.Path.as_uri()` and `urllib.request.pathname2url()`.
The method is added to `Path` rather than `PurePath` because it uses `os.fsdecode()`, and so its results vary from system to system. I intend to deprecate `PurePath.as_uri()` and move it to `Path` for the same reason.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Brings `pathlib.Path.is_dir()` and `in line with `os.DirEntry.is_dir()`, which
will be important for implementing generic path walking and globbing.
Likewise `is_file()`.
This new exception type is raised instead of `NotImplementedError` when
a path operation is not supported. It can be raised from `Path.readlink()`,
`symlink_to()`, `hardlink_to()`, `owner()` and `group()`. In a future
version of pathlib, it will be raised by `AbstractPath` for these methods
and others, such as `AbstractPath.mkdir()` and `unlink()`.
This commit introduces a 'walk-and-match' strategy for handling glob patterns that include a non-terminal `**` wildcard, such as `**/*.py`. For this example, the previous implementation recursively walked directories using `os.scandir()` when it expanded the `**` component, and then **scanned those same directories again** when expanded the `*.py` component. This is wasteful.
In the new implementation, any components following a `**` wildcard are used to build a `re.Pattern` object, which is used to filter the results of the recursive walk. A pattern like `**/*.py` uses half the number of `os.scandir()` calls; a pattern like `**/*/*.py` a third, etc.
This new algorithm does not apply if either:
1. The *follow_symlinks* argument is set to `None` (its default), or
2. The pattern contains `..` components.
In these cases we fall back to the old implementation.
This commit also replaces selector classes with selector functions. These generators directly yield results rather calling through to their successors. A new internal `Path._glob()` method takes care to chain these generators together, which simplifies the lazy algorithm and slightly improves performance. It should also be easier to understand and maintain.