Commit Graph

77 Commits

Author SHA1 Message Date
Daniel Hollas 2304774465
gh-118761: Speedup pathlib import by deferring shutil (#123520)
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
2024-09-01 15:44:48 +01:00
Barney Gale 7bd6ebf696
GH-73991: Prune `pathlib.Path.copy()` and `copy_into()` arguments (#123337)
Remove *ignore* and *on_error* arguments from `pathlib.Path.copy[_into]()`,
because these arguments are under-designed. Specifically:

- *ignore* is appropriated from `shutil.copytree()`, but it's not clear
  how it should apply when the user copies a non-directory. We've changed
  the callback signature from the `shutil` version, but I'm not confident
  the new signature is as good as it can be.
- *on_error* is a generalisation of `shutil.copytree()`'s error handling,
  which is to accumulate exceptions and raise a single `shutil.Error` at
  the end. It's not obvious which solution is better.

Additionally, this arguments may be challenging to implement in future user
subclasses of `PathBase`, which might utilise a native recursive copying
method.
2024-08-26 17:05:34 +01:00
Barney Gale 033d537cd4
GH-73991: Make `pathlib.Path.delete()` private. (#123315)
Per feedback from Paul Moore on GH-123158, it's better to defer making
`Path.delete()` public than ship it with under-designed error handling
capabilities.

We leave a remnant `_delete()` method, which is used by `move()`. Any
functionality not needed by `move()` is deleted.
2024-08-26 16:26:34 +01:00
Barney Gale c68a93c582
GH-73991: Add `pathlib.Path.copy_into()` and `move_into()` (#123314)
These two methods accept an *existing* directory path, onto which we join
the source path's base name to form the final target path.

A possible alternative implementation is to check for directories in
`copy()` and `move()` and adjust the target path, which is done in several
`shutil` functions. This behaviour is helpful in a shell context, but
less so in a stored program that explicitly specifies destinations. For
example, a user that calls `Path('foo.py').copy('bar.py')` might not
imagine that `bar.py/foo.py` would be created, but under the alternative
implementation this will happen if `bar.py` is an existing directory.
2024-08-26 14:14:23 +01:00
Barney Gale 625d0705b9
GH-73991: Add `pathlib.Path.move()` (#122073)
Add a `Path.move()` method that moves a file or directory tree, and returns a new `Path` instance pointing to the target.

This method is similar to `shutil.move()`, except that it doesn't accept a *copy_function* argument, and it doesn't check whether the destination is an existing directory.
2024-08-25 16:51:51 +01:00
Barney Gale c4ee4e756a
GH-122890: Fix low-level error handling in `pathlib.Path.copy()` (#122897)
Give unique names to our low-level FD copying functions, and try each one
in turn. Handle errors appropriately for each implementation:

- `fcntl.FICLONE`: suppress `EBADF`, `EOPNOTSUPP`, `ETXTBSY`, `EXDEV`
- `posix._fcopyfile`: suppress `EBADF`, `ENOTSUP`
- `os.copy_file_range`: suppress `ETXTBSY`, `EXDEV`
- `os.sendfile`: suppress `ENOTSOCK`
2024-08-24 15:11:39 +01:00
Barney Gale d7ae4dc5c1
GH-73991: Disallow copying directory into itself via `pathlib.Path.copy()` (#122924) 2024-08-23 20:03:11 +01:00
Cody Maloney 35d8ac7cd7
GH-120754: Disable buffering in Path.read_bytes (#122111)
`Path.read_bytes()` is used to read a whole file. buffering /
BufferedIO is focused around making small, possibly interleaved,
read/write efficient which doesn't add value in this case.

On my Mac, running the benchmark:

```python
import pyperf
from pathlib import Path

def read_all(all_paths):
    for p in all_paths:
        p.read_bytes()

def read_file(path_obj):
    path_obj.read_bytes()

all_rst = list(Path("Doc").glob("**/*.rst"))
all_py = list(Path(".").glob("**/*.py"))
assert all_rst, "Should have found rst files"
assert all_py, "Should have found python source files"

runner = pyperf.Runner()
runner.bench_func("read_file_small", read_file, Path("Doc/howto/clinic.rst"))
runner.bench_func("read_file_large", read_file, Path("Doc/c-api/typeobj.rst"))
```

before:
```python
.....................
read_file_small: Mean +- std dev: 6.80 us +- 0.07 us
.....................
read_file_large: Mean +- std dev: 10.8 us +- 0.2 us
````

after:
```python
.....................
read_file_small: Mean +- std dev: 5.67 us +- 0.05 us
.....................
read_file_large: Mean +- std dev: 9.77 us +- 0.52 us
```
2024-08-16 13:52:41 -07:00
Barney Gale a6644d4464
GH-73991: Rework `pathlib.Path.copytree()` into `copy()` (#122369)
Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.)

Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `delete()` methods (which will also
accept any type of file.)

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2024-08-11 22:43:18 +01:00
Barney Gale 98dba73010
GH-73991: Rework `pathlib.Path.rmtree()` into `delete()` (#122368)
Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `copy()` methods (which will also
accept any type of file.)
2024-08-07 01:34:44 +01:00
Виталий Дмитриев c4e8196940
Fix duplicated words 'begins with a' in pathlib docstring (#122732) 2024-08-06 18:38:33 +01:00
Barney Gale c4c7097e64
GH-73991: Support preserving metadata in `pathlib.Path.copytree()` (#121438)
Add *preserve_metadata* keyword-only argument to `pathlib.Path.copytree()`,
defaulting to false. When set to true, we copy timestamps, permissions,
extended attributes and flags where available, like `shutil.copystat()`.
2024-07-20 23:32:52 +01:00
Barney Gale 094375b9b7
GH-73991: Add `pathlib.Path.rmtree()` (#119060)
Add a `Path.rmtree()` method that removes an entire directory tree, like
`shutil.rmtree()`. The signature of the optional *on_error* argument
matches the `Path.walk()` argument of the same name, but differs from the
*onexc* and *onerror* arguments to `shutil.rmtree()`. Consistency within
pathlib is probably more important.

In the private pathlib ABCs, we add an implementation based on `walk()`.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-07-20 20:14:13 +00:00
Barney Gale 88fc0655d4
GH-73991: Support preserving metadata in `pathlib.Path.copy()` (#120806)
Add *preserve_metadata* keyword-only argument to `pathlib.Path.copy()`, defaulting to false. When set to true, we copy timestamps, permissions, extended attributes and flags where available, like `shutil.copystat()`. The argument has no effect on Windows, where metadata is always copied.

Internally (in the pathlib ABCs), path types gain `_readable_metadata` and `_writable_metadata` attributes. These sets of strings describe what kinds of metadata can be retrieved and stored. We take an intersection of `source._readable_metadata` and `target._writable_metadata` to minimise reads/writes. A new `_read_metadata()` method accepts a set of metadata keys and returns a dict with those keys, and a new `_write_metadata()` method accepts a dict of metadata. We *might* make these public in future, but it's hard to justify while the ABCs are still private.
2024-07-06 17:18:39 +01:00
Barney Gale f09d184821
GH-73991: Support copying directory symlinks on older Windows (#120807)
Check for `ERROR_INVALID_PARAMETER` when calling `_winapi.CopyFile2()` and
raise `UnsupportedOperation`. In `Path.copy()`, handle this exception and
fall back to the `PathBase.copy()` implementation.
2024-07-03 04:30:29 +01:00
Barney Gale 35e998f560
GH-73991: Add `pathlib.Path.copytree()` (#120718)
Add `pathlib.Path.copytree()` method, which recursively copies one
directory to another.

This differs from `shutil.copytree()` in the following respects:

1. Our method has a *follow_symlinks* argument, whereas shutil's has a
   *symlinks* argument with an inverted meaning.
2. Our method lacks something like a *copy_function* argument. It always
   uses `Path.copy()` to copy files.
3. Our method lacks something like a *ignore_dangling_symlinks* argument.
   Instead, users can filter out danging symlinks with *ignore*, or
   ignore exceptions with *on_error*
4. Our *ignore* argument is a callable that accepts a single path object,
   whereas shutil's accepts a path and a list of child filenames.
5. We add an *on_error* argument, which is a callable that accepts
   an `OSError` instance. (`Path.walk()` also accepts such a callable).

Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>
2024-06-23 22:01:12 +01:00
Barney Gale 20d5b84f57
GH-73991: Add follow_symlinks argument to `pathlib.Path.copy()` (#120519)
Add support for not following symlinks in `pathlib.Path.copy()`.

On Windows we add the `COPY_FILE_COPY_SYMLINK` flag is following symlinks is disabled. If the source is symlink to a directory, this call will fail with `ERROR_ACCESS_DENIED`. In this case we add `COPY_FILE_DIRECTORY` to the flags and retry. This can fail on old Windowses, which we note in the docs.

No news as `copy()` was only just added.
2024-06-19 00:59:54 +00:00
Barney Gale 7c38097add
GH-73991: Add `pathlib.Path.copy()` (#119058)
Add a `Path.copy()` method that copies the content of one file to another.

This method is similar to `shutil.copyfile()` but differs in the following ways:

- Uses `fcntl.FICLONE` where available (see GH-81338)
- Uses `os.copy_file_range` where available (see GH-81340)
- Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`.

The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata.

Incorporates code from GH-81338 and GH-93152.

Co-authored-by: Eryk Sun <eryksun@gmail.com>
2024-06-14 17:15:49 +01:00
Barney Gale 242c7498e5
GH-116380: Move pathlib-specific code from `glob` to `pathlib._abc`. (#120011)
In `glob._Globber`, move pathlib-specific methods to `pathlib._abc.PathGlobber` and replace them with abstract methods. Rename `glob._Globber` to `glob._GlobberBase`. As a result, the `glob` module is no longer befouled by code that can only ever apply to pathlib.

No change of behaviour.
2024-06-07 17:59:34 +01:00
Barney Gale e83ce850f4
pathlib ABCs: remove duplicate `realpath()` implementation. (#119178)
Add private `posixpath._realpath()` function, which is a generic version of `realpath()` that can be parameterised with string tokens (`sep`, `curdir`, `pardir`) and query functions (`getcwd`, `lstat`, `readlink`). Also add support for limiting the number of symlink traversals.

In the private `pathlib._abc.PathBase` class, call `posixpath._realpath()` and remove our re-implementation of the same algorithm.

No change to any public APIs, either in `posixpath` or `pathlib`.

Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>
2024-06-05 18:54:50 +01:00
Barney Gale 7ff61f51b6
GH-119169: Implement `pathlib.Path.walk()` using `os.walk()` (#119573)
For silly reasons, pathlib's generic implementation of `walk()` currently
resides in `glob._Globber`. This commit moves it into
`pathlib._abc.PathBase.walk()` where it really belongs, and makes
`pathlib.Path.walk()` call `os.walk()`.
2024-05-29 20:51:04 +00:00
Barney Gale e418fc3a6e
GH-82805: Fix handling of single-dot file extensions in pathlib (#118952)
pathlib now treats "`.`" as a valid file extension (suffix). This brings
it in line with `os.path.splitext()`.

In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method
that splits a path into a `(root, ext)` pair, like `os.path.splitext()`.
This method is called by `PurePathBase.stem`, `suffix`, etc. In a future
version of pathlib, we might make these base classes public, and so users
will be able to define their own `splitext()` method to control file
extension splitting.

In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes`
properties that don't use `splitext()`, which avoids computing the path
base name twice.
2024-05-25 21:01:36 +01:00
Barney Gale 3c28510b98
GH-119113: Raise `TypeError` from `pathlib.PurePath.with_suffix(None)` (#119124)
Restore behaviour from 3.12 when `path.with_suffix(None)` is called.
2024-05-19 17:04:56 +01:00
Kirill Podoprigora 31a28cbae0
gh-119049: Defer `import warnings` in `pathlib._local` (#119111) 2024-05-17 17:12:02 +01:00
Barney Gale 7d8725ac6f
GH-74033: Drop deprecated `pathlib.Path` keyword arguments (#118793)
Remove support for supplying keyword arguments to `pathlib.Path()`. This
has been deprecated since Python 3.12.
2024-05-14 20:14:07 +00:00
Barney Gale fbe6a0988f
GH-101357: Suppress `OSError` from `pathlib.Path.exists()` and `is_*()` (#118243)
Suppress all `OSError` exceptions from `pathlib.Path.exists()` and `is_*()`
rather than a selection of more common errors as we do presently. Also
adjust the implementations to call `os.path.exists()` etc, which are much
faster on Windows thanks to GH-101196.
2024-05-14 17:53:15 +00:00
Barney Gale f772d0d08a
GH-78707: Drop deprecated `pathlib.PurePath.[is_]relative_to()` arguments (#118780)
Remove support for supplying additional positional arguments to
`PurePath.relative_to()` and `is_relative_to()`. This has been deprecated
since Python 3.12.
2024-05-10 15:53:46 +00:00
Barney Gale b4bdf83cc6
GH-116380: Revert move of pathlib globbing code to `pathlib._glob` (#118678)
The previous change made the `glob` module slower to import, because it
imported `pathlib._glob` and hence the rest of `pathlib`.

Reverts a40f557d7b.
2024-05-07 00:32:48 +00:00
Barney Gale d8d94911e2
Move pathlib implementation out of `__init__.py` (#118582)
Use the `__init__.py` file only for imports that define the API, following the example of asyncio.
2024-05-05 20:57:19 +01:00
Barney Gale a40f557d7b
GH-116380: Move pathlib globbing implementation into `pathlib._glob` (#118562)
Moving this code under the `pathlib` package makes it quite a lot easier
to backport in the `pathlib-abc` PyPI package. It was a bit foolish of me
to add it to `glob` in the first place.

Also add `translate()` to `__all__` in `glob`. This function is new in
3.13, so there's no NEWS needed.
2024-05-03 20:29:25 +00:00
Andrew Zipperer a6b610a94b
docs: typo: tiny grammar change: "pointed by" -> "pointed to by" (#118411)
* docs: tiny grammar change: "pointed by" -> "pointed to by"

This commit uses "file pointed to by" to replace "file pointed by" in
 - doc for shutil.copytree
 - docstring for shutil.copytree
 - docstring _abc.PathBase.open
 - docstring for pathlib.Path.open
 - doc for os.copy_file_range
 - doc for os.splice

The docs use "file pointed to by" more frequently than
"file pointed by". So, this commit replaces the uses of
"file pointed by" in order to make the uses consistent
through the docs.

```bash
$ grep -ri 'pointed to by' cpython/
```
yields more results than
```bash
$ grep -ri 'pointed by' cpython/
```

Separately:

There are two occurrences of "tree pointed by":
 - cpython/Doc/library/xml.etree.elementtree.rst for
     `xml.etree.ElementInclude.include`
 - cpython/Lib/xml/etree/ElementInclude.py for `include`

For those uses of "tree pointed by", I expect "tree pointed to by"
instead. However, I found enough uses online of (a) "tree pointed by"
rather than (b) "tree pointed to by" to convince me that (a) is in
common use.

So, this commit does not replace those occurrences of "tree pointed by"
to "tree pointed to by". But I will replace them if a reviewer
believes it is correct to replace them.

* docs: typo: "exists and executable" -> "exists and is executable"

---------

Co-authored-by: Andrew-Zipperer <atzipperer@gmail.com>
2024-05-02 05:37:12 +00:00
Barney Gale 15fbd53ba9
GH-112855: Speed up `pathlib.PurePath` pickling (#112856)
The second item in the tuple returned from `__reduce__()` is a tuple of arguments to supply to path constructor. Previously we returned the `parts` tuple here, which entailed joining, parsing and normalising the path object, and produced a compact pickle representation.

With this patch, we instead return a tuple of paths that were originally given to the path constructor. This makes pickling much faster (at the expense of compactness).

It's worth noting that, in the olden times, pathlib performed this parsing/normalization up-front in every case, and so using `parts` for pickling was almost free. Nowadays pathlib only parses/normalises paths when it's necessary or advantageous to do so (e.g. computing a path parent, or iterating over a directory, respectively).
2024-04-20 17:46:52 +01:00
Barney Gale a74f117dab
GH-115060: Speed up `pathlib.Path.glob()` by omitting initial `stat()` (#117831)
Since 6258844c, paths that might not exist can be fed into pathlib's
globbing implementation, which will call `os.scandir()` / `os.lstat()` only
when strictly necessary. This allows us to drop an initial `self.is_dir()`
call, which saves a `stat()`.

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2024-04-14 00:08:03 +01:00
Barney Gale 30f0643e36
GH-117727: Speed up `pathlib.Path.iterdir()` by using `os.scandir()` (#117728)
Replace use of `os.listdir()` with `os.scandir()`. Forgo setting `_drv`,
`_root` and `_tail_cached`, as these usually aren't needed. Use
`os.DirEntry.path` to set `_str`.
2024-04-12 22:02:39 +00:00
Barney Gale 0eb52f5f26
GH-115060: Speed up `pathlib.Path.glob()` by not scanning literal parts (#117732)
Don't bother calling `os.scandir()` to scan for literal pattern segments,
like `foo` in `foo/*.py`. Instead, append the segment(s) as-is and call
through to the next selector with `exists=False`, which signals that the
path might not exist. Subsequent selectors will call `os.scandir()` or
`os.lstat()` to filter out missing paths as needed.
2024-04-12 22:19:21 +01:00
Barney Gale 0cc71bde00
GH-117586: Speed up `pathlib.Path.walk()` by working with strings (#117726)
Move `pathlib.Path.walk()` implementation into `glob._Globber`. The new
`glob._Globber.walk()` classmethod works with strings internally, which is
a little faster than generating `Path` objects and keeping them normalized.
The `pathlib.Path.walk()` method converts the strings back to path objects.

In the private pathlib ABCs, our existing subclass of `_Globber` ensures
that `PathBase` instances are used throughout.

Follow-up to #117589.
2024-04-11 01:26:53 +01:00
Barney Gale 6258844c27
GH-117586: Speed up `pathlib.Path.glob()` by working with strings (#117589)
Move pathlib globbing implementation into a new private class: `glob._Globber`. This class implements fast string-based globbing. It's called by `pathlib.Path.glob()`, which then converts strings back to path objects.

In the private pathlib ABCs, add a `pathlib._abc.Globber` subclass that works with `PathBase` objects rather than strings, and calls user-defined path methods like `PathBase.stat()` rather than `os.stat()`.

This sets the stage for two more improvements:

- GH-115060: Query non-wildcard segments with `lstat()`
- GH-116380: Unify `pathlib` and `glob` implementations of globbing.

No change to the implementations of `glob.glob()` and `glob.iglob()`.
2024-04-10 20:43:07 +01:00
Barney Gale 6150bb2412
GH-77609: Add recurse_symlinks argument to `pathlib.Path.glob()` (#117311)
Replace tri-state `follow_symlinks` with boolean `recurse_symlinks` argument. The new argument controls whether symlinks are followed when expanding recursive `**` wildcards. The possible argument values correspond as follows:

    follow_symlinks  recurse_symlinks
    ===============  ================
    False            N/A
    None             False
    True             True

We therefore drop support for not following symlinks when expanding non-recursive pattern parts; it wasn't requested in the original issue, and it's a feature not found in any shells.

This makes the API a easier to grok by eliminating `None` as an option.

No news blurb as `follow_symlinks` was new in 3.13.
2024-04-05 18:51:54 +00:00
Barney Gale 752e18389e
GH-114575: Rename `PurePath.pathmod` to `PurePath.parser` (#116513)
And rename the private base class from `PathModuleBase` to `ParserBase`.
2024-03-31 19:14:48 +01:00
Barney Gale 1dce0073da
pathlib ABCs: follow all symlinks in `PathBase.glob()` (#116293)
Switch the default value of *follow_symlinks* from `None` to `True` in
`pathlib._abc.PathBase.glob()` and `rglob()`. This speeds up recursive
globbing.

No change to the public pathlib classes.
2024-03-04 02:26:33 +00:00
Barney Gale e3dedeae7a
GH-114610: Fix `pathlib.PurePath.with_stem('')` handling of file extensions (#114612)
Raise `ValueError` if `with_stem('')` is called on a path with a file
extension. Paths may only have an empty stem if they also have an empty
suffix.
2024-02-24 19:37:03 +00:00
Barney Gale 6f93b4df92
GH-115060: Speed up `pathlib.Path.glob()` by removing redundant regex matching (#115061)
When expanding and filtering paths for a `**` wildcard segment, build an `re.Pattern` object from the subsequent pattern parts, rather than the entire pattern, and match against the `os.DirEntry` object prior to instantiating a path object. Also skip compiling a pattern when expanding a `*` wildcard segment.
2024-02-10 18:12:34 +00:00
Barney Gale 1b1f8398d0
GH-106747: Make pathlib ABC globbing more consistent with `glob.glob()` (#115056)
When expanding `**` wildcards, ensure we add a trailing slash to the
topmost directory path. This matches `glob.glob()` behaviour:

    >>> glob.glob('dirA/**', recursive=True)
    ['dirA/', 'dirA/dirB', 'dirA/dirB/dirC']

This does not affect `pathlib.Path.glob()`, because trailing slashes aren't
supported in pathlib proper.
2024-02-06 02:48:18 +00:00
Barney Gale 574291963f
pathlib ABCs: drop partial, broken, untested support for `bytes` paths. (#114777)
Methods like `full_match()`, `glob()`, etc, are difficult to make work with
byte paths, and it's not worth the effort. This patch makes `PurePathBase`
raise `TypeError` when given non-`str` path segments.
2024-01-31 00:59:33 +00:00
Barney Gale 1667c28686
pathlib ABCs: raise `UnsupportedOperation` directly. (#114776)
Raise `UnsupportedOperation` directly, rather than via an `_unsupported()`
helper, to give human readers and IDEs/typecheckers/etc a bigger hint that
these methods are abstract.
2024-01-31 00:38:01 +00:00
Barney Gale fda7445ca5
GH-70303: Make `pathlib.Path.glob('**')` return both files and directories (#114684)
Return files and directories from `pathlib.Path.glob()` if the pattern ends
with `**`. This is more compatible with `PurePath.full_match()` and with
other glob implementations such as bash and `glob.glob()`. Users can add a
trailing slash to match only directories.

In my previous patch I added a `FutureWarning` with the intention of fixing
this in Python 3.15. Upon further reflection I think this was an
unnecessarily cautious remedy to a clear bug.
2024-01-30 19:52:53 +00:00
Barney Gale 809eed4805
GH-114610: Fix `pathlib._abc.PurePathBase.with_suffix('.ext')` handling of stems (#114613)
Raise `ValueError` if `with_suffix('.ext')` is called on a path without a
stem. Paths may only have a non-empty suffix if they also have a non-empty
stem.

ABC-only bugfix; no effect on public classes.
2024-01-30 14:25:16 +00:00
Barney Gale 823a38a960
GH-79634: Speed up pathlib globbing by removing `joinpath()` call. (#114623)
Remove `self.joinpath('')` call that should have been removed in 6313cdde.

This makes `PathBase.glob('')` yield itself *without* adding a trailing slash. It's hard to say whether this is more or less correct, but at least everything else is faster, and there's no behaviour change in the public classes where empty glob patterns are disallowed.
2024-01-27 19:59:51 +00:00
Barney Gale 7e31d6dea2
gh-88569: add `ntpath.isreserved()` (#95486)
Add `ntpath.isreserved()`, which identifies reserved pathnames such as "NUL", "AUX" and "CON".

Deprecate `pathlib.PurePath.is_reserved()`.

---------

Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Brett Cannon <brett@python.org>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
2024-01-26 18:14:24 +00:00
Barney Gale b69548a0f5
GH-73435: Add `pathlib.PurePath.full_match()` (#114350)
In 49f90ba we added support for the recursive wildcard `**` in
`pathlib.PurePath.match()`. This should allow arbitrary prefix and suffix
matching, like `p.match('foo/**')` or `p.match('**/foo')`, but there's a
problem: for relative patterns only, `match()` implicitly inserts a `**`
token on the left hand side, causing all patterns to match from the right.
As a result, it's impossible to match relative patterns from the left:
`PurePath('foo/bar').match('bar/**')` is true!

This commit reverts the changes to `match()`, and instead adds a new
`full_match()` method that:

- Allows empty patterns
- Supports the recursive wildcard `**`
- Matches the *entire* path when given a relative pattern
2024-01-26 01:12:46 +00:00