Commit Graph

122 Commits

Author SHA1 Message Date
Barney Gale 15de493395
GH-107465: Add `pathlib.Path.from_uri()` classmethod. (#107640)
This method supports file URIs (including variants) as described in RFC 8089, such as URIs generated by `pathlib.Path.as_uri()` and `urllib.request.pathname2url()`.

The method is added to `Path` rather than `PurePath` because it uses `os.fsdecode()`, and so its results vary from system to system. I intend to deprecate `PurePath.as_uri()` and move it to `Path` for the same reason.

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2023-10-01 16:14:02 +01:00
Barney Gale ecd813f054
GH-109187: Improve symlink loop handling in `pathlib.Path.resolve()` (GH-109192)
Treat symlink loops like other errors: in strict mode, raise `OSError`, and
in non-strict mode, do not raise any exception.
2023-09-26 17:57:17 +01:00
Barney Gale ec0a0d2bd9
GH-70303: Emit FutureWarning when pathlib glob pattern ends with `**` (GH-105413)
In a future Python release, patterns with this ending will match both files
and directories. Users may add a trailing slash to remove the warning.
2023-08-04 23:12:12 +00:00
Barney Gale c6c5665ee0
GH-100502: Add `pathlib.PurePath.pathmod` attribute (GH-106533)
This instance attribute stores the implementation of `os.path` used for
low-level path operations: either `posixpath` or `ntpath`.
2023-07-19 18:59:55 +01:00
Ned Batchelder f014f1567c
docs: clarify Path.suffix (GH-106650) 2023-07-13 20:24:54 +01:00
Barney Gale 219effa876
GH-105793: Add follow_symlinks argument to `pathlib.Path.is_dir()` and `is_file()` (GH-105794)
Brings `pathlib.Path.is_dir()` and `in line with `os.DirEntry.is_dir()`, which
will be important for implementing generic path walking and globbing.
Likewise `is_file()`.
2023-06-26 17:58:17 +01:00
Barney Gale 4a6c84fc1e
GH-104375: Use `versionchanged` to describe new arguments in pathlib docs (GH-104376) 2023-06-24 16:14:09 +01:00
Barney Gale a8006706f7
GH-89812: Add `pathlib.UnsupportedOperation` (GH-105926)
This new exception type is raised instead of `NotImplementedError` when
a path operation is not supported. It can be raised from `Path.readlink()`,
`symlink_to()`, `hardlink_to()`, `owner()` and `group()`. In a future
version of pathlib, it will be raised by `AbstractPath` for these methods
and others, such as `AbstractPath.mkdir()` and `unlink()`.
2023-06-22 14:35:51 +01:00
Barney Gale 24af45172f
GH-102613: Fast recursive globbing in `pathlib.Path.glob()` (GH-104512)
This commit introduces a 'walk-and-match' strategy for handling glob patterns that include a non-terminal `**` wildcard, such as `**/*.py`. For this example, the previous implementation recursively walked directories using `os.scandir()` when it expanded the `**` component, and then **scanned those same directories again** when expanded the `*.py` component. This is wasteful.

In the new implementation, any components following a `**` wildcard are used to build a `re.Pattern` object, which is used to filter the results of the recursive walk. A pattern like `**/*.py` uses half the number of `os.scandir()` calls; a pattern like `**/*/*.py` a third, etc.

This new algorithm does not apply if either:

1. The *follow_symlinks* argument is set to `None` (its default), or
2. The pattern contains `..` components.

In these cases we fall back to the old implementation.

This commit also replaces selector classes with selector functions. These generators directly yield results rather calling through to their successors. A new internal `Path._glob()` method takes care to chain these generators together, which simplifies the lazy algorithm and slightly improves performance. It should also be easier to understand and maintain.
2023-06-06 23:50:36 +01:00
Barney Gale 49f90ba1ea
GH-73435: Implement recursive wildcards in `pathlib.PurePath.match()` (#101398)
`PurePath.match()` now handles the `**` wildcard as in `Path.glob()`, i.e. it matches any number of path segments.

We now compile a `re.Pattern` object for the entire pattern. This is made more difficult by `fnmatch` not treating directory separators as special when evaluating wildcards (`*`, `?`, etc), and so we arrange the path parts onto separate *lines* in a string, and ensure we don't set `re.DOTALL`.

Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2023-05-30 20:18:09 +00:00
Barney Gale ace676e2c2
GH-77609: Add follow_symlinks argument to `pathlib.Path.glob()` (GH-102616)
Add a keyword-only *follow_symlinks* parameter to `pathlib.Path.glob()` and`rglob()`.

When *follow_symlinks* is `None` (the default), these methods follow symlinks except when evaluating "`**`" wildcards. When set to true or false, symlinks are always or never followed, respectively.
2023-05-29 16:59:52 +01:00
thirumurugan dcdc90d384
GH-104484: Add case_sensitive argument to `pathlib.PurePath.match()` (GH-104565)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
2023-05-18 18:59:31 +01:00
Hugo van Kemenade 13ac1766bc
gh-103960: Dark mode: invert image brightness (#103983) 2023-05-10 16:46:37 +03:00
Barney Gale d00d942149
GH-100479: Add `pathlib.PurePath.with_segments()` (GH-103975)
Add `pathlib.PurePath.with_segments()`, which creates a path object from arguments. This method is called whenever a derivative path is created, such as from `pathlib.PurePath.parent`. Subclasses may override this method to share information between path objects.

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2023-05-05 19:04:53 +00:00
Barney Gale 8100be5535
GH-81079: Add case_sensitive argument to `pathlib.Path.glob()` (GH-102710)
This argument allows case-sensitive matching to be enabled on Windows, and
case-insensitive matching to be enabled on Posix.

Co-authored-by: Steve Dower <steve.dower@microsoft.com>
2023-05-04 16:44:36 +00:00
andrei kulakov af886ffa06
GH-89769: `pathlib.Path.glob()`: do not follow symlinks when checking for precise match (GH-29655)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
2023-05-03 04:50:10 +01:00
Barney Gale 6716254e71
GH-101362: Optimise PurePath(PurePath(...)) (GH-101667)
The previous `_parse_args()` method pulled the `_parts` out of any supplied `PurePath` objects; these were subsequently joined in `_from_parts()` using `os.path.join()`. This is actually a slower form of joining than calling `fspath()` on the path object, because it doesn't take advantage of the fact that the contents of `_parts` is normalized!

This reduces the time taken to run `PurePath("foo", "bar")` by ~20%, and the time taken to run `PurePath(p, "cheese")`, where `p = PurePath("/foo", "bar", "baz")`, by ~40%.

Automerge-Triggered-By: GH:AlexWaygood
2023-03-05 15:50:21 -08:00
Furkan Onder 3690688149
GH-101898: Fix missing term references for hashable definition (#101899)
Fix missing term references for hashable definition
2023-02-14 14:20:11 +04:00
Barney Gale 01093b8203
gh-86610: Use attribute directive in docs for pathlib.PurePath (#101114) 2023-01-20 23:13:58 +01:00
Jürgen Gmach 61f338a005
GH-101112: Specify type of pattern for Path.rglob (#101132)
The documentation for `rglob` did not mention what `pattern` actually
is.

Mentioning and linking to `fnmatch` makes this explicit, as the
documentation for `fnmatch` both shows the syntax and some explanation.
2023-01-20 23:11:31 +01:00
Shantanu 2f2fa03ff3
gh-87691: clarify use of anchor in pathlib docs (#100782)
This is feedback from https://github.com/python/cpython/pull/100737#discussion_r1062968696

This matches the wording from the `os.path.join` docs better:
https://docs.python.org/3/library/os.path.html#os.path.join

In particular, the previous use of "anchor" was incorrect given the
pathlib definition of "anchor".

Co-authored-by: Barney Gale <barney.gale@gmail.com>
2023-01-05 17:49:33 -08:00
Shantanu 1ae619c911
gh-87691: add an absolute path pathlib example in / operator docs (GH-100737)
The behaviour is fully explained a couple paragraphs above, but it may be useful to have a brief example to cover the behaviour.

Automerge-Triggered-By: GH:hauntsaninja
2023-01-05 14:55:35 -08:00
Barney Gale 5a991da329
gh-78707: deprecate passing >1 argument to `PurePath.[is_]relative_to()` (GH-94469)
This brings `relative_to()` and `is_relative_to()` more in line with other pathlib methods like `rename()` and `symlink_to()`.

Resolves #78707.
2022-12-16 16:14:27 -08:00
Charles Machalow 1b2de89bce
gh-99547: Add isjunction methods for checking if a path is a junction (GH-99548) 2022-11-22 17:19:34 +00:00
Mateusz 0023f51deb
gh-98240: Updated Path.rename docs, when it is atomic (GH-98245) 2022-10-28 16:31:37 -07:00
domragusa e089f23bbb
gh-84538: add strict argument to pathlib.PurePath.relative_to (GH-19813)
By default, :meth:`pathlib.PurePath.relative_to` doesn't deal with paths that are not a direct prefix of the other, raising an exception in that instance. This change adds a *walk_up* parameter that can be set to allow for using ``..`` to calculate the relative path.

example:
```
>>> p = PurePosixPath('/etc/passwd')
>>> p.relative_to('/etc')
PurePosixPath('passwd')
>>> p.relative_to('/usr')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pathlib.py", line 940, in relative_to
    raise ValueError(error_message.format(str(self), str(formatted)))
ValueError: '/etc/passwd' does not start with '/usr'
>>> p.relative_to('/usr', strict=False)
PurePosixPath('../etc/passwd')
```


https://bugs.python.org/issue40358

Automerge-Triggered-By: GH:brettcannon
2022-10-28 16:20:14 -07:00
Julien Palard 2eb503e4dd
Doc: Found some remaining default roles. (GH-98392) 2022-10-18 15:46:18 +02:00
Ansab Gillani b462f143ff
Fix documentation typo for pathlib.Path.walk (GH-96301) 2022-08-26 14:21:40 -07:00
Barney Gale 29650fea96
gh-86943: implement `pathlib.WindowsPath.is_mount()` (GH-31458)
Have `pathlib.WindowsPath.is_mount()` call `ntpath.ismount()`. Previously it raised `NotImplementedError` unconditionally.


https://bugs.python.org/issue42777
2022-08-05 15:37:44 -07:00
Stanislav Zmiev c1e929858a
gh-90385: Add `pathlib.Path.walk()` method (GH-92517)
Automerge-Triggered-By: GH:brettcannon
2022-07-22 16:55:46 -07:00
Ned Batchelder 6e2fbdab92
docs: use 'recursively' in the description of rglob, and mention globs in the os equivalences (GH-94954)
The r in `rglob` stands for "recursively", so use the word in the description. Also, glob and rglob can usefully be mentioned as the pathlib equivalent of os.walk.

Automerge-Triggered-By: GH:brettcannon
2022-07-20 14:47:43 -07:00
Chris Adams 3789c63577
Add additional pointers to pathlib's mapping to os.path functions (#94828)
* Add additional pointers to pathlib's mapping to os.path functions

os.path.splitext has a somewhat quirky signature since it mixes the path and filename components but I wanted the documentation to mention `PurePath.stem` as the natural counterpart to `PurePath.suffix` for the common use of `os.path.splitext` to turn "file.py" into "file" and "py".

Technically this could have some discussion of how to handle the parent directory hierarchy but that seems a bit out of keeping with the spirit of this table so I omitted mentioning `PurePath.parents` here.

* Update Doc/library/pathlib.rst

Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>

Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
2022-07-16 00:09:27 +02:00
Oleg Iarygin f62ff97f31
gh-93851: Fix all broken links in Doc/ (GH-93853) 2022-06-21 20:55:18 +02:00
Oleg Iarygin 78f1a43694
gh-91317: Document that Path does not collapse initial `//` (GH-32193)
Documentation for `pathlib` says:

> Spurious slashes and single dots are collapsed, but double dots ('..') are not, since this would change the meaning of a path in the face of symbolic links:

However, it omits that initial double slashes also aren't collapsed.

Later, in documentation of `PurePath.drive`, `PurePath.root`, and `PurePath.name` it mentions UNC but:

- this abbreviation says nothing to a person who is unaware about existence of UNC (Wikipedia doesn't help either by [giving a disambiguation page](https://en.wikipedia.org/wiki/UNC))
- it shows up only if a person needs to use a specific property or decides to fully learn what the module provides.

For context, see the BPO entry.
2022-06-10 15:52:36 -07:00
jacksonriley 8ef7929baf
Fix `PurePath.relative_to` links in the pathlib documentation. (GH-93268)
These are currently broken as they refer to :meth:`Path.relative_to` rather than :meth:`PurePath.relative_to`, and `relative_to` is a method on `PurePath`.
2022-06-07 11:54:16 -07:00
Brett Cannon d59b2d0441
Fix a directive in the `pathlib` docs (GH-93030) 2022-05-20 15:52:38 -07:00
Stanley f51ed04c66
gh-72073: Add Windows case in pathlib.rename (GH-93002)
#72073

https://docs.python.org/3.12/library/pathlib.html#pathlib.Path.rename
2022-05-20 15:25:39 -07:00
Serhiy Storchaka b1c4368824
Revert "gh-92550 - Fix regression in `pathlib.Path.rglob()` (GH-92583)" (GH-92598)
This reverts commit dcdf250d2d.
2022-05-11 07:14:25 +03:00
Gregory P. Smith 07b34926d3
gh-84131: Remove the deprecated pathlib.Path.link_to method. (#92505)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
2022-05-10 12:31:41 -07:00
Barney Gale dcdf250d2d
gh-92550 - Fix regression in `pathlib.Path.rglob()` (GH-92583)
We could try to remedy this by taking a slice, but we then run into an issue where the empty string will match altsep on POSIX. That rabbit hole could keep getting deeper.

A proper fix for the original issue involves making pathlib's path normalisation more configurable - in this case we want to retain trailing slashes, but in other we might want to preserve `./` prefixes, or elide `../` segments when we're sure we won't encounter symlinks.

This reverts commit ea2f5bcda1.
2022-05-09 17:12:16 -07:00
Eisuke Kawashima ea2f5bcda1
bpo-22276: Change pathlib.Path.glob not to ignore trailing path separator (GH-10349)
Now pathlib.Path.glob() **only** matches directories when the pattern ends in a path separator.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2022-04-28 12:45:03 -07:00
slateny 161dff7e10
gh-84459: Make wording more specific for Path.replace (GH-91853)
#84459
2022-04-27 15:03:03 -07:00
Matt Williams 795b365e8a
Fix typo in Path.iterdir docs (GH-31822) 2022-03-22 19:51:41 -07:00
Barney Gale 18cb2ef46c
bpo-29688: document and test `pathlib.Path.absolute()` (GH-26153)
Co-authored-by: Brett Cannon <brett@python.org>
Co-authored-by: Brian Helba <brian.helba@kitware.com>
2022-01-28 15:40:55 -08:00
Barney Gale f24e2e5464
bpo-39950: add `pathlib.Path.hardlink_to()` method that supersedes `link_to()` (GH-18909)
The argument order of `link_to()` is reversed compared to what one may expect, so:

    a.link_to(b)

Might be expected to create *a* as a link to *b*, in fact it creates *b* as a link to *a*, making it function more like a "link from". This doesn't match `symlink_to()` nor the documentation and doesn't seem to be the original author's intent.

This PR deprecates `link_to()` and introduces `hardlink_to()`, which has the same argument order as `symlink_to()`.
2021-04-23 13:48:52 -07:00
Ned Batchelder b2b6cd00c6
docs: clarify what patterns Path.glob accepts (GH-25486)
Automerge-Triggered-By: GH:Yhg1s
2021-04-20 09:45:45 -07:00
Barney Gale 3f3d82b848
bpo-39899: os.path.expanduser(): don't guess other Windows users' home directories if the basename of the current user's home directory doesn't match their username. (GH-18841)
This makes `ntpath.expanduser()` match `pathlib.Path.expanduser()` in this regard, and is more in line with `posixpath.expanduser()`'s cautious approach.

Also remove the near-duplicate implementation of `expanduser()` in pathlib, and by doing so fix a bug where KeyError could be raised when expanding another user's home directory.
2021-04-07 23:50:13 +01:00
Barney Gale 8aac1bea2e
bpo-42999: Expand and clarify pathlib.Path.link_to() documentation. (GH-24294) 2021-04-07 16:56:32 +01:00
Barney Gale abf964942f
bpo-39906: Add follow_symlinks parameter to pathlib.Path.stat() and chmod() (GH-18864) 2021-04-07 16:53:39 +01:00
Hong Xu 1459fed92c
Doc: os.path.abspath and Path.resolve are also different (GH-23276) 2021-01-20 11:20:00 +01:00