Commit Graph

141 Commits

Author SHA1 Message Date
Barney Gale 6150bb2412
GH-77609: Add recurse_symlinks argument to `pathlib.Path.glob()` (#117311)
Replace tri-state `follow_symlinks` with boolean `recurse_symlinks` argument. The new argument controls whether symlinks are followed when expanding recursive `**` wildcards. The possible argument values correspond as follows:

    follow_symlinks  recurse_symlinks
    ===============  ================
    False            N/A
    None             False
    True             True

We therefore drop support for not following symlinks when expanding non-recursive pattern parts; it wasn't requested in the original issue, and it's a feature not found in any shells.

This makes the API a easier to grok by eliminating `None` as an option.

No news blurb as `follow_symlinks` was new in 3.13.
2024-04-05 18:51:54 +00:00
Barney Gale 752e18389e
GH-114575: Rename `PurePath.pathmod` to `PurePath.parser` (#116513)
And rename the private base class from `PathModuleBase` to `ParserBase`.
2024-03-31 19:14:48 +01:00
Barney Gale 72eea512b8
GH-106747: Document another difference between `glob` and `pathlib`. (#116518)
Document that `path.glob()` might return *path*, whereas
`glob.glob(root_dir=path)` will never return an empty string corresponding
to *path*.
2024-03-22 19:14:09 +00:00
Barney Gale 1904f0a224
GH-113838: Add "Comparison to os.path" section to pathlib docs (#115926) 2024-03-15 00:11:49 +00:00
Mariusz Felisiak 3f1b6efee9
Docs: fix broken links (#116651) 2024-03-12 21:19:33 -07:00
Barney Gale e921f09c8a
GH-101112: Add "pattern language" section to pathlib docs (#114030)
Explain the `full_match()` / `glob()` / `rglob()` pattern language in its own section. Move `rglob()` documentation under `glob()` and reduce duplicated text.
2024-02-26 00:19:03 +00:00
Barney Gale fda7445ca5
GH-70303: Make `pathlib.Path.glob('**')` return both files and directories (#114684)
Return files and directories from `pathlib.Path.glob()` if the pattern ends
with `**`. This is more compatible with `PurePath.full_match()` and with
other glob implementations such as bash and `glob.glob()`. Users can add a
trailing slash to match only directories.

In my previous patch I added a `FutureWarning` with the intention of fixing
this in Python 3.15. Upon further reflection I think this was an
unnecessarily cautious remedy to a clear bug.
2024-01-30 19:52:53 +00:00
Barney Gale 7e31d6dea2
gh-88569: add `ntpath.isreserved()` (#95486)
Add `ntpath.isreserved()`, which identifies reserved pathnames such as "NUL", "AUX" and "CON".

Deprecate `pathlib.PurePath.is_reserved()`.

---------

Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Brett Cannon <brett@python.org>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
2024-01-26 18:14:24 +00:00
Barney Gale b69548a0f5
GH-73435: Add `pathlib.PurePath.full_match()` (#114350)
In 49f90ba we added support for the recursive wildcard `**` in
`pathlib.PurePath.match()`. This should allow arbitrary prefix and suffix
matching, like `p.match('foo/**')` or `p.match('**/foo')`, but there's a
problem: for relative patterns only, `match()` implicitly inserts a `**`
token on the left hand side, causing all patterns to match from the right.
As a result, it's impossible to match relative patterns from the left:
`PurePath('foo/bar').match('bar/**')` is true!

This commit reverts the changes to `match()`, and instead adds a new
`full_match()` method that:

- Allows empty patterns
- Supports the recursive wildcard `**`
- Matches the *entire* path when given a relative pattern
2024-01-26 01:12:46 +00:00
Barney Gale b822b85ac1
GH-105900: Fix `pathlib.Path.symlink_to(target_is_directory=...)` docs (#114035)
Clarify that *target_is_directory* only matters if the target doesn't
exist.
2024-01-23 05:30:16 +00:00
Barney Gale 32c227470a
GH-82695: Clarify `pathlib.Path.mkdir()` documentation (#114032)
Remove a double negative in the documentation of `mkdir()`'s *exist_ok*
parameter.

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2024-01-23 02:31:09 +00:00
Barney Gale 3a61d24062
GH-99334: Explain that `PurePath.is_relative_to()` is purely lexical. (#114031) 2024-01-23 01:06:44 +00:00
Barney Gale 6313cdde58
GH-79634: Accept path-like objects as pathlib glob patterns. (#114017)
Allow `os.PathLike` objects to be passed as patterns to `pathlib.Path.glob()` and `rglob()`. (It's already possible to use them in `PurePath.match()`)

While we're in the area:

- Allow empty glob patterns in `PathBase` (but not `Path`)
- Speed up globbing in `PathBase` by generating paths with trailing slashes only as a final step, rather than for every intermediate directory.
- Simplify and speed up handling of rare patterns involving both `**` and `..` segments.
2024-01-20 02:10:25 +00:00
Barney Gale 45e527dfb5
GH-110109: pathlib docs: bring `from_uri()` and `as_uri()` together. (#110312)
This is a very soft deprecation of `PurePath.as_uri()`. We instead document
it as a `Path` method, and add a couple of sentences mentioning that it's
also available in `PurePath`.

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2024-01-16 22:51:57 +00:00
Barney Gale 7092b3f131
GH-78988: Document `pathlib.Path.glob()` exception propagation. (#114036)
We propagate the `OSError` from the `is_dir()` call on the top-level
directory, and suppress all others.
2024-01-16 22:28:54 +00:00
Taylor Packard ed8720ace4
gh-112758: Updated pathlib documentation for PurePath.match (#112814) 2023-12-08 18:13:17 +00:00
Alex Waygood 2c3906bc4b
gh-101100: Silence Sphinx warnings when `ntpath` or `posixpath` are referenced (#112833) 2023-12-07 20:57:30 +00:00
Kamil Turek a1551b48ee
gh-103363: Add follow_symlinks argument to `pathlib.Path.owner()` and `group()` (#107962) 2023-12-04 19:42:01 +00:00
Junya Okabe 9d70831cb7
gh-110745: add a newline argument to pathlib.Path.read_text (#110880)
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Barney Gale <barney.gale@gmail.com>
2023-11-21 22:32:38 +00:00
Barney Gale 15de493395
GH-107465: Add `pathlib.Path.from_uri()` classmethod. (#107640)
This method supports file URIs (including variants) as described in RFC 8089, such as URIs generated by `pathlib.Path.as_uri()` and `urllib.request.pathname2url()`.

The method is added to `Path` rather than `PurePath` because it uses `os.fsdecode()`, and so its results vary from system to system. I intend to deprecate `PurePath.as_uri()` and move it to `Path` for the same reason.

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2023-10-01 16:14:02 +01:00
Barney Gale ecd813f054
GH-109187: Improve symlink loop handling in `pathlib.Path.resolve()` (GH-109192)
Treat symlink loops like other errors: in strict mode, raise `OSError`, and
in non-strict mode, do not raise any exception.
2023-09-26 17:57:17 +01:00
Barney Gale ec0a0d2bd9
GH-70303: Emit FutureWarning when pathlib glob pattern ends with `**` (GH-105413)
In a future Python release, patterns with this ending will match both files
and directories. Users may add a trailing slash to remove the warning.
2023-08-04 23:12:12 +00:00
Barney Gale c6c5665ee0
GH-100502: Add `pathlib.PurePath.pathmod` attribute (GH-106533)
This instance attribute stores the implementation of `os.path` used for
low-level path operations: either `posixpath` or `ntpath`.
2023-07-19 18:59:55 +01:00
Ned Batchelder f014f1567c
docs: clarify Path.suffix (GH-106650) 2023-07-13 20:24:54 +01:00
Barney Gale 219effa876
GH-105793: Add follow_symlinks argument to `pathlib.Path.is_dir()` and `is_file()` (GH-105794)
Brings `pathlib.Path.is_dir()` and `in line with `os.DirEntry.is_dir()`, which
will be important for implementing generic path walking and globbing.
Likewise `is_file()`.
2023-06-26 17:58:17 +01:00
Barney Gale 4a6c84fc1e
GH-104375: Use `versionchanged` to describe new arguments in pathlib docs (GH-104376) 2023-06-24 16:14:09 +01:00
Barney Gale a8006706f7
GH-89812: Add `pathlib.UnsupportedOperation` (GH-105926)
This new exception type is raised instead of `NotImplementedError` when
a path operation is not supported. It can be raised from `Path.readlink()`,
`symlink_to()`, `hardlink_to()`, `owner()` and `group()`. In a future
version of pathlib, it will be raised by `AbstractPath` for these methods
and others, such as `AbstractPath.mkdir()` and `unlink()`.
2023-06-22 14:35:51 +01:00
Barney Gale 24af45172f
GH-102613: Fast recursive globbing in `pathlib.Path.glob()` (GH-104512)
This commit introduces a 'walk-and-match' strategy for handling glob patterns that include a non-terminal `**` wildcard, such as `**/*.py`. For this example, the previous implementation recursively walked directories using `os.scandir()` when it expanded the `**` component, and then **scanned those same directories again** when expanded the `*.py` component. This is wasteful.

In the new implementation, any components following a `**` wildcard are used to build a `re.Pattern` object, which is used to filter the results of the recursive walk. A pattern like `**/*.py` uses half the number of `os.scandir()` calls; a pattern like `**/*/*.py` a third, etc.

This new algorithm does not apply if either:

1. The *follow_symlinks* argument is set to `None` (its default), or
2. The pattern contains `..` components.

In these cases we fall back to the old implementation.

This commit also replaces selector classes with selector functions. These generators directly yield results rather calling through to their successors. A new internal `Path._glob()` method takes care to chain these generators together, which simplifies the lazy algorithm and slightly improves performance. It should also be easier to understand and maintain.
2023-06-06 23:50:36 +01:00
Barney Gale 49f90ba1ea
GH-73435: Implement recursive wildcards in `pathlib.PurePath.match()` (#101398)
`PurePath.match()` now handles the `**` wildcard as in `Path.glob()`, i.e. it matches any number of path segments.

We now compile a `re.Pattern` object for the entire pattern. This is made more difficult by `fnmatch` not treating directory separators as special when evaluating wildcards (`*`, `?`, etc), and so we arrange the path parts onto separate *lines* in a string, and ensure we don't set `re.DOTALL`.

Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2023-05-30 20:18:09 +00:00
Barney Gale ace676e2c2
GH-77609: Add follow_symlinks argument to `pathlib.Path.glob()` (GH-102616)
Add a keyword-only *follow_symlinks* parameter to `pathlib.Path.glob()` and`rglob()`.

When *follow_symlinks* is `None` (the default), these methods follow symlinks except when evaluating "`**`" wildcards. When set to true or false, symlinks are always or never followed, respectively.
2023-05-29 16:59:52 +01:00
thirumurugan dcdc90d384
GH-104484: Add case_sensitive argument to `pathlib.PurePath.match()` (GH-104565)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
2023-05-18 18:59:31 +01:00
Hugo van Kemenade 13ac1766bc
gh-103960: Dark mode: invert image brightness (#103983) 2023-05-10 16:46:37 +03:00
Barney Gale d00d942149
GH-100479: Add `pathlib.PurePath.with_segments()` (GH-103975)
Add `pathlib.PurePath.with_segments()`, which creates a path object from arguments. This method is called whenever a derivative path is created, such as from `pathlib.PurePath.parent`. Subclasses may override this method to share information between path objects.

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2023-05-05 19:04:53 +00:00
Barney Gale 8100be5535
GH-81079: Add case_sensitive argument to `pathlib.Path.glob()` (GH-102710)
This argument allows case-sensitive matching to be enabled on Windows, and
case-insensitive matching to be enabled on Posix.

Co-authored-by: Steve Dower <steve.dower@microsoft.com>
2023-05-04 16:44:36 +00:00
andrei kulakov af886ffa06
GH-89769: `pathlib.Path.glob()`: do not follow symlinks when checking for precise match (GH-29655)
Co-authored-by: Barney Gale <barney.gale@gmail.com>
2023-05-03 04:50:10 +01:00
Barney Gale 6716254e71
GH-101362: Optimise PurePath(PurePath(...)) (GH-101667)
The previous `_parse_args()` method pulled the `_parts` out of any supplied `PurePath` objects; these were subsequently joined in `_from_parts()` using `os.path.join()`. This is actually a slower form of joining than calling `fspath()` on the path object, because it doesn't take advantage of the fact that the contents of `_parts` is normalized!

This reduces the time taken to run `PurePath("foo", "bar")` by ~20%, and the time taken to run `PurePath(p, "cheese")`, where `p = PurePath("/foo", "bar", "baz")`, by ~40%.

Automerge-Triggered-By: GH:AlexWaygood
2023-03-05 15:50:21 -08:00
Furkan Onder 3690688149
GH-101898: Fix missing term references for hashable definition (#101899)
Fix missing term references for hashable definition
2023-02-14 14:20:11 +04:00
Barney Gale 01093b8203
gh-86610: Use attribute directive in docs for pathlib.PurePath (#101114) 2023-01-20 23:13:58 +01:00
Jürgen Gmach 61f338a005
GH-101112: Specify type of pattern for Path.rglob (#101132)
The documentation for `rglob` did not mention what `pattern` actually
is.

Mentioning and linking to `fnmatch` makes this explicit, as the
documentation for `fnmatch` both shows the syntax and some explanation.
2023-01-20 23:11:31 +01:00
Shantanu 2f2fa03ff3
gh-87691: clarify use of anchor in pathlib docs (#100782)
This is feedback from https://github.com/python/cpython/pull/100737#discussion_r1062968696

This matches the wording from the `os.path.join` docs better:
https://docs.python.org/3/library/os.path.html#os.path.join

In particular, the previous use of "anchor" was incorrect given the
pathlib definition of "anchor".

Co-authored-by: Barney Gale <barney.gale@gmail.com>
2023-01-05 17:49:33 -08:00
Shantanu 1ae619c911
gh-87691: add an absolute path pathlib example in / operator docs (GH-100737)
The behaviour is fully explained a couple paragraphs above, but it may be useful to have a brief example to cover the behaviour.

Automerge-Triggered-By: GH:hauntsaninja
2023-01-05 14:55:35 -08:00
Barney Gale 5a991da329
gh-78707: deprecate passing >1 argument to `PurePath.[is_]relative_to()` (GH-94469)
This brings `relative_to()` and `is_relative_to()` more in line with other pathlib methods like `rename()` and `symlink_to()`.

Resolves #78707.
2022-12-16 16:14:27 -08:00
Charles Machalow 1b2de89bce
gh-99547: Add isjunction methods for checking if a path is a junction (GH-99548) 2022-11-22 17:19:34 +00:00
Mateusz 0023f51deb
gh-98240: Updated Path.rename docs, when it is atomic (GH-98245) 2022-10-28 16:31:37 -07:00
domragusa e089f23bbb
gh-84538: add strict argument to pathlib.PurePath.relative_to (GH-19813)
By default, :meth:`pathlib.PurePath.relative_to` doesn't deal with paths that are not a direct prefix of the other, raising an exception in that instance. This change adds a *walk_up* parameter that can be set to allow for using ``..`` to calculate the relative path.

example:
```
>>> p = PurePosixPath('/etc/passwd')
>>> p.relative_to('/etc')
PurePosixPath('passwd')
>>> p.relative_to('/usr')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pathlib.py", line 940, in relative_to
    raise ValueError(error_message.format(str(self), str(formatted)))
ValueError: '/etc/passwd' does not start with '/usr'
>>> p.relative_to('/usr', strict=False)
PurePosixPath('../etc/passwd')
```


https://bugs.python.org/issue40358

Automerge-Triggered-By: GH:brettcannon
2022-10-28 16:20:14 -07:00
Julien Palard 2eb503e4dd
Doc: Found some remaining default roles. (GH-98392) 2022-10-18 15:46:18 +02:00
Ansab Gillani b462f143ff
Fix documentation typo for pathlib.Path.walk (GH-96301) 2022-08-26 14:21:40 -07:00
Barney Gale 29650fea96
gh-86943: implement `pathlib.WindowsPath.is_mount()` (GH-31458)
Have `pathlib.WindowsPath.is_mount()` call `ntpath.ismount()`. Previously it raised `NotImplementedError` unconditionally.


https://bugs.python.org/issue42777
2022-08-05 15:37:44 -07:00
Stanislav Zmiev c1e929858a
gh-90385: Add `pathlib.Path.walk()` method (GH-92517)
Automerge-Triggered-By: GH:brettcannon
2022-07-22 16:55:46 -07:00
Ned Batchelder 6e2fbdab92
docs: use 'recursively' in the description of rglob, and mention globs in the os equivalences (GH-94954)
The r in `rglob` stands for "recursively", so use the word in the description. Also, glob and rglob can usefully be mentioned as the pathlib equivalent of os.walk.

Automerge-Triggered-By: GH:brettcannon
2022-07-20 14:47:43 -07:00