Commit Graph

163 Commits

Author SHA1 Message Date
Barney Gale cbac8a3888
GH-121462: pathlib docs: improve table of corresponding os/os.path functions (#121465)
Re-order table of corresponding functions with the following priorities:

1. Pure functionality is at the top
2. `os.path` functions are shown before `os` functions
3. Similar functionality is kept together
4. Functionality follows docs order where possible

Add a few missed correspondences:

- `os.path.isjunction` and `Path.is_junction`
- `os.path.ismount` and `Path.is_mount`
- `os.lstat()` and `Path.lstat()`
- `os.lchmod()` and `Path.lchmod()`

Also add footnotes describing a few differences.
2024-07-27 18:03:18 +01:00
Ville Skyttä bc264eac3a
Docs: spelling and grammar fixes (#122084)
Corrected some grammar and spelling issues in documentation.

Co-authored-by: Russell Keith-Magee <russell@keith-magee.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-07-22 09:14:25 +08:00
Barney Gale c4c7097e64
GH-73991: Support preserving metadata in `pathlib.Path.copytree()` (#121438)
Add *preserve_metadata* keyword-only argument to `pathlib.Path.copytree()`,
defaulting to false. When set to true, we copy timestamps, permissions,
extended attributes and flags where available, like `shutil.copystat()`.
2024-07-20 23:32:52 +01:00
Barney Gale 094375b9b7
GH-73991: Add `pathlib.Path.rmtree()` (#119060)
Add a `Path.rmtree()` method that removes an entire directory tree, like
`shutil.rmtree()`. The signature of the optional *on_error* argument
matches the `Path.walk()` argument of the same name, but differs from the
*onexc* and *onerror* arguments to `shutil.rmtree()`. Consistency within
pathlib is probably more important.

In the private pathlib ABCs, we add an implementation based on `walk()`.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-07-20 20:14:13 +00:00
Barney Gale 88fc0655d4
GH-73991: Support preserving metadata in `pathlib.Path.copy()` (#120806)
Add *preserve_metadata* keyword-only argument to `pathlib.Path.copy()`, defaulting to false. When set to true, we copy timestamps, permissions, extended attributes and flags where available, like `shutil.copystat()`. The argument has no effect on Windows, where metadata is always copied.

Internally (in the pathlib ABCs), path types gain `_readable_metadata` and `_writable_metadata` attributes. These sets of strings describe what kinds of metadata can be retrieved and stored. We take an intersection of `source._readable_metadata` and `target._writable_metadata` to minimise reads/writes. A new `_read_metadata()` method accepts a set of metadata keys and returns a dict with those keys, and a new `_write_metadata()` method accepts a dict of metadata. We *might* make these public in future, but it's hard to justify while the ABCs are still private.
2024-07-06 17:18:39 +01:00
Barney Gale f09d184821
GH-73991: Support copying directory symlinks on older Windows (#120807)
Check for `ERROR_INVALID_PARAMETER` when calling `_winapi.CopyFile2()` and
raise `UnsupportedOperation`. In `Path.copy()`, handle this exception and
fall back to the `PathBase.copy()` implementation.
2024-07-03 04:30:29 +01:00
Barney Gale 6b280a8498
GH-119054: Add alt text to pathlib inheritance diagram (#121158)
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
2024-06-29 17:46:53 +00:00
Barney Gale d6d8707ff2
GH-119054: Add "Expanding and resolving paths" section to pathlib docs. (#120970)
Add dedicated subsection for `home()`, `expanduser()`, `cwd()`,
`absolute()`, `resolve()` and `readlink()`. The position of this section
keeps all the `Path` constructors (`Path()`, `Path.from_uri()`,
`Path.home()` and `Path.cwd()`) near the top. Within the section, closely
related methods are kept adjacent. Specifically:

-.`home()` and `expanduser()` (the former calls the latter)
- `cwd()` and `absolute()` (the former calls the latter)
- `absolute()` and `resolve()` (both make paths absolute)
- `resolve()` and `readlink()` (both read symlink targets)
- Ditto `cwd()` and `absolute()`
- Ditto `absolute()` and `resolve()`

The "Other methods" section is removed.
2024-06-29 16:09:47 +01:00
Barney Gale e4a97a7fb1
GH-119054: Add "Permissions and ownership" section to pathlib docs. (#120505)
Add dedicated subsection for `pathlib.owner()`, `group()`, `chmod()` and
`lchmod()`.

Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
2024-06-24 19:05:24 +00:00
Barney Gale 35e998f560
GH-73991: Add `pathlib.Path.copytree()` (#120718)
Add `pathlib.Path.copytree()` method, which recursively copies one
directory to another.

This differs from `shutil.copytree()` in the following respects:

1. Our method has a *follow_symlinks* argument, whereas shutil's has a
   *symlinks* argument with an inverted meaning.
2. Our method lacks something like a *copy_function* argument. It always
   uses `Path.copy()` to copy files.
3. Our method lacks something like a *ignore_dangling_symlinks* argument.
   Instead, users can filter out danging symlinks with *ignore*, or
   ignore exceptions with *on_error*
4. Our *ignore* argument is a callable that accepts a single path object,
   whereas shutil's accepts a path and a list of child filenames.
5. We add an *on_error* argument, which is a callable that accepts
   an `OSError` instance. (`Path.walk()` also accepts such a callable).

Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>
2024-06-23 22:01:12 +01:00
Barney Gale 20d5b84f57
GH-73991: Add follow_symlinks argument to `pathlib.Path.copy()` (#120519)
Add support for not following symlinks in `pathlib.Path.copy()`.

On Windows we add the `COPY_FILE_COPY_SYMLINK` flag is following symlinks is disabled. If the source is symlink to a directory, this call will fail with `ERROR_ACCESS_DENIED`. In this case we add `COPY_FILE_DIRECTORY` to the flags and retry. This can fail on old Windowses, which we note in the docs.

No news as `copy()` was only just added.
2024-06-19 00:59:54 +00:00
Barney Gale 7c38097add
GH-73991: Add `pathlib.Path.copy()` (#119058)
Add a `Path.copy()` method that copies the content of one file to another.

This method is similar to `shutil.copyfile()` but differs in the following ways:

- Uses `fcntl.FICLONE` where available (see GH-81338)
- Uses `os.copy_file_range` where available (see GH-81340)
- Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`.

The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata.

Incorporates code from GH-81338 and GH-93152.

Co-authored-by: Eryk Sun <eryksun@gmail.com>
2024-06-14 17:15:49 +01:00
Barney Gale d88a1f2e15
GH-119054: Add "Renaming and deleting" section to pathlib docs. (#120465)
Add dedicated subsection for `pathlib.Path.rename()`, `replace()`,
`unlink()` and `rmdir()`.
2024-06-13 21:25:26 +01:00
Barney Gale c2d810b6d4
GH-119054: Add "Creating files and directories" section to pathlib docs. (#120186)
Add dedicated subsection for `pathlib.Path.touch()`, `mkdir()`,
`symlink_to()` and `hardlink_to()`. Also note that `open()`, `write_text()`
and `write_bytes()` are often used to create files.

Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
2024-06-13 17:58:46 +00:00
Barney Gale 14e1506a6d
GH-119054: Add "Reading directories" section to pathlib docs (#119956)
Add a dedicated subsection for `Path.iterdir()`-related methods,
specifically `iterdir()`, `glob()`, `rglob()` and `walk()`.

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
2024-06-06 23:27:39 +00:00
Barney Gale bd6d4ed645
GH-119054: Add "Reading and writing files" section to pathlib docs (#119524)
Add a dedicated subsection for `open()`, `read_text()`, `read_bytes()`,
`write_text()` and `write_bytes()`.
2024-06-02 19:39:19 +00:00
Barney Gale e418fc3a6e
GH-82805: Fix handling of single-dot file extensions in pathlib (#118952)
pathlib now treats "`.`" as a valid file extension (suffix). This brings
it in line with `os.path.splitext()`.

In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method
that splits a path into a `(root, ext)` pair, like `os.path.splitext()`.
This method is called by `PurePathBase.stem`, `suffix`, etc. In a future
version of pathlib, we might make these base classes public, and so users
will be able to define their own `splitext()` method to control file
extension splitting.

In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes`
properties that don't use `splitext()`, which avoids computing the path
base name twice.
2024-05-25 21:01:36 +01:00
Barney Gale 81d6336230
GH-119054: Add "Querying file type and status" section to pathlib docs (#119055)
Add a dedicated subsection for `Path.stat()`-related methods, specifically
`stat()`, `lstat()`, `exists()`, `is_*()`, and `samefile()`.
2024-05-24 19:35:13 +00:00
Barney Gale fbe6a0988f
GH-101357: Suppress `OSError` from `pathlib.Path.exists()` and `is_*()` (#118243)
Suppress all `OSError` exceptions from `pathlib.Path.exists()` and `is_*()`
rather than a selection of more common errors as we do presently. Also
adjust the implementations to call `os.path.exists()` etc, which are much
faster on Windows thanks to GH-101196.
2024-05-14 17:53:15 +00:00
Serhiy Storchaka 05c2fe1acd
Format None, True, False and NotImplemented as literals (GH-118758) 2024-05-08 22:35:16 +03:00
Ned Batchelder bcb435ee8f
docs: module page titles should not start with a link to themselves (#117099) 2024-05-08 20:34:40 +01:00
Barney Gale a74f117dab
GH-115060: Speed up `pathlib.Path.glob()` by omitting initial `stat()` (#117831)
Since 6258844c, paths that might not exist can be fed into pathlib's
globbing implementation, which will call `os.scandir()` / `os.lstat()` only
when strictly necessary. This allows us to drop an initial `self.is_dir()`
call, which saves a `stat()`.

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2024-04-14 00:08:03 +01:00
Barney Gale 6150bb2412
GH-77609: Add recurse_symlinks argument to `pathlib.Path.glob()` (#117311)
Replace tri-state `follow_symlinks` with boolean `recurse_symlinks` argument. The new argument controls whether symlinks are followed when expanding recursive `**` wildcards. The possible argument values correspond as follows:

    follow_symlinks  recurse_symlinks
    ===============  ================
    False            N/A
    None             False
    True             True

We therefore drop support for not following symlinks when expanding non-recursive pattern parts; it wasn't requested in the original issue, and it's a feature not found in any shells.

This makes the API a easier to grok by eliminating `None` as an option.

No news blurb as `follow_symlinks` was new in 3.13.
2024-04-05 18:51:54 +00:00
Barney Gale 752e18389e
GH-114575: Rename `PurePath.pathmod` to `PurePath.parser` (#116513)
And rename the private base class from `PathModuleBase` to `ParserBase`.
2024-03-31 19:14:48 +01:00
Barney Gale 72eea512b8
GH-106747: Document another difference between `glob` and `pathlib`. (#116518)
Document that `path.glob()` might return *path*, whereas
`glob.glob(root_dir=path)` will never return an empty string corresponding
to *path*.
2024-03-22 19:14:09 +00:00
Barney Gale 1904f0a224
GH-113838: Add "Comparison to os.path" section to pathlib docs (#115926) 2024-03-15 00:11:49 +00:00
Mariusz Felisiak 3f1b6efee9
Docs: fix broken links (#116651) 2024-03-12 21:19:33 -07:00
Barney Gale e921f09c8a
GH-101112: Add "pattern language" section to pathlib docs (#114030)
Explain the `full_match()` / `glob()` / `rglob()` pattern language in its own section. Move `rglob()` documentation under `glob()` and reduce duplicated text.
2024-02-26 00:19:03 +00:00
Barney Gale fda7445ca5
GH-70303: Make `pathlib.Path.glob('**')` return both files and directories (#114684)
Return files and directories from `pathlib.Path.glob()` if the pattern ends
with `**`. This is more compatible with `PurePath.full_match()` and with
other glob implementations such as bash and `glob.glob()`. Users can add a
trailing slash to match only directories.

In my previous patch I added a `FutureWarning` with the intention of fixing
this in Python 3.15. Upon further reflection I think this was an
unnecessarily cautious remedy to a clear bug.
2024-01-30 19:52:53 +00:00
Barney Gale 7e31d6dea2
gh-88569: add `ntpath.isreserved()` (#95486)
Add `ntpath.isreserved()`, which identifies reserved pathnames such as "NUL", "AUX" and "CON".

Deprecate `pathlib.PurePath.is_reserved()`.

---------

Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Brett Cannon <brett@python.org>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
2024-01-26 18:14:24 +00:00
Barney Gale b69548a0f5
GH-73435: Add `pathlib.PurePath.full_match()` (#114350)
In 49f90ba we added support for the recursive wildcard `**` in
`pathlib.PurePath.match()`. This should allow arbitrary prefix and suffix
matching, like `p.match('foo/**')` or `p.match('**/foo')`, but there's a
problem: for relative patterns only, `match()` implicitly inserts a `**`
token on the left hand side, causing all patterns to match from the right.
As a result, it's impossible to match relative patterns from the left:
`PurePath('foo/bar').match('bar/**')` is true!

This commit reverts the changes to `match()`, and instead adds a new
`full_match()` method that:

- Allows empty patterns
- Supports the recursive wildcard `**`
- Matches the *entire* path when given a relative pattern
2024-01-26 01:12:46 +00:00
Barney Gale b822b85ac1
GH-105900: Fix `pathlib.Path.symlink_to(target_is_directory=...)` docs (#114035)
Clarify that *target_is_directory* only matters if the target doesn't
exist.
2024-01-23 05:30:16 +00:00
Barney Gale 32c227470a
GH-82695: Clarify `pathlib.Path.mkdir()` documentation (#114032)
Remove a double negative in the documentation of `mkdir()`'s *exist_ok*
parameter.

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2024-01-23 02:31:09 +00:00
Barney Gale 3a61d24062
GH-99334: Explain that `PurePath.is_relative_to()` is purely lexical. (#114031) 2024-01-23 01:06:44 +00:00
Barney Gale 6313cdde58
GH-79634: Accept path-like objects as pathlib glob patterns. (#114017)
Allow `os.PathLike` objects to be passed as patterns to `pathlib.Path.glob()` and `rglob()`. (It's already possible to use them in `PurePath.match()`)

While we're in the area:

- Allow empty glob patterns in `PathBase` (but not `Path`)
- Speed up globbing in `PathBase` by generating paths with trailing slashes only as a final step, rather than for every intermediate directory.
- Simplify and speed up handling of rare patterns involving both `**` and `..` segments.
2024-01-20 02:10:25 +00:00
Barney Gale 45e527dfb5
GH-110109: pathlib docs: bring `from_uri()` and `as_uri()` together. (#110312)
This is a very soft deprecation of `PurePath.as_uri()`. We instead document
it as a `Path` method, and add a couple of sentences mentioning that it's
also available in `PurePath`.

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2024-01-16 22:51:57 +00:00
Barney Gale 7092b3f131
GH-78988: Document `pathlib.Path.glob()` exception propagation. (#114036)
We propagate the `OSError` from the `is_dir()` call on the top-level
directory, and suppress all others.
2024-01-16 22:28:54 +00:00
Taylor Packard ed8720ace4
gh-112758: Updated pathlib documentation for PurePath.match (#112814) 2023-12-08 18:13:17 +00:00
Alex Waygood 2c3906bc4b
gh-101100: Silence Sphinx warnings when `ntpath` or `posixpath` are referenced (#112833) 2023-12-07 20:57:30 +00:00
Kamil Turek a1551b48ee
gh-103363: Add follow_symlinks argument to `pathlib.Path.owner()` and `group()` (#107962) 2023-12-04 19:42:01 +00:00
Junya Okabe 9d70831cb7
gh-110745: add a newline argument to pathlib.Path.read_text (#110880)
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Barney Gale <barney.gale@gmail.com>
2023-11-21 22:32:38 +00:00
Barney Gale 15de493395
GH-107465: Add `pathlib.Path.from_uri()` classmethod. (#107640)
This method supports file URIs (including variants) as described in RFC 8089, such as URIs generated by `pathlib.Path.as_uri()` and `urllib.request.pathname2url()`.

The method is added to `Path` rather than `PurePath` because it uses `os.fsdecode()`, and so its results vary from system to system. I intend to deprecate `PurePath.as_uri()` and move it to `Path` for the same reason.

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2023-10-01 16:14:02 +01:00
Barney Gale ecd813f054
GH-109187: Improve symlink loop handling in `pathlib.Path.resolve()` (GH-109192)
Treat symlink loops like other errors: in strict mode, raise `OSError`, and
in non-strict mode, do not raise any exception.
2023-09-26 17:57:17 +01:00
Barney Gale ec0a0d2bd9
GH-70303: Emit FutureWarning when pathlib glob pattern ends with `**` (GH-105413)
In a future Python release, patterns with this ending will match both files
and directories. Users may add a trailing slash to remove the warning.
2023-08-04 23:12:12 +00:00
Barney Gale c6c5665ee0
GH-100502: Add `pathlib.PurePath.pathmod` attribute (GH-106533)
This instance attribute stores the implementation of `os.path` used for
low-level path operations: either `posixpath` or `ntpath`.
2023-07-19 18:59:55 +01:00
Ned Batchelder f014f1567c
docs: clarify Path.suffix (GH-106650) 2023-07-13 20:24:54 +01:00
Barney Gale 219effa876
GH-105793: Add follow_symlinks argument to `pathlib.Path.is_dir()` and `is_file()` (GH-105794)
Brings `pathlib.Path.is_dir()` and `in line with `os.DirEntry.is_dir()`, which
will be important for implementing generic path walking and globbing.
Likewise `is_file()`.
2023-06-26 17:58:17 +01:00
Barney Gale 4a6c84fc1e
GH-104375: Use `versionchanged` to describe new arguments in pathlib docs (GH-104376) 2023-06-24 16:14:09 +01:00
Barney Gale a8006706f7
GH-89812: Add `pathlib.UnsupportedOperation` (GH-105926)
This new exception type is raised instead of `NotImplementedError` when
a path operation is not supported. It can be raised from `Path.readlink()`,
`symlink_to()`, `hardlink_to()`, `owner()` and `group()`. In a future
version of pathlib, it will be raised by `AbstractPath` for these methods
and others, such as `AbstractPath.mkdir()` and `unlink()`.
2023-06-22 14:35:51 +01:00
Barney Gale 24af45172f
GH-102613: Fast recursive globbing in `pathlib.Path.glob()` (GH-104512)
This commit introduces a 'walk-and-match' strategy for handling glob patterns that include a non-terminal `**` wildcard, such as `**/*.py`. For this example, the previous implementation recursively walked directories using `os.scandir()` when it expanded the `**` component, and then **scanned those same directories again** when expanded the `*.py` component. This is wasteful.

In the new implementation, any components following a `**` wildcard are used to build a `re.Pattern` object, which is used to filter the results of the recursive walk. A pattern like `**/*.py` uses half the number of `os.scandir()` calls; a pattern like `**/*/*.py` a third, etc.

This new algorithm does not apply if either:

1. The *follow_symlinks* argument is set to `None` (its default), or
2. The pattern contains `..` components.

In these cases we fall back to the old implementation.

This commit also replaces selector classes with selector functions. These generators directly yield results rather calling through to their successors. A new internal `Path._glob()` method takes care to chain these generators together, which simplifies the lazy algorithm and slightly improves performance. It should also be easier to understand and maintain.
2023-06-06 23:50:36 +01:00