Move tests for Path.walk() into a new PathWalkTest class, and apply a similar change in tests for the ABCs. This allows us to properly tear down the walk test hierarchy in tearDown(), rather than leaving it to os_helper.rmtree().
In `PathBase.resolve()`, raise `UnsupportedOperation` if a non-POSIX path
parser is used (our implementation uses `posixpath._realpath()`, which
produces incorrect results for non-POSIX path flavours.) Also tweak code to
call `self.absolute()` upfront rather than supplying an emulated `getcwd()`
function.
Adjust `PathBase.absolute()` to work somewhat like `resolve()`. If a POSIX
path parser is used, we treat the root directory as the current directory.
This is the simplest useful behaviour for concrete path types without a
current directory cursor.
`PurePath.__init__()` incorrectly uses the `_raw_paths` of a given
`PurePath` object with a different flavour, even though the procedure to
join path segments can differ between flavours.
This change makes the `_raw_paths`-enabled deferred joining apply _only_
when the path flavours match.
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Per feedback from Paul Moore on GH-123158, it's better to defer making
`Path.delete()` public than ship it with under-designed error handling
capabilities.
We leave a remnant `_delete()` method, which is used by `move()`. Any
functionality not needed by `move()` is deleted.
These two methods accept an *existing* directory path, onto which we join
the source path's base name to form the final target path.
A possible alternative implementation is to check for directories in
`copy()` and `move()` and adjust the target path, which is done in several
`shutil` functions. This behaviour is helpful in a shell context, but
less so in a stored program that explicitly specifies destinations. For
example, a user that calls `Path('foo.py').copy('bar.py')` might not
imagine that `bar.py/foo.py` would be created, but under the alternative
implementation this will happen if `bar.py` is an existing directory.
Add a `Path.move()` method that moves a file or directory tree, and returns a new `Path` instance pointing to the target.
This method is similar to `shutil.move()`, except that it doesn't accept a *copy_function* argument, and it doesn't check whether the destination is an existing directory.
Replace `umask(0)` with `umask(0o002)` so the created files are not
world-writable, and replace `umask(0o022)` with `umask(0o026)` to check
that permissions for 'others' can still be set.
Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.)
Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `delete()` methods (which will also
accept any type of file.)
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `copy()` methods (which will also
accept any type of file.)
Add *preserve_metadata* keyword-only argument to `pathlib.Path.copytree()`,
defaulting to false. When set to true, we copy timestamps, permissions,
extended attributes and flags where available, like `shutil.copystat()`.
Add a `Path.rmtree()` method that removes an entire directory tree, like
`shutil.rmtree()`. The signature of the optional *on_error* argument
matches the `Path.walk()` argument of the same name, but differs from the
*onexc* and *onerror* arguments to `shutil.rmtree()`. Consistency within
pathlib is probably more important.
In the private pathlib ABCs, we add an implementation based on `walk()`.
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Add *preserve_metadata* keyword-only argument to `pathlib.Path.copy()`, defaulting to false. When set to true, we copy timestamps, permissions, extended attributes and flags where available, like `shutil.copystat()`. The argument has no effect on Windows, where metadata is always copied.
Internally (in the pathlib ABCs), path types gain `_readable_metadata` and `_writable_metadata` attributes. These sets of strings describe what kinds of metadata can be retrieved and stored. We take an intersection of `source._readable_metadata` and `target._writable_metadata` to minimise reads/writes. A new `_read_metadata()` method accepts a set of metadata keys and returns a dict with those keys, and a new `_write_metadata()` method accepts a dict of metadata. We *might* make these public in future, but it's hard to justify while the ABCs are still private.
Add `pathlib.Path.copytree()` method, which recursively copies one
directory to another.
This differs from `shutil.copytree()` in the following respects:
1. Our method has a *follow_symlinks* argument, whereas shutil's has a
*symlinks* argument with an inverted meaning.
2. Our method lacks something like a *copy_function* argument. It always
uses `Path.copy()` to copy files.
3. Our method lacks something like a *ignore_dangling_symlinks* argument.
Instead, users can filter out danging symlinks with *ignore*, or
ignore exceptions with *on_error*
4. Our *ignore* argument is a callable that accepts a single path object,
whereas shutil's accepts a path and a list of child filenames.
5. We add an *on_error* argument, which is a callable that accepts
an `OSError` instance. (`Path.walk()` also accepts such a callable).
Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>
In preparation for the addition of `PathBase.rmtree()`, implement
`DummyPath.unlink()` and `rmdir()`, and move corresponding tests into
`test_pathlib_abc` so they're run against `DummyPath`.
Remove support for supplying additional positional arguments to
`PurePath.relative_to()` and `is_relative_to()`. This has been deprecated
since Python 3.12.
Since 6258844c, paths that might not exist can be fed into pathlib's
globbing implementation, which will call `os.scandir()` / `os.lstat()` only
when strictly necessary. This allows us to drop an initial `self.is_dir()`
call, which saves a `stat()`.
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
When expanding and filtering paths for a `**` wildcard segment, build an `re.Pattern` object from the subsequent pattern parts, rather than the entire pattern, and match against the `os.DirEntry` object prior to instantiating a path object. Also skip compiling a pattern when expanding a `*` wildcard segment.
Methods like `full_match()`, `glob()`, etc, are difficult to make work with
byte paths, and it's not worth the effort. This patch makes `PurePathBase`
raise `TypeError` when given non-`str` path segments.
Return files and directories from `pathlib.Path.glob()` if the pattern ends
with `**`. This is more compatible with `PurePath.full_match()` and with
other glob implementations such as bash and `glob.glob()`. Users can add a
trailing slash to match only directories.
In my previous patch I added a `FutureWarning` with the intention of fixing
this in Python 3.15. Upon further reflection I think this was an
unnecessarily cautious remedy to a clear bug.
Raise `ValueError` if `with_suffix('.ext')` is called on a path without a
stem. Paths may only have a non-empty suffix if they also have a non-empty
stem.
ABC-only bugfix; no effect on public classes.
Remove `self.joinpath('')` call that should have been removed in 6313cdde.
This makes `PathBase.glob('')` yield itself *without* adding a trailing slash. It's hard to say whether this is more or less correct, but at least everything else is faster, and there's no behaviour change in the public classes where empty glob patterns are disallowed.
Add `@needs_symlinks` decorator for tests that require symlink support in
the path class.
Also add `@needs_windows` and `@needs_posix` decorators for tests that
require a specific a specific path flavour. These aren't much used yet, but
will be later.
Add `ntpath.isreserved()`, which identifies reserved pathnames such as "NUL", "AUX" and "CON".
Deprecate `pathlib.PurePath.is_reserved()`.
---------
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Brett Cannon <brett@python.org>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Allow `os.PathLike` objects to be passed as patterns to `pathlib.Path.glob()` and `rglob()`. (It's already possible to use them in `PurePath.match()`)
While we're in the area:
- Allow empty glob patterns in `PathBase` (but not `Path`)
- Speed up globbing in `PathBase` by generating paths with trailing slashes only as a final step, rather than for every intermediate directory.
- Simplify and speed up handling of rare patterns involving both `**` and `..` segments.
Path modules provide a subset of the `os.path` API, specifically those
functions needed to provide `PurePathBase` functionality. Each
`PurePathBase` subclass references its path module via a `pathmod` class
attribute.
This commit adds a new `PathModuleBase` class, which provides abstract
methods that unconditionally raise `UnsupportedOperation`. An instance of
this class is assigned to `PurePathBase.pathmod`, replacing `posixpath`.
As a result, `PurePathBase` is no longer POSIX-y by default, and
all its methods raise `UnsupportedOperation` courtesy of `pathmod`.
Users who subclass `PurePathBase` or `PathBase` should choose the path
syntax by setting `pathmod` to `posixpath`, `ntpath`, `os.path`, or their
own subclass of `PathModuleBase`, as circumstances demand.
On Windows, `os.path.isabs()` now returns `False` when given a path that
starts with exactly one (back)slash. This is more compatible with other
functions in `os.path`, and with Microsoft's own documentation.
Also adjust `pathlib.PureWindowsPath.is_absolute()` to call
`ntpath.isabs()`, which corrects its handling of partial UNC/device paths
like `//foo`.
Co-authored-by: Jon Foster <jon@jon-foster.co.uk>
Apply pathlib's normalization and performance tuning in `pathlib.PurePath`, but not `pathlib._abc.PurePathBase`.
With this change, the pathlib ABCs do not normalize away alternate path separators, empty segments, or dot segments. A single string given to the initialiser will round-trip by default, i.e. `str(PurePathBase(my_string)) == my_string`. Implementors can set their own path domain-specific normalization scheme by overriding `__str__()`
Eliminating path normalization makes maintaining and caching the path's parts and string representation both optional and not very useful, so this commit moves the `_drv`, `_root`, `_tail_cached` and `_str` slots from `PurePathBase` to `PurePath`. Only `_raw_paths` and `_resolving` slots remain in `PurePathBase`. This frees the ABCs from the burden of some of pathlib's hardest-to-understand code.
`PurePathBase` does not define `__eq__()`, and so we have no business checking path equality in `test_eq_common` and `test_equivalences`. The tests only pass at the moment because we define the test class's `__eq__()` for use elsewhere.
Also move `test_parse_path_common` into the main pathlib test suite. It exercises a private `_parse_path()` method that will be moved to `PurePath` soon.
Lastly move a couple more tests concerned with optimisations and path normalisation.
Raise auditing events in `pathlib.Path.glob()`, `rglob()` and `walk()`,
but not in `pathlib._abc.PathBase` methods. Also move generation of a
deprecation warning into `pathlib.Path` so it gets the right stack level.
Run expensive tests for walking and globbing from `test_pathlib` but not
`test_pathlib_abc`. The ABCs are not as tightly optimised as the classes
in top-level `pathlib`, and so these tests are taking rather a long time on
some buildbots. Coverage of the main `pathlib` classes should suffice.
Change the value of `pathlib._abc.PurePathBase.pathmod` from `os.path` to
`posixpath`.
User subclasses of `PurePathBase` and `PathBase` previously used the host
OS's path syntax, e.g. backslashes as separators on Windows. This is wrong
in most use cases, and likely to catch developers out unless they test on
both Windows and non-Windows machines.
In this patch we change the default to POSIX syntax, regardless of OS. This
is somewhat arguable (why not make all aspects of syntax abstract and
individually configurable?) but an improvement all the same.
This change has no effect on `PurePath`, `Path`, nor their subclasses. Only
private APIs are affected.
`PurePathBase.__repr__()` produces a string like `MyPath('/foo')`. This
repr is incorrect/misleading when a subclass's `__init__()` method is
customized, which I expect to be the very common.
This commit moves the `__repr__()` method to `PurePath`, leaving
`PurePathBase` with the default `object` repr.
No user-facing changes because the `pathlib._abc` module remains private.
Store the test base directory as a class attribute named `base` rather than
module constants named `BASE`.
The base directory is a local file path, and therefore not ideally suited
to the pathlib ABC tests. In a future commit we'll change its value in
`test_pathlib_abc.py` such that it points to a totally fictitious path, which
will help to ensure we're not touching the local filesystem.