These functions have long sown confusion among Python developers. The
existing documentation says they deal with URL path components, but that
doesn't fit the evidence on Windows:
>>> pathname2url(r'C:\foo')
'///C:/foo'
>>> pathname2url(r'\\server\share')
'////server/share' # or '//server/share' as of quite recently
If these were URL path components, they would imply complete URLs like
`file://///C:/foo` and `file://////server/share`. Clearly this isn't right.
Yet the implementation in `nturl2path` is deliberate, and the
`url2pathname()` function correctly inverts it.
On non-Windows platforms, the behaviour until quite recently is to simply
quote/unquote the path without adding or removing any leading slashes. This
behaviour is compatible with *both* interpretations -- 1) the value is a
URL path component (existing docs), and 2) the value is everything
following `file:` (this commit)
The conclusion I draw is that these functions operate on everything after
the `file:` prefix, which may include an authority section. This is the
only explanation that fits both the Windows and non-Windows behaviour.
It's also a better match for the function names.
Move tests for Path.walk() into a new PathWalkTest class, and apply a similar change in tests for the ABCs. This allows us to properly tear down the walk test hierarchy in tearDown(), rather than leaving it to os_helper.rmtree().
Stop converting Windows drive letters to uppercase in
`urllib.request.pathname2url()` and `url2pathname()`. This behaviour is
unnecessary and inconsistent with pathlib's file URI implementation.
Record success in `specialize`
This matches the existing behavior where we increment the success
stat for the generic opcode each time we successfully specialize
an instruction.
If Python fails to start newly created thread
due to failure of underlying PyThread_start_new_thread() call,
its state should be removed from interpreter' thread states list
to avoid its double cleanup.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
This gets rid of the immortal check in `PyStackRef_FromPyObjectSteal()`.
Overall, this improves performance about 2% in the free threading
build.
This also renames `PyStackRef_Is()` to `PyStackRef_IsExactly()` because
the macro requires that the tag bits of the arguments match, which is
only true in certain special cases.
This change enables custom GHA runners for Ubuntu-24.04 that run on Arm hardware. It also prepares for Windows runners on Arm hardware, but doesn't enable that just yet, because the Arm GHA runner images for Windows need to be updated.
Add free-threaded specialization for `UNPACK_SEQUENCE` opcode.
`UNPACK_SEQUENCE_TUPLE/UNPACK_SEQUENCE_TWO_TUPLE` are already thread safe since tuples are immutable.
`UNPACK_SEQUENCE_LIST` is not thread safe because of nature of lists (there is nothing preventing another thread from adding items to or removing them the list while the instruction is executing). To achieve thread safety we add a critical section to the implementation of `UNPACK_SEQUENCE_LIST`, especially around the parts where we check the size of the list and push items onto the stack.
---------
Co-authored-by: Matt Page <mpage@meta.com>
Co-authored-by: mpage <mpage@cs.stanford.edu>
Threads are gone after fork, so clear the queues too. Otherwise the
child process (here created via multiprocessing.Process) crashes on
interpreter exit.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* Name without a PATHEXT extension is only searched if the mode does not
include X_OK.
* Support multi-component PATHEXT extensions (e.g. ".foo.bar").
* Support files without extensions in PATHEXT contains dot-only extension
(".", "..", etc).
* Support PATHEXT extensions that end with a dot (e.g. ".foo.").
The usage parameter of argparse.ArgumentParser no longer
affects the default value of the prog parameter in subparsers.
Previously the full custom usage of the main parser was used as
the prog prefix in subparsers.
`mmap`, `munmap`, and `mprotect` are used by CPython for memory
management, which may occur in the middle of the FileIO tests. The
system calls can also be used with files, so `strace` includes them
in its `%file` and `%desc` filters.
Filter out the `mmap` system calls related to memory allocation for the
file tests. Currently FileIO doesn't do `mmap` at all, so didn't add
code to track from `mmap` through `munmap` since it wouldn't be used.
For now if an `mmap` on a fd happens, the call will be included (which
may cause test to fail), and at that time support for tracking the
address throug `munmap` could be added.
The `methodcaller` C vectorcall implementation uses an arguments array
that is shared across calls. The first argument is modified on every
invocation. This isn't thread-safe in the free threading build. I think
it's also not safe in general, but for now just disable it in the free
threading build.
Decode a file URI like `file://///server/share` as a UNC path like
`\\server\share`. This form of file URI is created by software the simply
prepends `file:///` to any absolute Windows path.
Discard any 'localhost' authority from the beginning of a `file:` URI. As a
result, file URIs like `//localhost/etc/hosts` are correctly decoded as
`/etc/hosts`.
Adjust `pathname2url()` to encode embedded colon characters in Windows
paths, rather than bailing out with an `OSError`.
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Enable specialization of LOAD_GLOBAL in free-threaded builds.
Thread-safety of specialization in free-threaded builds is provided by the following:
A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization.
Generation of new keys versions is made atomic in free-threaded builds.
Existing helpers are used to atomically modify the opcode.
Thread-safety of specialized instructions in free-threaded builds is provided by the following:
Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards.
Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset.
The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.
This is a precursor to the actual fix for gh-114940, where we will change these macros to use the new lock. This change is almost entirely mechanical; the exceptions are the loops in codeobject.c and ceval.c, which now hold the "head" lock. Note that almost all of the uses of _Py_FOR_EACH_TSTATE_UNLOCKED() here will change to _Py_FOR_EACH_TSTATE_BEGIN() once we add the new per-interpreter lock.