When I added _PyInterpreterState_IsRunningMain() and friends last year, I tried to accommodate applications that embed Python but don't call _PyInterpreterState_SetRunningMain() (not that they're expected to). That mostly worked fine until my recent changes in gh-117049, where the subtleties with the fallback code led to failures; the change ended up breaking test_tools.test_freeze, which exercises a basic embedding situation.
The simplest fix is to drop the fallback code I originally added to _PyInterpreterState_IsRunningMain() (and later to _PyThreadState_IsRunningMain()). I've kept the fallback in the _xxsubinterpreters module though. I've also updated Py_FrozenMain() to call _PyInterpreterState_SetRunningMain().
Split `_PyThreadState_DeleteExcept` into two functions:
- `_PyThreadState_RemoveExcept` removes all thread states other than one
passed as an argument. It returns the removed thread states as a
linked list.
- `_PyThreadState_DeleteList` deletes those dead thread states. It may
call destructors, so we want to "start the world" before calling
`_PyThreadState_DeleteList` to avoid potential deadlocks.
I added it quite a while ago as a strategy for managing interpreter lifetimes relative to the PEP 554 (now 734) implementation. Relatively recently I refactored that implementation to no longer rely on InterpreterID objects. Thus now I'm removing it.
Add Py_GetConstant() and Py_GetConstantBorrowed() functions.
In the limited C API version 3.13, getting Py_None, Py_False,
Py_True, Py_Ellipsis and Py_NotImplemented singletons is now
implemented as function calls at the stable ABI level to hide
implementation details. Getting these constants still return borrowed
references.
Add _testlimitedcapi/object.c and test_capi/test_object.py to test
Py_GetConstant() and Py_GetConstantBorrowed() functions.
Mostly we unify the two different implementations of the conversion code (from PyObject * to int64_t. We also drop the PyArg_ParseTuple()-style converter function, as well as rename and move PyInterpreterID_LookUp().
This changes the free-threaded build to perform a stop-the-world pause
before deleting other thread states when forking and during shutdown.
This fixes some crashes when using multiprocessing and during shutdown
when running with `PYTHON_GIL=0`.
This also changes `PyOS_BeforeFork` to acquire the runtime lock
(i.e., `HEAD_LOCK(&_PyRuntime)`) before forking to ensure that data
protected by the runtime lock (and not just the GIL or stop-the-world)
is in a consistent state before forking.
Before this change, ctypes classes used a custom dict subclass, `StgDict`,
as their `tp_dict`. This acts like a regular dict but also includes extra information
about the type.
This replaces stgdict by `StgInfo`, a C struct on the type, accessed by
`PyObject_GetTypeData()` (PEP-697).
All usage of `StgDict` (mainly variables named `stgdict`, `dict`, `edict` etc.) is
converted to `StgInfo` (named `stginfo`, `info`, `einfo`, etc.).
Where the dict is actually used for class attributes (as a regular PyDict), it's now
called `attrdict`.
This change -- not overriding `tp_dict` -- is made to make me comfortable with
the next part of this PR: moving the initialization logic from `tp_new` to `tp_init`.
The `StgInfo` is set up in `__init__` of each class, with a guard that prevents
calling `__init__` more than once. Note that abstract classes (like `Array` or
`Structure`) are created using `PyType_FromMetaclass` and do not have
`__init__` called.
Previously, this was done in `__new__`, which also wasn't called for abstract
classes.
Since `__init__` can be called from Python code or skipped, there is a tested
guard to ensure `StgInfo` is initialized exactly once before it's used.
Co-authored-by: neonene <53406459+neonene@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Starting in Python 3.12, we prevented calling fork() and starting new threads
during interpreter finalization (shutdown). This has led to a number of
regressions and flaky tests. We should not prevent starting new threads
(or `fork()`) until all non-daemon threads exit and finalization starts in
earnest.
This changes the checks to use `_PyInterpreterState_GetFinalizing(interp)`,
which is set immediately before terminating non-daemon threads.
* Split long.c tests of _testcapi into two parts: limited C API tests
in _testlimitedcapi and non-limited C API tests in _testcapi.
* Move testcapi_long.h from Modules/_testcapi/ to
Modules/_testlimitedcapi/.
* Add MODULE__TESTLIMITEDCAPI_DEPS to Makefile.pre.in.
Split unicode.c tests of _testcapi into two parts: limited C API
tests in _testlimitedcapi and non-limited C API tests in _testcapi.
Update test_codecs.
Split abstract.c and float.c tests of _testcapi into two parts:
limited C API tests in _testlimitedcapi and non-limited C API tests
in _testcapi.
Update test_bytes and test_class.
This includes adding what should be a relatively temporary
`Modules/_decimal/windows/mpdecimal.h` shim to choose between `mpdecimal32vc.h`
or `mpdecimal64vc.h` based on which of `CONFIG_64` or `CONFIG_32` is defined.
Even though it has no internal references to Python objects it still
has a reference to its type by virtue of being a heap type. We need
to provide a traverse function that visits the type, but we do not
need to provide a clear function.
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:
1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
However, C holds it, so the assertion fails.
The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.
There are two main parts to this PR:
1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
The event is set by the runtime prior to the thread being cleared (in the same
place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
`PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
`_tstate_lock`s.
[^1]: 441affc9e7/Lib/threading.py (L1201)
[^2]: 441affc9e7/Lib/threading.py (L1115)
[^3]: 8194653279
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Use the NtQueryInformationProcess system call to efficiently retrieve the parent process ID in a single step, rather than using the process snapshots API which retrieves large amounts of unnecessary information and is more prone to failure (since it makes heap allocations).
Includes a fallback to the original win32_getppid implementation in case the unstable API appears to return strange results.
The fildes converter of Argument Clinic now always call
PyObject_AsFileDescriptor(), not only for the limited C API.
The _PyLong_FileDescriptor_Converter() converter stays as a fallback
when PyObject_AsFileDescriptor() cannot be used.
Return 0 on success. Set an exception and return -1 on error.
Fix os.timerfd_settime(): properly report exceptions on
_PyTime_FromSecondsDouble() failure.
No longer export _PyTime_FromSecondsDouble().
Move the following files from Modules/_testcapi/ to
Modules/_testlimitedcapi/:
* bytearray.c
* bytes.c
* pyos.c
* sys.c
Changes:
* Replace PyBytes_AS_STRING() with PyBytes_AsString().
* Replace PyBytes_GET_SIZE() with PyBytes_Size().
* Update related test_capi tests.
* Copy Modules/_testcapi/util.h to Modules/_testlimitedcapi/util.h.
Add a new C extension "_testlimitedcapi" which is only built with the
limited C API.
Move heaptype_relative.c and vectorcall_limited.c from
Modules/_testcapi/ to Modules/_testlimitedcapi/.
* configure: add _testlimitedcapi test extension.
* Update generate_stdlib_module_names.py.
* Update make check-c-globals.
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Previously, the `locked` field was set after releasing the lock. This reverses
the order so that the `locked` field is set while the lock is still held.
There is still one thread-safety issue where `locked` is checked prior to
releasing the lock, however, in practice that will only be an issue when
unlocking the lock is contended, which should be rare.
The problem manifested when the .py module got reloaded and the corresponding extension module didn't. The .py module registers types with the extension and the extension was not allowing that to happen more than once. The solution: let it happen more than once.
If awailable, enable -fstrict-overflow for libmpdec. Also
shut off false positive warnings (-Warray-bounds).
The later was backported from mpdecimal-4.0.0.
Make `_thread.ThreadHandle` thread-safe in free-threaded builds
We protect the mutable state of `ThreadHandle` using a `_PyOnceFlag`.
Concurrent operations (i.e. `join` or `detach`) on `ThreadHandle` block
until it is their turn to execute or an earlier operation succeeds.
Once an operation has been applied successfully all future operations
complete immediately.
The `join()` method is now idempotent. It may be called multiple times
but the underlying OS thread will only be joined once. After `join()`
succeeds, any future calls to `join()` will succeed immediately.
The internal thread handle `detach()` method has been removed.
This code decrefs `qidobj` twice in some paths. Since `qidobj` isn't used after
the first `Py_DECREF()`, remove the others, and replace the `Py_DECREF()` with
`Py_CLEAR()` to make it clear that the variable is dead.
With this fix, `python -mtest test_interpreters -R 3:3 -mtest_queues` no longer
fails with `_Py_NegativeRefcount: Assertion failed: object has negative ref
count`.
Allow controlling Expat >=2.6.0 reparse deferral (CVE-2023-52425) by adding five new methods:
- `xml.etree.ElementTree.XMLParser.flush`
- `xml.etree.ElementTree.XMLPullParser.flush`
- `xml.parsers.expat.xmlparser.GetReparseDeferralEnabled`
- `xml.parsers.expat.xmlparser.SetReparseDeferralEnabled`
- `xml.sax.expatreader.ExpatParser.flush`
Based on the "flush" idea from https://github.com/python/cpython/pull/115138#issuecomment-1932444270 .
### Notes
- Please treat as a security fix related to CVE-2023-52425.
Includes code suggested-by: Snild Dolkow <snild@sony.com>
and by core dev Serhiy Storchaka.
This change is part of the work on PEP-738: Adding Android as a
supported platform.
* Remove the "1.0" suffix from libpython's filename on Android, which
would prevent Gradle from packaging it into an app.
* Simplify the build command in the Makefile so that libpython always
gets given an SONAME with the `-Wl-h` argument, even if the SONAME is
identical to the actual filename.
* Disable a number of functions on Android which can be compiled and
linked against, but always fail at runtime. As a result, the native
_multiprocessing module is no longer built for Android.
* gh-115390 (bee7bb331) added some pre-determined results to the
configure script for things that can't be autodetected when
cross-compiling; this change adds Android to these where appropriate.
* Add a couple more pre-determined results for Android, and making them
cover iOS as well. This means the --enable-ipv6 configure option will
no longer be required on either platform.
This brings the code under test.support.interpreters, and the corresponding extension modules, in line with recent updates to PEP 734.
(Note: PEP 734 has not been accepted at this time. However, we are using an internal copy of the implementation in the test suite to exercise the existing subinterpreters feature.)
* Rename _Py_UOpsAbstractInterpContext to _Py_UOpsContext and _Py_UOpsSymType to _Py_UopsSymbol.
* #define shortened form of _Py_uop_... names for improved readability.
Part of the work on PEP 738: Adding Android as a supported platform.
* Rename the LIBPYTHON variable to MODULE_LDFLAGS, to more accurately
reflect its purpose.
* Edit makesetup to use MODULE_LDFLAGS when linking extension modules.
* Edit the Makefile so that extension modules depend on libpython on
Android and Cygwin.
* Restore `-fPIC` on Android. It was removed several years ago with a
note that the toolchain used it automatically, but this is no longer
the case. Omitting it causes all linker commands to fail with an error
like `relocation R_AARCH64_ADR_PREL_PG_HI21 cannot be used against
symbol '_Py_FalseStruct'; recompile with -fPIC`.
PyTime_t no longer uses an arbitrary unit, it's always a number of
nanoseconds (64-bit signed integer).
* Rename _PyTime_FromNanosecondsObject() to _PyTime_FromLong().
* Rename _PyTime_AsNanosecondsObject() to _PyTime_AsLong().
* Remove pytime_from_nanoseconds().
* Remove pytime_as_nanoseconds().
* Remove _PyTime_FromNanoseconds().
Remove references to the old names _PyTime_MIN
and _PyTime_MAX, now that PyTime_MIN and
PyTime_MAX are public.
Replace also _PyTime_MIN with PyTime_MIN.
* Rename `_testinternalcapi.get_{uop,counter}_optimizer` to `new_*_optimizer`
* Use `_PyUOpName()` instead of` _PyOpcode_uop_name[]`
* Add `target` to executor iterator items -- `list(ex)` now returns `(opcode, oparg, target, operand)` quadruples
* Add executor methods `get_opcode()` and `get_oparg()` to get `vmdata.opcode`, `vmdata.oparg`
* Define a helper for printing uops, and unify various places where they are printed
* Add a hack to summarize_stats.py to fix legacy uop names (e.g. `POP_TOP` -> `_POP_TOP`)
* Define helpers in `test_opt.py` for accessing the set or list of opnames of an executor
<pycore_time.h> include is no longer needed to get the PyTime_t type
in internal header files. This type is now provided by <Python.h>
include. Add <pycore_time.h> includes to C files instead.
Restore support of such combination, disabled in gh-113796.
csv.writer() now quotes empty fields if delimiter is a space and
skipinitialspace is true and raises exception if quoting is not possible.
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.
The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
lseek() always returns 0 for character pseudo-devices like
`/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it
always returns -1, to which CPython reacts by raising appropriate
exceptions). They are thus technically seekable despite not having seek
semantics.
When calling read() on e.g. an instance of `io.BufferedReader` that
wraps such a file, `BufferedReader` reads ahead, filling its buffer,
creating a discrepancy between the number of bytes read and the internal
`tell()` always returning 0, which previously resulted in e.g.
`BufferedReader.tell()` or `BufferedReader.seek()` being able to return
positions < 0 even though these are supposed to be always >= 0.
Invariably keep the return value non-negative by returning
max(former_return_value, 0) instead, and add some corresponding tests.
This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and
quiescent state based reclamation (QSBR). The API provides a mechanism
for callers to detect when it is safe to free memory that may be
concurrently accessed by readers.
The ID of the owning thread (`rlock_owner`) may be accessed by
multiple threads without holding the underlying lock; relaxed
atomics are used in place of the previous loads/stores.
The number of times that the lock has been acquired (`rlock_count`)
is only ever accessed by the thread that holds the lock; we do not
need to use atomics to access it.