This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and
quiescent state based reclamation (QSBR). The API provides a mechanism
for callers to detect when it is safe to free memory that may be
concurrently accessed by readers.
The GC keeps track of the number of allocations (less deallocations)
since the last GC. This buffers the count in thread-local state and uses
atomic operations to modify the per-interpreter count. The thread-local
buffering avoids contention on shared state.
A consequence is that the GC scheduling is not as precise, so
"test_sneaky_frame_object" is skipped because it requires that the GC be
run exactly after allocating a frame object.
Makes _PyType_Lookup thread safe, including:
Thread safety of the underlying cache.
Make mutation of mro and type members thread safe
Also _PyType_GetMRO and _PyType_GetBases are currently returning borrowed references which aren't safe.
This is a temporary fix to unblock embedders that do not call Py_Main().
_PyInterpreterState_IsRunningMain() will always return true for the main interpreter, even in corner cases where it technically should not. The (future) full solution will do the right thing in those corner cases.
For the most part, these changes make is substantially easier to backport subinterpreter-related code to 3.12, especially the related modules (e.g. _xxsubinterpreters). The main motivation is to support releasing a PyPI package with the 3.13 capabilities compiled for 3.12.
A lot of the changes here involve either hiding details behind macros/functions or splitting up some files.
Setters for members with an unsigned integer type now support
the same range of valid values for objects that has a __index__()
method as for int.
Previously, Py_T_UINT, Py_T_ULONG and Py_T_ULLONG did not support
objects that has a __index__() method larger than LONG_MAX.
Py_T_ULLONG did not support negative ints. Now it supports them and
emits a RuntimeWarning.
Biased reference counting maintains two refcount fields in each object:
`ob_ref_local` and `ob_ref_shared`. The true refcount is the sum of these two
fields. In some cases, when refcounting operations are split across threads,
the ob_ref_shared field can be negative (although the total refcount must be
at least zero). In this case, the thread that decremented the refcount
requests that the owning thread give up ownership and merge the refcount
fields.
This changes a number of internal usages of `PyDict_SetDefault` to use `PyDict_SetDefaultRef`.
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
This marks dead ThreadHandles as non-joinable earlier in
`PyOS_AfterFork_Child()` before we execute any Python code. The handles
are stored in a global linked list in `_PyRuntimeState` because `fork()`
affects the entire process.
Fix race between `_PyParkingLot_Park` and `_PyParkingLot_UnparkAll` when handling interrupts
There is a potential race when `_PyParkingLot_UnparkAll` is executing in
one thread and another thread is unblocked because of an interrupt in
`_PyParkingLot_Park`. Consider the following scenario:
1. Thread T0 is blocked[^1] in `_PyParkingLot_Park` on address `A`.
2. Thread T1 executes `_PyParkingLot_UnparkAll` on address `A`. It
finds the `wait_entry` for `T0` and unlinks[^2] its list node.
3. Immediately after (2), T0 is woken up due to an interrupt. It
then segfaults trying to unlink[^3] the node that was previously
unlinked in (2).
To fix this we mark each waiter as unparking before releasing the bucket
lock. `_PyParkingLot_Park` will wait to handle the coming wakeup, and not
attempt to unlink the node, when this field is set. `_PyParkingLot_Unpark`
does this already, presumably to handle this case.
* Fix a RuntimeWarning emitted when assign an integer-like value that
is not an instance of int to an attribute that corresponds to a C
struct member of type T_UINT and T_ULONG.
* Fix a double RuntimeWarning emitted when assign a negative integer value
to an attribute that corresponds to a C struct member of type T_UINT.
* gh-112529: Remove PyGC_Head from object pre-header in free-threaded build
This avoids allocating space for PyGC_Head in the free-threaded build.
The GC implementation for free-threaded CPython does not use the
PyGC_Head structure.
* The trashcan mechanism uses the `ob_tid` field instead of `_gc_prev`
in the free-threaded build.
* The GDB libpython.py file now determines the offset of the managed
dict field based on whether the running process is a free-threaded
build. Those are identified by the `ob_ref_local` field in PyObject.
* Fixes `_PySys_GetSizeOf()` which incorrectly incorrectly included the
size of `PyGC_Head` in the size of static `PyTypeObject`.
The free-threaded build's GC implementation is non-generational, but was
scheduled as if it were collecting a young generation leading to
quadratic behavior. This increases the minimum threshold and scales it
to the number of live objects as we do for the old generation in the
default build.
Note that the scheduling is still not thread-safe without the GIL. Those
changes will come in later PRs.
A few tests, like "test_sneaky_frame_object" rely on prompt scheduling
of the GC. For now, to keep that test passing, we disable the scaled
threshold after calls like `gc.set_threshold(1, 0, 0)`.
Add a configure define for HAVE_PTHREAD_COND_TIMEDWAIT_RELATIVE_NP and
replaces pthread_cond_timedwait() with pthread_cond_timedwait_relative_np()
for relative time when supported in semaphore waiting logic.