Commit Graph

484 Commits

Author SHA1 Message Date
Miss Islington (bot) 0becae366c
[3.13] gh-117657: Fix QSBR race condition (GH-118843) (#118905)
`_Py_qsbr_unregister` is called when the PyThreadState is already
detached, so the access to `tstate->qsbr` isn't safe without locking the
shared mutex. Grab the `struct _qsbr_shared` from the interpreter
instead.
(cherry picked from commit 33d20199af)

Co-authored-by: Alex Turner <alexturner@meta.com>
2024-05-10 15:13:17 +00:00
Miss Islington (bot) 4480dd86d9
[3.13] gh-117657: Fix data races reported by TSAN on `interp->threads.main` (GH-118865) (#118904)
Use relaxed loads/stores when reading/writing to this field.
(cherry picked from commit 22d5185308)

Co-authored-by: mpage <mpage@meta.com>
2024-05-10 14:40:06 +00:00
Brett Simmers 853163d3b5
gh-116322: Enable the GIL while loading C extension modules (#118560)
Add the ability to enable/disable the GIL at runtime, and use that in
the C module loading code.

We can't know before running a module init function if it supports
free-threading, so the GIL is temporarily enabled before doing so. If
the module declares support for running without the GIL, the GIL is
later disabled. Otherwise, the GIL is permanently enabled, and will
never be disabled again for the life of the current interpreter.
2024-05-06 23:07:23 -04:00
Dino Viehland ff6cbb2503
gh-112075: use per-thread dict version pool (#118676)
use thread state set of dict versions
2024-05-07 00:22:26 +00:00
Sam Gross 723d4d2fe8
gh-118527: Intern code consts in free-threaded build (#118667)
We already intern and immortalize most string constants. In the
free-threaded build, other constants can be a source of reference count
contention because they are shared by all threads running the same code
objects.
2024-05-06 20:12:39 -04:00
Brett Simmers f8290df63f
gh-116738: Make `_codecs` module thread-safe (#117530)
The module itself is a thin wrapper around calls to functions in
`Python/codecs.c`, so that's where the meaningful changes happened:

- Move codecs-related state that lives on `PyInterpreterState` to a
  struct declared in `pycore_codecs.h`.

- In free-threaded builds, add a mutex to `codecs_state` to synchronize
  operations on `search_path`. Because `search_path_mutex` is used as a
  normal mutex and not a critical section, we must be extremely careful
  with operations called while holding it.

- The codec registry is explicitly initialized as part of
  `_PyUnicode_InitEncodings` to simplify thread-safety.
2024-05-02 18:25:36 -04:00
Guido van Rossum 7d83f7bcc4
gh-118335: Configure Tier 2 interpreter at build time (#118339)
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.

We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.

On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.

In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
2024-04-30 18:26:34 -07:00
Sam Gross b2c3b70c71
gh-118332: Fix deadlock involving stop the world (#118412)
Avoid detaching thread state when stopping the world. When re-attaching
the thread state, the thread would attempt to resume the top-most
critical section, which might now be held by a thread paused for our
stop-the-world request.
2024-04-30 15:01:28 -04:00
Sam Gross 7ccacb220d
gh-117783: Immortalize objects that use deferred reference counting (#118112)
Deferred reference counting is not fully implemented yet. As a temporary
measure, we immortalize objects that would use deferred reference
counting to avoid multi-threaded scaling bottlenecks.

This is only performed in the free-threaded build once the first
non-main thread is started. Additionally, some tests, including refleak
tests, suppress this behavior.
2024-04-29 14:36:02 -04:00
mpage 2e7771a03d
gh-117657: Quiet TSAN warnings about remaining non-atomic accesses of `tstate->state` (#118165)
Quiet TSAN warnings about remaining non-atomic accesses of `tstate->state`
2024-04-23 10:20:14 -07:00
Dino Viehland 07525c9a85
gh-116818: Make `sys.settrace`, `sys.setprofile`, and monitoring thread-safe (#116775)
Makes sys.settrace, sys.setprofile, and monitoring generally thread-safe.

Mostly uses a stop-the-world approach and synchronization around the code object's _co_instrumentation_version.  There may be a little bit of extra synchronization around the monitoring data that's required to be TSAN clean.
2024-04-19 14:47:42 -07:00
Mark Shannon 147cd0581e
GH-117760: Streamline the trashcan mechanism (GH-117763) 2024-04-17 11:08:05 +01:00
mpage 0d5238373d
gh-117657: Quiet more TSAN warnings due to incorrect modeling of compare/exchange (#117830) 2024-04-15 12:17:55 -04:00
mpage 6e0b327690
gh-117657: Quiet TSAN warning about a data race between `start_the_world()` and `tstate_try_attach()` (#117828)
TSAN erroneously reports a data race between the `_Py_atomic_compare_exchange_int`
on `tstate->state` in `tstate_try_attach()` and the non-atomic load of
`tstate->state` in `start_the_world`. The `_Py_atomic_compare_exchange_int` fails,
but TSAN erroneously treats it as a store.
2024-04-15 12:17:33 -04:00
Eric Snow fd259fdabe
gh-76785: Handle Legacy Interpreters Properly (gh-117490)
This is similar to the situation with threading._DummyThread.  The methods (incl. __del__()) of interpreters.Interpreter objects must be careful with interpreters not created by interpreters.create().  The simplest thing to start with is to disable any method that modifies or runs in the interpreter.  As part of this, the runtime keeps track of where an interpreter was created.  We also handle interpreter "refcounts" properly.
2024-04-11 23:23:25 +00:00
Eric Snow 993c3cca16
gh-76785: Add More Tests to test_interpreters.test_api (gh-117662)
In addition to the increase test coverage, this is a precursor to sorting out how we handle interpreters created directly via the C-API.
2024-04-10 18:37:01 -06:00
Sam Gross 1a6594f661
gh-117439: Make refleak checking thread-safe without the GIL (#117469)
This keeps track of the per-thread total reference count operations in
PyThreadState in the free-threaded builds. The count is merged into the
interpreter's total when the thread exits.
2024-04-08 12:11:36 -04:00
mpage df73179048
gh-111926: Make weakrefs thread-safe in free-threaded builds (#117168)
Most mutable data is protected by a striped lock that is keyed on the
referenced object's address. The weakref's hash is protected using the
weakref's per-object lock.
 
Note that this only affects free-threaded builds. Apart from some minor
refactoring, the added code is all either gated by `ifdef`s or is a no-op
(e.g. `Py_BEGIN_CRITICAL_SECTION`).
2024-04-08 10:58:38 -04:00
Eric Snow 976bcb2379
gh-76785: Raise InterpreterError, Not RuntimeError (gh-117489)
I had meant to switch everything to InterpreterError when I added it a while back.  At the time I missed a few key spots.

As part of this, I've added print-the-exception to _PyXI_InitTypes() and fixed an error case in `_PyStaticType_InitBuiltin().
2024-04-03 10:58:39 -06:00
Sam Gross bfc57d43d8
gh-117303: Don't detach in `PyThreadState_DeleteCurrent()` (#117304)
This fixes a crash in `test_threading.test_reinit_tls_after_fork()` when
running with the GIL disabled. We already properly handle the case where
the thread state is `_Py_THREAD_ATTACHED` in `tstate_delete_common()` --
we just need to remove an assertion.

Keeping the thread attached means that a stop-the-world pause, such as
for a `fork()`, won't commence until we remove our thread state from the
interpreter's linked list. This prevents a crash when the child process
tries to clean up the dead thread states.
2024-03-29 18:58:08 -04:00
Sam Gross 01bd74eadb
gh-117300: Use stop the world to make `sys._current_frames` and `sys._current_exceptions` thread-safe. (#117301)
This adds a stop the world pause to make the two functions thread-safe
when the GIL is disabled in the free-threaded build.

Additionally, the main test thread may call `sys._current_exceptions()` as
soon as `g_raised.set()` is called. The background thread may not yet reach
the `leave_g.wait()` line.
2024-03-29 15:33:06 -04:00
Sam Gross 8dbfdb2957
gh-110481: Fix biased reference counting queue initialization. (#117271)
The biased reference counting queue must be initialized from the bound
(active) thread because it uses `_Py_ThreadId()` as the key in a hash
table.
2024-03-28 09:28:39 -04:00
Eric Snow b3d25df8d3
gh-105716: Fix _PyInterpreterState_IsRunningMain() For Embedders (gh-117140)
When I added _PyInterpreterState_IsRunningMain() and friends last year, I tried to accommodate applications that embed Python but don't call _PyInterpreterState_SetRunningMain() (not that they're expected to).  That mostly worked fine until my recent changes in gh-117049, where the subtleties with the fallback code led to failures; the change ended up breaking test_tools.test_freeze, which exercises a basic embedding situation.

The simplest fix is to drop the fallback code I originally added to _PyInterpreterState_IsRunningMain() (and later to _PyThreadState_IsRunningMain()).  I've kept the fallback in the _xxsubinterpreters module though.  I've also updated Py_FrozenMain() to call _PyInterpreterState_SetRunningMain().
2024-03-21 18:20:20 -06:00
Sam Gross 1f72fb5447
gh-116522: Refactor `_PyThreadState_DeleteExcept` (#117131)
Split `_PyThreadState_DeleteExcept` into two functions:

- `_PyThreadState_RemoveExcept` removes all thread states other than one
  passed as an argument. It returns the removed thread states as a
  linked list.

- `_PyThreadState_DeleteList` deletes those dead thread states. It may
  call destructors, so we want to "start the world" before calling
  `_PyThreadState_DeleteList` to avoid potential deadlocks.
2024-03-21 11:21:02 -07:00
Eric Snow 617158e078
gh-76785: Drop PyInterpreterID_Type (gh-117101)
I added it quite a while ago as a strategy for managing interpreter lifetimes relative to the PEP 554 (now 734) implementation.  Relatively recently I refactored that implementation to no longer rely on InterpreterID objects.  Thus now I'm removing it.
2024-03-21 17:15:02 +00:00
Eric Snow 5a76d1be8e
gh-105716: Update interp->threads.main After Fork (gh-117049)
I missed this in gh-109921.

We also update Py_Exit() to call _PyInterpreterState_SetNotRunningMain(), if necessary.
2024-03-21 10:06:35 -06:00
Eric Snow bbee57fa8c
gh-76785: Clean Up Interpreter ID Conversions (gh-117048)
Mostly we unify the two different implementations of the conversion code (from PyObject * to int64_t.  We also drop the PyArg_ParseTuple()-style converter function, as well as rename and move PyInterpreterID_LookUp().
2024-03-21 09:56:12 -06:00
Sam Gross e728303532
gh-116522: Stop the world before fork() and during shutdown (#116607)
This changes the free-threaded build to perform a stop-the-world pause
before deleting other thread states when forking and during shutdown.
This fixes some crashes when using multiprocessing and during shutdown
when running with `PYTHON_GIL=0`.

This also changes `PyOS_BeforeFork` to acquire the runtime lock
(i.e., `HEAD_LOCK(&_PyRuntime)`) before forking to ensure that data
protected by the runtime lock (and not just the GIL or stop-the-world)
is in a consistent state before forking.
2024-03-21 10:01:16 -04:00
Guido van Rossum 7e1f38f2de
gh-116916: Remove separate next_func_version counter (#116918)
Somehow we ended up with two separate counter variables tracking "the next function version".
Most likely this was a historical accident where an old branch was updated incorrectly.
This PR merges the two counters into a single one: `interp->func_state.next_version`.
2024-03-18 11:11:10 -07:00
mpage 33da0e844c
gh-114271: Fix race in `Thread.join()` (#114839)
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:

1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
   out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
   out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
   However, C holds it, so the assertion fails.

The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.

There are two main parts to this PR:

1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
   The event is set by the runtime prior to the thread being cleared (in the same
   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
   event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
   `_tstate_lock`s.

[^1]: 441affc9e7/Lib/threading.py (L1201)
[^2]: 441affc9e7/Lib/threading.py (L1115)
[^3]: 8194653279

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
2024-03-16 13:56:30 +01:00
Sam Gross 9f983e00ec
gh-116515: Clear thread-local state before tstate_delete_common() (#116517)
This moves `current_fast_clear()` up so that the current thread state is
`NULL` while running `tstate_delete_common()`.

This doesn't fix any bugs, but it means that we are more consistent that
`_PyThreadState_GET() != NULL` means that the thread is "attached".
2024-03-11 15:14:20 -04:00
Sam Gross 834bf57eb7
gh-116396: Pass "detached_state" argument to tstate_set_detached (#116398)
The stop-the-world code was incorrectly setting suspended threads'
states to _Py_THREAD_DETACHED instead of _Py_THREAD_SUSPENDED.
2024-03-07 13:37:43 -05:00
Sam Gross c012c8ab7b
gh-115103: Delay reuse of mimalloc pages that store PyObjects (#115435)
This implements the delayed reuse of mimalloc pages that contain Python
objects in the free-threaded build.

Allocations of the same size class are grouped in data structures called
pages. These are different from operating system pages. For thread-safety, we
want to ensure that memory used to store PyObjects remains valid as long as
there may be concurrent lock-free readers; we want to delay using it for
other size classes, in other heaps, or returning it to the operating system.

When a mimalloc page becomes empty, instead of immediately freeing it, we tag
it with a QSBR goal and insert it into a per-thread state linked list of
pages to be freed. When mimalloc needs a fresh page, we process the queue and
free any still empty pages that are now deemed safe to be freed. Pages
waiting to be freed are still available for allocations of the same size
class and allocating from a page prevent it from being freed. There is
additional logic to handle abandoned pages when threads exit.
2024-03-06 09:42:11 -05:00
Brett Simmers 0adfa8482d
gh-115832: Fix instrumentation version mismatch during interpreter shutdown (#115856)
A previous commit introduced a bug to `interpreter_clear()`: it set
`interp->ceval.instrumentation_version` to 0, without making the corresponding
change to `tstate->eval_breaker` (which holds a thread-local copy of the
version). After this happens, Python code can still run due to object finalizers
during a GC, and the version check in bytecodes.c will see a different result
than the one in instrumentation.c causing an infinite loop.

The fix itself is straightforward: clear `tstate->eval_breaker` when clearing
`interp->ceval.instrumentation_version`.
2024-03-04 11:29:39 -05:00
Steve Dower 9578288a3e
gh-116012: Preserve GetLastError() across calls to TlsGetValue on Windows (GH-116014) 2024-02-28 13:58:25 +00:00
Michael Droettboom b05afdd5ec
gh-115168: Add pystats counter for invalidated executors (GH-115169) 2024-02-26 17:51:47 +00:00
Sam Gross e3ad6ca56f
gh-115103: Implement delayed free mechanism for free-threaded builds (#115367)
This adds `_PyMem_FreeDelayed()` and supporting functions. The
`_PyMem_FreeDelayed()` function frees memory with the same allocator as
`PyMem_Free()`, but after some delay to ensure that concurrent lock-free
readers have finished.
2024-02-20 13:04:37 -05:00
Sam Gross cc82e33af9
gh-115491: Keep some fields valid across allocations (free-threading) (#115573)
This avoids filling the memory occupied by ob_tid, ob_ref_local, and
ob_ref_shared with debug bytes (e.g., 0xDD) in mimalloc in the
free-threaded build.
2024-02-20 10:36:40 -05:00
Victor Stinner 9af80ec83d
gh-110850: Replace _PyTime_t with PyTime_t (#115719)
Run command:

sed -i -e 's!\<_PyTime_t\>!PyTime_t!g' $(find -name "*.c" -o -name "*.h")
2024-02-20 15:02:27 +00:00
Brett Simmers 0749244d13
gh-112175: Add `eval_breaker` to `PyThreadState` (#115194)
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.

The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
2024-02-20 09:57:48 -05:00
Mark Shannon 7b21403ccd
GH-112354: Initial implementation of warm up on exits and trace-stitching (GH-114142) 2024-02-20 09:39:55 +00:00
Sam Gross 5903190727
gh-115103: Implement delayed memory reclamation (QSBR) (#115180)
This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and
quiescent state based reclamation (QSBR). The API provides a mechanism
for callers to detect when it is safe to free memory that may be
concurrently accessed by readers.
2024-02-16 15:25:19 -05:00
Dino Viehland 454d7963e3
gh-113743: Use per-interpreter locks for types (#115541)
Move type-lock to per-interpreter lock to avoid heavy contention in interpreters test
2024-02-15 16:28:31 -08:00
Dino Viehland ae460d450a
gh-113743: Make the MRO cache thread-safe in free-threaded builds (#113930)
Makes _PyType_Lookup thread safe, including:
    Thread safety of the underlying cache.
    Make mutation of mro and type members thread safe
    Also _PyType_GetMRO and _PyType_GetBases are currently returning borrowed references which aren't safe.
2024-02-15 10:54:57 -08:00
Eric Snow 468430189d
gh-115482: Assume the Main Interpreter is Always Running "main" (gh-115484)
This is a temporary fix to unblock embedders that do not call Py_Main().

_PyInterpreterState_IsRunningMain() will always return true for the main interpreter, even in corner cases where it technically should not. The (future) full solution will do the right thing in those corner cases.
2024-02-14 16:07:22 -07:00
Donghee Na f15795c9a0
gh-111968: Rename freelist related struct names to Eric's suggestion (gh-115329) 2024-02-14 00:32:51 +00:00
Mark Shannon f9f6156c5a
GH-113710: Backedge counter improvements. (GH-115166) 2024-02-13 14:16:37 +00:00
Donghee Na d4d5bae147
gh-111968: Refactor _PyXXX_Fini to integrate with _PyObject_ClearFreeLists (gh-114899) 2024-02-10 00:57:04 +00:00
Sam Gross a3af3cb4f4
gh-110481: Implement inter-thread queue for biased reference counting (#114824)
Biased reference counting maintains two refcount fields in each object:
`ob_ref_local` and `ob_ref_shared`. The true refcount is the sum of these two
fields. In some cases, when refcounting operations are split across threads,
the ob_ref_shared field can be negative (although the total refcount must be
at least zero). In this case, the thread that decremented the refcount
requests that the owning thread give up ownership and merge the refcount
fields.
2024-02-09 17:08:32 -05:00
Sam Gross b6228b521b
gh-115035: Mark ThreadHandles as non-joinable earlier after forking (#115042)
This marks dead ThreadHandles as non-joinable earlier in
`PyOS_AfterFork_Child()` before we execute any Python code. The handles
are stored in a global linked list in `_PyRuntimeState` because `fork()`
affects the entire process.
2024-02-06 14:45:04 -05:00