cpython

Commit Graph

Author	SHA1	Message	Date
Sam Gross	e728303532	gh-116522: Stop the world before fork() and during shutdown (#116607 ) This changes the free-threaded build to perform a stop-the-world pause before deleting other thread states when forking and during shutdown. This fixes some crashes when using multiprocessing and during shutdown when running with `PYTHON_GIL=0`. This also changes `PyOS_BeforeFork` to acquire the runtime lock (i.e., `HEAD_LOCK(&_PyRuntime)`) before forking to ensure that data protected by the runtime lock (and not just the GIL or stop-the-world) is in a consistent state before forking.	2024-03-21 10:01:16 -04:00
Guido van Rossum	7e1f38f2de	gh-116916: Remove separate next_func_version counter (#116918 ) Somehow we ended up with two separate counter variables tracking "the next function version". Most likely this was a historical accident where an old branch was updated incorrectly. This PR merges the two counters into a single one: `interp->func_state.next_version`.	2024-03-18 11:11:10 -07:00
mpage	33da0e844c	gh-114271: Fix race in `Thread.join()` (#114839 ) There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()` and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution involving threads A, B, and C: 1. A starts. 2. B joins A, blocking on its `_tstate_lock`. 3. C joins A, blocking on its `_tstate_lock`. 4. A finishes and releases its `_tstate_lock`. 5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped out before calling `_stop()`. 6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped out before releasing it. 7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held. However, C holds it, so the assertion fails. The race can be reproduced[^3] by inserting sleeps at the appropriate points in the threading code. To do so, run the `repro_join_race.py` from the linked repo. There are two main parts to this PR: 1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`. The event is set by the runtime prior to the thread being cleared (in the same place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the event to be set. 2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all non-daemon threads to exit. To do so, an `is_daemon` predicate was added to `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()` now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on `_tstate_lock`s. [^1]: `441affc9e7/Lib/threading.py (L1201)` [^2]: `441affc9e7/Lib/threading.py (L1115)` [^3]: `8194653279` --------- Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Antoine Pitrou <antoine@python.org>	2024-03-16 13:56:30 +01:00
Sam Gross	9f983e00ec	gh-116515: Clear thread-local state before tstate_delete_common() (#116517 ) This moves `current_fast_clear()` up so that the current thread state is `NULL` while running `tstate_delete_common()`. This doesn't fix any bugs, but it means that we are more consistent that `_PyThreadState_GET() != NULL` means that the thread is "attached".	2024-03-11 15:14:20 -04:00
Sam Gross	834bf57eb7	gh-116396: Pass "detached_state" argument to tstate_set_detached (#116398 ) The stop-the-world code was incorrectly setting suspended threads' states to _Py_THREAD_DETACHED instead of _Py_THREAD_SUSPENDED.	2024-03-07 13:37:43 -05:00
Sam Gross	c012c8ab7b	gh-115103: Delay reuse of mimalloc pages that store PyObjects (#115435 ) This implements the delayed reuse of mimalloc pages that contain Python objects in the free-threaded build. Allocations of the same size class are grouped in data structures called pages. These are different from operating system pages. For thread-safety, we want to ensure that memory used to store PyObjects remains valid as long as there may be concurrent lock-free readers; we want to delay using it for other size classes, in other heaps, or returning it to the operating system. When a mimalloc page becomes empty, instead of immediately freeing it, we tag it with a QSBR goal and insert it into a per-thread state linked list of pages to be freed. When mimalloc needs a fresh page, we process the queue and free any still empty pages that are now deemed safe to be freed. Pages waiting to be freed are still available for allocations of the same size class and allocating from a page prevent it from being freed. There is additional logic to handle abandoned pages when threads exit.	2024-03-06 09:42:11 -05:00
Brett Simmers	0adfa8482d	gh-115832: Fix instrumentation version mismatch during interpreter shutdown (#115856 ) A previous commit introduced a bug to `interpreter_clear()`: it set `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and the version check in bytecodes.c will see a different result than the one in instrumentation.c causing an infinite loop. The fix itself is straightforward: clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`.	2024-03-04 11:29:39 -05:00
Steve Dower	9578288a3e	gh-116012: Preserve GetLastError() across calls to TlsGetValue on Windows (GH-116014)	2024-02-28 13:58:25 +00:00
Michael Droettboom	b05afdd5ec	gh-115168: Add pystats counter for invalidated executors (GH-115169)	2024-02-26 17:51:47 +00:00
Sam Gross	e3ad6ca56f	gh-115103: Implement delayed free mechanism for free-threaded builds (#115367 ) This adds `_PyMem_FreeDelayed()` and supporting functions. The `_PyMem_FreeDelayed()` function frees memory with the same allocator as `PyMem_Free()`, but after some delay to ensure that concurrent lock-free readers have finished.	2024-02-20 13:04:37 -05:00
Sam Gross	cc82e33af9	gh-115491: Keep some fields valid across allocations (free-threading) (#115573 ) This avoids filling the memory occupied by ob_tid, ob_ref_local, and ob_ref_shared with debug bytes (e.g., 0xDD) in mimalloc in the free-threaded build.	2024-02-20 10:36:40 -05:00
Victor Stinner	9af80ec83d	gh-110850: Replace _PyTime_t with PyTime_t (#115719 ) Run command: sed -i -e 's!\<_PyTime_t\>!PyTime_t!g' $(find -name ".c" -o -name ".h")	2024-02-20 15:02:27 +00:00
Brett Simmers	0749244d13	gh-112175: Add `eval_breaker` to `PyThreadState` (#115194 ) This change adds an `eval_breaker` field to `PyThreadState`. The primary motivation is for performance in free-threaded builds: with thread-local eval breakers, we can stop a specific thread (e.g., for an async exception) without interrupting other threads. The source of truth for the global instrumentation version is stored in the `instrumentation_version` field in PyInterpreterState. Threads usually read the version from their local `eval_breaker`, where it continues to be colocated with the eval breaker bits.	2024-02-20 09:57:48 -05:00
Mark Shannon	7b21403ccd	GH-112354: Initial implementation of warm up on exits and trace-stitching (GH-114142)	2024-02-20 09:39:55 +00:00
Sam Gross	5903190727	gh-115103: Implement delayed memory reclamation (QSBR) (#115180 ) This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.	2024-02-16 15:25:19 -05:00
Dino Viehland	454d7963e3	gh-113743: Use per-interpreter locks for types (#115541 ) Move type-lock to per-interpreter lock to avoid heavy contention in interpreters test	2024-02-15 16:28:31 -08:00
Dino Viehland	ae460d450a	gh-113743: Make the MRO cache thread-safe in free-threaded builds (#113930 ) Makes _PyType_Lookup thread safe, including: Thread safety of the underlying cache. Make mutation of mro and type members thread safe Also _PyType_GetMRO and _PyType_GetBases are currently returning borrowed references which aren't safe.	2024-02-15 10:54:57 -08:00
Eric Snow	468430189d	gh-115482: Assume the Main Interpreter is Always Running "main" (gh-115484) This is a temporary fix to unblock embedders that do not call Py_Main(). _PyInterpreterState_IsRunningMain() will always return true for the main interpreter, even in corner cases where it technically should not. The (future) full solution will do the right thing in those corner cases.	2024-02-14 16:07:22 -07:00
Donghee Na	f15795c9a0	gh-111968: Rename freelist related struct names to Eric's suggestion (gh-115329)	2024-02-14 00:32:51 +00:00
Mark Shannon	f9f6156c5a	GH-113710: Backedge counter improvements. (GH-115166)	2024-02-13 14:16:37 +00:00
Donghee Na	d4d5bae147	gh-111968: Refactor _PyXXX_Fini to integrate with _PyObject_ClearFreeLists (gh-114899)	2024-02-10 00:57:04 +00:00
Sam Gross	a3af3cb4f4	gh-110481: Implement inter-thread queue for biased reference counting (#114824 ) Biased reference counting maintains two refcount fields in each object: `ob_ref_local` and `ob_ref_shared`. The true refcount is the sum of these two fields. In some cases, when refcounting operations are split across threads, the ob_ref_shared field can be negative (although the total refcount must be at least zero). In this case, the thread that decremented the refcount requests that the owning thread give up ownership and merge the refcount fields.	2024-02-09 17:08:32 -05:00
Sam Gross	b6228b521b	gh-115035: Mark ThreadHandles as non-joinable earlier after forking (#115042 ) This marks dead ThreadHandles as non-joinable earlier in `PyOS_AfterFork_Child()` before we execute any Python code. The handles are stored in a global linked list in `_PyRuntimeState` because `fork()` affects the entire process.	2024-02-06 14:45:04 -05:00
Andrew Rogers	b3f0b698da	gh-104530: Enable native Win32 condition variables by default (GH-104531)	2024-02-02 13:50:51 +00:00
Donghee Na	13907968d7	gh-111968: Use per-thread freelists for dict in free-threading (gh-114323)	2024-02-01 20:53:53 +00:00
Victor Stinner	58f883b91b	gh-103323: Remove current_fast_get() unused parameter (#114593 ) The current_fast_get() static inline function doesn't use its 'runtime' parameter, so just remove it.	2024-01-30 11:47:58 +01:00
Neil Schemenauer	7a7bce5a0a	gh-113055: Use pointer for interp->obmalloc state (gh-113412) For interpreters that share state with the main interpreter, this points to the same static memory structure. For interpreters with their own obmalloc state, it is heap allocated. Add free_obmalloc_arenas() which will free the obmalloc arenas and radix tree structures for interpreters with their own obmalloc state. Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>	2024-01-26 19:38:14 -08:00
Sam Gross	b52fc70d1a	gh-112529: Implement GC for free-threaded builds (#114262 ) * gh-112529: Implement GC for free-threaded builds This implements a mark and sweep GC for the free-threaded builds of CPython. The implementation relies on mimalloc to find GC tracked objects (i.e., "containers").	2024-01-25 10:27:36 -08:00
Michael Droettboom	ea3cd0498c	gh-114312: Collect stats for unlikely events (GH-114493)	2024-01-25 11:10:51 +00:00
Mark Shannon	384429d1c0	GH-113710: Add a tier 2 peephole optimization pass. (GH-114487) * Convert _LOAD_CONST to inline versions * Remove PEP 523 checks	2024-01-24 12:08:31 +00:00
Sam Gross	441affc9e7	gh-111964: Implement stop-the-world pauses (gh-112471) The `--disable-gil` builds occasionally need to pause all but one thread. Some examples include: * Cyclic garbage collection, where this is often called a "stop the world event" * Before calling `fork()`, to ensure a consistent state for internal data structures * During interpreter shutdown, to ensure that daemon threads aren't accessing Python objects This adds the following functions to implement global and per-interpreter pauses: * `_PyEval_StopTheWorldAll()` and `_PyEval_StartTheWorldAll()` (for the global runtime) * `_PyEval_StopTheWorld()` and `_PyEval_StartTheWorld()` (per-interpreter) (The function names may change.) These functions are no-ops outside of the `--disable-gil` build.	2024-01-23 11:08:23 -07:00
Donghee Na	7fa511ba57	gh-111968: Use per-thread freelists for generator in free-threading (gh-114189)	2024-01-18 18:15:00 +00:00
Donghee Na	867f59f234	gh-111968: Use per-thread freelists for PyContext in free-threading (gh-114122)	2024-01-16 16:14:56 +00:00
Donghee Na	3eae76554b	gh-111968: Use per-thread slice_cache in free-threading (gh-113972)	2024-01-16 00:38:57 +09:00
Donghee Na	2e7577b622	gh-111968: Use per-thread freelists for tuple in free-threading (gh-113921)	2024-01-12 03:46:28 +09:00
Donghee Na	f728f7242c	gh-111968: Use per-thread freelists for float in free-threading (gh-113886)	2024-01-10 15:47:13 +00:00
Donghee Na	57bdc6c30d	gh-111968: Introduce _PyFreeListState and _PyFreeListState_GET API (gh-113584)	2024-01-10 08:04:41 +09:00
Sam Gross	0b7476080b	gh-112532: Tag mimalloc heaps and pages (#113742 ) * gh-112532: Tag mimalloc heaps and pages Mimalloc pages are data structures that contain contiguous allocations of the same block size. Note that they are distinct from operating system pages. Mimalloc pages are contained in segments. When a thread exits, it abandons any segments and contained pages that have live allocations. These segments and pages may be later reclaimed by another thread. To support GC and certain thread-safety guarantees in free-threaded builds, we want pages to only be reclaimed by the corresponding heap in the claimant thread. For example, we want pages containing GC objects to only be claimed by GC heaps. This allows heaps and pages to be tagged with an integer tag that is used to ensure that abandoned pages are only claimed by heaps with the same tag. Heaps can be initialized with a tag (0-15); any page allocated by that heap copies the corresponding tag. * Fix conversion warning	2024-01-05 12:08:50 -08:00
Sam Gross	fcb3c2a444	gh-112532: Isolate abandoned segments by interpreter (#113717 ) * gh-112532: Isolate abandoned segments by interpreter Mimalloc segments are data structures that contain memory allocations along with metadata. Each segment is "owned" by a thread. When a thread exits, it abandons its segments to a global pool to be later reclaimed by other threads. This changes the pool to be per-interpreter instead of process-wide. This will be important for when we use mimalloc to find GC objects in the `--disable-gil` builds. We want heaps to only store Python objects from a single interpreter. Absent this change, the abandoning and reclaiming process could break this isolation. * Add missing '&_mi_abandoned_default' to 'tld_empty'	2024-01-04 22:21:40 +00:00
Sam Gross	acf3bcc886	gh-112532: Use separate mimalloc heaps for GC objects (gh-113263) * gh-112532: Use separate mimalloc heaps for GC objects In `--disable-gil` builds, we now use four separate heaps in anticipation of using mimalloc to find GC objects when the GIL is disabled. To support this, we also make a few changes to mimalloc: * `mi_heap_t` and `mi_tld_t` initialization is split from allocation. This allows us to have a `mi_tld_t` per-`PyThreadState`, which is important to keep interpreter isolation, since the same OS thread may run in multiple interpreters (using different PyThreadStates.) * Heap abandoning (mi_heap_collect_ex) can now be called from a different thread than the one that created the heap. This is necessary because we may clear and delete the containing PyThreadStates from a different thread during finalization and after fork(). * Use enum instead of defines and guard mimalloc includes. * The enum typedef will be convenient for future PRs that use the type. * Guarding the mimalloc includes allows us to unconditionally include pycore_mimalloc.h from other header files that rely on things like `struct _mimalloc_thread_state`. * Only define _mimalloc_thread_state in Py_GIL_DISABLED builds	2023-12-27 01:53:20 +09:00
Donghee Na	d00dbf5415	gh-112535: Implement fallback implementation of _Py_ThreadId() (gh-113185) --------- Co-authored-by: Sam Gross <colesbury@gmail.com>	2023-12-18 16:54:49 +00:00
Sam Gross	a3c031884d	gh-112723: Call `PyThreadState_Clear()` from the correct interpreter (#112776 ) The `PyThreadState_Clear()` function must only be called with the GIL held and must be called from the same interpreter as the passed in thread state. Otherwise, any Python objects on the thread state may be destroyed using the wrong interpreter, leading to memory corruption. This is also important for `Py_GIL_DISABLED` builds because free lists will be associated with PyThreadStates and cleared in `PyThreadState_Clear()`. This fixes two places that called `PyThreadState_Clear()` from the wrong interpreter and adds an assertion to `PyThreadState_Clear()`.	2023-12-12 17:20:21 -07:00
Eric Snow	86a77f4e1a	gh-76785: Fixes for test.support.interpreters (gh-112982) This involves a number of changes for PEP 734.	2023-12-12 08:24:31 -07:00
Sam Gross	cf6110ba13	gh-111924: Use PyMutex for Runtime-global Locks. (gh-112207) This replaces some usages of PyThread_type_lock with PyMutex, which does not require memory allocation to initialize. This simplifies some of the runtime initialization and is also one step towards avoiding changing the default raw memory allocator during initialize/finalization, which can be non-thread-safe in some circumstances.	2023-12-07 12:33:40 -07:00
Sam Gross	db460735af	gh-112538: Add internal-only _PyThreadStateImpl "wrapper" for PyThreadState (gh-112560) Every PyThreadState instance is now actually a _PyThreadStateImpl. It is safe to cast from `PyThreadState` to `_PyThreadStateImpl` and back. The _PyThreadStateImpl will contain fields that we do not want to expose in the public C API.	2023-12-07 12:11:45 -07:00
Hugo van Kemenade	3b3ec0d77f	gh-111863: Rename `Py_NOGIL` to `Py_GIL_DISABLED` (#111864 ) Rename Py_NOGIL to Py_GIL_DISABLED	2023-11-20 15:52:00 +02:00
Sam Gross	446f18a911	gh-111956: Add thread-safe one-time initialization. (gh-111960)	2023-11-16 12:19:54 -07:00
Sam Gross	31c90d5838	gh-111569: Implement Python critical section API (gh-111571) Critical sections are helpers to replace the global interpreter lock with finer grained locking. They provide similar guarantees to the GIL and avoid the deadlock risk that plain locking involves. Critical sections are implicitly ended whenever the GIL would be released. They are resumed when the GIL would be acquired. Nested critical sections behave as if the sections were interleaved.	2023-11-08 15:39:29 -07:00
Tian Gao	e0afed7e27	gh-103615: Use local events for opcode tracing (GH-109472) * Use local monitoring for opcode trace * Remove f_opcode_trace_set * Add test for setting f_trace_opcodes after settrace	2023-11-03 16:39:50 +00:00
Eric Snow	9322ce90ac	gh-76785: Crossinterp utils additions (gh-111530) This moves several general internal APIs out of _xxsubinterpretersmodule.c and into the new Python/crossinterp.c (and the corresponding internal headers). Specifically: * _Py_excinfo, etc.: the initial implementation for non-object exception snapshots (in pycore_pyerrors.h and Python/errors.c) * _PyXI_exception_info, etc.: helpers for passing an exception beween interpreters (wraps _Py_excinfo) * _PyXI_namespace, etc.: helpers for copying a dict of attrs between interpreters * _PyXI_Enter(), _PyXI_Exit(): functions that abstract out the transitions between one interpreter and a second that will do some work temporarily Again, these were all abstracted out of _xxsubinterpretersmodule.c as generalizations. I plan on proposing these as public API at some point.	2023-11-01 17:36:40 -06:00
Eric Snow	c6fe0869ab	gh-76785: Move the Cross-Interpreter Code to Its Own File (gh-111502) This is partly to clear this stuff out of pystate.c, but also in preparation for moving some code out of _xxsubinterpretersmodule.c. This change also moves this stuff to the internal API (new: Include/internal/pycore_crossinterp.h). @vstinner did this previously and I undid it. Now I'm re-doing it. :/	2023-10-30 16:53:10 -06:00
Mark Shannon	52e902ccf0	GH-109369: Add machinery for deoptimizing tier2 executors, both individually and globally. (GH-110384)	2023-10-23 14:49:09 +01:00
Eric Snow	a77fa05124	gh-76785: Clean Up the Channels Module (gh-110568)	2023-10-17 23:51:52 +00:00
Tian Gao	1e3460d9fa	gh-110752: Reset `ceval.eval_breaker` to 0 in `interpreter_clear` (GH-110753)	2023-10-12 15:10:21 +01:00
Eric Snow	7bd560ce8d	gh-76785: Add SendChannel.send_buffer() (#110246 ) (This is still a test module.)	2023-10-09 07:39:51 -06:00
Brett Cannon	5fd8821cf8	GH-110455: Guard `assert(tstate->thread_id > 0)` with `#ifndef HAVE_PTHREAD_STUBS` (GH-110487)	2023-10-06 16:12:19 -07:00
Sam Gross	6e97a9647a	gh-109549: Add new states to PyThreadState to support PEP 703 (gh-109915) This adds a new field 'state' to PyThreadState that can take on one of three values: _Py_THREAD_ATTACHED, _Py_THREAD_DETACHED, or _Py_THREAD_GC. The "attached" and "detached" states correspond closely to acquiring and releasing the GIL. The "gc" state is current unused, but will be used to implement stop-the-world GC for --disable-gil builds in the near future.	2023-10-05 09:46:33 -06:00
Eric Snow	80dc39e1dc	gh-110310: Add a Per-Interpreter XID Registry for Heap Types (gh-110311) We do the following: * add a per-interpreter XID registry (PyInterpreterState.xidregistry) * put heap types there (keep static types in _PyRuntimeState.xidregistry) * clear the registries during interpreter/runtime finalization * avoid duplicate entries in the registry (when _PyCrossInterpreterData_RegisterClass() is called more than once for a type) * use Py_TYPE() instead of PyObject_Type() in _PyCrossInterpreterData_Lookup() The per-interpreter registry helps preserve isolation between interpreters. This is important when heap types are registered, which is something we haven't been doing yet but I will likely do soon.	2023-10-04 16:35:27 -06:00
Victor Stinner	d73501602f	gh-108867: Add PyThreadState_GetUnchecked() function (#108870 ) Add PyThreadState_GetUnchecked() function: similar to PyThreadState_Get(), but don't issue a fatal error if it is NULL. The caller is responsible to check if the result is NULL. Previously, this function was private and known as _PyThreadState_UncheckedGet().	2023-10-03 16:53:51 +00:00
Eric Snow	f5198b09e1	gh-109860: Use a New Thread State When Switching Interpreters, When Necessary (gh-110245) In a few places we switch to another interpreter without knowing if it has a thread state associated with the current thread. For the main interpreter there wasn't much of a problem, but for subinterpreters we were mostly okay re-using the tstate created with the interpreter (located via PyInterpreterState_ThreadHead()). There was a good chance that tstate wasn't actually in use by another thread. However, there are no guarantees of that. Furthermore, re-using an already used tstate is currently fragile. To address this, now we create a new thread state in each of those places and use it. One consequence of this change is that PyInterpreterState_ThreadHead() may not return NULL (though that won't happen for the main interpreter).	2023-10-03 09:20:48 -06:00
Eric Snow	1dd9dee45d	gh-105716: Support Background Threads in Subinterpreters Consistently (gh-109921) The existence of background threads running on a subinterpreter was preventing interpreters from getting properly destroyed, as well as impacting the ability to run the interpreter again. It also affected how we wait for non-daemon threads to finish. We add PyInterpreterState.threads.main, with some internal C-API functions.	2023-10-02 20:12:12 +00:00
Victor Stinner	8b626a47ba	gh-110079: Remove extern "C" { ...} in C code (#110080 )	2023-09-29 10:56:49 +02:00
Eric Snow	32466c97c0	gh-109793: Allow Switching Interpreters During Finalization (gh-109794) Essentially, we should check the thread ID rather than the thread state pointer.	2023-09-27 13:41:06 -06:00
Eric Snow	fd7e08a6f3	gh-76785: Use Pending Calls When Releasing Cross-Interpreter Data (gh-109556) This fixes some crashes in the _xxinterpchannels module, due to a race between interpreters.	2023-09-19 15:01:34 -06:00
Sam Gross	0c89056fe5	gh-108724: Add PyMutex and _PyParkingLot APIs (gh-109344) PyMutex is a one byte lock with fast, inlineable lock and unlock functions for the common uncontended case. The design is based on WebKit's WTF::Lock. PyMutex is built using the _PyParkingLot APIs, which provides a cross-platform futex-like API (based on WebKit's WTF::ParkingLot). This internal API will be used for building other synchronization primitives used to implement PEP 703, such as one-time initialization and events. This also includes tests and a mini benchmark in Tools/lockbench/lockbench.py to compare with the existing PyThread_type_lock. Uncontended acquisition + release: * Linux (x86-64): PyMutex: 11 ns, PyThread_type_lock: 44 ns * macOS (arm64): PyMutex: 13 ns, PyThread_type_lock: 18 ns * Windows (x86-64): PyMutex: 13 ns, PyThread_type_lock: 38 ns PR Overview: The primary purpose of this PR is to implement PyMutex, but there are a number of support pieces (described below). * PyMutex: A 1-byte lock that doesn't require memory allocation to initialize and is generally faster than the existing PyThread_type_lock. The API is internal only for now. * _PyParking_Lot: A futex-like API based on the API of the same name in WebKit. Used to implement PyMutex. * _PyRawMutex: A word sized lock used to implement _PyParking_Lot. * PyEvent: A one time event. This was used a bunch in the "nogil" fork and is useful for testing the PyMutex implementation, so I've included it as part of the PR. * pycore_llist.h: Defines common operations on doubly-linked list. Not strictly necessary (could do the list operations manually), but they come up frequently in the "nogil" fork. ( Similar to https://man.freebsd.org/cgi/man.cgi?queue) --------- Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>	2023-09-19 09:54:29 -06:00
Hood Chatham	6b179adb8c	gh-106213: Make Emscripten trampolines work with JSPI (GH-106219) There is a WIP proposal to enable webassembly stack switching which have been implemented in v8: https://github.com/WebAssembly/js-promise-integration It is not possible to switch stacks that contain JS frames so the Emscripten JS trampolines that allow calling functions with the wrong number of arguments don't work in this case. However, the js-promise-integration proposal requires the [type reflection for Wasm/JS API](https://github.com/WebAssembly/js-types) proposal, which allows us to actually count the number of arguments a function expects. For better compatibility with stack switching, this PR checks if type reflection is available, and if so we use a switch block to decide the appropriate signature. If type reflection is unavailable, we should use the current EMJS trampoline. We cache the function argument counts since when I didn't cache them performance was negatively affected. Co-authored-by: T. Wouters <thomas@python.org> Co-authored-by: Brett Cannon <brett@python.org>	2023-09-15 15:04:21 -07:00
Victor Stinner	517cd82ea7	gh-108987: Fix _thread.start_new_thread() race condition (#109135 ) Fix _thread.start_new_thread() race condition. If a thread is created during Python finalization, the newly spawned thread now exits immediately instead of trying to access freed memory and lead to a crash. thread_run() calls PyEval_AcquireThread() which checks if the thread must exit. The problem was that tstate was dereferenced earlier in _PyThreadState_Bind() which leads to a crash most of the time. Move _PyThreadState_CheckConsistency() from thread_run() to _PyThreadState_Bind().	2023-09-11 17:27:03 +02:00
Victor Stinner	f63d37877a	gh-104690: thread_run() checks for tstate dangling pointer (#109056 ) thread_run() of _threadmodule.c now calls _PyThreadState_CheckConsistency() to check if tstate is a dangling pointer when Python is built in debug mode. Rename ceval_gil.c is_tstate_valid() to _PyThreadState_CheckConsistency() to reuse it in _threadmodule.c.	2023-09-08 11:50:46 +02:00
Victor Stinner	b0edf3b98e	GH-91079: Rename C_RECURSION_LIMIT to Py_C_RECURSION_LIMIT (#108507 ) Symbols of the C API should be prefixed by "Py_" to avoid conflict with existing names in 3rd party C extensions on "#include <Python.h>". test.pythoninfo now logs Py_C_RECURSION_LIMIT constant and other _testcapi and _testinternalcapi constants.	2023-09-08 09:48:28 +00:00
Mark Shannon	15d4c9fabc	GH-108716: Turn off deep-freezing of code objects. (GH-108722)	2023-09-08 10:34:40 +01:00
Victor Stinner	b298b395e8	gh-108765: Cleanup #include in Python/*.c files (#108977 ) Mention one symbol imported by each #include.	2023-09-06 15:56:08 +02:00
Victor Stinner	b936cf4fe0	gh-108634: PyInterpreterState_New() no longer calls Py_FatalError() (#108748 ) pycore_create_interpreter() now returns a status, rather than calling Py_FatalError(). * PyInterpreterState_New() now calls Py_ExitStatusException() instead of calling Py_FatalError() directly. * Replace Py_FatalError() with PyStatus in init_interpreter() and _PyObject_InitState(). * _PyErr_SetFromPyStatus() now raises RuntimeError, instead of ValueError. It can now call PyErr_NoMemory(), raise MemoryError, if it detects _PyStatus_NO_MEMORY() error message.	2023-09-01 12:43:30 +02:00
Victor Stinner	13a00078b8	gh-108634: Py_TRACE_REFS uses a hash table (#108663 ) Python built with "configure --with-trace-refs" (tracing references) is now ABI compatible with Python release build and debug build. Moreover, it now also supports the Limited API. Change Py_TRACE_REFS build: * Remove _PyObject_EXTRA_INIT macro. * The PyObject structure no longer has two extra members (_ob_prev and _ob_next). * Use a hash table (_Py_hashtable_t) to trace references (all objects): PyInterpreterState.object_state.refchain. * Py_TRACE_REFS build is now ABI compatible with release build and debug build. * Limited C API extensions can now be built with Py_TRACE_REFS: xxlimited, xxlimited_35, _testclinic_limited. * No longer rename PyModule_Create2() and PyModule_FromDefAndSpec2() functions to PyModule_Create2TraceRefs() and PyModule_FromDefAndSpec2TraceRefs(). * _Py_PrintReferenceAddresses() is now called before finalize_interp_delete() which deletes the refchain hash table. * test_tracemalloc find_trace() now also filters by size to ignore the memory allocated by _PyRefchain_Trace(). Test changes for Py_TRACE_REFS: * Add test.support.Py_TRACE_REFS constant. * Add test_sys.test_getobjects() to test sys.getobjects() function. * test_exceptions skips test_recursion_normalizing_with_no_memory() and test_memory_error_in_PyErr_PrintEx() if Python is built with Py_TRACE_REFS. * test_repl skips test_no_memory(). * test_capi skisp test_set_nomemory().	2023-08-31 18:33:34 +02:00
Mark Shannon	006e44f950	GH-108035: Remove the `_PyCFrame` struct as it is no longer needed for performance. (GH-108036)	2023-08-17 11:16:03 +01:00
Eric Snow	430632d6f7	gh-107630: Initialize Each Interpreter's refchain Properly (gh-107733) This finishes fixing the crashes in Py_TRACE_REFS builds. We missed this part in gh-107567.	2023-08-07 13:14:56 -06:00
Eric Snow	8ba4df91ae	gh-105699: Use a _Py_hashtable_t for the PyModuleDef Cache (gh-106974) This fixes a crasher due to a race condition, triggered infrequently when two isolated (own GIL) subinterpreters simultaneously initialize their sys or builtins modules. The crash happened due the combination of the "detached" thread state we were using and the "last holder" logic we use for the GIL. It turns out it's tricky to use the same thread state for different threads. Who could have guessed? We solve the problem by eliminating the one object we were still sharing between interpreters. We replace it with a low-level hashtable, using the "raw" allocator to avoid tying it to the main interpreter. We also remove the accommodations for "detached" thread states, which were a dubious idea to start with.	2023-07-28 14:39:08 -06:00
Eric Snow	8bdae1424b	gh-101524: Only Use Public C-API in the _xxsubinterpreters Module (gh-107359) The _xxsubinterpreters module should not rely on internal API. Some of the functions it uses were recently moved there however. Here we move them back (and expose them properly).	2023-07-27 15:30:16 -06:00
Victor Stinner	0927a2b25c	GH-103082: Rename PY_MONITORING_EVENTS to _PY_MONITORING_EVENTS (#107069 ) Rename private C API constants: * Rename PY_MONITORING_UNGROUPED_EVENTS to _PY_MONITORING_UNGROUPED_EVENTS * Rename PY_MONITORING_EVENTS to _PY_MONITORING_EVENTS	2023-07-22 21:35:27 +00:00
Victor Stinner	bc7eb17084	gh-106320: Use _PyInterpreterState_GET() (#106336 ) Replace PyInterpreterState_Get() with inlined _PyInterpreterState_GET().	2023-07-02 16:37:37 +00:00
Victor Stinner	8571b271e7	gh-106320: Remove private _PyInterpreterState functions (#106325 ) Remove private _PyThreadState and _PyInterpreterState C API functions: move them to the internal C API (pycore_pystate.h and pycore_interp.h). Don't export most of these functions anymore, but still export functions used by tests. Remove _PyThreadState_Prealloc() and _PyThreadState_Init() from the C API, but keep it in the stable API.	2023-07-02 01:39:38 +00:00
Victor Stinner	46a3190fcf	gh-105927: Avoid calling PyWeakref_GET_OBJECT() (#105997 ) * Replace PyWeakref_GET_OBJECT() with _PyWeakref_GET_REF(). * _sqlite/blob.c now holds a strong reference to the blob object while calling close_blob(). * _xidregistry_find_type() now holds a strong reference to registered while using it.	2023-06-22 22:31:31 +02:00
Mark Shannon	7199584ac8	GH-100987: Allow objects other than code objects as the "executable" of an internal frame. (GH-105727) * Add table describing possible executable classes for out-of-process debuggers. * Remove shim code object creation code as it is no longer needed. * Make lltrace a bit more robust w.r.t. non-standard frames.	2023-06-14 13:46:37 +01:00
Eric Snow	757b402ea1	gh-104812: Run Pending Calls in any Thread (gh-104813) For a while now, pending calls only run in the main thread (in the main interpreter). This PR changes things to allow any thread run a pending call, unless the pending call was explicitly added for the main thread to run.	2023-06-13 15:02:19 -06:00
Eric Snow	68dfa49627	gh-100227: Lock Around Modification of the Global Allocators State (gh-105516) The risk of a race with this state is relatively low, but we play it safe anyway. We do avoid using the lock in performance-sensitive cases where the risk of a race is very, very low.	2023-06-08 14:06:54 -06:00
Eric Snow	e822a676f1	gh-100227: Lock Around Adding Global Audit Hooks (gh-105515) The risk of a race with this state is relatively low, but we play it safe anyway.	2023-06-08 18:38:15 +00:00
Eric Snow	7799c8e678	gh-100227: Lock Around Use of the Global "atexit" State (gh-105514) The risk of a race with this state is relatively low, but we play it safe anyway.	2023-06-08 18:08:28 +00:00
Mark Shannon	4bfa01b9d9	GH-104584: Plugin optimizer API (GH-105100)	2023-06-02 11:46:18 +01:00
Eric Snow	3698fda06e	gh-104341: Call _PyEval_ReleaseLock() with NULL When Finalizing the Current Thread (gh-105109) This avoids the problematic race in drop_gil() by skipping the FORCE_SWITCHING code there for finalizing threads. (The idea for this approach came out of discussions with @markshannon.)	2023-06-01 16:24:10 -06:00
Eric Snow	26baa747c2	gh-104341: Adjust tstate_must_exit() to Respect Interpreter Finalization (gh-104437) With the move to a per-interpreter GIL, this check slipped through the cracks.	2023-05-15 13:59:26 -06:00
Eric Snow	5c9ee498c6	gh-99113: A Per-Interpreter GIL! (gh-104210) This is the culmination of PEP 684 (and of my 8-year long multi-core Python project)! Each subinterpreter may now be created with its own GIL (via Py_NewInterpreterFromConfig()). If not so configured then the interpreter will share with the main interpreter--the status quo since subinterpreters were added decades ago. The main interpreter always has its own GIL and subinterpreters from Py_NewInterpreter() will always share with the main interpreter.	2023-05-08 13:15:09 -06:00
Eric Snow	92d8bfffbf	gh-99113: Make Sure the GIL is Acquired at the Right Places (gh-104208) This is a pre-requisite for a per-interpreter GIL. Without it this change isn't strictly necessary. However, there is no real downside otherwise.	2023-05-06 15:59:30 -06:00
Eric Snow	55671fe047	gh-99113: Share the GIL via PyInterpreterState.ceval.gil (gh-104203) In preparation for a per-interpreter GIL, we add PyInterpreterState.ceval.gil, set it to the shared GIL for each interpreter, and use that rather than using _PyRuntime.ceval.gil directly. Note that _PyRuntime.ceval.gil is still the actual GIL.	2023-05-05 13:23:00 -06:00
Victor Stinner	45398ad512	gh-103323: Remove PyRuntimeState_GetThreadState() (#104171 ) This function no longer makes sense, since its runtime parameter is no longer used. Use directly _PyThreadState_GET() and _PyInterpreterState_GET() instead.	2023-05-04 16:21:01 +02:00
Mark Shannon	738c226786	GH-103082: Code cleanup in instrumentation code (#103474 )	2023-04-29 04:51:55 +00:00
Eric Snow	d8627999d8	gh-100227: Add a Granular Lock for _PyRuntime.imports.extensions.dict (gh-103460) The lock is unnecessary as long as there's a GIL, but completely necessary with a per-interpreter GIL.	2023-04-24 21:09:35 -06:00
Eric Snow	df3173d28e	gh-101659: Isolate "obmalloc" State to Each Interpreter (gh-101660) This is strictly about moving the "obmalloc" runtime state from `_PyRuntimeState` to `PyInterpreterState`. Doing so improves isolation between interpreters, specifically most of the memory (incl. objects) allocated for each interpreter's use. This is important for a per-interpreter GIL, but such isolation is valuable even without it. FWIW, a per-interpreter obmalloc is the proverbial canary-in-the-coalmine when it comes to the isolation of objects between interpreters. Any object that leaks (unintentionally) to another interpreter is highly likely to cause a crash (on debug builds at least). That's a useful thing to know, relative to interpreter isolation.	2023-04-24 17:23:57 -06:00
Eric Snow	f8abfa3314	gh-103323: Get the "Current" Thread State from a Thread-Local Variable (gh-103324) We replace _PyRuntime.tstate_current with a thread-local variable. As part of this change, we add a _Py_thread_local macro in pyport.h (only for the core runtime) to smooth out the compiler differences. The main motivation here is in support of a per-interpreter GIL, but this change also provides some performance improvement opportunities. Note that we do not provide a fallback to the thread-local, either falling back to the old tstate_current or to thread-specific storage (PyThread_tss_*()). If that proves problematic then we can circle back. I consider it unlikely, but will run the buildbots to double-check. Also note that this does not change any of the code related to the GILState API, where it uses a thread state stored in thread-specific storage. I suspect we can combine that with _Py_tss_tstate (from here). However, that can be addressed separately and is not urgent (nor critical). (While this change was mostly done independently, I did take some inspiration from earlier (~2020) work by @markshannon (main...markshannon:threadstate_in_tls) and @vstinner (#23976).)	2023-04-24 11:17:02 -06:00
Mark Shannon	411b169281	GH-103082: Implementation of PEP 669: Low Impact Monitoring for CPython (GH-103083) * The majority of the monitoring code is in instrumentation.c * The new instrumentation bytecodes are in bytecodes.c * legacy_tracing.c adapts the new API to the old sys.setrace and sys.setprofile APIs	2023-04-12 12:04:55 +01:00
Irit Katriel	78b763f630	gh-103176: sys._current_exceptions() returns mapping to exception instances instead of exc_info tuples (#103177 )	2023-04-11 09:38:37 +01:00
Eric Snow	52e9b389a8	gh-100227: Use an Array for _PyRuntime's Set of Locks During Init (gh-103315) This cleans things up a bit and simplifies adding new granular global locks.	2023-04-06 12:00:49 -06:00

1 2 3 4 5 ...

507 Commits