Add the ability to enable/disable the GIL at runtime, and use that in
the C module loading code.
We can't know before running a module init function if it supports
free-threading, so the GIL is temporarily enabled before doing so. If
the module declares support for running without the GIL, the GIL is
later disabled. Otherwise, the GIL is permanently enabled, and will
never be disabled again for the life of the current interpreter.
We already intern and immortalize most string constants. In the
free-threaded build, other constants can be a source of reference count
contention because they are shared by all threads running the same code
objects.
Add _PyType_LookupRef and use incref before setting attribute on type
Makes setting an attribute on a class and signaling type modified atomic
Avoid adding re-entrancy exposing the type cache in an inconsistent state by decrefing after type is updated
The designated initializer syntax in static inline functions in pycore_backoff.h
causes problems for C++ or MSVC users who aren't yet using C++20.
While internal, pycore_backoff.h is included (indirectly, via pycore_code.h)
by some key 3rd party software that does so for speed.
Use the new public Raw functions:
* _PyTime_PerfCounterUnchecked() with PyTime_PerfCounterRaw()
* _PyTime_TimeUnchecked() with PyTime_TimeRaw()
* _PyTime_MonotonicUnchecked() with PyTime_MonotonicRaw()
Remove internal functions:
* _PyTime_PerfCounterUnchecked()
* _PyTime_TimeUnchecked()
* _PyTime_MonotonicUnchecked()
We have only been tracking each module's PyModuleDef. However, there are some problems with that. For example, in some cases we load single-phase init extension modules from def->m_base.m_init or def->m_base.m_copy, but if multiple modules share a def then we can end up with unexpected behavior.
With this change, we track the following:
* PyModuleDef (same as before)
* for some modules, its init function or a copy of its __dict__, but specific to that module
* whether it is a builtin/core module or a "dynamic" extension
* the interpreter (ID) that owns the cached __dict__ (only if cached)
This also makes it easier to remember the module's kind (e.g. single-phase init) and if loading it previously failed, which I'm doing separately.
* Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations.
* Remove CALL_PY_WITH_DEFAULTS specialization
* Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.
PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.
A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
The module itself is a thin wrapper around calls to functions in
`Python/codecs.c`, so that's where the meaningful changes happened:
- Move codecs-related state that lives on `PyInterpreterState` to a
struct declared in `pycore_codecs.h`.
- In free-threaded builds, add a mutex to `codecs_state` to synchronize
operations on `search_path`. Because `search_path_mutex` is used as a
normal mutex and not a critical section, we must be extremely careful
with operations called while holding it.
- The codec registry is explicitly initialized as part of
`_PyUnicode_InitEncodings` to simplify thread-safety.
Add "Raw" variant of PyTime functions:
* PyTime_MonotonicRaw()
* PyTime_PerfCounterRaw()
* PyTime_TimeRaw()
Changes:
* Add documentation and tests. Tests release the GIL while calling
raw clock functions.
* py_get_system_clock() and py_get_monotonic_clock() now check that
the GIL is hold by the caller if raise_exc is non-zero.
* Reimplement "Unchecked" functions with raw clock functions.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Account for `add_stopiteration_handler` pushing a block for `async with`.
To allow generator functions that previously almost hit the `CO_MAXBLOCKS`
limit by nesting non-async blocks, the limit is increased by 1.
This increase allows one more block in non-generator functions.
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.
We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.
On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.
In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
Avoid detaching thread state when stopping the world. When re-attaching
the thread state, the thread would attempt to resume the top-most
critical section, which might now be held by a thread paused for our
stop-the-world request.
Deferred reference counting is not fully implemented yet. As a temporary
measure, we immortalize objects that would use deferred reference
counting to avoid multi-threaded scaling bottlenecks.
This is only performed in the free-threaded build once the first
non-main thread is started. Additionally, some tests, including refleak
tests, suppress this behavior.
Basically, I've turned most of _PyImport_LoadDynamicModuleWithSpec() into two new functions (_PyImport_GetModInitFunc() and _PyImport_RunModInitFunc()) and moved the rest of it out into _imp_create_dynamic_impl(). There shouldn't be any changes in behavior.
This change makes some future changes simpler. This is particularly relevant to potentially calling each module init function in the main interpreter first. Thus the critical part of the PR is the addition of _PyImport_RunModInitFunc(), which is strictly focused on running the init func and validating the result. A later PR will take it a step farther by capturing error information rather than raising exceptions.
FWIW, this change also helps readers by clarifying a bit more about what happens when an extension/builtin module is imported.
This is an improvement over the status quo, reducing the likelihood of completely filling the pending calls queue. However, the problem won't go away completely unless we move to an unbounded linked list or add a mechanism for waiting until the queue isn't full.