Only call `gc_restore_tid()` from stop-the-world contexts.
`worklist_pop()` can be called while other threads are running, so use a
relaxed atomic to modify `ob_tid`.
The Full Grammar specification in the docs omits rule actions, so grammar rules that raise a syntax error looked like valid syntax.
This was solved in ef940de by hiding those rules in the custom syntax highlighter.
This moves all syntax-error alternatives to invalid rules, adds a validator that ensures that actions containing RAISE_SYNTAX_ERROR are in invalid rules, and reverts the syntax highlighter hack.
When the _Py_SINGLETON() is used, Argument Clinic now adds an
explicit "pycore_runtime.h" include to get the macro. Previously, the
macro may or may not be included indirectly by another include.
ensurepip forks a subprocess to run pip itself, but that subprocess only inherits a -I isolated mode flag (see _run_pip() in Lib/ensurepip/__init__.py), not the "-E -s" flags that the installer has been using. This means that parts of ensurepip don't actually run in an isolated environment and can make incorrect decisions based on packages installed in the user site-packages.
_PyArg_Parser holds static global data generated for modules by Argument Clinic. The _PyArg_Parser.kwtuple field is a tuple object, even though it's stored within a static global. In some cases the tuple is statically allocated and thus it's okay that it gets shared by multiple interpreters. However, in other cases the tuple is set lazily, allocated from the heap using the active interprepreter at the point the tuple is needed.
This is a problem once that interpreter is destroyed since _PyArg_Parser.kwtuple becomes at dangling pointer, leading to crashes. It isn't a problem if the tuple is allocated under the main interpreter, since its lifetime is bound to the lifetime of the runtime. The solution here is to temporarily switch to the main interpreter. The alternative would be to always statically allocate the tuple.
This change also fixes a bug where only the most recent parser was added to the global linked list.
This ensures we don't lose races that occur in subprocesses or
interleave races from workers running in parallel.
Log files are collected and packaged into a zipfile that can be
downloaded from the "Artifacts" section of the workflow run.
`_Py_qsbr_unregister` is called when the PyThreadState is already
detached, so the access to `tstate->qsbr` isn't safe without locking the
shared mutex. Grab the `struct _qsbr_shared` from the interpreter
instead.
Using `race:` filters out warnings if the function appears anywhere in
the stack trace. This can hide a lot of unrelated warnings, especially
for a function like `_PyEval_EvalFrameDefault`, which is somewhere on
the stack more often than not.
Change all free-threaded suppressions to `race_top:`, which only matches
the top frame, and add any new suppressions this exposes.
Use relaxed atomics when reading / writing to the field. There are still a
few places in the GC where we do not use atomics. Those should be safe as
the world is stopped.
The module itself is a thin wrapper around calls to functions in
`Python/codecs.c`, so that's where the meaningful changes happened:
- Move codecs-related state that lives on `PyInterpreterState` to a
struct declared in `pycore_codecs.h`.
- In free-threaded builds, add a mutex to `codecs_state` to synchronize
operations on `search_path`. Because `search_path_mutex` is used as a
normal mutex and not a critical section, we must be extremely careful
with operations called while holding it.
- The codec registry is explicitly initialized as part of
`_PyUnicode_InitEncodings` to simplify thread-safety.
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.
We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.
On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.
In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
Makes sys.settrace, sys.setprofile, and monitoring generally thread-safe.
Mostly uses a stop-the-world approach and synchronization around the code object's _co_instrumentation_version. There may be a little bit of extra synchronization around the monitoring data that's required to be TSAN clean.
Quiet erroneous TSAN reports of data races in `_PySeqLock`
TSAN reports a couple of data races between the compare/exchange in
`_PySeqLock_LockWrite` and the non-atomic loads in `_PySeqLock_{Abandon,Unlock}Write`.
This is another instance of TSAN incorrectly modeling failed compare/exchange
as a write instead of a load.
Fix data races in the method cache in free-threaded builds
These are technically data races, but I think they're benign (to
the extent that that is actually possible). We update cache entries
non-atomically but read them atomically from another thread, and there's
nothing that establishes a happens-before relationship between the
reads and writes that I can see.
Additionally, reduce the iterations for a few weakref tests that would
otherwise take a prohibitively long amount of time (> 1 hour) when TSAN
is enabled and the GIL is disabled.
* Move ifndef_symbols, includes and add_include() from Clinic to
Codegen. Add a 'codegen' (Codegen) attribute to Clinic.
* Remove libclinic.crenderdata module: move code to libclinic.codegen.
* BlockPrinter.print_block(): remove unused 'limited_capi' argument.
Remove also 'core_includes' parameter.
* Add get_includes() methods.
* Make Codegen.ifndef_symbols private.
* Make Codegen.includes private.
* Make CConverter.includes private.
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.
The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
Add libclinic.clanguage module and move the following classes and
functions there:
* CLanguage
* declare_parser()
Add libclinic.codegen and move the following classes there:
* BlockPrinter
* BufferSeries
* Destination
Move the following functions to libclinic.function:
* permute_left_option_groups()
* permute_optional_groups()
* permute_right_option_groups()
* Add a new create_parser_namespace() function for
PythonParser to pass objects to executed code.
* In run_clinic(), list converters using 'converters' and
'return_converters' dictionarties.
* test_clinic: add 'object()' return converter.
* Use also create_parser_namespace() in eval_ast_expr().
Co-authored-by: Erlend E. Aasland <erlend@python.org>
I added it quite a while ago as a strategy for managing interpreter lifetimes relative to the PEP 554 (now 734) implementation. Relatively recently I refactored that implementation to no longer rely on InterpreterID objects. Thus now I'm removing it.
Add Py_GetConstant() and Py_GetConstantBorrowed() functions.
In the limited C API version 3.13, getting Py_None, Py_False,
Py_True, Py_Ellipsis and Py_NotImplemented singletons is now
implemented as function calls at the stable ABI level to hide
implementation details. Getting these constants still return borrowed
references.
Add _testlimitedcapi/object.c and test_capi/test_object.py to test
Py_GetConstant() and Py_GetConstantBorrowed() functions.
Keep Tools/build/deepfreeze.py around (we may repurpose it for deepfreezing non-code objects),
and keep basic "clean" targets that remove the output of former deep-freeze activities,
to keep the build directories of current devs clean.
* Move Block and BlockParser classes to a new libclinic.block_parser
module.
* Move Language and PythonLanguage classes to a new
libclinic.language module.
The fildes converter of Argument Clinic now always call
PyObject_AsFileDescriptor(), not only for the limited C API.
The _PyLong_FileDescriptor_Converter() converter stays as a fallback
when PyObject_AsFileDescriptor() cannot be used.
Add a new C extension "_testlimitedcapi" which is only built with the
limited C API.
Move heaptype_relative.c and vectorcall_limited.c from
Modules/_testcapi/ to Modules/_testlimitedcapi/.
* configure: add _testlimitedcapi test extension.
* Update generate_stdlib_module_names.py.
* Update make check-c-globals.
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
* Move param guard to param state machine
* Override return converter during parsing
* Don't use a custom type slot return converter; instead
special case type slot functions during generation.
This changes the `sym_set_...()` functions to return a `bool` which is `false`
when the symbol is `bottom` after the operation.
All calls to such functions now check this result and go to `hit_bottom`,
a special error label that prints a different message and then reports
that it wasn't able to optimize the trace. No executor will be produced
in this case.
* Rename _Py_UOpsAbstractInterpContext to _Py_UOpsContext and _Py_UOpsSymType to _Py_UopsSymbol.
* #define shortened form of _Py_uop_... names for improved readability.
* Rename `_testinternalcapi.get_{uop,counter}_optimizer` to `new_*_optimizer`
* Use `_PyUOpName()` instead of` _PyOpcode_uop_name[]`
* Add `target` to executor iterator items -- `list(ex)` now returns `(opcode, oparg, target, operand)` quadruples
* Add executor methods `get_opcode()` and `get_oparg()` to get `vmdata.opcode`, `vmdata.oparg`
* Define a helper for printing uops, and unify various places where they are printed
* Add a hack to summarize_stats.py to fix legacy uop names (e.g. `POP_TOP` -> `_POP_TOP`)
* Define helpers in `test_opt.py` for accessing the set or list of opnames of an executor
This change essentially replaces usage of `%1` with `%~1`, which removes
quotes, if any. Without this change, the if statements fail due to
the quotes mangling the syntax.
Additionally, this change works around comma being treated as a parameter
delimiter in test.bat by escaping commas at time of parsing. Tested
combinations of rt and regrtest arguments, all seems to work as before
but now you can specify commas in arguments like "-uall,extralargefile".
Refactor state_modulename_name() of the parsing state machine, by
adding helpers for the sections that deal with ...:
1. parsing the function name
2. normalizing "function kind"
3. dealing with cloned functions
4. resolving return converters
5. adding the function to the DSL parser
Add PythonFinalizationError exception. This exception derived from
RuntimeError is raised when an operation is blocked during the Python
finalization.
The following functions now raise PythonFinalizationError, instead of
RuntimeError:
* _thread.start_new_thread()
* subprocess.Popen
* os.fork()
* os.fork1()
* os.forkpty()
Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
For the most part, these changes make is substantially easier to backport subinterpreter-related code to 3.12, especially the related modules (e.g. _xxsubinterpreters). The main motivation is to support releasing a PyPI package with the 3.13 capabilities compiled for 3.12.
A lot of the changes here involve either hiding details behind macros/functions or splitting up some files.
* gh-112529: Remove PyGC_Head from object pre-header in free-threaded build
This avoids allocating space for PyGC_Head in the free-threaded build.
The GC implementation for free-threaded CPython does not use the
PyGC_Head structure.
* The trashcan mechanism uses the `ob_tid` field instead of `_gc_prev`
in the free-threaded build.
* The GDB libpython.py file now determines the offset of the managed
dict field based on whether the running process is a free-threaded
build. Those are identified by the `ob_ref_local` field in PyObject.
* Fixes `_PySys_GetSizeOf()` which incorrectly incorrectly included the
size of `PyGC_Head` in the size of static `PyTypeObject`.
Add an option (--enable-experimental-jit for configure-based builds
or --experimental-jit for PCbuild-based ones) to build an
*experimental* just-in-time compiler, based on copy-and-patch (https://fredrikbk.com/publications/copy-and-patch.pdf).
See Tools/jit/README.md for more information on how to install the required build-time tooling.
For interpreters that share state with the main interpreter, this points
to the same static memory structure. For interpreters with their own
obmalloc state, it is heap allocated. Add free_obmalloc_arenas() which
will free the obmalloc arenas and radix tree structures for interpreters
with their own obmalloc state.
Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
The test_peg_generator test tried to link the python313_d.lib library,
which failed because the library is now named python313t_d.lib. The
underlying problem is that the "compiler" attribute was not set when
we call get_libraries() from distutils.
Make it possible for a converter to have multiple includes, by collecting
them in a list on the converter instance. This implies converter includes
are added during template generation, so we have to add them to the
clinic instance at the end of the template generation instead of in the
beginning.