Stop the world when invalidating function versions
The tier1 interpreter specializes `CALL` instructions based on the values
of certain function attributes (e.g. `__code__`, `__defaults__`). The tier1
interpreter uses function versions to verify that the attributes of a function
during execution of a specialization match those seen during specialization.
A function's version is initialized in `MAKE_FUNCTION` and is invalidated when
any of the critical function attributes are changed. The tier1 interpreter stores
the function version in the inline cache during specialization. A guard is used by
the specialized instruction to verify that the version of the function on the operand
stack matches the cached version (and therefore has all of the expected attributes).
It is assumed that once the guard passes, all attributes will remain unchanged
while executing the rest of the specialized instruction.
Stopping the world when invalidating function versions ensures that all critical
function attributes will remain unchanged after the function version guard passes
in free-threaded builds. It's important to note that this is only true if the remainder
of the specialized instruction does not enter and exit a stop-the-world point.
We will stop the world the first time any of the following function attributes
are mutated:
- defaults
- vectorcall
- kwdefaults
- closure
- code
This should happen rarely and only happens once per function, so the performance
impact on majority of code should be minimal.
Additionally, refactor the API for manipulating function versions to more clearly
match the stated semantics.
Changes to the function version cache:
- In addition to the function object, also store the code object,
and allow the latter to be retrieved even if the function has been evicted.
- Stop assigning new function versions after a critical attribute (e.g. `__code__`)
has been modified; the version is permanently reset to zero in this case.
- Changes to `__annotations__` are no longer considered critical. (This fixes gh-109998.)
Changes to the Tier 2 optimization machinery:
- If we cannot map a function version to a function, but it is still mapped to a code object,
we continue projecting the trace.
The operand of the `_PUSH_FRAME` and `_POP_FRAME` opcodes can be either NULL,
a function object, or a code object with the lowest bit set.
This allows us to trace through code that calls an ephemeral function,
i.e., a function that may not be alive when we are constructing the executor,
e.g. a generator expression or certain nested functions.
We will lose globals removal inside such functions,
but we can still do other peephole operations
(and even possibly [call inlining](https://github.com/python/cpython/pull/116290),
if we decide to do it), which only need the code object.
As before, if we cannot retrieve the code object from the cache, we stop projecting.
This finishes the work begun in gh-107760. When, while projecting a superblock, we encounter a call to a short, simple function, the superblock will now enter the function using `_PUSH_FRAME`, continue through it, and leave it using `_POP_FRAME`, and then continue through the original code. Multiple frame pushes and pops are even possible. It is also possible to stop appending to the superblock in the middle of a called function, when running out of space or encountering an unsupported bytecode.
This implements PEP 695, Type Parameter Syntax. It adds support for:
- Generic functions (def func[T](): ...)
- Generic classes (class X[T](): ...)
- Type aliases (type X = ...)
- New scoping when the new syntax is used within a class body
- Compiler and interpreter changes to support the new syntax and scoping rules
Co-authored-by: Marc Mueller <30130371+cdce8p@users.noreply.github.com>
Co-authored-by: Eric Traut <eric@traut.com>
Co-authored-by: Larry Hastings <larry@hastings.org>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
* Make internal APIs that take PyFrameConstructor take a PyFunctionObject instead.
* Add reference to function to frame, borrow references to builtins and globals.
* Add COPY_FREE_VARS instruction to allow specialization of calls to inner functions.