* Add PRECALL_FUNCTION opcode.
* Move 'call shape' varaibles into struct.
* Replace CALL_NO_KW and CALL_KW with KW_NAMES and CALL instructions.
* Specialize for builtin methods taking using the METH_FASTCALL | METH_KEYWORDS protocol.
* Allow kwnames for specialized calls to builtin types.
* Specialize calls to tuple(arg) and str(arg).
* Add RETURN_GENERATOR and JUMP_NO_INTERRUPT opcodes.
* Trim frame and generator by word each.
* Minor refactor of frame.c
* Update test.test_sys to account for smaller frames.
* Treat generator functions as normal functions when evaluating and specializing.
This change is strictly renames and moving code around. It helps in the following ways:
* ensures type-related init functions focus strictly on one of the three aspects (state, objects, types)
* passes in PyInterpreterState * to all those functions, simplifying work on moving types/objects/state to the interpreter
* consistent naming conventions help make what's going on more clear
* keeping API related to a type in the corresponding header file makes it more obvious where to look for it
https://bugs.python.org/issue46008
* Make generator, coroutine and async gen structs all the same size.
* Store interpreter frame in generator (and coroutine). Reduces the number of allocations neeeded for a generator from two to one.
* Split exit paths into exceptional and non-exceptional.
* Move exit tracing code to individual bytecodes.
* Wrap all trace entry and exit events in macros to make them clearer and easier to enhance.
* Move return sequence into RETURN_VALUE, YIELD_VALUE and YIELD_FROM. Distinguish between normal trace events and dtrace events.
* Make internal APIs that take PyFrameConstructor take a PyFunctionObject instead.
* Add reference to function to frame, borrow references to builtins and globals.
* Add COPY_FREE_VARS instruction to allow specialization of calls to inner functions.
Places the locals between the specials and stack. This is the more "natural" layout for a C struct, makes the code simpler and gives a slight speedup (~1%)
* Convert "specials" array to InterpreterFrame struct, adding f_lasti, f_state and other non-debug FrameObject fields to it.
* Refactor, calls pushing the call to the interpreter upward toward _PyEval_Vector.
* Compute f_back when on thread stack, only filling in value when frame object outlives stack invocation.
* Move ownership of InterpreterFrame in generator from frame object to generator object.
* Do not create frame objects for Python calls.
* Do not create frame objects for generators.
Currently, if an arg value escapes (into the closure for an inner function) we end up allocating two indices in the fast locals even though only one gets used. Additionally, using the lower index would be better in some cases, such as with no-arg `super()`. To address this, we update the compiler to fix the offsets so each variable only gets one "fast local". As a consequence, now some cell offsets are interspersed with the locals (only when an arg escapes to an inner function).
https://bugs.python.org/issue43693
This was reverted in GH-26596 (commit 6d518bb) due to some bad memory accesses.
* Add the MAKE_CELL opcode. (gh-26396)
The memory accesses have been fixed.
https://bugs.python.org/issue43693
This moves logic out of the frame initialization code and into the compiler and eval loop. Doing so simplifies the runtime code and allows us to optimize it better.
https://bugs.python.org/issue43693
* Remove 'zombie' frames. We won't need them once we are allocating fixed-size frames.
* Add co_nlocalplus field to code object to avoid recomputing size of locals + frees + cells.
* Move locals, cells and freevars out of frame object into separate memory buffer.
* Use per-threadstate allocated memory chunks for local variables.
* Move globals and builtins from frame object to per-thread stack.
* Move (slow) locals frame object to per-thread stack.
* Move internal frame functions to internal header.