We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
* Add RETURN_GENERATOR and JUMP_NO_INTERRUPT opcodes.
* Trim frame and generator by word each.
* Minor refactor of frame.c
* Update test.test_sys to account for smaller frames.
* Treat generator functions as normal functions when evaluating and specializing.
* Make internal APIs that take PyFrameConstructor take a PyFunctionObject instead.
* Add reference to function to frame, borrow references to builtins and globals.
* Add COPY_FREE_VARS instruction to allow specialization of calls to inner functions.
Freelists for object structs can now be disabled. A new ``configure``
option ``--without-freelists`` can be used to disable all freelists
except empty tuple singleton. Internal Py*_MAXFREELIST macros can now
be defined as 0 without causing compiler warnings and segfaults.
Signed-off-by: Christian Heimes <christian@python.org>
Places the locals between the specials and stack. This is the more "natural" layout for a C struct, makes the code simpler and gives a slight speedup (~1%)
* Convert "specials" array to InterpreterFrame struct, adding f_lasti, f_state and other non-debug FrameObject fields to it.
* Refactor, calls pushing the call to the interpreter upward toward _PyEval_Vector.
* Compute f_back when on thread stack, only filling in value when frame object outlives stack invocation.
* Move ownership of InterpreterFrame in generator from frame object to generator object.
* Do not create frame objects for Python calls.
* Do not create frame objects for generators.
Currently, if an arg value escapes (into the closure for an inner function) we end up allocating two indices in the fast locals even though only one gets used. Additionally, using the lower index would be better in some cases, such as with no-arg `super()`. To address this, we update the compiler to fix the offsets so each variable only gets one "fast local". As a consequence, now some cell offsets are interspersed with the locals (only when an arg escapes to an inner function).
https://bugs.python.org/issue43693
This is the same fix as for PyFrame_LocalsToFast() in gh-26609, but applied to PyFrame_FastToLocalsWithError(). (It should have been in that PR.)
https://bugs.python.org/issue43693
This was reverted in GH-26596 (commit 6d518bb) due to some bad memory accesses.
* Add the MAKE_CELL opcode. (gh-26396)
The memory accesses have been fixed.
https://bugs.python.org/issue43693
This moves logic out of the frame initialization code and into the compiler and eval loop. Doing so simplifies the runtime code and allows us to optimize it better.
https://bugs.python.org/issue43693
These were reverted in gh-26530 (commit 17c4edc) due to refleaks.
* 2c1e258 - Compute deref offsets in compiler (gh-25152)
* b2bf2bc - Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)
This change fixes the refleaks.
https://bugs.python.org/issue43693
* Revert "bpo-43693: Compute deref offsets in compiler (gh-25152)"
This reverts commit b2bf2bc1ec.
* Revert "bpo-43693: Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)"
This reverts commit 2c1e2583fd.
These two commits are breaking the refleak buildbots.
A number of places in the code base (notably ceval.c and frameobject.c) rely on mapping variable names to indices in the frame "locals plus" array (AKA fast locals), and thus opargs. Currently the compiler indirectly encodes that information on the code object as the tuples co_varnames, co_cellvars, and co_freevars. At runtime the dependent code must calculate the proper mapping from those, which isn't ideal and impacts performance-sensitive sections. This is something we can easily address in the compiler instead.
This change addresses the situation by replacing internal use of co_varnames, etc. with a single combined tuple of names in locals-plus order, along with a minimal array mapping each to its kind (local vs. cell vs. free). These two new PyCodeObject fields, co_fastlocalnames and co_fastllocalkinds, are not exposed to Python code for now, but co_varnames, etc. are still available with the same values as before (though computed lazily).
Aside from the (mild) performance impact, there are a number of other benefits:
* there's now a clear, direct relationship between locals-plus and variables
* code that relies on the locals-plus-to-name mapping is simpler
* marshaled code objects are smaller and serialize/de-serialize faster
Also note that we can take this approach further by expanding the possible values in co_fastlocalkinds to include specific argument types (e.g. positional-only, kwargs). Doing so would allow further speed-ups in _PyEval_MakeFrameVector(), which is where args get unpacked into the locals-plus array. It would also allow us to shrink marshaled code objects even further.
https://bugs.python.org/issue43693
* Move up the comment about fields using in hashing/comparision.
* Group the fields more clearly.
* Add co_ncellvars and co_nfreevars.
* Raise ValueError if nlocals != len(varnames), rather than aborting.
* Remove 'zombie' frames. We won't need them once we are allocating fixed-size frames.
* Add co_nlocalplus field to code object to avoid recomputing size of locals + frees + cells.
* Move locals, cells and freevars out of frame object into separate memory buffer.
* Use per-threadstate allocated memory chunks for local variables.
* Move globals and builtins from frame object to per-thread stack.
* Move (slow) locals frame object to per-thread stack.
* Move internal frame functions to internal header.
"Zero cost" exception handling.
* Uses a lookup table to determine how to handle exceptions.
* Removes SETUP_FINALLY and POP_TOP block instructions, eliminating (most of) the runtime overhead of try statements.
* Reduces the size of the frame object by about 60%.
Accessing the following attributes will now fire PEP 578 style audit hooks as ("object.__getattr__", obj, name):
* PyTracebackObject: tb_frame
* PyFrameObject: f_code
* PyGenObject: gi_code, gi_frame
* PyCoroObject: cr_code, cr_frame
* PyAsyncGenObject: ag_code, ag_frame
Add an AUDIT_READ attribute flag aliased to READ_RESTRICTED.
Update obsolete flag documentation.
Add pycore_moduleobject.h internal header file with static inline
functions to access module members:
* _PyModule_GetDict()
* _PyModule_GetDef()
* _PyModule_GetState()
These functions don't check at runtime if their argument has a valid
type and can be inlined even if Python is not built with LTO.
_PyType_GetModuleByDef() uses _PyModule_GetDef().
Replace PyModule_GetState() with _PyModule_GetState() in the
extension modules, considered as performance sensitive:
* _abc
* _functools
* _operator
* _pickle
* _queue
* _random
* _sre
* _struct
* _thread
* _winapi
* array
* posix
The following extensions are now built with the Py_BUILD_CORE_MODULE
macro defined, to be able to use the internal pycore_moduleobject.h
header: _abc, array, _operator, _queue, _sre, _struct.
* Use instruction offset, rather than bytecode offset. Streamlines interpreter dispatch a bit, and removes most EXTENDED_ARGs for jumps.
* Change some uses of PyCode_Addr2Line to PyFrame_GetLineNumber
* Remove an assertion which required CO_NEWLOCALS and CO_OPTIMIZED
code flags. It is ok to call this function on a code with these
flags set.
* Fix reference counting on builtins: remove Py_DECREF().
Fix regression introduced in the
commit 46496f9d12.
Add also a comment to document that _PyEval_BuiltinsFromGlobals()
returns a borrowed reference.
The types.FunctionType constructor now inherits the current builtins
if the globals dictionary has no "__builtins__" key, rather than
using {"None": None} as builtins: same behavior as eval() and exec()
functions.
Defining a function with "def function(...): ..." in Python is not
affected, globals cannot be overriden with this syntax: it also
inherits the current builtins.
PyFrame_New(), PyEval_EvalCode(), PyEval_EvalCodeEx(),
PyFunction_New() and PyFunction_NewWithQualName() now inherits the
current builtins namespace if the globals dictionary has no
"__builtins__" key.
* Add _PyEval_GetBuiltins() function.
* _PyEval_BuiltinsFromGlobals() now uses _PyEval_GetBuiltins() if
builtins cannot be found in globals.
* Add tstate parameter to _PyEval_BuiltinsFromGlobals().
Pass the current interpreter (interp) rather than the current Python
thread state (tstate) to internal functions which only use the
interpreter.
Modified functions:
* _PyXXX_Fini() and _PyXXX_ClearFreeList() functions
* _PyEval_SignalAsyncExc(), make_pending_calls()
* _PySys_GetObject(), sys_set_object(), sys_set_object_id(), sys_set_object_str()
* should_audit(), set_flags_from_config(), make_flags()
* _PyAtExit_Call()
* init_stdio_encoding()
* etc.
* Refactor _PyFrame_New_NoTrack() and PyFunction_NewWithQualName()
code.
* PyFrame_New() checks for _PyEval_BuiltinsFromGlobals() failure.
* Fix a ref leak in _PyEval_BuiltinsFromGlobals() error path.
* Complete PyFunction_GetModule() documentation: it returns a
borrowed reference and it can return NULL.
* Move _PyEval_BuiltinsFromGlobals() definition to the internal C
API.
* PyFunction_NewWithQualName() uses _Py_IDENTIFIER() API for the
"__name__" string to make it compatible with subinterpreters.
* Further refactoring of PyEval_EvalCode and friends. Break into make-frame, and eval-frame parts.
* Simplify function vector call using new _PyEval_Vector.
* Remove unused internal functions: _PyEval_EvalCodeWithName and _PyEval_EvalCode.
* Don't use legacy function PyEval_EvalCodeEx.
* Add test for frame.f_lineno with/without tracing.
* Make sure that frame.f_lineno is correct regardless of whether frame.f_trace is set.
* Update importlib
* Add NEWS
* Merge gen and frame state variables into one.
* Replace stack pointer with depth in PyFrameObject. Makes code easier to read and saves a word of memory.