1. Remove conditions already checked by assert()
2. Remove object_init() call that effectively creates an empty tuple and
checks that this tuple is empty
Currently, if an arg value escapes (into the closure for an inner function) we end up allocating two indices in the fast locals even though only one gets used. Additionally, using the lower index would be better in some cases, such as with no-arg `super()`. To address this, we update the compiler to fix the offsets so each variable only gets one "fast local". As a consequence, now some cell offsets are interspersed with the locals (only when an arg escapes to an inner function).
https://bugs.python.org/issue43693
This was reverted in GH-26596 (commit 6d518bb) due to some bad memory accesses.
* Add the MAKE_CELL opcode. (gh-26396)
The memory accesses have been fixed.
https://bugs.python.org/issue43693
This moves logic out of the frame initialization code and into the compiler and eval loop. Doing so simplifies the runtime code and allows us to optimize it better.
https://bugs.python.org/issue43693
These were reverted in gh-26530 (commit 17c4edc) due to refleaks.
* 2c1e258 - Compute deref offsets in compiler (gh-25152)
* b2bf2bc - Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)
This change fixes the refleaks.
https://bugs.python.org/issue43693
* Revert "bpo-43693: Compute deref offsets in compiler (gh-25152)"
This reverts commit b2bf2bc1ec.
* Revert "bpo-43693: Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)"
This reverts commit 2c1e2583fd.
These two commits are breaking the refleak buildbots.
A number of places in the code base (notably ceval.c and frameobject.c) rely on mapping variable names to indices in the frame "locals plus" array (AKA fast locals), and thus opargs. Currently the compiler indirectly encodes that information on the code object as the tuples co_varnames, co_cellvars, and co_freevars. At runtime the dependent code must calculate the proper mapping from those, which isn't ideal and impacts performance-sensitive sections. This is something we can easily address in the compiler instead.
This change addresses the situation by replacing internal use of co_varnames, etc. with a single combined tuple of names in locals-plus order, along with a minimal array mapping each to its kind (local vs. cell vs. free). These two new PyCodeObject fields, co_fastlocalnames and co_fastllocalkinds, are not exposed to Python code for now, but co_varnames, etc. are still available with the same values as before (though computed lazily).
Aside from the (mild) performance impact, there are a number of other benefits:
* there's now a clear, direct relationship between locals-plus and variables
* code that relies on the locals-plus-to-name mapping is simpler
* marshaled code objects are smaller and serialize/de-serialize faster
Also note that we can take this approach further by expanding the possible values in co_fastlocalkinds to include specific argument types (e.g. positional-only, kwargs). Doing so would allow further speed-ups in _PyEval_MakeFrameVector(), which is where args get unpacked into the locals-plus array. It would also allow us to shrink marshaled code objects even further.
https://bugs.python.org/issue43693
The PyType_Ready() function now raises an error if a type is defined
with the Py_TPFLAGS_HAVE_GC flag set but has no traverse function
(PyTypeObject.tp_traverse).
* Move up the comment about fields using in hashing/comparision.
* Group the fields more clearly.
* Add co_ncellvars and co_nfreevars.
* Raise ValueError if nlocals != len(varnames), rather than aborting.
Fix a regression in type() when a metaclass raises an exception. The
C function type_new() must properly report the exception when a
metaclass constructor raises an exception and the winner class is not
the metaclass.
Fix a crash at Python exit when a deallocator function removes the
last strong reference to a heap type.
Don't read type memory after calling basedealloc() since
basedealloc() can deallocate the type and free its memory.
_PyMem_IsPtrFreed() argument is now constant.
* Remove 'zombie' frames. We won't need them once we are allocating fixed-size frames.
* Add co_nlocalplus field to code object to avoid recomputing size of locals + frees + cells.
* Move locals, cells and freevars out of frame object into separate memory buffer.
* Use per-threadstate allocated memory chunks for local variables.
* Move globals and builtins from frame object to per-thread stack.
* Move (slow) locals frame object to per-thread stack.
* Move internal frame functions to internal header.
check_set_special_type_attr() and type_set_annotations()
now check for immutable flag (Py_TPFLAGS_IMMUTABLETYPE).
Co-authored-by: Victor Stinner <vstinner@python.org>
Add a new Py_TPFLAGS_DISALLOW_INSTANTIATION type flag to disallow
creating type instances: set tp_new to NULL and don't create the
"__new__" key in the type dictionary.
The flag is set automatically on static types if tp_base is NULL or
&PyBaseObject_Type and tp_new is NULL.
Use the flag on the following types:
* _curses.ncurses_version type
* _curses_panel.panel
* _tkinter.Tcl_Obj
* _tkinter.tkapp
* _tkinter.tktimertoken
* _xxsubinterpretersmodule.ChannelID
* sys.flags type
* sys.getwindowsversion() type
* sys.version_info type
Update MyStr example in the C API documentation to use
Py_TPFLAGS_DISALLOW_INSTANTIATION.
Add _PyStructSequence_InitType() function to create a structseq type
with the Py_TPFLAGS_DISALLOW_INSTANTIATION flag set.
type_new() calls _PyType_CheckConsistency() at exit.
* Add Py_TPFLAGS_SEQUENCE and Py_TPFLAGS_MAPPING, add to all relevant standard builtin classes.
* Set relevant flags on collections.abc.Sequence and Mapping.
* Use flags in MATCH_SEQUENCE and MATCH_MAPPING opcodes.
* Inherit Py_TPFLAGS_SEQUENCE and Py_TPFLAGS_MAPPING.
* Add NEWS
* Remove interpreter-state map_abc and seq_abc fields.
Change class and module objects to lazy-create empty annotations dicts on demand. The annotations dicts are stored in the object's `__dict__` for backwards compatibility.
Introduce Py_TPFLAGS_IMMUTABLETYPE flag for immutable type objects, and
modify PyType_Ready() to set it for static types.
Co-authored-by: Victor Stinner <vstinner@python.org>
Add pycore_moduleobject.h internal header file with static inline
functions to access module members:
* _PyModule_GetDict()
* _PyModule_GetDef()
* _PyModule_GetState()
These functions don't check at runtime if their argument has a valid
type and can be inlined even if Python is not built with LTO.
_PyType_GetModuleByDef() uses _PyModule_GetDef().
Replace PyModule_GetState() with _PyModule_GetState() in the
extension modules, considered as performance sensitive:
* _abc
* _functools
* _operator
* _pickle
* _queue
* _random
* _sre
* _struct
* _thread
* _winapi
* array
* posix
The following extensions are now built with the Py_BUILD_CORE_MODULE
macro defined, to be able to use the internal pycore_moduleobject.h
header: _abc, array, _operator, _queue, _sre, _struct.
PyType_Ready() now ensures that a type MRO cannot be empty.
_PyType_GetModuleByDef() no longer checks "i < PyTuple_GET_SIZE(mro)"
at the first loop iteration to optimize the most common case, when
the argument is the defining class.
_PyType_GetModuleByDef() no longer checks if types are heap types.
_PyType_GetModuleByDef() must only be called on a heap type created
by PyType_FromModuleAndSpec() or on its subclasses.
type_ready_mro() ensures that a static type cannot inherit from a
heap type.
* Rename functions
* Only pass type parameter to "add_xxx" functions.
* Clarify the role of the type_ready_inherit_as_structs() function.
* Move type_dict_set_doc() code to call it in type_ready_fill_dict().
* Split PyType_Ready() into sub-functions.
* type_ready_mro() now checks if bases are static types earlier.
* Check tp_name earlier, in type_ready_checks().
* Add _PyType_IsReady() macro to check if a type is ready.
* Split type_new() into into many small functions.
* Add type_new_ctx structure to pass variables between subfunctions.
* Initialize some PyTypeObject and PyHeapTypeObject members earlier
in type_new_alloc().
* Rename variables to more specific names.
* Add "__weakref__" identifier for type_new_visit_slots().
* Factorize code to convert a method to a classmethod
(__init_subclass__ and __class_getitem__).
* Add braces to respect PEP 7.
* Move variable declarations where the variables are initialized.
Remove the compiler functions using "struct _mod" type, because the
public AST C API was removed:
* PyAST_Compile()
* PyAST_CompileEx()
* PyAST_CompileObject()
* PyFuture_FromAST()
* PyFuture_FromASTObject()
These functions were undocumented and excluded from the limited C API.
Rename functions:
* PyAST_CompileObject() => _PyAST_Compile()
* PyFuture_FromASTObject() => _PyFuture_FromAST()
Moreover, _PyFuture_FromAST() is no longer exported (replace
PyAPI_FUNC() with extern). _PyAST_Compile() remains exported for
test_peg_generator.
Remove also compatibility functions:
* PyAST_Compile()
* PyAST_CompileEx()
* PyFuture_FromAST()
The common case going through _PyType_Lookup is to have a cache hit. There are some small tweaks that can make this a little cheaper:
* The name field identity is used for a cache hit and is kept alive by the cache. So there's no need to read the hash code o the name - instead, the address can be used as the hash.
* There's no need to check if the name is cachable on the lookup either, it probably is, and if it is, it'll be in the cache.
* If we clear the version tag when invalidating a type then we don't actually need to check for a valid version tag bit.
Pass the current interpreter (interp) rather than the current Python
thread state (tstate) to internal functions which only use the
interpreter.
Modified functions:
* _PyXXX_Fini() and _PyXXX_ClearFreeList() functions
* _PyEval_SignalAsyncExc(), make_pending_calls()
* _PySys_GetObject(), sys_set_object(), sys_set_object_id(), sys_set_object_str()
* should_audit(), set_flags_from_config(), make_flags()
* _PyAtExit_Call()
* init_stdio_encoding()
* etc.
Make the Unicode dictionary of interned strings compatible with
subinterpreters.
Remove the INTERN_NAME_STRINGS macro in typeobject.c: names are
always now interned (even if EXPERIMENTAL_ISOLATED_SUBINTERPRETERS
macro is defined).
_PyUnicode_ClearInterned() now uses PyDict_Next() to no longer
allocate memory, to ensure that the interned dictionary is cleared.
Make the type attribute lookup cache per-interpreter.
Add private _PyType_InitCache() function, called by PyInterpreterState_New().
Continue to share next_version_tag between interpreters, since static
types are still shared by interpreters.
Remove MCACHE macro: the cache is no longer disabled if the
EXPERIMENTAL_ISOLATED_SUBINTERPRETERS macro is defined.
No longer use deprecated aliases to functions:
* Replace PyObject_MALLOC() with PyObject_Malloc()
* Replace PyObject_REALLOC() with PyObject_Realloc()
* Replace PyObject_FREE() with PyObject_Free()
* Replace PyObject_Del() with PyObject_Free()
* Replace PyObject_DEL() with PyObject_Free()
No longer use deprecated aliases to functions:
* Replace PyMem_MALLOC() with PyMem_Malloc()
* Replace PyMem_REALLOC() with PyMem_Realloc()
* Replace PyMem_FREE() with PyMem_Free()
* Replace PyMem_Del() with PyMem_Free()
* Replace PyMem_DEL() with PyMem_Free()
Modify also the PyMem_DEL() macro to use directly PyMem_Free().
* There were leaks if Py_tp_bases is used more than once or if some call is
failed before setting tp_bases.
* There was a crash if the bases argument or the Py_tp_bases slot is not a tuple.
* The documentation was not accurate.
bpo-1635741, bpo-40170: When called on a static type with NULL
tp_base, PyType_Ready() no longer increments the reference count of
the PyBaseObject_Type ("object). PyTypeObject.tp_base is a strong
reference on a heap type, but it is borrowed reference on a static
type.
Fix 99 reference leaks at Python exit (showrefcount 18623 => 18524).
If PyDict_GetItemWithError is only used to check whether the key is in dict,
it is better to use PyDict_Contains instead.
And if it is used in combination with PyDict_SetItem, PyDict_SetDefault can
replace the combination.
These functions are considered not safe because they suppress all internal errors
and can return wrong result. PyDict_GetItemString and _PyDict_GetItemId can
also silence current exception in rare cases.
Remove no longer used _PyDict_GetItemId.
Add _PyDict_ContainsId and rename _PyDict_Contains into
_PyDict_Contains_KnownHash.