When changing PyType_FromMetaclass recently (GH-93012, GH-93466, GH-28748)
I found a bunch of opportunities to improve the code. Here they are.
Fixes: #89546
Automerge-Triggered-By: GH:encukou
Classes ReferenceType, ProxyType and CallableProxyType have now correct
atrtributes __module__, __name__ and __qualname__.
It makes them (types, not instances) pickleable.
It combines PyImport_ImportModule() and PyObject_GetAttrString()
and saves 4-6 lines of code on every use.
Add also _PyImport_GetModuleAttr() which takes Python strings as arguments.
This checks the bases of of a type created using the FromSpec
API to inherit the bases metaclasses. The metaclass's alloc
function will be called as is done in `tp_new` for classes
created in Python.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@protonmail.com>
This was added for bpo-40514 (gh-84694) to test out a per-interpreter GIL. However, it has since proven unnecessary to keep the experiment in the repo. (It can be done as a branch in a fork like normal.) So here we are removing:
* the configure option
* the macro
* the code enabled by the macro
Added a new stable API function ``PyType_FromMetaclass``, which mirrors
the behavior of ``PyType_FromModuleAndSpec`` except that it takes an
additional metaclass argument. This is, e.g., useful for language
binding tools that need to store additional information in the type
object.
Avoid _PyCodec_Lookup() and PyCodec_LookupError() for most common
built-in encodings and error handlers to avoid creating a temporary
Unicode string object, whereas these encodings and error handlers are
known to be valid.
Python now always use the ``%zu`` and ``%zd`` printf formats to
format a size_t or Py_ssize_t number. Building Python 3.12 requires a
C11 compiler, so these printf formats are now always supported.
* PyObject_Print() and _PyObject_Dump() now use the printf %zd format
to display an object reference count.
* Update PY_FORMAT_SIZE_T comment.
* Remove outdated notes about the %zd format in PyBytes_FromFormat()
and PyUnicode_FromFormat() documentations.
* configure no longer checks for the %zd format and no longer defines
PY_FORMAT_SIZE_T macro in pyconfig.h.
* pymacconfig.h no longer undefines PY_FORMAT_SIZE_T: macOS 10.4 is
no longer supported. Python 3.12 now requires macOS 10.6 (Snow
Leopard) or newer.
Generally comparable perf for the "good" case where memchr doesn't
return any collisions (false matches on lower byte) but clearly faster
with collisions.
Remove the PyUnicode_InternImmortal() function and the
SSTATE_INTERNED_IMMORTAL macro.
The PyUnicode_InternImmortal() function is still exported in the
stable ABI. The function is removed from the API.
PyASCIIObject.state.interned size is now a single bit, rather than 2
bits.
Keep SSTATE_NOT_INTERNED and SSTATE_INTERNED_MORTAL macros for
backward compatibility, but no longer use them internally since the
interned member is now a single bit and so can only have two values
(interned or not interned).
Update stats of _PyUnicode_ClearInterned().
In the limited C API version 3.12, PyUnicode_KIND() is now
implemented as a static inline function. Keep the macro for the
regular C API and for the limited C API version 3.11 and older to
prevent introducing new compiler warnings.
Update _decimal.c and stringlib/eq.h for PyUnicode_KIND().
Convert the following macros to static inline functions:
* PyCell_GET()
* PyCell_SET()
Limited C API version 3.12 no longer casts arguments.
Fix also usage of PyCell_SET(): only delete the old value after
setting the new value.
Convert the following macros to static inline functions:
* _Py_AS_GC()
* _PyGCHead_FINALIZED(), _PyGCHead_SET_FINALIZED()
* _PyGCHead_NEXT(), _PyGCHead_SET_NEXT()
* _PyGCHead_PREV(), _PyGCHead_SET_PREV()
* _PyGC_FINALIZED(), _PyGC_SET_FINALIZED()
* _PyObject_GC_IS_TRACKED()
* _PyObject_GC_MAY_BE_TRACKED()
Add a macro wrapping the _PyObject_GC_IS_TRACKED() function to cast
the argument to PyObject*.
Currently, calling Py_EnterRecursiveCall() and
Py_LeaveRecursiveCall() may use a function call or a static inline
function call, depending if the internal pycore_ceval.h header file
is included or not. Use a different name for the static inline
function to ensure that the static inline function is always used in
Python internals for best performance. Similar approach than
PyThreadState_GET() (function call) and _PyThreadState_GET() (static
inline function).
* Rename _Py_EnterRecursiveCall() to _Py_EnterRecursiveCallTstate()
* Rename _Py_LeaveRecursiveCall() to _Py_LeaveRecursiveCallTstate()
* pycore_ceval.h: Rename Py_EnterRecursiveCall() to
_Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() and
_Py_LeaveRecursiveCall()
Replace "(PyCFunction)(void(*)(void))func" cast with
_PyCFunction_CAST(func).
Change generated by the command:
sed -i -e \
's!(PyCFunction)(void(\*)(void)) *\([A-Za-z0-9_]\+\)!_PyCFunction_CAST(\1)!g' \
$(find -name "*.c")
If the error handler returns position less or equal than the starting
position of non-encodable characters, most of built-in encoders didn't
properly re-size the output buffer. This led to out-of-bounds writes,
and segfaults.
Reduce the complexity from O((M+N)^2) to O(M*N), where M and N are the length
of __args__ for both operands (1 for operand which is not a UnionType).
As a consequence, the complexity of parameter substitution in UnionType has
been reduced from O(N^3) to O(N^2).
Co-authored-by: Yurii Karabas <1998uriyyo@gmail.com>
Move the following API from Include/opcode.h (public C API) to a new
Include/internal/pycore_opcode.h header file (internal C API):
* EXTRA_CASES
* _PyOpcode_Caches
* _PyOpcode_Deopt
* _PyOpcode_Jump
* _PyOpcode_OpName
* _PyOpcode_RelativeJump
* Stores all location info in linetable to conform to PEP 626.
* Remove column table from code objects.
* Remove end-line table from code objects.
* Document new location table format
Python 3.11 now uses C11 standard which adds static_assert()
to <assert.h>.
* In pytime.c, replace Py_BUILD_ASSERT() with preprocessor checks on
SIZEOF_TIME_T with #error.
* On macOS, py_mach_timebase_info() now accepts timebase members with
the same size than _PyTime_t.
* py_get_monotonic_clock() now saturates GetTickCount64() to
_PyTime_MAX: GetTickCount64() is unsigned, whereas _PyTime_t is
signed.
The left-hand side expression of the if-check can be converted to a
constant by the compiler, but the addition on the right-hand side is
performed during runtime.
Move the addition from the right-hand side to the left-hand side by
turning it into a subtraction there. Since the values are known to
be large enough to not turn negative, this is a safe operation.
Prevents a very unlikely integer overflow on 32 bit systems.
Fixes GH-91421.
Remove the Include/code.h header file. C extensions should only
include the main <Python.h> header file.
Python.h includes directly Include/cpython/code.h instead.
Copying and pickling instances of subclasses of builtin types
bytearray, set, frozenset, collections.OrderedDict, collections.deque,
weakref.WeakSet, and datetime.tzinfo now copies and pickles instance attributes
implemented as slots.
Add macros to cast objects to PyASCIIObject*, PyCompactUnicodeObject*
and PyUnicodeObject*: _PyASCIIObject_CAST(),
_PyCompactUnicodeObject_CAST() and _PyUnicodeObject_CAST(). Using
these new macros make the code more readable and check their argument
with: assert(PyUnicode_Check(op)).
Remove redundant assert(PyUnicode_Check(op)) in macros using directly
or indirectly these new CAST macros.
Replacing existing casts with these macros.
* `PyFrame_FastToLocalsWithError` and `PyFrame_LocalsToFast` are no longer called during profile and tracing.
(Contributed by Fabio Zadrozny)
* Make accesses to a frame's `f_locals` safe from C code, not relying on calls to `PyFrame_FastToLocals` or `PyFrame_LocalsToFast`.
* Document new `PyFrame_GetLocals` C-API function.
* Moves the bytecode to the end of the corresponding PyCodeObject, and quickens it in-place.
* Removes the almost-always-unused co_varnames, co_freevars, and co_cellvars member caches
* _PyOpcode_Deopt is a new mapping from all opcodes to their un-quickened forms.
* _PyOpcode_InlineCacheEntries is renamed to _PyOpcode_Caches
* _Py_IncrementCountAndMaybeQuicken is renamed to _PyCode_Warmup
* _Py_Quicken is renamed to _PyCode_Quicken
* _co_quickened is renamed to _co_code_adaptive (and is now a read-only memoryview).
* Do not emit unused nonzero opargs anymore in the compiler.
Specifically, prepare for starring of tuples via a new genericalias iter type. GenericAlias also partially supports the iterator protocol after this change.
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Add new functions to pack and unpack C double (serialize and
deserialize):
* PyFloat_Pack2(), PyFloat_Pack4(), PyFloat_Pack8()
* PyFloat_Unpack2(), PyFloat_Unpack4(), PyFloat_Unpack8()
Document these functions and add unit tests.
Rename private functions and move them from the internal C API
to the public C API:
* _PyFloat_Pack2() => PyFloat_Pack2()
* _PyFloat_Pack4() => PyFloat_Pack4()
* _PyFloat_Pack8() => PyFloat_Pack8()
* _PyFloat_Unpack2() => PyFloat_Unpack2()
* _PyFloat_Unpack4() => PyFloat_Unpack4()
* _PyFloat_Unpack8() => PyFloat_Unpack8()
Replace the "unsigned char*" type with "char*" which is more common
and easy to use.
Add methods __typing_subst__() in TypeVar and ParamSpec.
Simplify code by using more object-oriented approach, especially
the C code for types.GenericAlias and the Python code for
collections.abc.Callable.
When an exception is created in a nested call to PyObject_GetAttr, any
external calls will override the context information of the
AttributeError that we have already placed in the most internal call.
This will cause the suggestions we create to nor work properly as the
attribute name and object that we will be using are the incorrect ones.
To avoid this, we need to check first if these attributes are already
set and bail out if that's the case.
Remove the undocumented private float.__set_format__() method,
previously known as float.__set_format__() in Python 3.7. Its
docstring said: "You probably don't want to use this function. It
exists mainly to be used in Python's test suite."
Rename private functions (no exported), add an underscore prefix:
* PyLineTable_InitAddressRange() => _PyLineTable_InitAddressRange()
* PyLineTable_NextAddressRange() => _PyLineTable_NextAddressRange()
* PyLineTable_PreviousAddressRange() => _PyLineTable_PreviousAddressRange()
Move private functions to the internal C API:
* _PyCode_Addr2EndLine()
* _PyCode_Addr2EndOffset()
* _PyCode_Addr2Offset()
* _PyCode_InitAddressRange()
* _PyCode_InitEndAddressRange(
* _PyLineTable_InitAddressRange()
* _PyLineTable_NextAddressRange()
* _PyLineTable_PreviousAddressRange()
No longer export the following internal functions:
* _PyCode_GetVarnames()
* _PyCode_GetCellvars()
* _PyCode_GetFreevars()
* _Py_GetSpecializationStats()
Add "extern" to pycore_code.h functions to identify them more easiliy
(they are still not exported).
Rename the private undocumented float.__set_format__() method to
float.__setformat__() to fix a typo introduced in Python 3.7. The
method is only used by test_float.
The change enables again test_float tests on the float format which
were previously skipped because of the typo.
The typo was introduced in Python 3.7 by bpo-20185
in commit b5c51d3dd9.
Remove the HAVE_PY_SET_53BIT_PRECISION macro (moved to the internal
C API).
* Move HAVE_PY_SET_53BIT_PRECISION macro to pycore_pymath.h.
* Replace PY_NO_SHORT_FLOAT_REPR macro with _PY_SHORT_FLOAT_REPR
macro which is always defined. gcc -Wundef emits a warning when
using _PY_SHORT_FLOAT_REPR but the macro is not defined, if
pycore_pymath.h include was forgotten.
Instead of manually enumerating the global strings in generate_global_objects.py, we extrapolate the list from usage of _Py_ID() and _Py_STR() in the source files.
This is partly inspired by gh-31261.
https://bugs.python.org/issue46541
Ensure strong references are acquired whenever using `set_next()`. Added randomized test cases for `__eq__` methods that sometimes mutate sets when called.
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
bytesobject.c, bytearrayobject.c and unicodeobject.c now define all
macros used by stringlib, to avoid using undefined macros.
Fix "gcc -Wundef" warnings.
Added new internal functions to compute mod without also computing the quotient.
The loops can be leaner then, which leads to modestly but reliably faster execution in contexts that know they don't need the quotient.
Code by Jeremiah Vivian (Pascual).
Remove the PyHeapType_GET_MEMBERS() macro. It was exposed in the
public C API by mistake, it must only be used by Python internally.
Use the PyTypeObject.tp_members member instead.
Rename PyHeapType_GET_MEMBERS() to _PyHeapType_GET_MEMBERS() and move
it to the internal C API.
Move _Py_GetAllocatedBlocks() and _PyObject_DebugMallocStats()
declarations to pycore_pymem.h. These functions are related to memory
allocators, not to the PyObject structure.
Convert the PyType_SUPPORTS_WEAKREFS() macro to a regular function.
It no longer access the PyTypeObject.tp_weaklistoffset member
directly.
Add _PyType_SUPPORTS_WEAKREFS() static inline functions, used
internally by Python for best performance.
* bpo-46504: faster code for trial quotient in x_divrem()
This brings x_divrem() back into synch with x_divrem1(), which was changed
in bpo-46406 to generate faster code to find machine-word division
quotients and remainders. Modern processors compute both with a single
machine instruction, but convincing C to exploit that requires writing
_less_ "clever" C code.
Add _PyUnicode_FiniTypes() function, called by
finalize_interp_types(). It clears these static types:
* EncodingMapType
* PyFieldNameIter_Type
* PyFormatterIter_Type
_PyStaticType_Dealloc() now does nothing if tp_subclasses
is not NULL.
Add 'static_exceptions' list to factorize code between
_PyExc_InitTypes() and _PyBuiltins_AddExceptions().
_PyExc_InitTypes() does nothing if it's not the main interpreter.
Sort exceptions in Lib/test/exception_hierarchy.txt.
* Add comment to recurse_down_subclasses() explaining why it's safe
to use a borrowed reference to tp_subclasses.
* remove_all_subclasses() no longer accept NULL cases
* type_set_bases() now relies on the fact that new_bases is not NULL.
* type_dealloc_common() avoids PyErr_Fetch/PyErr_Restore if tp_bases
is NULL.
* remove_all_subclasses() makes sure that no exception is raised.
* Don't test at runtime if tp_mro only contains types: rely on
_PyType_CAST() assertion for that.
* _PyStaticType_Dealloc() no longer clears tp_subclasses which is
already NULL.
* mro_hierarchy() avoids calling _PyType_GetSubclasses() if
tp_subclasses is NULL.
Coding style:
* Use Py_NewRef().
* Add braces and move variable declarations to the first variable
assignement.
* Rename a few variables and parameters to use better names.
* Move PyContext static types into object.c static_types list.
* Rename PyContextTokenMissing_Type to _PyContextTokenMissing_Type
and declare it in pycore_context.h.
* _PyHamtItems types are no long exported: replace PyAPI_DATA() with
extern.
The remove_subclass() function now deletes the dictionary when
removing the last subclass (if the dictionary becomes empty) to save
memory: set PyTypeObject.tp_subclasses to NULL. remove_subclass() is
called when a type is deallocated.
_PyType_GetSubclasses() no longer holds a reference to tp_subclasses:
its loop cannot modify tp_subclasses.
Fix a race condition on setting a type __bases__ attribute: the
internal function add_subclass() now gets the
PyTypeObject.tp_subclasses member after calling PyWeakref_NewRef()
which can trigger a garbage collection which can indirectly modify
PyTypeObject.tp_subclasses.
Add a new _PyType_GetSubclasses() function to get type's subclasses.
_PyType_GetSubclasses(type) returns a list which holds strong
refererences to subclasses. It is safer than iterating on
type->tp_subclasses which yields weak references and can be modified
in the loop.
_PyType_GetSubclasses(type) now holds a reference to the tp_subclasses
dict while creating the list of subclasses.
set_collection_flag_recursive() of _abc.c now uses
_PyType_GetSubclasses().
Add types removed by mistake by the commit adding
_PyTypes_FiniTypes().
Move also PyBool_Type at the end, since it depends on PyLong_Type.
PyBytes_Type and PyUnicode_Type no longer depend explicitly on
PyBaseObject_Type: it's the default of PyType_Ready().
Add _PyTypes_FiniTypes() best-effort function to clear static types:
don't deallocate a type if it still has subclasses.
remove_subclass() now sets tp_subclasses to NULL when removing the
last subclass.
The _curses module now creates its ncurses_version type as a heap
type using PyStructSequence_NewType(), rather than using a static
type.
* Move _PyStructSequence_FiniType() definition to pycore_structseq.h.
* test.pythoninfo: log curses.ncurses_version.
Add _PyStructSequence_FiniType() and _PyStaticType_Dealloc()
functions to finalize a structseq static type in Py_Finalize().
Currrently, these functions do nothing if Python is built in release
mode.
Clear static types:
* AsyncGenHooksType: sys.set_asyncgen_hooks()
* FlagsType: sys.flags
* FloatInfoType: sys.float_info
* Hash_InfoType: sys.hash_info
* Int_InfoType: sys.int_info
* ThreadInfoType: sys.thread_info
* UnraisableHookArgsType: sys.unraisablehook
* VersionInfoType: sys.version
* WindowsVersionType: sys.getwindowsversion()
* Add RETURN_GENERATOR and JUMP_NO_INTERRUPT opcodes.
* Trim frame and generator by word each.
* Minor refactor of frame.c
* Update test.test_sys to account for smaller frames.
* Treat generator functions as normal functions when evaluating and specializing.
The docstrings for MappingProxyType's keys(), values(), and items()
methods were never updated to reflect the changes that Python 3 brought
to these APIs, namely returning views rather than lists.
The empty bytes object (b'') and the 256 one-character bytes objects were allocated at runtime init. Now we statically allocate and initialize them.
https://bugs.python.org/issue45953
When multiplying lists and tuples by `n`, increment each element's refcount, by `n`, just once.
Saves `n-1` increments per element, and allows for a leaner & faster copying loop.
Code by sweeneyde (Dennis Sweeney).
This reverts commit ea251806b8.
Keep "assert(interned == NULL);" in _PyUnicode_Fini(), but only for
the main interpreter.
Keep _PyUnicode_ClearInterned() changes avoiding the creation of a
temporary Python list object.
x_mul()'s squaring code can do some redundant and/or useless
work at the end of each digit pass. A more careful analysis
of worst-case carries at various digit positions allows
making that code leaner.
* bpo-46218: Change long_pow() to sliding window algorithm
The primary motivation is to eliminate long_pow's reliance on that the number of bits in a long "digit" is a multiple of 5. Now it no longer cares how many bits are in a digit.
But the sliding window approach also allows cutting the precomputed table of small powers in half, which reduces initialization overhead enough that the approach pays off for smaller exponents too. Depending on exponent bit patterns, a sliding window may also be able to save some bigint multiplies (sometimes when at least 5 consecutive exponent bits are 0, regardless of their starting bit position modulo 5).
Note: boosting the window width to 6 didn't work well overall. It give marginal speed improvements for huge exponents, but the increased overhead (the small-power table needs twice as many entries) made it a loss for smaller exponents.
Co-authored-by: Oleg Iarygin <dralife@yandex.ru>
* Do not PUSH/POP traceback or type to the stack as part of exc_info
* Remove exc_traceback and exc_type from _PyErr_StackItem
* Add to what's new, because this change breaks things like Cython
The array of small PyLong objects has been statically declared. Here I also statically initialize them. Consequently they are no longer initialized dynamically during runtime init.
I've also moved them under a new sub-struct in _PyRuntimeState, in preparation for static allocation and initialization of other global objects.
https://bugs.python.org/issue45953
This change is strictly renames and moving code around. It helps in the following ways:
* ensures type-related init functions focus strictly on one of the three aspects (state, objects, types)
* passes in PyInterpreterState * to all those functions, simplifying work on moving types/objects/state to the interpreter
* consistent naming conventions help make what's going on more clear
* keeping API related to a type in the corresponding header file makes it more obvious where to look for it
https://bugs.python.org/issue46008
* Make generator, coroutine and async gen structs all the same size.
* Store interpreter frame in generator (and coroutine). Reduces the number of allocations neeeded for a generator from two to one.
Rename PyConfig.no_debug_ranges to PyConfig.code_debug_ranges and
invert the value.
Document -X no_debug_ranges and PYTHONNODEBUGRANGES env var in
PyConfig.code_debug_ranges documentation.