Not using `__class_getitem__()` fallback if there is a non-subcriptable metaclass was caused by a certain asymmetry between how `PySequenceMethods` and `PyMappingMethods` are used in `PyObject_GetItem`. This PR removes this asymmetry. No tests failed, so I assume it was not intentional.
Fix error messages for PySequence_Size(), PySequence_GetItem(),
PySequence_SetItem() and PySequence_DelItem() called with a mapping
and PyMapping_Size() called with a sequence.
During development of the limited API support for PySide,
we saw an error in a macro that accessed a type field.
This patch fixes the 7 errors in the Python headers.
Macros which were not written as capitals were implemented
as function.
To do the necessary analysis again, a script was included that
parses all headers and looks for "->tp_" in serctions which can
be reached with active limited API.
It is easily possible to call this script as a test.
Error listing:
../../Include/objimpl.h:243
#define PyObject_IS_GC(o) (PyType_IS_GC(Py_TYPE(o)) && \
(Py_TYPE(o)->tp_is_gc == NULL || Py_TYPE(o)->tp_is_gc(o)))
Action: commented only
../../Include/objimpl.h:362
#define PyType_SUPPORTS_WEAKREFS(t) ((t)->tp_weaklistoffset > 0)
Action: commented only
../../Include/objimpl.h:364
#define PyObject_GET_WEAKREFS_LISTPTR(o) \
((PyObject **) (((char *) (o)) + Py_TYPE(o)->tp_weaklistoffset))
Action: commented only
../../Include/pyerrors.h:143
#define PyExceptionClass_Name(x) \
((char *)(((PyTypeObject*)(x))->tp_name))
Action: implemented function
../../Include/abstract.h:593
#define PyIter_Check(obj) \
((obj)->ob_type->tp_iternext != NULL && \
(obj)->ob_type->tp_iternext != &_PyObject_NextNotImplemented)
Action: implemented function
../../Include/abstract.h:713
#define PyIndex_Check(obj) \
((obj)->ob_type->tp_as_number != NULL && \
(obj)->ob_type->tp_as_number->nb_index != NULL)
Action: implemented function
../../Include/abstract.h:924
#define PySequence_ITEM(o, i)\
( Py_TYPE(o)->tp_as_sequence->sq_item(o, i) )
Action: commented only
* Add Py_UNREACHABLE() as an alias to abort().
* Use Py_UNREACHABLE() instead of assert(0)
* Convert more unreachable code to use Py_UNREACHABLE()
* Document Py_UNREACHABLE() and a few other macros.
* group the (stateful) runtime globals into various topical structs
* consolidate the topical structs under a single top-level _PyRuntimeState struct
* add a check-c-globals.py script that helps identify runtime globals
Other globals are excluded (see globals.txt and check-c-globals.py).
* Move all functions to call objects in a new Objects/call.c file.
* Rename fast_function() to _PyFunction_FastCallKeywords().
* Copy null_error() from Objects/abstract.c
* Inline type_error() in call.c to not have to copy it, it was only
called once.
* Export _PyEval_EvalCodeWithName() since it is now called
from call.c.
* Move all functions to call objects in a new Objects/call.c file.
* Rename fast_function() to _PyFunction_FastCallKeywords().
* Copy null_error() from Objects/abstract.c
* Inline type_error() in call.c to not have to copy it, it was only
called once.
* Export _PyEval_EvalCodeWithName() since it is now called
from call.c.
Issue #29507: Optimize slots calling Python methods. For Python methods, get
the unbound Python function and prepend arguments with self, rather than
calling the descriptor which creates a temporary PyMethodObject.
Add a new _PyObject_FastCall_Prepend() function used to call the unbound Python
method with self. It avoids the creation of a temporary tuple to pass
positional arguments.
Avoiding temporary PyMethodObject and avoiding temporary tuple makes Python
slots up to 1.46x faster. Microbenchmark on a __getitem__() method implemented
in Python:
Median +- std dev: 121 ns +- 5 ns -> 82.8 ns +- 1.0 ns: 1.46x faster (-31%)
Co-Authored-by: INADA Naoki <songofacandy@gmail.com>
* *PyCFunction_*Call*() functions now call Py_EnterRecursiveCall().
* PyObject_Call() now calls directly _PyFunction_FastCallDict() and
PyCFunction_Call() to avoid calling Py_EnterRecursiveCall() twice per
function call
Issue #29234: Inlining _PyStack_AsTuple() into callers increases their stack
consumption, Disable inlining to optimize the stack consumption.
Add _Py_NO_INLINE: use __attribute__((noinline)) of GCC and Clang.
It reduces the stack consumption, bytes per call, before => after:
test_python_call: 1040 => 976 (-64 B)
test_python_getitem: 976 => 912 (-64 B)
test_python_iterator: 1120 => 1056 (-64 B)
=> total: 3136 => 2944 (- 192 B)
Issue #29233: Replace the inefficient _PyObject_VaCallFunctionObjArgs() with
_PyObject_FastCall() in call_method() and call_maybe().
Only a few functions call call_method() and call it with a fixed number of
arguments. Avoid the complex and expensive _PyObject_VaCallFunctionObjArgs()
function, replace it with an array allocated on the stack with the exact number
of argumlents.
It reduces the stack consumption, bytes per call, before => after:
test_python_call: 1168 => 1152 (-16 B)
test_python_getitem: 1344 => 1008 (-336 B)
test_python_iterator: 1568 => 1232 (-336 B)
Remove the _PyObject_VaCallFunctionObjArgs() function which became useless.
Rename it to object_vacall() and make it private.
Issue #28870: Add a new _PY_FASTCALL_SMALL_STACK constant, size of "small
stacks" allocated on the C stack to pass positional arguments to
_PyObject_FastCall().
_PyObject_Call_Prepend() now uses a small stack of 5 arguments (40 bytes)
instead of 8 (64 bytes), since it is modified to use _PY_FASTCALL_SMALL_STACK.
Issue #28915: Replace PyObject_CallFunction() with
PyObject_CallFunctionObjArgs() when the format string was only made of "O"
formats, PyObject* arguments.
PyObject_CallFunctionObjArgs() avoids the creation of a temporary tuple and
doesn't have to parse a format string.
Issue #28915: Use _Py_VaBuildStack() to build a C array of PyObject* and then
use _PyObject_FastCall().
The function has a special case if the stack only contains one parameter and
the parameter is a tuple: "unpack" the tuple of arguments in this case.