The bulk of this patch was generated automatically with:
for name in \
PyObject_Vectorcall \
Py_TPFLAGS_HAVE_VECTORCALL \
PyObject_VectorcallMethod \
PyVectorcall_Function \
PyObject_CallOneArg \
PyObject_CallMethodNoArgs \
PyObject_CallMethodOneArg \
;
do
echo $name
git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g"
done
old=_PyObject_FastCallDict
new=PyObject_VectorcallDict
git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g"
and then cleaned up:
- Revert changes to in docs & news
- Revert changes to backcompat defines in headers
- Nudge misaligned comments
Restore PyObject_IsInstance() comment explaining why only tuples of
types are accepted, but not general sequence. Comment written by
Guido van Rossum in commit 03290ecbf1
which implements isinstance(x, (A, B, ...)). The comment was lost in
a PyObject_IsInstance() optimization:
commit ec569b7947.
Cleanup also the code. recursive_isinstance() is no longer recursive,
so rename it to object_isinstance(), whereas object_isinstance() is
recursive and so rename it to object_recursive_isinstance().
* Add _Py_EnterRecursiveCall() and _Py_LeaveRecursiveCall() which
require a tstate argument.
* Pass tstate to _Py_MakeRecCheck() and _Py_CheckRecursiveCall().
* Convert Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() macros
to static inline functions.
_PyThreadState_GET() is the most efficient way to get the tstate, and
so using it with _Py_EnterRecursiveCall() and
_Py_LeaveRecursiveCall() should be a little bit more efficient than
using Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() which use
the "slower" PyThreadState_GET().
Not using `__class_getitem__()` fallback if there is a non-subcriptable metaclass was caused by a certain asymmetry between how `PySequenceMethods` and `PyMappingMethods` are used in `PyObject_GetItem`. This PR removes this asymmetry. No tests failed, so I assume it was not intentional.
Fix error messages for PySequence_Size(), PySequence_GetItem(),
PySequence_SetItem() and PySequence_DelItem() called with a mapping
and PyMapping_Size() called with a sequence.
During development of the limited API support for PySide,
we saw an error in a macro that accessed a type field.
This patch fixes the 7 errors in the Python headers.
Macros which were not written as capitals were implemented
as function.
To do the necessary analysis again, a script was included that
parses all headers and looks for "->tp_" in serctions which can
be reached with active limited API.
It is easily possible to call this script as a test.
Error listing:
../../Include/objimpl.h:243
#define PyObject_IS_GC(o) (PyType_IS_GC(Py_TYPE(o)) && \
(Py_TYPE(o)->tp_is_gc == NULL || Py_TYPE(o)->tp_is_gc(o)))
Action: commented only
../../Include/objimpl.h:362
#define PyType_SUPPORTS_WEAKREFS(t) ((t)->tp_weaklistoffset > 0)
Action: commented only
../../Include/objimpl.h:364
#define PyObject_GET_WEAKREFS_LISTPTR(o) \
((PyObject **) (((char *) (o)) + Py_TYPE(o)->tp_weaklistoffset))
Action: commented only
../../Include/pyerrors.h:143
#define PyExceptionClass_Name(x) \
((char *)(((PyTypeObject*)(x))->tp_name))
Action: implemented function
../../Include/abstract.h:593
#define PyIter_Check(obj) \
((obj)->ob_type->tp_iternext != NULL && \
(obj)->ob_type->tp_iternext != &_PyObject_NextNotImplemented)
Action: implemented function
../../Include/abstract.h:713
#define PyIndex_Check(obj) \
((obj)->ob_type->tp_as_number != NULL && \
(obj)->ob_type->tp_as_number->nb_index != NULL)
Action: implemented function
../../Include/abstract.h:924
#define PySequence_ITEM(o, i)\
( Py_TYPE(o)->tp_as_sequence->sq_item(o, i) )
Action: commented only
* Add Py_UNREACHABLE() as an alias to abort().
* Use Py_UNREACHABLE() instead of assert(0)
* Convert more unreachable code to use Py_UNREACHABLE()
* Document Py_UNREACHABLE() and a few other macros.
* group the (stateful) runtime globals into various topical structs
* consolidate the topical structs under a single top-level _PyRuntimeState struct
* add a check-c-globals.py script that helps identify runtime globals
Other globals are excluded (see globals.txt and check-c-globals.py).
* Move all functions to call objects in a new Objects/call.c file.
* Rename fast_function() to _PyFunction_FastCallKeywords().
* Copy null_error() from Objects/abstract.c
* Inline type_error() in call.c to not have to copy it, it was only
called once.
* Export _PyEval_EvalCodeWithName() since it is now called
from call.c.
* Move all functions to call objects in a new Objects/call.c file.
* Rename fast_function() to _PyFunction_FastCallKeywords().
* Copy null_error() from Objects/abstract.c
* Inline type_error() in call.c to not have to copy it, it was only
called once.
* Export _PyEval_EvalCodeWithName() since it is now called
from call.c.
Issue #29507: Optimize slots calling Python methods. For Python methods, get
the unbound Python function and prepend arguments with self, rather than
calling the descriptor which creates a temporary PyMethodObject.
Add a new _PyObject_FastCall_Prepend() function used to call the unbound Python
method with self. It avoids the creation of a temporary tuple to pass
positional arguments.
Avoiding temporary PyMethodObject and avoiding temporary tuple makes Python
slots up to 1.46x faster. Microbenchmark on a __getitem__() method implemented
in Python:
Median +- std dev: 121 ns +- 5 ns -> 82.8 ns +- 1.0 ns: 1.46x faster (-31%)
Co-Authored-by: INADA Naoki <songofacandy@gmail.com>
* *PyCFunction_*Call*() functions now call Py_EnterRecursiveCall().
* PyObject_Call() now calls directly _PyFunction_FastCallDict() and
PyCFunction_Call() to avoid calling Py_EnterRecursiveCall() twice per
function call
Issue #29234: Inlining _PyStack_AsTuple() into callers increases their stack
consumption, Disable inlining to optimize the stack consumption.
Add _Py_NO_INLINE: use __attribute__((noinline)) of GCC and Clang.
It reduces the stack consumption, bytes per call, before => after:
test_python_call: 1040 => 976 (-64 B)
test_python_getitem: 976 => 912 (-64 B)
test_python_iterator: 1120 => 1056 (-64 B)
=> total: 3136 => 2944 (- 192 B)
Issue #29233: Replace the inefficient _PyObject_VaCallFunctionObjArgs() with
_PyObject_FastCall() in call_method() and call_maybe().
Only a few functions call call_method() and call it with a fixed number of
arguments. Avoid the complex and expensive _PyObject_VaCallFunctionObjArgs()
function, replace it with an array allocated on the stack with the exact number
of argumlents.
It reduces the stack consumption, bytes per call, before => after:
test_python_call: 1168 => 1152 (-16 B)
test_python_getitem: 1344 => 1008 (-336 B)
test_python_iterator: 1568 => 1232 (-336 B)
Remove the _PyObject_VaCallFunctionObjArgs() function which became useless.
Rename it to object_vacall() and make it private.
Issue #28870: Add a new _PY_FASTCALL_SMALL_STACK constant, size of "small
stacks" allocated on the C stack to pass positional arguments to
_PyObject_FastCall().
_PyObject_Call_Prepend() now uses a small stack of 5 arguments (40 bytes)
instead of 8 (64 bytes), since it is modified to use _PY_FASTCALL_SMALL_STACK.
Issue #28915: Replace PyObject_CallFunction() with
PyObject_CallFunctionObjArgs() when the format string was only made of "O"
formats, PyObject* arguments.
PyObject_CallFunctionObjArgs() avoids the creation of a temporary tuple and
doesn't have to parse a format string.
Issue #28915: Use _Py_VaBuildStack() to build a C array of PyObject* and then
use _PyObject_FastCall().
The function has a special case if the stack only contains one parameter and
the parameter is a tuple: "unpack" the tuple of arguments in this case.
Issue #28838: Rename parameters of the "calls" functions of the Python C API.
* Rename 'callable_object' and 'func' to 'callable': any Python callable object
is accepted, not only Python functions
* Rename 'method' and 'nameid' to 'name' (method name)
* Rename 'o' to 'obj'
* Move, fix and update documentation of PyObject_CallXXX() functions
in abstract.h
* Update also the documentaton of the C API (update parameter names)
Replace
_PyObject_CallArg1(func, arg)
with
PyObject_CallFunctionObjArgs(func, arg, NULL)
Using the _PyObject_CallArg1() macro increases the usage of the C stack, which
was unexpected and unwanted. PyObject_CallFunctionObjArgs() doesn't have this
issue.
Issue #28858: The change b9c9691c72c5 introduced a regression. It seems like
_PyObject_CallArg1() uses more stack memory than
PyObject_CallFunctionObjArgs().
* PyObject_CallFunctionObjArgs(func, NULL) => _PyObject_CallNoArg(func)
* PyObject_CallFunctionObjArgs(func, arg, NULL) => _PyObject_CallArg1(func, arg)
PyObject_CallFunctionObjArgs() allocates 40 bytes on the C stack and requires
extra work to "parse" C arguments to build a C array of PyObject*.
_PyObject_CallNoArg() and _PyObject_CallArg1() are simpler and don't allocate
memory on the C stack.
This change is part of the fastcall project. The change on listsort() is
related to the issue #23507.
* Callable object: callable, o, callable_object => func
* Object for method calls: o => obj
* Method name: name or nameid => method
Cleanup also the C code:
* Don't initialize variables to NULL if they are not used before their first
assignement
* Add braces for readability
new exception with setting current exception as __cause__.
_PyErr_FormatFromCause(exception, format, args...) is equivalent to Python
raise exception(format % args) from sys.exc_info()[1]
* BUILD_TUPLE_UNPACK and BUILD_MAP_UNPACK_WITH_CALL no longer generated with
single tuple or dict.
* Restored more informative error messages for incorrect var-positional and
var-keyword arguments.
* Removed code duplications in _PyEval_EvalCodeWithName().
* Removed redundant runtime checks and parameters in _PyStack_AsDict().
* Added a workaround and enabled previously disabled test in test_traceback.
* Removed dead code from the dis module.
Issue #27810: Add a new calling convention for C functions:
PyObject* func(PyObject *self, PyObject **args,
Py_ssize_t nargs, PyObject *kwnames);
Where args is a C array of positional arguments followed by values of keyword
arguments. nargs is the number of positional arguments, kwnames are keys of
keyword arguments. kwnames can be NULL.
Issue #27830: Add _PyObject_FastCallKeywords(): avoid the creation of a
temporary dictionary for keyword arguments.
Other changes:
* Cleanup call_function() and fast_function() (ex: rename nk to nkwargs)
* Remove now useless do_call(), replaced with _PyObject_FastCallKeywords()
Issue #27841: Add _PyObject_Call_Prepend() helper function to prepend an
argument to existing arguments to call a function. This helper uses fast calls.
Modify method_call() and slot_tp_new() to use _PyObject_Call_Prepend().
Issue #27830: Similar to _PyObject_FastCallDict(), but keyword arguments are
also passed in the same C array than positional arguments, rather than being
passed as a Python dict.
Issue #27809:
* PyObject_CallMethodObjArgs(), _PyObject_CallMethodIdObjArgs() and
PyObject_CallFunctionObjArgs() now use fast call to avoid the creation of a
temporary tuple
* Rename objargs_mktuple() to objargs_mkstack()
* objargs_mkstack() now stores objects in a C array using borrowed references,
instead of storing arguments into a tuple
objargs_mkstack() uses a small buffer allocated on the C stack for 5 arguments
or less, or allocates a buffer in the heap memory.
Note: this change is different than the change 0e4f26083bbb, I fixed the test
to decide if the small stack can be used or not. sizeof(PyObject**) was also
replaced with sizeof(stack[0]) since the sizeof() was wrong (but gave the same
result).
Issue #27809:
* PyObject_CallMethodObjArgs(), _PyObject_CallMethodIdObjArgs() and
PyObject_CallFunctionObjArgs() now use fast call to avoid the creation of a
temporary tuple
* Rename objargs_mktuple() to objargs_mkstack()
* objargs_mkstack() now stores objects in a C array using borrowed references,
instead of storing arguments into a tuple
objargs_mkstack() uses a small buffer allocated on the C stack for 5 arguments
or less, or allocates a buffer in the heap memory.
Issue #27128, PyObject_CallFunction(), _PyObject_FastCall() and callmethod():
if the format string of parameters is empty, avoid the creation of an empty
tuple: call _PyObject_FastCall() without parameters.