Issue #29227: Inline call_function() into _PyEval_EvalFrameDefault() using
Py_LOCAL_INLINE to reduce the stack consumption.
It reduces the stack consumption, bytes per call, before => after:
test_python_call: 1152 => 1040 (-112 B)
test_python_getitem: 1008 => 976 (-32 B)
test_python_iterator: 1232 => 1120 (-112 B)
=> total: 3392 => 3136 (- 256 B)
Copy and then adapt Python/random.c from default branch. Difference between 3.5
and default branches:
* Python 3.5 only uses getrandom() in non-blocking mode: flags=GRND_NONBLOCK
* If getrandom() fails with EAGAIN: py_getrandom() immediately fails and
remembers that getrandom() doesn't work.
* Python 3.5 has no _PyOS_URandomNonblock() function: _PyOS_URandom()
works in non-blocking mode on Python 3.5
* dev_urandom() now calls py_getentropy(). Prepare the fallback to support
getentropy() failure and falls back on reading from /dev/urandom.
* Simplify dev_urandom(). pyurandom() is now responsible to call getentropy()
or getrandom(). Enhance also dev_urandom() and pyurandom() documentation.
* getrandom() is now preferred over getentropy(). The glibc 2.24 now implements
getentropy() on Linux using the getrandom() syscall. But getentropy()
doesn't support non-blocking mode. Since getrandom() is tried first, it's not
more needed to explicitly exclude getentropy() on Solaris. Replace:
"if defined(HAVE_GETENTROPY) && !defined(sun)"
with "if defined(HAVE_GETENTROPY)"
* Enhance py_getrandom() documentation. py_getentropy() now supports ENOSYS,
EPERM & EINTR
The glibc now implements getentropy() on Linux using the getrandom() syscall.
But getentropy() doesn't support non-blocking mode.
Since getrandom() is tried first, it's not more needed to explicitly exclude
getentropy() on Solaris. Replace:
if defined(HAVE_GETENTROPY) && !defined(sun)
with
if defined(HAVE_GETENTROPY)
Issue #28870: Add a new _PY_FASTCALL_SMALL_STACK constant, size of "small
stacks" allocated on the C stack to pass positional arguments to
_PyObject_FastCall().
_PyObject_Call_Prepend() now uses a small stack of 5 arguments (40 bytes)
instead of 8 (64 bytes), since it is modified to use _PY_FASTCALL_SMALL_STACK.
Special thanks to INADA Naoki for pushing the patch through
the last mile, Serhiy Storchaka for reviewing the code, and to
Victor Stinner for suggesting the idea (originally implemented
in the PyPy project).
The PEP 523 modified PyEval_EvalFrameEx(): it's now an indirection to
interp->eval_frame().
Inline the call in performance critical code. Leave PyEval_EvalFrame()
unchanged, this function is only kept for backward compatibility.
Issue #28915: Replace _PyObject_CallMethodId() with
_PyObject_CallMethodIdObjArgs() in various modules when the format string was
only made of "O" formats, PyObject* arguments.
_PyObject_CallMethodIdObjArgs() avoids the creation of a temporary tuple and
doesn't have to parse a format string.
Issue #28915: Replace _PyObject_CallMethodId() with
_PyObject_CallMethodIdObjArgs() when the format string only use the format 'O'
for objects, like "(O)".
_PyObject_CallMethodIdObjArgs() avoids the code to parse a format string and
avoids the creation of a temporary tuple.
Issue #28915: Py_ssize_t type is better for indexes. The compiler might emit
more efficient code for i++. Py_ssize_t is the type of a PyTuple index for
example.
Replace also "int endchar" with "char endchar".
Issue #28838: Rename parameters of the "calls" functions of the Python C API.
* Rename 'callable_object' and 'func' to 'callable': any Python callable object
is accepted, not only Python functions
* Rename 'method' and 'nameid' to 'name' (method name)
* Rename 'o' to 'obj'
* Move, fix and update documentation of PyObject_CallXXX() functions
in abstract.h
* Update also the documentaton of the C API (update parameter names)
Replace
_PyObject_CallArg1(func, arg)
with
PyObject_CallFunctionObjArgs(func, arg, NULL)
Using the _PyObject_CallArg1() macro increases the usage of the C stack, which
was unexpected and unwanted. PyObject_CallFunctionObjArgs() doesn't have this
issue.
Handling zero-argument super() in __init_subclass__ and
__set_name__ involved moving __class__ initialisation to
type.__new__. This requires cooperation from custom
metaclasses to ensure that the new __classcell__ entry
is passed along appropriately.
The initial implementation of that change resulted in abruptly
broken zero-argument super() support in metaclasses that didn't
adhere to the new requirements (such as Django's metaclass for
Model definitions).
The updated approach adopted here instead emits a deprecation
warning for those cases, and makes them work the same way they
did in Python 3.5.
This patch also improves the related class machinery documentation
to cover these details and to include more reader-friendly
cross-references and index entries.
Issue #28858: The change b9c9691c72c5 introduced a regression. It seems like
_PyObject_CallArg1() uses more stack memory than
PyObject_CallFunctionObjArgs().
Replace
PyObject_CallFunction(func, "O", arg)
and
PyObject_CallFunction(func, "O", arg, NULL)
with
_PyObject_CallArg1(func, arg)
Replace
PyObject_CallFunction(func, NULL)
with
_PyObject_CallNoArg(func)
_PyObject_CallNoArg() and _PyObject_CallArg1() are simpler and don't allocate
memory on the C stack.
* PyObject_CallFunctionObjArgs(func, NULL) => _PyObject_CallNoArg(func)
* PyObject_CallFunctionObjArgs(func, arg, NULL) => _PyObject_CallArg1(func, arg)
PyObject_CallFunctionObjArgs() allocates 40 bytes on the C stack and requires
extra work to "parse" C arguments to build a C array of PyObject*.
_PyObject_CallNoArg() and _PyObject_CallArg1() are simpler and don't allocate
memory on the C stack.
This change is part of the fastcall project. The change on listsort() is
related to the issue #23507.
Issue #28799:
* Remove the PyEval_GetCallStats() function.
* Deprecate the untested and undocumented sys.callstats() function.
* Remove the CALL_PROFILE special build
Use the sys.setprofile() function, cProfile or profile module to profile
function calls.
Issue #28782: Fix a bug in the implementation ``yield from`` when checking
if the next instruction is YIELD_FROM. Regression introduced by WORDCODE
(issue #26647).
Reviewed by Serhiy Storchaka and Yury Selivanov.
Issue #28691: Fix warn_invalid_escape_sequence(): handle correctly
DeprecationWarning raised as an exception. First clear the current exception to
replace the DeprecationWarning exception with a SyntaxError exception.
Unit test written by Serhiy Storchaka.
When Python is not compiled with PGO, the performance of Python on call_simple
and call_method microbenchmarks depend highly on the code placement. In the
worst case, the performance slowdown can be up to 70%.
The GCC __attribute__((hot)) attribute helps to keep hot code close to reduce
the risk of such major slowdown. This attribute is ignored when Python is
compiled with PGO.
The following functions are considered as hot according to statistics collected
by perf record/perf report:
* _PyEval_EvalFrameDefault()
* call_function()
* _PyFunction_FastCall()
* PyFrame_New()
* frame_dealloc()
* PyErr_Occurred()
new exception with setting current exception as __cause__.
_PyErr_FormatFromCause(exception, format, args...) is equivalent to Python
raise exception(format % args) from sys.exc_info()[1]
new exception with setting current exception as __cause__.
_PyErr_FormatFromCause(exception, format, args...) is equivalent to Python
raise exception(format % args) from sys.exc_info()[1]
* BUILD_TUPLE_UNPACK and BUILD_MAP_UNPACK_WITH_CALL no longer generated with
single tuple or dict.
* Restored more informative error messages for incorrect var-positional and
var-keyword arguments.
* Removed code duplications in _PyEval_EvalCodeWithName().
* Removed redundant runtime checks and parameters in _PyStack_AsDict().
* Added a workaround and enabled previously disabled test in test_traceback.
* Removed dead code from the dis module.
The __class__ cell used by zero-argument super() is now initialized
from type.__new__ rather than __build_class__, so class methods
relying on that will now work correctly when called from metaclass
methods during class creation.
Patch by Martin Teichmann.
Issue #27810:
* Modify vgetargskeywordsfast() to work on a C array of PyObject* rather than
working on a tuple directly.
* Add _PyArg_ParseStack()
* Argument Clinic now emits code using the new METH_FASTCALL calling convention
Issue #27810: Add a new calling convention for C functions:
PyObject* func(PyObject *self, PyObject **args,
Py_ssize_t nargs, PyObject *kwnames);
Where args is a C array of positional arguments followed by values of keyword
arguments. nargs is the number of positional arguments, kwnames are keys of
keyword arguments. kwnames can be NULL.
Tested on macOS 10.11 dtrace, Ubuntu 16.04 SystemTap, and libbcc.
Largely based by an initial patch by Jesús Cea Avión, with some
influence from Dave Malcolm's SystemTap patch and Nikhil Benesch's
unification patch.
Things deliberately left out for simplicity:
- ustack helpers, I have no way of testing them at this point since
they are Solaris-specific
- PyFrameObject * in function__entry/function__return, this is
SystemTap-specific
- SPARC support
- dynamic tracing
- sys module dtrace facility introspection
All of those might be added later.
Issue #27830: Add _PyObject_FastCallKeywords(): avoid the creation of a
temporary dictionary for keyword arguments.
Other changes:
* Cleanup call_function() and fast_function() (ex: rename nk to nkwargs)
* Remove now useless do_call(), replaced with _PyObject_FastCallKeywords()
Issue #27213: Rework CALL_FUNCTION* opcodes to produce shorter and more
efficient bytecode:
* CALL_FUNCTION now only accepts position arguments
* CALL_FUNCTION_KW accepts position arguments and keyword arguments, but keys
of keyword arguments are packed into a constant tuple.
* CALL_FUNCTION_EX is the most generic, it expects a tuple and a dict for
positional and keyword arguments.
CALL_FUNCTION_VAR and CALL_FUNCTION_VAR_KW opcodes have been removed.
2 tests of test_traceback are currently broken: skip test, the issue #28050 was
created to track the issue.
Patch by Demur Rumed, design by Serhiy Storchaka, reviewed by Serhiy Storchaka
and Victor Stinner.
symtable_analyze() calls analyze_block() with bound=NULL. Theoretically
that NULL can be passed down to update_symbols(). update_symbols() may
deference NULL and pass it to PySet_Contains()
Issue #27776: The os.urandom() function does now block on Linux 3.17 and newer
until the system urandom entropy pool is initialized to increase the security.
This change is part of the PEP 524.
Don't pass "()" format to PyObject_CallXXX() to call a function without
argument: pass NULL as the format string instead. It avoids to have to parse a
string to produce 0 argument.
Issue #27830: Similar to _PyObject_FastCallDict(), but keyword arguments are
also passed in the same C array than positional arguments, rather than being
passed as a Python dict.
Issue #27809: PyEval_CallObjectWithKeywords() doesn't increment temporary the
reference counter of the args tuple (positional arguments). The caller already
holds a strong reference to it.