gcc -Wcast-qual turns up a number of instances of casting away constness of pointers. Some of these can be safely modified, by either:
Adding the const to the type cast, as in:
- return _PyUnicode_FromUCS1((unsigned char*)s, size);
+ return _PyUnicode_FromUCS1((const unsigned char*)s, size);
or, Removing the cast entirely, because it's not necessary (but probably was at one time), as in:
- PyDTrace_FUNCTION_ENTRY((char *)filename, (char *)funcname, lineno);
+ PyDTrace_FUNCTION_ENTRY(filename, funcname, lineno);
These changes will not change code, but they will make it much easier to check for errors in consts
The bulk of this patch was generated automatically with:
for name in \
PyObject_Vectorcall \
Py_TPFLAGS_HAVE_VECTORCALL \
PyObject_VectorcallMethod \
PyVectorcall_Function \
PyObject_CallOneArg \
PyObject_CallMethodNoArgs \
PyObject_CallMethodOneArg \
;
do
echo $name
git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g"
done
old=_PyObject_FastCallDict
new=PyObject_VectorcallDict
git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g"
and then cleaned up:
- Revert changes to in docs & news
- Revert changes to backcompat defines in headers
- Nudge misaligned comments
Replace a few Py_FatalError() calls if tstate is NULL with
assert(tstate != NULL) in ceval.c.
PyEval_AcquireThread(), PyEval_ReleaseThread() and
PyEval_RestoreThread() must never be called with a NULL tstate.
* Add DICT_UPDATE and DICT_MERGE bytecodes. Use them for ** unpacking.
* Remove BUILD_MAP_UNPACK and BUILD_MAP_UNPACK_WITH_CALL, as they are now unused.
* Update magic number for ** unpacking opcodes.
* Update dis.rst to incorporate new bytecodes.
* Add blurb entry.
* Add three new bytecodes: LIST_TO_TUPLE, LIST_EXTEND, SET_UPDATE. Use them to implement star unpacking expressions.
* Remove four bytecodes BUILD_LIST_UNPACK, BUILD_TUPLE_UNPACK, BUILD_SET_UNPACK and BUILD_TUPLE_UNPACK_WITH_CALL opcodes as they are now unused.
* Update magic number and dis.rst for new bytecodes.
* Reorder the __aenter__ and __aexit__ checks for async with
* Add assertions for async with body being skipped
* Swap __aexit__ and __aenter__ loading in the documentation
Break up COMPARE_OP into four logically distinct opcodes:
* COMPARE_OP for rich comparisons
* IS_OP for 'is' and 'is not' tests
* CONTAINS_OP for 'in' and 'is not' tests
* JUMP_IF_NOT_EXC_MATCH for checking exceptions in 'try-except' statements.
When producing the bytecode of exception handlers with name binding (like `except Exception as e`) we need to produce a try-finally block to make sure that the name is deleted after the handler is executed to prevent cycles in the stack frame objects. The bytecode associated with this try-finally block does not have source lines associated and it was causing problems when the tracing functionality was running over it.
Remove BEGIN_FINALLY, END_FINALLY, CALL_FINALLY and POP_FINALLY bytecodes. Implement finally blocks by code duplication.
Reimplement frame.lineno setter using line numbers rather than bytecode offsets.
Add PyInterpreterState.runtime field: reference to the _PyRuntime
global variable. This field exists to not have to pass runtime in
addition to tstate to a function. Get runtime from tstate:
tstate->interp->runtime.
Remove "_PyRuntimeState *runtime" parameter from functions already
taking a "PyThreadState *tstate" parameter.
_PyGC_Init() first parameter becomes "PyThreadState *tstate".
Additional note: the `method_check_args` function in `Objects/descrobject.c` is written in such a way that it applies to all kinds of descriptors. In particular, a future re-implementation of `wrapper_descriptor` could use that code.
CC @vstinner @encukou
https://bugs.python.org/issue37645
Automerge-Triggered-By: @encukou
* Add _Py_EnterRecursiveCall() and _Py_LeaveRecursiveCall() which
require a tstate argument.
* Pass tstate to _Py_MakeRecCheck() and _Py_CheckRecursiveCall().
* Convert Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() macros
to static inline functions.
_PyThreadState_GET() is the most efficient way to get the tstate, and
so using it with _Py_EnterRecursiveCall() and
_Py_LeaveRecursiveCall() should be a little bit more efficient than
using Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() which use
the "slower" PyThreadState_GET().
Provide Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() as
regular functions for the limited API. Previously, there were defined
as macros, but these macros didn't work with the limited API which
cannot access PyThreadState.recursion_depth field.
Remove _Py_CheckRecursionLimit from the stable ABI.
Add Include/cpython/ceval.h header file.
bpo-37151: remove special case for PyCFunction from PyObject_Call
Alse, make the undocumented function PyCFunction_Call an alias
of PyObject_Call and deprecate it.
The fact that keyword names are strings is now part of the vectorcall and `METH_FASTCALL` protocols. The biggest concrete change is that `_PyStack_UnpackDict` now checks that and raises `TypeError` if not.
CC @markshannon @vstinner
https://bugs.python.org/issue37540
* Replace global var Py_VerboseFlag with interp->config.verbose.
* Add _PyErr_NoMemory(tstate) function.
* Add tstate parameter to _PyEval_SetCoroutineOriginTrackingDepth()
and move the function to the internal API.
* Replace _PySys_InitMain(runtime, interp)
with _PySys_InitMain(runtime, tstate).
It has been documented as deprecated and to be removed in 3.8;
From a comment on another thread – which I can't find ; leave get_coro_wrapper() for now, but always return `None`.
https://bugs.python.org/issue36933
Remove the PyEval_ReInitThreads() function from the Python C API.
It should not be called explicitly: use PyOS_AfterFork_Child()
instead.
Rename PyEval_ReInitThreads() to _PyEval_ReInitThreads() and add a
'runtime' parameter.
Add "struct _ceval_runtime_state *ceval = &_PyRuntime.ceval;" local
variables to function to better highlight the dependency on the
global variable _PyRuntime and to point directly to _PyRuntime.ceval
field rather than on the larger _PyRuntime.
Changes:
* Add _PyRuntimeState_GetThreadState(runtime) macro.
* Add _PyEval_AddPendingCall(ceval, ...) and
_PyThreadState_Swap(gilstate, ...) functions.
* _PyThreadState_GET() macro now calls
_PyRuntimeState_GetThreadState() using &_PyRuntime.
* Add 'ceval' parameter to COMPUTE_EVAL_BREAKER(),
SIGNAL_PENDING_SIGNALS(), _PyEval_SignalAsyncExc(),
_PyEval_SignalReceived() and _PyEval_FiniThreads() macros and
functions.
* Add 'tstate' parameter to call_function(), do_call_core() and
do_raise().
* Add 'runtime' parameter to _Py_CURRENTLY_FINALIZING(),
_Py_FinishPendingCalls() and _PyThreadState_DeleteExcept()
macros and functions.
* Declare 'runtime', 'tstate', 'ceval' and 'eval_breaker' variables
as constant.
If a "=" is specified a the end of an f-string expression, the f-string will evaluate to the text of the expression, followed by '=', followed by the repr of the value of the expression.
This commit contains the implementation of PEP570: Python positional-only parameters.
* Update Grammar/Grammar with new typedarglist and varargslist
* Regenerate grammar files
* Update and regenerate AST related files
* Update code object
* Update marshal.c
* Update compiler and symtable
* Regenerate importlib files
* Update callable objects
* Implement positional-only args logic in ceval.c
* Regenerate frozen data
* Update standard library to account for positional-only args
* Add test file for positional-only args
* Update other test files to account for positional-only args
* Add News entry
* Update inspect module and related tests
* Add _PyEval_FiniThreads2(). _PyEval_FiniThreads() now only clears
the pending lock, whereas _PyEval_FiniThreads2() destroys the GIL.
* pymain_free() now calls _PyEval_FiniThreads2().
* Py_FinalizeEx() now calls _PyEval_FiniThreads().
PyEval_AcquireLock() and PyEval_AcquireThread() now
terminate the current thread if called while the interpreter is
finalizing, making them consistent with PyEval_RestoreThread(),
Py_END_ALLOW_THREADS, and PyGILState_Ensure().
Change PyAPI_FUNC(type), PyAPI_DATA(type) and PyMODINIT_FUNC macros
of pyport.h when Py_BUILD_CORE_MODULE is defined.
The Py_BUILD_CORE_MODULE define must be now be used to build a C
extension as a dynamic library accessing Python internals: export the
PyInit_xxx() function in DLL exports on Windows.
Changes:
* Py_BUILD_CORE_BUILTIN and Py_BUILD_CORE_MODULE now imply
Py_BUILD_CORE directy in pyport.h.
* ceval.c compilation now fails with an error if Py_BUILD_CORE is not
defined, just to ensure that Python is build with the correct
defines.
* setup.py now compiles _pickle.c with Py_BUILD_CORE_MODULE define.
* setup.py compiles _json.c with Py_BUILD_CORE_MODULE define, rather
than Py_BUILD_CORE_BUILTIN define
* PCbuild/pythoncore.vcxproj: Add Py_BUILD_CORE_BUILTIN define.
This is effectively an un-revert of #11617 and #12024 (reverted in #12159). Portions of those were merged in other PRs (with lower risk) and this represents the remainder. Note that I found 3 different bugs in the original PRs and have fixed them here.
The function has no return value.
Fix the following warning on Windows:
python\ceval.c(180): warning C4098: 'PyEval_InitThreads':
'void' function returning a value
* Revert "bpo-36097: Use only public C-API in the_xxsubinterpreters module (adding as necessary). (#12003)"
This reverts commit bcfa450f21.
* Revert "bpo-33608: Simplify ceval's DISPATCH by hoisting eval_breaker ahead of time. (gh-12062)"
This reverts commit bda918bf65.
* Revert "bpo-33608: Use _Py_AddPendingCall() in _PyCrossInterpreterData_Release(). (gh-12024)"
This reverts commit b05b711a2c.
* Revert "bpo-33608: Factor out a private, per-interpreter _Py_AddPendingCall(). (GH-11617)"
This reverts commit ef4ac967e2.
Ensure that the main interpreter is active (in the main thread) for signal-handling operations. This is increasingly relevant as people use subinterpreters more.
https://bugs.python.org/issue35724
This change separates the signal handling trigger in the eval loop from the "pending calls" machinery. There is no semantic change and the difference in performance is insignificant.
The change makes both components less confusing. It also eliminates the risk of changes to the pending calls affecting signal handling. This is particularly relevant for some upcoming pending calls changes I have in the works.
In _localemodule.c and selectmodule.c, remove dead code that would
cause double decrefs if run.
In addition, replace PyList_SetItem() with PyList_SET_ITEM() in cases
where a new list is populated and there is no possibility of an error.
In addition, check if the list changed size in the loop in array_array_fromlist().
* _PyTuple_ITEMS() gives access to the tuple->ob_item field and cast the
first argument to PyTupleObject*. This internal macro is only usable if
Py_BUILD_CORE is defined.
* Replace &PyTuple_GET_ITEM(ob, 0) with _PyTuple_ITEMS(ob).
* Replace PyTuple_GET_ITEM(op, 1) with &_PyTuple_ITEMS(ob)[1].
If Py_BUILD_CORE is defined, the PyThreadState_GET() macro access
_PyRuntime which comes from the internal pycore_state.h header.
Public headers must not require internal headers.
Move PyThreadState_GET() and _PyInterpreterState_GET_UNSAFE() from
Include/pystate.h to Include/internal/pycore_state.h, and rename
PyThreadState_GET() to _PyThreadState_GET() there.
The PyThreadState_GET() macro of pystate.h is now redefined when
pycore_state.h is included, to use the fast _PyThreadState_GET().
Changes:
* Add _PyThreadState_GET() macro
* Replace "PyThreadState_GET()->interp" with
_PyInterpreterState_GET_UNSAFE()
* Replace PyThreadState_GET() with _PyThreadState_GET() in internal C
files (compiled with Py_BUILD_CORE defined), but keep
PyThreadState_GET() in the public header files.
* _testcapimodule.c: replace PyThreadState_GET() with
PyThreadState_Get(); the module is not compiled with Py_BUILD_CORE
defined.
* pycore_state.h now requires Py_BUILD_CORE to be defined.
* Remove _PyThreadState_Current
* Replace GET_TSTATE() with PyThreadState_GET()
* Replace GET_INTERP_STATE() with _PyInterpreterState_GET_UNSAFE()
* Replace direct access to _PyThreadState_Current with
PyThreadState_GET()
* Replace _PyThreadState_Current with
_PyRuntime.gilstate.tstate_current
* Rename SET_TSTATE() to _PyThreadState_SET(), name more
consistent with _PyThreadState_GET()
* Update outdated comments
`list.append([], None)` was profiled but `list.append([], None, **{})` was not profiled.
Enable profiling for later case.
https://bugs.python.org/issue34125
This will prevent emitting a resource warning when the execution was
interrupted by Ctrl-C between calling open() and entering a 'with' block
in "with open()".
* Added new opcode END_ASYNC_FOR.
* Setting global StopAsyncIteration no longer breaks "async for" loops.
* Jumping into an "async for" loop is now disabled.
* Jumping out of an "async for" loop no longer corrupts the stack.
* Simplify the compiler.
* Add coro.cr_origin and sys.set_coroutine_origin_tracking_depth
* Use coroutine origin information in the unawaited coroutine warning
* Stop using set_coroutine_wrapper in asyncio debug mode
* In BaseEventLoop.set_debug, enable debugging in the correct thread
This kludge is from 1992. Any C99 compiler is going to be able to handle the
ceval dispatch switch.
Anyway, we have much bigger switches than the ceval dispatch one around. (See,
e.g., Objects/unicodetype_db.h.)
The concrete PyDict_* API is used to interact with PyInterpreterState.modules in a number of places. This isn't compatible with all dict subclasses, nor with other Mapping implementations. This patch switches the concrete API usage to the corresponding abstract API calls.
We also add a PyImport_GetModule() function (and some other helpers) to reduce a bunch of code duplication.
* Add Py_UNREACHABLE() as an alias to abort().
* Use Py_UNREACHABLE() instead of assert(0)
* Convert more unreachable code to use Py_UNREACHABLE()
* Document Py_UNREACHABLE() and a few other macros.
PR #1638, for bpo-28411, causes problems in some (very) edge cases. Until that gets sorted out, we're reverting the merge. PR #3506, a fix on top of #1638, is also getting reverted.
* group the (stateful) runtime globals into various topical structs
* consolidate the topical structs under a single top-level _PyRuntimeState struct
* add a check-c-globals.py script that helps identify runtime globals
Other globals are excluded (see globals.txt and check-c-globals.py).
f_trace_lines: enable/disable line trace events
f_trace_opcodes: enable/disable opcode trace events
These are intended primarily for testing of the interpreter
itself, as they make it much easier to emulate signals
arriving at unfortunate times.
* group the (stateful) runtime globals into various topical structs
* consolidate the topical structs under a single top-level _PyRuntimeState struct
* add a check-c-globals.py script that helps identify runtime globals
Other globals are excluded (see globals.txt and check-c-globals.py).
* Improve signal delivery
Avoid using Py_AddPendingCall from signal handler, to avoid calling signal-unsafe functions.
* Remove unused function
* Improve comments
* Add stress test
* Adapt for --without-threads
* Add second stress test
* Add NEWS blurb
* Address comments @haypo
If we have a chain of generators/coroutines that are 'yield from'ing
each other, then resuming the stack works like:
- call send() on the outermost generator
- this enters _PyEval_EvalFrameDefault, which re-executes the
YIELD_FROM opcode
- which calls send() on the next generator
- which enters _PyEval_EvalFrameDefault, which re-executes the
YIELD_FROM opcode
- ...etc.
However, every time we enter _PyEval_EvalFrameDefault, the first thing
we do is to check for pending signals, and if there are any then we
run the signal handler. And if it raises an exception, then we
immediately propagate that exception *instead* of starting to execute
bytecode. This means that e.g. a SIGINT at the wrong moment can "break
the chain" – it can be raised in the middle of our yield from chain,
with the bottom part of the stack abandoned for the garbage collector.
The fix is pretty simple: there's already a special case in
_PyEval_EvalFrameEx where it skips running signal handlers if the next
opcode is SETUP_FINALLY. (I don't see how this accomplishes anything
useful, but that's another story.) If we extend this check to also
skip running signal handlers when the next opcode is YIELD_FROM, then
that closes the hole – now the exception can only be raised at the
innermost stack frame.
This shouldn't have any performance implications, because the opcode
check happens inside the "slow path" after we've already determined
that there's a pending signal or something similar for us to process;
the vast majority of the time this isn't true and the new check
doesn't run at all.
* bpo-6532: Make the thread id an unsigned integer.
From C API side the type of results of PyThread_start_new_thread() and
PyThread_get_thread_ident(), the id parameter of
PyThreadState_SetAsyncExc(), and the thread_id field of PyThreadState
changed from "long" to "unsigned long".
* Restore a check in thread_get_ident().
When LOAD_METHOD is used for calling C mehtod, PyMethodDescrObject
was passed to profilefunc from 5566bbb.
But lsprof traces only PyCFunctionObject. Additionally, there can be
some third party extension which assumes passed arg is
PyCFunctionObject without calling PyCFunction_Check().
So make PyCFunctionObject from PyMethodDescrObject when
tstate->c_profilefunc is set.
When you use `'%s' % SubClassOfStr()`, where `SubClassOfStr.__rmod__` exists, the reverse operation is ignored as normally such string formatting operations use the `PyUnicode_Format()` fast path. This patch tests for subclasses of `str` first and picks the slow path in that case.
Patch by Martijn Pieters.
* Move all functions to call objects in a new Objects/call.c file.
* Rename fast_function() to _PyFunction_FastCallKeywords().
* Copy null_error() from Objects/abstract.c
* Inline type_error() in call.c to not have to copy it, it was only
called once.
* Export _PyEval_EvalCodeWithName() since it is now called
from call.c.
* Move all functions to call objects in a new Objects/call.c file.
* Rename fast_function() to _PyFunction_FastCallKeywords().
* Copy null_error() from Objects/abstract.c
* Inline type_error() in call.c to not have to copy it, it was only
called once.
* Export _PyEval_EvalCodeWithName() since it is now called
from call.c.
Issue #29227: Inline call_function() into _PyEval_EvalFrameDefault() using
Py_LOCAL_INLINE to reduce the stack consumption.
It reduces the stack consumption, bytes per call, before => after:
test_python_call: 1152 => 1040 (-112 B)
test_python_getitem: 1008 => 976 (-32 B)
test_python_iterator: 1232 => 1120 (-112 B)
=> total: 3392 => 3136 (- 256 B)
Special thanks to INADA Naoki for pushing the patch through
the last mile, Serhiy Storchaka for reviewing the code, and to
Victor Stinner for suggesting the idea (originally implemented
in the PyPy project).
The PEP 523 modified PyEval_EvalFrameEx(): it's now an indirection to
interp->eval_frame().
Inline the call in performance critical code. Leave PyEval_EvalFrame()
unchanged, this function is only kept for backward compatibility.
Issue #28838: Rename parameters of the "calls" functions of the Python C API.
* Rename 'callable_object' and 'func' to 'callable': any Python callable object
is accepted, not only Python functions
* Rename 'method' and 'nameid' to 'name' (method name)
* Rename 'o' to 'obj'
* Move, fix and update documentation of PyObject_CallXXX() functions
in abstract.h
* Update also the documentaton of the C API (update parameter names)
Issue #28858: The change b9c9691c72c5 introduced a regression. It seems like
_PyObject_CallArg1() uses more stack memory than
PyObject_CallFunctionObjArgs().
* PyObject_CallFunctionObjArgs(func, NULL) => _PyObject_CallNoArg(func)
* PyObject_CallFunctionObjArgs(func, arg, NULL) => _PyObject_CallArg1(func, arg)
PyObject_CallFunctionObjArgs() allocates 40 bytes on the C stack and requires
extra work to "parse" C arguments to build a C array of PyObject*.
_PyObject_CallNoArg() and _PyObject_CallArg1() are simpler and don't allocate
memory on the C stack.
This change is part of the fastcall project. The change on listsort() is
related to the issue #23507.
Issue #28799:
* Remove the PyEval_GetCallStats() function.
* Deprecate the untested and undocumented sys.callstats() function.
* Remove the CALL_PROFILE special build
Use the sys.setprofile() function, cProfile or profile module to profile
function calls.
Issue #28782: Fix a bug in the implementation ``yield from`` when checking
if the next instruction is YIELD_FROM. Regression introduced by WORDCODE
(issue #26647).
Reviewed by Serhiy Storchaka and Yury Selivanov.
When Python is not compiled with PGO, the performance of Python on call_simple
and call_method microbenchmarks depend highly on the code placement. In the
worst case, the performance slowdown can be up to 70%.
The GCC __attribute__((hot)) attribute helps to keep hot code close to reduce
the risk of such major slowdown. This attribute is ignored when Python is
compiled with PGO.
The following functions are considered as hot according to statistics collected
by perf record/perf report:
* _PyEval_EvalFrameDefault()
* call_function()
* _PyFunction_FastCall()
* PyFrame_New()
* frame_dealloc()
* PyErr_Occurred()