If Py_BUILD_CORE is defined, the PyThreadState_GET() macro access
_PyRuntime which comes from the internal pycore_state.h header.
Public headers must not require internal headers.
Move PyThreadState_GET() and _PyInterpreterState_GET_UNSAFE() from
Include/pystate.h to Include/internal/pycore_state.h, and rename
PyThreadState_GET() to _PyThreadState_GET() there.
The PyThreadState_GET() macro of pystate.h is now redefined when
pycore_state.h is included, to use the fast _PyThreadState_GET().
Changes:
* Add _PyThreadState_GET() macro
* Replace "PyThreadState_GET()->interp" with
_PyInterpreterState_GET_UNSAFE()
* Replace PyThreadState_GET() with _PyThreadState_GET() in internal C
files (compiled with Py_BUILD_CORE defined), but keep
PyThreadState_GET() in the public header files.
* _testcapimodule.c: replace PyThreadState_GET() with
PyThreadState_Get(); the module is not compiled with Py_BUILD_CORE
defined.
* pycore_state.h now requires Py_BUILD_CORE to be defined.
* group the (stateful) runtime globals into various topical structs
* consolidate the topical structs under a single top-level _PyRuntimeState struct
* add a check-c-globals.py script that helps identify runtime globals
Other globals are excluded (see globals.txt and check-c-globals.py).
According to the comment, there was previously a call to PyObject_IsSubclass() involved which could fail, but since it was replaced with a call to PyType_IsSubtype(), it can no longer fail.
Replace
_PyObject_CallArg1(func, arg)
with
PyObject_CallFunctionObjArgs(func, arg, NULL)
Using the _PyObject_CallArg1() macro increases the usage of the C stack, which
was unexpected and unwanted. PyObject_CallFunctionObjArgs() doesn't have this
issue.
When Python is not compiled with PGO, the performance of Python on call_simple
and call_method microbenchmarks depend highly on the code placement. In the
worst case, the performance slowdown can be up to 70%.
The GCC __attribute__((hot)) attribute helps to keep hot code close to reduce
the risk of such major slowdown. This attribute is ignored when Python is
compiled with PGO.
The following functions are considered as hot according to statistics collected
by perf record/perf report:
* _PyEval_EvalFrameDefault()
* call_function()
* _PyFunction_FastCall()
* PyFrame_New()
* frame_dealloc()
* PyErr_Occurred()
new exception with setting current exception as __cause__.
_PyErr_FormatFromCause(exception, format, args...) is equivalent to Python
raise exception(format % args) from sys.exc_info()[1]
Issue #26154: Add a new private _PyThreadState_UncheckedGet() function which
gets the current thread state, but don't call Py_FatalError() if it is NULL.
Python 3.5.1 removed the _PyThreadState_Current symbol from the Python C API to
no more expose complex and private atomic types. Atomic types depends on the
compiler or can even depend on compiler options. The new function
_PyThreadState_UncheckedGet() allows to get the variable value without having
to care of the exact implementation of atomic types.
Changes:
* Replace direct usage of the _PyThreadState_Current variable with a call to
_PyThreadState_UncheckedGet().
* In pystate.c, replace direct usage of the _PyThreadState_Current variable
with the PyThreadState_GET() macro for readability.
* Document also PyThreadState_Get() in pystate.h
now register both filenames in the exception on failure.
This required adding new C API functions allowing OSError exceptions
to reference two filenames instead of one.
_PyUnicode_CompareWithId() is faster than PyUnicode_CompareWithASCIIString()
when both strings are equal and interned.
Add also _PyId_builtins identifier for "builtins" common string.
instead of creating temporary Unicode string objects
Add also more identifiers in pythonrun.c to avoid temporary Unicode string
objets for the interactive interpreter.