We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
Cast PY_TIMEOUT_MAX to double, not to _PyTime_t.
Fix the clang warning:
Modules/_threadmodule.c:1648:26: warning: implicit conversion from
'_PyTime_t' (aka 'long') to 'double' changes value from
9223372036854775 to 9223372036854776
[-Wimplicit-const-int-float-conversion]
double timeout_max = (_PyTime_t)PY_TIMEOUT_MAX * 1e-6;
^~~~~~~~~~~~~~~~~~~~~~~~~ ~
This simplifies new_threadstate(). We also rename _PyThreadState_Init() to _PyThreadState_SetCurrent() to reflect what it actually does.
https://bugs.python.org/issue46008
This parallels _PyRuntimeState.interpreters. Doing this helps make it more clear what part of PyInterpreterState relates to its threads.
https://bugs.python.org/issue46008
Redefining the PyThreadState_GET() macro in pycore_pystate.h is
useless since it doesn't affect files not including it. Either use
_PyThreadState_GET() directly, or don't use pycore_pystate.h internal
C API. For example, the _testcapi extension don't use the internal C
API, but use the public PyThreadState_Get() function instead.
Replace PyThreadState_Get() with _PyThreadState_GET(). The
_PyThreadState_GET() macro is more efficient than PyThreadState_Get()
and PyThreadState_GET() function calls which call fail with a fatal
Python error.
posixmodule.c and _ctypes extension now include <windows.h> before
pycore header files (like pycore_call.h).
_PyTraceback_Add() now uses _PyErr_Fetch()/_PyErr_Restore() instead
of PyErr_Fetch()/PyErr_Restore().
The _decimal and _xxsubinterpreters extensions are now built with the
Py_BUILD_CORE_MODULE macro defined to get access to the internal C
API.
Add a private C API for deadlines: add _PyDeadline_Init() and
_PyDeadline_Get() functions.
* Add _PyTime_Add() and _PyTime_Mul() functions which compute t1+t2
and t1*t2 and clamp the result on overflow.
* _PyTime_MulDiv() now uses _PyTime_Add() and _PyTime_Mul().
PyThread_acquire_lock_timed() now clamps the timeout into the
[_PyTime_MIN; _PyTime_MAX] range (_PyTime_t type) if it is too large,
rather than calling Py_FatalError() which aborts the process.
PyThread_acquire_lock_timed() no longer uses
MICROSECONDS_TO_TIMESPEC() to compute sem_timedwait() argument, but
_PyTime_GetSystemClock() and _PyTime_AsTimespec_truncate().
Fix _thread.TIMEOUT_MAX value on Windows: the maximum timeout is
0x7FFFFFFF milliseconds (around 24.9 days), not 0xFFFFFFFF
milliseconds (around 49.7 days).
Set PY_TIMEOUT_MAX to 0x7FFFFFFF milliseconds, rather than 0xFFFFFFFF
milliseconds.
Fix PY_TIMEOUT_MAX overflow test: replace (us >= PY_TIMEOUT_MAX) with
(us > PY_TIMEOUT_MAX).
_thread.start_new_thread() no longer calls PyThread_exit_thread()
explicitly at the thread exit, the call was redundant.
On Linux with the glibc, pthread_cancel() loads dynamically the
libgcc_s.so.1 library. dlopen() can fail if there is no more
available file descriptor to open the file. In this case, the process
aborts with the error message:
"libgcc_s.so.1 must be installed for pthread_cancel to work"
pthread_cancel() unwinds back to the thread's wrapping function that
calls the thread entry point.
The unwind function is dynamically loaded from the libgcc_s library
since it is tightly coupled to the C compiler (GCC). The unwinder
depends on DWARF, the compiler generates DWARF, so the unwinder
belongs to the compiler.
Thanks Florian Weimer and Carlos O'Donell for their help on
investigating this issue.
* Make functools types immutable
* Multibyte codec types are now immutable
* pyexpat.xmlparser is now immutable
* array.arrayiterator is now immutable
* _thread types are now immutable
* _csv types are now immutable
* _queue.SimpleQueue is now immutable
* mmap.mmap is now immutable
* unicodedata.UCD is now immutable
* sqlite3 types are now immutable
* _lsprof.Profiler is now immutable
* _overlapped.Overlapped is now immutable
* _operator types are now immutable
* winapi__overlapped.Overlapped is now immutable
* _lzma types are now immutable
* _bz2 types are now immutable
* _dbm.dbm and _gdbm.gdbm are now immutable
Add pycore_moduleobject.h internal header file with static inline
functions to access module members:
* _PyModule_GetDict()
* _PyModule_GetDef()
* _PyModule_GetState()
These functions don't check at runtime if their argument has a valid
type and can be inlined even if Python is not built with LTO.
_PyType_GetModuleByDef() uses _PyModule_GetDef().
Replace PyModule_GetState() with _PyModule_GetState() in the
extension modules, considered as performance sensitive:
* _abc
* _functools
* _operator
* _pickle
* _queue
* _random
* _sre
* _struct
* _thread
* _winapi
* array
* posix
The following extensions are now built with the Py_BUILD_CORE_MODULE
macro defined, to be able to use the internal pycore_moduleobject.h
header: _abc, array, _operator, _queue, _sre, _struct.
Port the _thread extension module to the multiphase initialization
API (PEP 489) and convert its static types to heap types.
Add a traverse function to the lock type, so the garbage collector
can break reference cycles.
* Fix ExceptHookArgsType name: "_thread.ExceptHookArgs", instead of
"_thread._ExceptHookArgs".
* PyInit__thread() no longer intializes interp->num_threads to 0:
it is already done in PyInterpreterState_New().
* Use PyModule_AddType(), Py_NewRef() and Py_XNewRef().
* Replace str_dict variable with _Py_IDENTIFIER(__dict__).
* Remove assert(Py_IS_TYPE(obj, &Locktype)) from release_sentinel()
to avoid having to retrive the type from this callback.
* Add thread_bootstate_free()
* Rename t_bootstrap() to thread_run()
* bootstate structure: rename keyw member to kwargs
No longer use deprecated aliases to functions:
* Replace PyObject_MALLOC() with PyObject_Malloc()
* Replace PyObject_REALLOC() with PyObject_Realloc()
* Replace PyObject_FREE() with PyObject_Free()
* Replace PyObject_Del() with PyObject_Free()
* Replace PyObject_DEL() with PyObject_Free()
No longer use deprecated aliases to functions:
* Replace PyMem_MALLOC() with PyMem_Malloc()
* Replace PyMem_REALLOC() with PyMem_Realloc()
* Replace PyMem_FREE() with PyMem_Free()
* Replace PyMem_Del() with PyMem_Free()
* Replace PyMem_DEL() with PyMem_Free()
Modify also the PyMem_DEL() macro to use directly PyMem_Free().
These functions are considered not safe because they suppress all internal errors
and can return wrong result. PyDict_GetItemString and _PyDict_GetItemId can
also silence current exception in rare cases.
Remove no longer used _PyDict_GetItemId.
Add _PyDict_ContainsId and rename _PyDict_Contains into
_PyDict_Contains_KnownHash.
An isolated subinterpreter cannot spawn threads, spawn a child
process or call os.fork().
* Add private _Py_NewInterpreter(isolated_subinterpreter) function.
* Add isolated=True keyword-only parameter to
_xxsubinterpreters.create().
* Allow again os.fork() in "non-isolated" subinterpreters.
Rename _PyInterpreterState_GET_UNSAFE() to _PyInterpreterState_GET()
for consistency with _PyThreadState_GET() and to have a shorter name
(help to fit into 80 columns).
Add also "assert(tstate != NULL);" to the function.
Add a private _at_fork_reinit() method to _thread.Lock,
_thread.RLock, threading.RLock and threading.Condition classes:
reinitialize the lock after fork in the child process; reset the lock
to the unlocked state.
Rename also the private _reset_internal_locks() method of
threading.Event to _at_fork_reinit().
* Add _PyThread_at_fork_reinit() private function. It is excluded
from the limited C API.
* threading.Thread._reset_internal_locks() now calls
_at_fork_reinit() on self._tstate_lock rather than creating a new
Python lock object.
* _PyThreadState_DeleteCurrent() now takes tstate rather than
runtime.
* Add ensure_tstate_not_null() helper to pystate.c.
* Add _PyEval_ReleaseLock() function.
* _PyThreadState_DeleteCurrent() now calls
_PyEval_ReleaseLock(tstate) and frees PyThreadState memory after
this call, not before.
* PyGILState_Release(): rename "tcur" variable to "tstate".
Replace _PyInterpreterState_Get() function call with
_PyInterpreterState_GET_UNSAFE() macro which is more efficient but
don't check if tstate or interp is NULL.
_Py_GetConfigsAsDict() now uses _PyThreadState_GET().
Add PyInterpreterState.runtime field: reference to the _PyRuntime
global variable. This field exists to not have to pass runtime in
addition to tstate to a function. Get runtime from tstate:
tstate->interp->runtime.
Remove "_PyRuntimeState *runtime" parameter from functions already
taking a "PyThreadState *tstate" parameter.
_PyGC_Init() first parameter becomes "PyThreadState *tstate".
* Rename PyThreadState_DeleteCurrent()
to _PyThreadState_DeleteCurrent()
* Move it to the internal C API
Co-Authored-By: Carol Willing <carolcode@willingconsulting.com>
In a subinterpreter, spawning a daemon thread now raises an
exception. Daemon threads were never supported in subinterpreters.
Previously, the subinterpreter finalization crashed with a Pyton
fatal error if a daemon thread was still running.
* Add _thread._is_main_interpreter()
* threading.Thread.start() now raises RuntimeError if the thread is a
daemon thread and the method is called from a subinterpreter.
* The _thread module now uses Argument Clinic for the new function.
* Use textwrap.dedent() in test_threading.SubinterpThreadingTests
_thread.start_new_thread() now logs uncaught exception raised by the
function using sys.unraisablehook(), rather than sys.excepthook(), so
the hook gets access to the function which raised the exception.
Add a new threading.excepthook() function which handles uncaught
Thread.run() exception. It can be overridden to control how uncaught
exceptions are handled.
threading.ExceptHookArgs is not documented on purpose: it should not
be used directly.
* threading.excepthook() and threading.ExceptHookArgs.
* Add _PyErr_Display(): similar to PyErr_Display(), but accept a
'file' parameter.
* Add _thread._excepthook(): C implementation of the exception hook
calling _PyErr_Display().
* Add _thread._ExceptHookArgs: structseq type.
* Add threading._invoke_excepthook_wrapper() which handles the gory
details to ensure that everything remains alive during Python
shutdown.
* Add unit tests.