We actually don't move PyImport_Inittab. Instead, we make a copy that we keep on _PyRuntimeState and use only that after Py_Initialize(). We also prevent folks from modifying PyImport_Inittab (the best we can) after that point.
https://github.com/python/cpython/issues/81057
The global allocators were stored in 3 static global variables: _PyMem_Raw, _PyMem, and _PyObject. State for the "small block" allocator was stored in another 13. That makes a total of 16 global variables. We are moving all 16 to the _PyRuntimeState struct as part of the work for gh-81057. (If PEP 684 is accepted then we will follow up by moving them all to PyInterpreterState.)
https://github.com/python/cpython/issues/81057
As we consolidate global variables, we find some objects that are almost suitable to add to _PyRuntimeState.global_objects, but have some small/sneaky bit of per-interpreter state (e.g. a weakref list). We're adding PyInterpreterState.static_objects so we can move such objects there. (We'll removed the _not_used field once we've added others.)
https://github.com/python/cpython/issues/81057
* Adds EXIT_INTERPRETER instruction to exit PyEval_EvalDefault()
* Simplifies RETURN_VALUE, YIELD_VALUE and RETURN_GENERATOR instructions as they no longer need to check for entry frames.
We do the following:
* move the generated _PyUnicode_InitStaticStrings() to its own file
* move the generated _PyStaticObjects_CheckRefcnt() to its own file
* include pycore_global_objects.h in extension modules instead of pycore_runtime_init.h
These changes help us avoid including things that aren't needed.
https://github.com/python/cpython/issues/90868
Previously, the optional restrictions on subinterpreters were: disallow fork, subprocess, and threads. By default, we were disallowing all three for "isolated" interpreters. We always allowed all three for the main interpreter and those created through the legacy `Py_NewInterpreter()` API.
Those settings were a bit conservative, so here we've adjusted the optional restrictions to: fork, exec, threads, and daemon threads. The default for "isolated" interpreters disables fork, exec, and daemon threads. Regular threads are allowed by default. We continue always allowing everything For the main interpreter and the legacy API.
In the code, we add `_PyInterpreterConfig.allow_exec` and `_PyInterpreterConfig.allow_daemon_threads`. We also add `Py_RTFLAGS_DAEMON_THREADS` and `Py_RTFLAGS_EXEC`.
* As most of `test_embed` now uses `Py_InitializeFromConfig`, add
a specific test case to cover `Py_Initialize` (and `Py_InitializeEx`)
* Rename `_testembed` init helper to clarify the API used
* Add a `PyConfig_Clear` call in `Py_InitializeEx` to make
the code more obviously correct (it already didn't leak as
none of the dynamically allocated config fields were being
populated, but it's clearer if the wrappers follow the
documented API usage guidelines)
(see https://github.com/python/cpython/issues/98608)
This change does the following:
1. change the argument to a new `_PyInterpreterConfig` struct
2. rename the function to `_Py_NewInterpreterFromConfig()`, inspired by `Py_InitializeFromConfig()` (takes a `_PyInterpreterConfig` instead of `isolated_subinterpreter`)
3. split up the boolean `isolated_subinterpreter` into the corresponding multiple granular settings
* allow_fork
* allow_subprocess
* allow_threads
4. add `PyInterpreterState.feature_flags` to store those settings
5. add a function for checking if a feature is enabled on an opaque `PyInterpreterState *`
6. drop `PyConfig._isolated_interpreter`
The existing default (see `Py_NewInterpeter()` and `Py_Initialize*()`) allows fork, subprocess, and threads and the optional "isolated" interpreter (see the `_xxsubinterpreters` module) disables all three. None of that changes here; the defaults are preserved.
Note that the given `_PyInterpreterConfig` will not be used outside `_Py_NewInterpreterFromConfig()`, nor preserved. This contrasts with how `PyConfig` is currently preserved, used, and even modified outside `Py_InitializeFromConfig()`. I'd rather just avoid that mess from the start for `_PyInterpreterConfig`. We can preserve it later if we find an actual need.
This change allows us to follow up with a number of improvements (e.g. stop disallowing subprocess and support disallowing exec instead).
(Note that this PR adds "private" symbols. We'll probably make them public, and add docs, in a separate change.)
⚠️⚠️ Note for reviewers, hackers and fellow systems/low-level/compiler engineers ⚠️⚠️
If you have a lot of experience with this kind of shenanigans and want to improve the **first** version, **please make a PR against my branch** or **reach out by email** or **suggest code changes directly on GitHub**.
If you have any **refinements or optimizations** please, wait until the first version is merged before starting hacking or proposing those so we can keep this PR productive.
Static builtin types are finalized by calling _PyStaticType_Dealloc(). Before this change, we were skipping finalizing such a type if it still had subtypes (i.e. its tp_subclasses hadn't been cleared yet). The problem is that types hold several heap objects, which leak if we skip the type's finalization. This change addresses that.
For context, there's an old comment (from e9e3eab0b8) that says the following:
// If a type still has subtypes, it cannot be deallocated.
// A subtype can inherit attributes and methods of its parent type,
// and a type must no longer be used once it's deallocated.
However, it isn't clear that is actually still true. Clearing tp_dict should mean it isn't a problem.
Furthermore, the only subtypes that might still be around come from extension modules that didn't clean them up when unloaded (i.e. extensions that do not implement multi-phase initialization, AKA PEP 489). Those objects are already leaking, so this change doesn't change anything in that regard. Instead, this change means more objects gets cleaned up that before.
Seems in the past the copy of builtins was not made in some scenarios,
and the error was silenced. Write it now to stderr, so we have a chance
to see it.
It combines PyImport_ImportModule() and PyObject_GetAttrString()
and saves 4-6 lines of code on every use.
Add also _PyImport_GetModuleAttr() which takes Python strings as arguments.
This was added for bpo-40514 (gh-84694) to test out a per-interpreter GIL. However, it has since proven unnecessary to keep the experiment in the repo. (It can be done as a branch in a fork like normal.) So here we are removing:
* the configure option
* the macro
* the code enabled by the macro
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
Move _Py_GetAllocatedBlocks() and _PyObject_DebugMallocStats()
declarations to pycore_pymem.h. These functions are related to memory
allocators, not to the PyObject structure.
Add _PyUnicode_FiniTypes() function, called by
finalize_interp_types(). It clears these static types:
* EncodingMapType
* PyFieldNameIter_Type
* PyFormatterIter_Type
_PyStaticType_Dealloc() now does nothing if tp_subclasses
is not NULL.
Add 'static_exceptions' list to factorize code between
_PyExc_InitTypes() and _PyBuiltins_AddExceptions().
_PyExc_InitTypes() does nothing if it's not the main interpreter.
Sort exceptions in Lib/test/exception_hierarchy.txt.
* Move PyContext static types into object.c static_types list.
* Rename PyContextTokenMissing_Type to _PyContextTokenMissing_Type
and declare it in pycore_context.h.
* _PyHamtItems types are no long exported: replace PyAPI_DATA() with
extern.
Add _PyTypes_FiniTypes() best-effort function to clear static types:
don't deallocate a type if it still has subclasses.
remove_subclass() now sets tp_subclasses to NULL when removing the
last subclass.
"python -X showrefcount" now shows the total reference count after
clearing and destroyed the main Python interpreter. Previously, it
was shown before.
Py_FinalizeEx() now calls _PyDebug_PrintTotalRefs() after
finalize_interp_delete().
Add _PyStructSequence_FiniType() and _PyStaticType_Dealloc()
functions to finalize a structseq static type in Py_Finalize().
Currrently, these functions do nothing if Python is built in release
mode.
Clear static types:
* AsyncGenHooksType: sys.set_asyncgen_hooks()
* FlagsType: sys.flags
* FloatInfoType: sys.float_info
* Hash_InfoType: sys.hash_info
* Int_InfoType: sys.int_info
* ThreadInfoType: sys.thread_info
* UnraisableHookArgsType: sys.unraisablehook
* VersionInfoType: sys.version
* WindowsVersionType: sys.getwindowsversion()
The empty bytes object (b'') and the 256 one-character bytes objects were allocated at runtime init. Now we statically allocate and initialize them.
https://bugs.python.org/issue45953
The array of small PyLong objects has been statically declared. Here I also statically initialize them. Consequently they are no longer initialized dynamically during runtime init.
I've also moved them under a new sub-struct in _PyRuntimeState, in preparation for static allocation and initialization of other global objects.
https://bugs.python.org/issue45953
This change is strictly renames and moving code around. It helps in the following ways:
* ensures type-related init functions focus strictly on one of the three aspects (state, objects, types)
* passes in PyInterpreterState * to all those functions, simplifying work on moving types/objects/state to the interpreter
* consistent naming conventions help make what's going on more clear
* keeping API related to a type in the corresponding header file makes it more obvious where to look for it
https://bugs.python.org/issue46008
PyInterpreterState_Main() is a plain function exposed in the public C-API. For internal usage we can take the more efficient approach in this PR.
https://bugs.python.org/issue46008
This parallels _PyRuntimeState.interpreters. Doing this helps make it more clear what part of PyInterpreterState relates to its threads.
https://bugs.python.org/issue46008
The getpath.py file is frozen at build time and executed as code over a namespace. It is never imported, nor is it meant to be importable or reusable. However, it should be easier to read, modify, and patch than the previous code.
This commit attempts to preserve every previously tested quirk, but these may be changed in the future to better align platforms.
Redefining the PyThreadState_GET() macro in pycore_pystate.h is
useless since it doesn't affect files not including it. Either use
_PyThreadState_GET() directly, or don't use pycore_pystate.h internal
C API. For example, the _testcapi extension don't use the internal C
API, but use the public PyThreadState_Get() function instead.
Replace PyThreadState_Get() with _PyThreadState_GET(). The
_PyThreadState_GET() macro is more efficient than PyThreadState_Get()
and PyThreadState_GET() function calls which call fail with a fatal
Python error.
posixmodule.c and _ctypes extension now include <windows.h> before
pycore header files (like pycore_call.h).
_PyTraceback_Add() now uses _PyErr_Fetch()/_PyErr_Restore() instead
of PyErr_Fetch()/PyErr_Restore().
The _decimal and _xxsubinterpreters extensions are now built with the
Py_BUILD_CORE_MODULE macro defined to get access to the internal C
API.
Currently we freeze several modules into the runtime. For each of these modules it is essential to bootstrapping the runtime that they be frozen. Any other stdlib module that we later freeze into the runtime is not essential. We can just as well import from the .py file. This PR lets users explicitly choose which should be used, with the new "-X frozen_modules=[on|off]" CLI flag. The default is "off" for now.
https://bugs.python.org/issue45020
The threading debug (PYTHONTHREADDEBUG environment variable) is
deprecated in Python 3.10 and will be removed in Python 3.12. This
feature requires a debug build of Python.
This commit stores the _PyRuntime structure in a section of the same name. This allows a debugging or crash reporting tool to quickly locate this structure at runtime without requiring the symbol table.
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
The Python _pyio.open() function becomes a static method to behave as
io.open() built-in function: don't become a bound method when stored
as a class variable. It becomes possible since static methods are now
callable in Python 3.10. Moreover, _pyio.OpenWrapper becomes a simple
alias to _pyio.open.
init_set_builtins_open() now sets builtins.open to io.open, rather
than setting it to io.OpenWrapper, since OpenWrapper is now an alias
to open in the io and _pyio modules.
Reorganize pycore_interp_init() to initialize singletons before the
the first PyType_Ready() call. Fix an issue when Python is configured
using --without-doc-strings.
These functions were undocumented and excluded from the limited C
API.
Most names defined by these header files were not prefixed by "Py"
and so could create names conflicts. For example, Python-ast.h
defined a "Yield" macro which was conflict with the "Yield" name used
by the Windows <winbase.h> header.
Use the Python ast module instead.
* Move Include/asdl.h to Include/internal/pycore_asdl.h.
* Move Include/Python-ast.h to Include/internal/pycore_ast.h.
* Remove ast.h header file.
* pycore_symtable.h no longer includes Python-ast.h.
At Python startup, call _PyGILState_Init() before
PyInterpreterState_New() which calls _PyThreadState_GET(). When
Python is built using --with-experimental-isolated-subinterpreters,
_PyThreadState_GET() uses autoTSSkey.
This is friendlier to other in-process code that an extension module or
embedding use could pull in such as CGo where tiny stacks are the norm
and sigaltstack() has been used to provide for signal handlers.
Without this, signals received by a process using tiny stacks may lead
to stack overflow crashes.
Pass the current interpreter (interp) rather than the current Python
thread state (tstate) to internal functions which only use the
interpreter.
Modified functions:
* _PyXXX_Fini() and _PyXXX_ClearFreeList() functions
* _PyEval_SignalAsyncExc(), make_pending_calls()
* _PySys_GetObject(), sys_set_object(), sys_set_object_id(), sys_set_object_str()
* should_audit(), set_flags_from_config(), make_flags()
* _PyAtExit_Call()
* init_stdio_encoding()
* etc.
The Py_FatalError() function and the faulthandler module now dump the
list of extension modules on a fatal error.
Add _Py_DumpExtensionModules() and _PyModule_IsExtension() internal
functions.
Make the Unicode dictionary of interned strings compatible with
subinterpreters.
Remove the INTERN_NAME_STRINGS macro in typeobject.c: names are
always now interned (even if EXPERIMENTAL_ISOLATED_SUBINTERPRETERS
macro is defined).
_PyUnicode_ClearInterned() now uses PyDict_Next() to no longer
allocate memory, to ensure that the interned dictionary is cleared.
Make the type attribute lookup cache per-interpreter.
Add private _PyType_InitCache() function, called by PyInterpreterState_New().
Continue to share next_version_tag between interpreters, since static
types are still shared by interpreters.
Remove MCACHE macro: the cache is no longer disabled if the
EXPERIMENTAL_ISOLATED_SUBINTERPRETERS macro is defined.
* Add _PyAtExit_Call() function and remove pyexitfunc and
pyexitmodule members of PyInterpreterState. The function
logs atexit callback errors using _PyErr_WriteUnraisableMsg().
* Add _PyAtExit_Init() and _PyAtExit_Fini() functions.
* Remove traverse, clear and free functions of the atexit module.
Co-authored-by: Dong-hee Na <donghee.na@python.org>
At Python exit, if a callback registered with atexit.register()
fails, its exception is now logged. Previously, only some exceptions
were logged, and the last exception was always silently ignored.
Add _PyAtExit_Call() function and remove
PyInterpreterState.atexit_func member. call_py_exitfuncs() now calls
directly _PyAtExit_Call().
The atexit module must now always be built as a built-in module.
pymain_run_file() no longer encodes the filename: pass the filename
as an object to the new _PyRun_AnyFileObject() function.
Add new private functions:
* _PyRun_AnyFileObject()
* _PyRun_InteractiveLoopObject()
* _Py_FdIsInteractive()
Convert the _imp extension module to the multi-phase initialization
API (PEP 489).
* Add _PyImport_BootstrapImp() which fix a bootstrap issue: import
the _imp module before importlib is initialized.
* Add create_builtin() sub-function, used by _imp_create_builtin().
* Initialize PyInterpreterState.import_func earlier, in
pycore_init_builtins().
* Remove references to _PyImport_Cleanup(). This function has been
renamed to finalize_modules() and moved to pylifecycle.c.
Remove the undocumented PyOS_InitInterrupts() C function.
* Rename PyOS_InitInterrupts() to _PySignal_Init(). It now installs
other signal handlers, not only SIGINT.
* Rename PyOS_FiniInterrupts() to _PySignal_Fini()
time.time(), time.perf_counter() and time.monotonic() functions can
no longer fail with a Python fatal error, instead raise a regular
Python exception on failure.
Remove _PyTime_Init(): don't check system, monotonic and perf counter
clocks at startup anymore.
On error, _PyTime_GetSystemClock(), _PyTime_GetMonotonicClock() and
_PyTime_GetPerfCounter() now silently ignore the error and return 0.
They cannot fail with a Python fatal error anymore.
Add py_mach_timebase_info() and win_perf_counter_frequency()
sub-functions.
* Call _PyTime_Init() and _PyWarnings_InitState() earlier during the
Python initialization.
* Inline _PyImportHooks_Init() into _PySys_InitCore().
* The _warnings initialization function no longer call
_PyWarnings_InitState() to prevent resetting filters_version to 0.
* _PyWarnings_InitState() now returns an int and no longer clear the
state in case of error (it's done anyway at Python exit).
* Rework init_importlib(), fix refleaks on errors.
Fix _PyConfig_Read() if compute_path_config=0: use values set by
Py_SetPath(), Py_SetPythonHome() and Py_SetProgramName(). Add
compute_path_config parameter to _PyConfig_InitPathConfig().
The following functions now return NULL if called before
Py_Initialize():
* Py_GetExecPrefix()
* Py_GetPath()
* Py_GetPrefix()
* Py_GetProgramFullPath()
* Py_GetProgramName()
* Py_GetPythonHome()
These functions no longer automatically computes the Python Path
Configuration. Moreover, Py_SetPath() no longer computes
program_full_path.
The path configuration is now computed in the "main" initialization.
The core initialization no longer computes it.
* Add _PyConfig_Read() function to read the configuration without
computing the path configuration.
* pyinit_core() no longer computes the path configuration: it is now
computed by init_interp_main().
* The path configuration output members of PyConfig are now optional:
* executable
* base_executable
* prefix
* base_prefix
* exec_prefix
* base_exec_prefix
* _PySys_UpdateConfig() now skips NULL strings in PyConfig.
* _testembed: Rename test_set_config() to test_init_set_config() for
consistency with other tests.
* Inline _PyInterpreterState_SetConfig(): replace it with
_PyConfig_Copy().
* Add _PyErr_SetFromPyStatus()
* Add _PyInterpreterState_GetConfigCopy()
* Add a new _PyInterpreterState_SetConfig() function.
* Add an unit which gets, modifies, and sets the config.
When Py_Initialize() is called twice, the second call now updates
more sys attributes for the configuration, rather than only sys.argv.
* Rename _PySys_InitMain() to _PySys_UpdateConfig().
* _PySys_UpdateConfig() now modifies sys.flags in-place, instead of
creating a new flags object.
* Remove old commented sys.flags flags (unbuffered and skip_first).
* Add private _PySys_GetObject() function.
* When Py_Initialize(), Py_InitializeFromConfig() and
Call _PyAST_Fini() on all interpreters, not only on the main
interpreter. Also, call it ealier to fix a reference leak.
Python types contain a reference to themselves in in their
PyTypeObject.tp_mro member. _PyAST_Fini() must called before the last
GC collection to destroy AST types.
_PyInterpreterState_Clear() now calls _PyAST_Fini(). It now also
calls _PyWarnings_Fini() on subinterpeters, not only on the main
interpreter.
Add an assertion in AST init_types() to ensure that the _ast module
is no longer used after _PyAST_Fini() has been called.
The last GC collection is now done before clearing builtins and sys
dictionaries. Add also assertions to ensure that gc.collect() is no
longer called after _PyGC_Fini().
Pass also the tstate to PyInterpreterState_Clear() to pass the
correct tstate to _PyGC_CollectNoFail() and _PyGC_Fini().
Move private _PyGC_CollectNoFail() to the internal C API.
Remove the private _PyGC_CollectIfEnabled() which was just an alias
to the public PyGC_Collect() function since Python 3.8.
Rename functions:
* collect() => gc_collect_main()
* collect_with_callback() => gc_collect_with_callback()
* collect_generations() => gc_collect_generations()
These functions are considered not safe because they suppress all internal errors
and can return wrong result. PyDict_GetItemString and _PyDict_GetItemId can
also silence current exception in rare cases.
Remove no longer used _PyDict_GetItemId.
Add _PyDict_ContainsId and rename _PyDict_Contains into
_PyDict_Contains_KnownHash.
Partially revert commit ac46eb4ad6662cf6d771b20d8963658b2186c48c:
"bpo-38113: Update the Python-ast.c generator to PEP384 (gh-15957)".
Using a module state per module instance is causing subtle practical
problems.
For example, the Mercurial project replaces the __import__() function
to implement lazy import, whereas Python expected that "import _ast"
always return a fully initialized _ast module.
Add _PyAST_Fini() to clear the state at exit.
The _ast module has no state (set _astmodule.m_size to 0). Remove
astmodule_traverse(), astmodule_clear() and astmodule_free()
functions.