Commit Graph

70 Commits

Author SHA1 Message Date
Victor Stinner 13a00078b8
gh-108634: Py_TRACE_REFS uses a hash table (#108663)
Python built with "configure --with-trace-refs" (tracing references)
is now ABI compatible with Python release build and debug build.
Moreover, it now also supports the Limited API.

Change Py_TRACE_REFS build:

* Remove _PyObject_EXTRA_INIT macro.
* The PyObject structure no longer has two extra members (_ob_prev
  and _ob_next).
* Use a hash table (_Py_hashtable_t) to trace references (all
  objects): PyInterpreterState.object_state.refchain.
* Py_TRACE_REFS build is now ABI compatible with release build and
  debug build.
* Limited C API extensions can now be built with Py_TRACE_REFS:
  xxlimited, xxlimited_35, _testclinic_limited.
* No longer rename PyModule_Create2() and PyModule_FromDefAndSpec2()
  functions to PyModule_Create2TraceRefs() and
  PyModule_FromDefAndSpec2TraceRefs().
* _Py_PrintReferenceAddresses() is now called before
  finalize_interp_delete() which deletes the refchain hash table.
* test_tracemalloc find_trace() now also filters by size to ignore
  the memory allocated by _PyRefchain_Trace().

Test changes for Py_TRACE_REFS:

* Add test.support.Py_TRACE_REFS constant.
* Add test_sys.test_getobjects() to test sys.getobjects() function.
* test_exceptions skips test_recursion_normalizing_with_no_memory()
  and test_memory_error_in_PyErr_PrintEx() if Python is built with
  Py_TRACE_REFS.
* test_repl skips test_no_memory().
* test_capi skisp test_set_nomemory().
2023-08-31 18:33:34 +02:00
Victor Stinner 0dd3fc2a64
gh-108216: Cleanup #include in internal header files (#108228)
* Add missing includes.
* Remove unused includes.
* Update old include/symbol names to newer names.
* Mention at least one included symbol.
* Sort includes.
* Update Tools/cases_generator/generate_cases.py used to generated
  pycore_opcode_metadata.h.
* Update Parser/asdl_c.py used to generate pycore_ast.h.
* Cleanup also includes in _testcapimodule.c and _testinternalcapi.c.
2023-08-21 18:05:59 +00:00
Mark Shannon 006e44f950
GH-108035: Remove the `_PyCFrame` struct as it is no longer needed for performance. (GH-108036) 2023-08-17 11:16:03 +01:00
Eric Snow 58ef741867
gh-107080: Fix Py_TRACE_REFS Crashes Under Isolated Subinterpreters (gh-107567)
The linked list of objects was a global variable, which broke isolation between interpreters, causing crashes. To solve this, we've moved the linked list to each interpreter.
2023-08-03 19:51:08 +00:00
Eric Snow 8ba4df91ae
gh-105699: Use a _Py_hashtable_t for the PyModuleDef Cache (gh-106974)
This fixes a crasher due to a race condition, triggered infrequently when two isolated (own GIL) subinterpreters simultaneously initialize their sys or builtins modules.  The crash happened due the combination of the "detached" thread state we were using and the "last holder" logic we use for the GIL.  It turns out it's tricky to use the same thread state for different threads.  Who could have guessed?

We solve the problem by eliminating the one object we were still sharing between interpreters.  We replace it with a low-level hashtable, using the "raw" allocator to avoid tying it to the main interpreter.

We also remove the accommodations for "detached" thread states, which were a dubious idea to start with.
2023-07-28 14:39:08 -06:00
Eric Snow b72947a8d2
gh-106931: Intern Statically Allocated Strings Globally (gh-107272)
We tried this before with a dict and for all interned strings.  That ran into problems due to interpreter isolation.  However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning.  Here we circle back to using a global cache, but only for statically allocated strings.  We also use a more-basic _Py_hashtable_t for that global cache instead of a dict.

Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter.  Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter.
2023-07-27 13:56:59 -06:00
Pablo Galindo Salgado b444bfb0a3
gh-106597: Add debugging struct with offsets for out-of-process tools (#106598) 2023-07-11 20:35:41 +01:00
Eric Snow 68dfa49627
gh-100227: Lock Around Modification of the Global Allocators State (gh-105516)
The risk of a race with this state is relatively low, but we play it safe anyway. We do avoid using the lock in performance-sensitive cases where the risk of a race is very, very low.
2023-06-08 14:06:54 -06:00
Eric Snow b8f7ab5783
gh-104252: Immortalize Py_EMPTY_KEYS (gh-104253)
This was missed in gh-19474.  It matters for with a per-interpreter GIL since PyDictKeysObject.dk_refcnt breaks isolation and leads to races.
2023-05-10 07:28:40 -06:00
Eric Snow df3173d28e
gh-101659: Isolate "obmalloc" State to Each Interpreter (gh-101660)
This is strictly about moving the "obmalloc" runtime state from
`_PyRuntimeState` to `PyInterpreterState`.  Doing so improves isolation
between interpreters, specifically most of the memory (incl. objects)
allocated for each interpreter's use.  This is important for a
per-interpreter GIL, but such isolation is valuable even without it.

FWIW, a per-interpreter obmalloc is the proverbial
canary-in-the-coalmine when it comes to the isolation of objects between
interpreters.  Any object that leaks (unintentionally) to another
interpreter is highly likely to cause a crash (on debug builds at
least).  That's a useful thing to know, relative to interpreter
isolation.
2023-04-24 17:23:57 -06:00
Eric Snow 209a0a7655
gh-95795: Move types.next_version_tag to PyInterpreterState (gh-102343)
Core static types will continue to use the global value.  All other types
will use the per-interpreter value.  They all share the same range, where
the global types use values < 2^16 and each interpreter uses values
higher than that.
2023-04-24 22:30:13 +00:00
Eddie Elizondo ea2c001650
gh-84436: Implement Immortal Objects (gh-19474)
This is the implementation of PEP683

Motivation:

The PR introduces the ability to immortalize instances in CPython which bypasses reference counting. Tagging objects as immortal allows up to skip certain operations when we know that the object will be around for the entire execution of the runtime.

Note that this by itself will bring a performance regression to the runtime due to the extra reference count checks. However, this brings the ability of having truly immutable objects that are useful in other contexts such as immutable data sharing between sub-interpreters.
2023-04-22 13:39:37 -06:00
Eric Snow dcd6f226d6
gh-100227: Make the Global PyModuleDef Cache Safe for Isolated Interpreters (gh-103084)
Sharing mutable (or non-immortal) objects between interpreters is generally not safe.  We can work around that but not easily. 
 There are two restrictions that are critical for objects that break interpreter isolation.

The first is that the object's state be guarded by a global lock.  For now the GIL meets this requirement, but a granular global lock is needed once we have a per-interpreter GIL.

The second restriction is that the object (and, for a container, its items) be deallocated/resized only when the interpreter in which it was allocated is the current one.  This is because every interpreter has (or will have, see gh-101660) its own object allocator.  Deallocating an object with a different allocator can cause crashes.

The dict for the cache of module defs is completely internal, which simplifies what we have to do to meet those requirements.  To do so, we do the following:

* add a mechanism for re-using a temporary thread state tied to the main interpreter in an arbitrary thread
   * add _PyRuntime.imports.extensions.main_tstate` 
   * add _PyThreadState_InitDetached() and _PyThreadState_ClearDetached() (pystate.c)
   * add _PyThreadState_BindDetached() and _PyThreadState_UnbindDetached() (pystate.c)
* make sure the cache dict (_PyRuntime.imports.extensions.dict) and its items are all owned by the main interpreter)
* add a placeholder using for a granular global lock

Note that the cache is only used for legacy extension modules and not for multi-phase init modules.

https://github.com/python/cpython/issues/100227
2023-03-29 17:15:43 -06:00
Eric Snow 89e67ada69
gh-100227: Revert gh-102925 "gh-100227: Make the Global Interned Dict Safe for Isolated Interpreters" (gh-103063)
This reverts commit 87be8d9.

This approach to keeping the interned strings safe is turning out to be too complex for my taste (due to obmalloc isolation). For now I'm going with the simpler solution, making the dict per-interpreter. We can revisit that later if we want a sharing solution.
2023-03-27 16:53:05 -06:00
Eric Snow 87be8d9522
gh-100227: Make the Global Interned Dict Safe for Isolated Interpreters (gh-102925)
This is effectively two changes.  The first (the bulk of the change) is where we add _Py_AddToGlobalDict() (and _PyRuntime.cached_objects.main_tstate, etc.).  The second (much smaller) change is where we update PyUnicode_InternInPlace() to use _Py_AddToGlobalDict() instead of calling PyDict_SetDefault() directly.

Basically, _Py_AddToGlobalDict() is a wrapper around PyDict_SetDefault() that should be used whenever we need to add a value to a runtime-global dict object (in the few cases where we are leaving the container global rather than moving it to PyInterpreterState, e.g. the interned strings dict).  _Py_AddToGlobalDict() does all the necessary work to make sure the target global dict is shared safely between isolated interpreters.  This is especially important as we move the obmalloc state to each interpreter (gh-101660), as well as, potentially, the GIL (PEP 684).

https://github.com/python/cpython/issues/100227
2023-03-22 18:30:04 -06:00
Mark Shannon 7559f5fda9
GH-101291: Rearrange the size bits in PyLongObject (GH-102464)
* Eliminate all remaining uses of Py_SIZE and Py_SET_SIZE on PyLongObject, adding asserts.

* Change layout of size/sign bits in longobject to support future addition of immortal ints and tagged medium ints.

* Add functions to hide some internals of long object, and for setting sign and digit count.

* Replace uses of IS_MEDIUM_VALUE macro with _PyLong_IsCompact().
2023-03-22 14:49:51 +00:00
Eric Snow cf6e7c5e55
gh-100227: Isolate the Import State to Each Interpreter (gh-101941)
Specific changes:

* move the import lock to PyInterpreterState
* move the "find_and_load" diagnostic state to PyInterpreterState

Note that the import lock exists to keep multiple imports of the same module in the same interpreter (but in different threads) from stomping on each other.  Independently, we use a distinct global lock to protect globally shared import state, especially related to loaded extension modules.  For now we can rely on the GIL as that lock but with a per-interpreter GIL we'll need a new global lock.

The remaining state in _PyRuntimeState.imports will (probably) continue being global.

https://github.com/python/cpython/issues/100227
2023-03-09 09:46:21 -07:00
Eric Snow 5e5acd291f
gh-100227: Move next_keys_version to PyInterpreterState (gh-102335)
https://github.com/python/cpython/issues/100227
2023-03-08 18:04:16 -07:00
Eric Snow 66ff374d4f
gh-100227: Move func_state.next_version to PyInterpreterState (gh-102334)
https://github.com/python/cpython/issues/100227
2023-03-08 15:56:36 -07:00
Eric Snow 8606697f49
gh-90110: Fix the c-analyzer Tool (#102483)
Some incompatible changes had gone in, and the "ignore" lists weren't properly undated. This change fixes that. It's necessary prior to enabling test_check_c_globals, which I hope to do soon.

Note that this does include moving last_resort_memory_error to PyInterpreterState.

https://github.com/python/cpython/issues/90110
2023-03-06 19:40:09 -07:00
Eric Snow f300a1fa4c
gh-100227: Move the dtoa State to PyInterpreterState (gh-102331)
https://github.com/python/cpython/issues/100227
2023-02-28 13:14:40 -07:00
Eric Snow b2fc549278
gh-101758: Clean Up Uses of Import State (gh-101919)
This change is almost entirely moving code around and hiding import state behind internal API.  We introduce no changes to behavior, nor to non-internal API.  (Since there was already going to be a lot of churn, I took this as an opportunity to re-organize import.c into topically-grouped sections of code.)  The motivation is to simplify a number of upcoming changes.

Specific changes:

* move existing import-related code to import.c, wherever possible
* add internal API for interacting with import state (both global and per-interpreter)
* use only API outside of import.c (to limit churn there when changing the location, etc.)
* consolidate the import-related state of PyInterpreterState into a single struct field (this changes layout slightly)
* add macros for import state in import.c (to simplify changing the location)
* group code in import.c into sections
*remove _PyState_AddModule()

https://github.com/python/cpython/issues/101758
2023-02-15 15:32:31 -07:00
Mark Shannon c1b1f51cd1
GH-101291: Refactor the `PyLongObject` struct into object header and PyLongValue struct. (GH-101292) 2023-01-30 10:03:04 +00:00
Eric Snow 6036c3e856
gh-59956: Clarify GILState-related Code (gh-101161)
The objective of this change is to help make the GILState-related code easier to understand.  This mostly involves moving code around and some semantically equivalent refactors.  However, there are a also a small number of slight changes in structure and behavior:

* tstate_current is moved out of _PyRuntimeState.gilstate
* autoTSSkey is moved out of _PyRuntimeState.gilstate
* autoTSSkey is initialized earlier
* autoTSSkey is re-initialized (after fork) earlier

https://github.com/python/cpython/issues/59956
2023-01-19 16:04:14 -07:00
Eric Snow 0415cf895f
gh-81057: Move the Cached Parser Dummy Name to _PyRuntimeState (#100277) 2022-12-16 13:48:03 +00:00
Eric Snow aa8591e9ca
gh-90111: Minor Cleanup for Runtime-Global Objects (gh-100254)
* move _PyRuntime.global_objects.interned to _PyRuntime.cached_objects.interned_strings (and use _Py_CACHED_OBJECT())
* rename _PyRuntime.global_objects to _PyRuntime.static_objects

(This also relates to gh-96075.)

https://github.com/python/cpython/issues/90111
2022-12-14 11:53:57 -07:00
Eric Snow 5eb28bca9f
gh-81057: Move Signal-Related Globals to _PyRuntimeState (gh-100085)
https://github.com/python/cpython/issues/81057
2022-12-12 16:50:19 -07:00
Eric Snow 53d9cd95cd
gh-81057: Move faulthandler Globals to _PyRuntimeState (gh-100152)
https://github.com/python/cpython/issues/81057
2022-12-12 09:58:46 -07:00
Eric Snow 8790d4d31f
gh-81057: Move tracemalloc Globals to _PyRuntimeState (gh-100151)
https://github.com/python/cpython/issues/81057
2022-12-12 08:44:23 -07:00
Eric Snow bc8cdf8c3d
gh-81057: Move Ceval Trampoline Globals to _PyRuntimeState (gh-100083)
https://github.com/python/cpython/issues/81057
2022-12-08 17:17:20 -07:00
Eric Snow 8a3f06c54b
gh-81057: Move time Globals to _PyRuntimeState (gh-100122)
https://github.com/python/cpython/issues/81057
2022-12-08 16:46:09 -07:00
Eric Snow cda9f0236f
gh-81057: Move OS-Related Globals to _PyRuntimeState (gh-100082)
https://github.com/python/cpython/issues/81057
2022-12-08 15:38:06 -07:00
Eric Snow 9db1e17c80
gh-81057: Move the global Dict-Related Versions to _PyRuntimeState (gh-99497)
We also move the global func version.

https://github.com/python/cpython/issues/81057
2022-11-16 10:37:29 -07:00
Eric Snow 01fa907aa8
gh-81057: Move contextvars-related Globals to _PyRuntimeState (gh-99400)
This is part of the effort to consolidate global variables, to make them easier to manage (and make it easier to later move some of them to PyInterpreterState).

https://github.com/python/cpython/issues/81057
2022-11-16 09:54:28 -07:00
Eric Snow 5f55067e23
gh-81057: Move More Globals in Core Code to _PyRuntimeState (gh-99516)
https://github.com/python/cpython/issues/81057
2022-11-16 09:37:14 -07:00
Eric Snow 3c57971a2d
gh-81057: Move Globals in Core Code to _PyRuntimeState (gh-99496)
This is the first of several changes to consolidate non-object globals in core code.

https://github.com/python/cpython/issues/81057
2022-11-15 09:45:11 -07:00
Kumar Aditya dc3e4350a5
GH-99205: remove `_static` field from `PyThreadState` and `PyInterpreterState` (GH-99385) 2022-11-14 16:35:37 -08:00
Eric Snow e874c2f198
gh-81057: Move the Remaining Import State Globals to _PyRuntimeState (gh-99488)
https://github.com/python/cpython/issues/81057
2022-11-14 15:56:16 -07:00
Eric Snow a088290f9d
gh-81057: Move Global Variables Holding Objects to _PyRuntimeState. (gh-99487)
This moves nearly all remaining object-holding globals in core code (other than static types).

https://github.com/python/cpython/issues/81057
2022-11-14 13:50:56 -07:00
Eric Snow 67807cfc87
gh-81057: Move the Allocators to _PyRuntimeState (gh-99217)
The global allocators were stored in 3 static global variables: _PyMem_Raw, _PyMem, and _PyObject.  State for the "small block" allocator was stored in another 13.  That makes a total of 16 global variables. We are moving all 16 to the _PyRuntimeState struct as part of the work for gh-81057.  (If PEP 684 is accepted then we will follow up by moving them all to PyInterpreterState.)

https://github.com/python/cpython/issues/81057
2022-11-11 16:30:46 -07:00
Eric Snow f531b6879b
gh-81057: Add PyInterpreterState.static_objects (gh-99397)
As we consolidate global variables, we find some objects that are almost suitable to add to _PyRuntimeState.global_objects, but have some small/sneaky bit of per-interpreter state (e.g. a weakref list). We're adding PyInterpreterState.static_objects so we can move such objects there. (We'll removed the _not_used field once we've added others.)

https://github.com/python/cpython/issues/81057
2022-11-11 14:24:18 -07:00
Eric Snow fe55ff3f68
gh-81057: Generate a Separate Initializer For Each Part of the Global Objects Initializer (gh-99389)
Up until now we had a single generated initializer macro for all the statically declared global objects in _PyRuntimeState, including several one-offs (e.g. the empty tuple). The one-offs don't need to be generated, but were because we had one big initializer. Having separate initializers for set of generated global objects allows us to generate only the ones we need to.  This allows us to add initializers for one-off global objects without having to generate them.

https://github.com/python/cpython/issues/81057
2022-11-11 13:23:41 -07:00
Kumar Aditya be0d5008b3
GH-90699: Remove remaining `_Py_IDENTIFIER` stdlib usage (GH-99067) 2022-11-07 12:06:23 -08:00
Mark Shannon 76449350b3
GH-91079: Decouple C stack overflow checks from Python recursion checks. (GH-96510) 2022-10-05 01:34:03 +01:00
Kumar Aditya 6dab8c95bd
GH-96458: Statically initialize utf8 representation of static strings (#96481) 2022-09-02 23:43:08 -07:00
Kumar Aditya 71697664d7
GH-90699: Move generated static initializer to pycore_runtime_generated.h (GH-94051) 2022-07-07 13:04:05 -07:00
Victor Stinner 7ad6f74fcf
gh-87347: Add parenthesis around macro arguments (#93915)
Add unit test on Py_MEMBER_SIZE() and some other macros.
2022-06-20 16:04:52 +02:00
Serhiy Storchaka 6fd4c8ec77
gh-93741: Add private C API _PyImport_GetModuleAttrString() (GH-93742)
It combines PyImport_ImportModule() and PyObject_GetAttrString()
and saves 4-6 lines of code on every use.

Add also _PyImport_GetModuleAttr() which takes Python strings as arguments.
2022-06-14 07:15:26 +03:00
Kumar Aditya 9331087966
GH-90699: use statically allocated strings in typeobject.c (gh-93751) 2022-06-13 01:38:18 +09:00
Serhiy Storchaka 3473817106
gh-91162: Support splitting of unpacked arbitrary-length tuple over TypeVar and TypeVarTuple parameters (alt) (GH-93412)
For example:

  A[T, *Ts][*tuple[int, ...]] -> A[int, *tuple[int, ...]]
  A[*Ts, T][*tuple[int, ...]] -> A[*tuple[int, ...], int]
2022-06-12 16:22:01 +03:00