From 13a00078b81776b23b0b6add69b848382240d1f2 Mon Sep 17 00:00:00 2001 From: Victor Stinner Date: Thu, 31 Aug 2023 18:33:34 +0200 Subject: [PATCH] gh-108634: Py_TRACE_REFS uses a hash table (#108663) Python built with "configure --with-trace-refs" (tracing references) is now ABI compatible with Python release build and debug build. Moreover, it now also supports the Limited API. Change Py_TRACE_REFS build: * Remove _PyObject_EXTRA_INIT macro. * The PyObject structure no longer has two extra members (_ob_prev and _ob_next). * Use a hash table (_Py_hashtable_t) to trace references (all objects): PyInterpreterState.object_state.refchain. * Py_TRACE_REFS build is now ABI compatible with release build and debug build. * Limited C API extensions can now be built with Py_TRACE_REFS: xxlimited, xxlimited_35, _testclinic_limited. * No longer rename PyModule_Create2() and PyModule_FromDefAndSpec2() functions to PyModule_Create2TraceRefs() and PyModule_FromDefAndSpec2TraceRefs(). * _Py_PrintReferenceAddresses() is now called before finalize_interp_delete() which deletes the refchain hash table. * test_tracemalloc find_trace() now also filters by size to ignore the memory allocated by _PyRefchain_Trace(). Test changes for Py_TRACE_REFS: * Add test.support.Py_TRACE_REFS constant. * Add test_sys.test_getobjects() to test sys.getobjects() function. * test_exceptions skips test_recursion_normalizing_with_no_memory() and test_memory_error_in_PyErr_PrintEx() if Python is built with Py_TRACE_REFS. * test_repl skips test_no_memory(). * test_capi skisp test_set_nomemory(). --- Doc/c-api/typeobj.rst | 22 -- Doc/using/configure.rst | 13 +- Doc/whatsnew/3.13.rst | 9 + Include/internal/pycore_object.h | 5 +- Include/internal/pycore_object_state.h | 11 +- Include/internal/pycore_runtime_init.h | 2 +- Include/modsupport.h | 8 - Include/object.h | 21 -- Include/pyport.h | 6 - Lib/test/support/__init__.py | 5 +- Lib/test/test_capi/test_mem.py | 3 + Lib/test/test_exceptions.py | 6 + Lib/test/test_import/__init__.py | 4 +- Lib/test/test_repl.py | 4 + Lib/test/test_sys.py | 21 ++ Lib/test/test_tracemalloc.py | 20 +- ...-08-30-02-52-52.gh-issue-108634.3dpBvf.rst | 3 + ...-08-30-02-54-06.gh-issue-108634.oV3Xzk.rst | 3 + Misc/SpecialBuilds.txt | 10 +- Modules/_testcapi/parts.h | 21 -- Objects/object.c | 272 +++++++++++------- Objects/setobject.c | 1 - Objects/sliceobject.c | 1 - Objects/structseq.c | 5 +- Objects/typeobject.c | 2 +- Python/bltinmodule.c | 2 +- Python/hashtable.c | 1 - Python/pylifecycle.c | 29 +- Python/pystate.c | 4 + configure | 11 +- configure.ac | 12 +- 31 files changed, 293 insertions(+), 244 deletions(-) create mode 100644 Misc/NEWS.d/next/Build/2023-08-30-02-52-52.gh-issue-108634.3dpBvf.rst create mode 100644 Misc/NEWS.d/next/C API/2023-08-30-02-54-06.gh-issue-108634.oV3Xzk.rst diff --git a/Doc/c-api/typeobj.rst b/Doc/c-api/typeobj.rst index f417c68fd1e..1fa3f2a6f53 100644 --- a/Doc/c-api/typeobj.rst +++ b/Doc/c-api/typeobj.rst @@ -528,28 +528,6 @@ type objects) *must* have the :c:member:`~PyVarObject.ob_size` field. This field is inherited by subtypes. -.. c:member:: PyObject* PyObject._ob_next - PyObject* PyObject._ob_prev - - These fields are only present when the macro ``Py_TRACE_REFS`` is defined - (see the :option:`configure --with-trace-refs option <--with-trace-refs>`). - - Their initialization to ``NULL`` is taken care of by the - ``PyObject_HEAD_INIT`` macro. For :ref:`statically allocated objects - `, these fields always remain ``NULL``. For :ref:`dynamically - allocated objects `, these two fields are used to link the - object into a doubly linked list of *all* live objects on the heap. - - This could be used for various debugging purposes; currently the only uses - are the :func:`sys.getobjects` function and to print the objects that are - still alive at the end of a run when the environment variable - :envvar:`PYTHONDUMPREFS` is set. - - **Inheritance:** - - These fields are not inherited by subtypes. - - PyVarObject Slots ----------------- diff --git a/Doc/using/configure.rst b/Doc/using/configure.rst index a16a4afffb1..fe35372603f 100644 --- a/Doc/using/configure.rst +++ b/Doc/using/configure.rst @@ -425,8 +425,7 @@ See also the :ref:`Python Development Mode ` and the .. versionchanged:: 3.8 Release builds and debug builds are now ABI compatible: defining the ``Py_DEBUG`` macro no longer implies the ``Py_TRACE_REFS`` macro (see the - :option:`--with-trace-refs` option), which introduces the only ABI - incompatibility. + :option:`--with-trace-refs` option). Debug options @@ -447,8 +446,14 @@ Debug options * Add :func:`sys.getobjects` function. * Add :envvar:`PYTHONDUMPREFS` environment variable. - This build is not ABI compatible with release build (default build) or debug - build (``Py_DEBUG`` and ``Py_REF_DEBUG`` macros). + The :envvar:`PYTHONDUMPREFS` environment variable can be used to dump + objects and reference counts still alive at Python exit. + + :ref:`Statically allocated objects ` are not traced. + + .. versionchanged:: 3.13 + This build is now ABI compatible with release build and :ref:`debug build + `. .. versionadded:: 3.8 diff --git a/Doc/whatsnew/3.13.rst b/Doc/whatsnew/3.13.rst index 298d5fb5677..fd2f2c3fff8 100644 --- a/Doc/whatsnew/3.13.rst +++ b/Doc/whatsnew/3.13.rst @@ -828,6 +828,11 @@ Build Changes * SQLite 3.15.2 or newer is required to build the :mod:`sqlite3` extension module. (Contributed by Erlend Aasland in :gh:`105875`.) +* Python built with :file:`configure` :option:`--with-trace-refs` (tracing + references) is now ABI compatible with Python release build and + :ref:`debug build `. + (Contributed by Victor Stinner in :gh:`108634`.) + C API Changes ============= @@ -900,6 +905,10 @@ New Features (with an underscore prefix). (Contributed by Victor Stinner in :gh:`108014`.) +* Python built with :file:`configure` :option:`--with-trace-refs` (tracing + references) now supports the :ref:`Limited API `. + (Contributed by Victor Stinner in :gh:`108634`.) + Porting to Python 3.13 ---------------------- diff --git a/Include/internal/pycore_object.h b/Include/internal/pycore_object.h index 7c142b384d1..d842816e673 100644 --- a/Include/internal/pycore_object.h +++ b/Include/internal/pycore_object.h @@ -55,7 +55,6 @@ PyAPI_FUNC(int) _PyObject_IsFreed(PyObject *); backwards compatible solution */ #define _PyObject_HEAD_INIT(type) \ { \ - _PyObject_EXTRA_INIT \ .ob_refcnt = _Py_IMMORTAL_REFCNT, \ .ob_type = (type) \ }, @@ -184,6 +183,8 @@ _PyType_HasFeature(PyTypeObject *type, unsigned long feature) { extern void _PyType_InitCache(PyInterpreterState *interp); extern void _PyObject_InitState(PyInterpreterState *interp); +extern void _PyObject_FiniState(PyInterpreterState *interp); +extern bool _PyRefchain_IsTraced(PyInterpreterState *interp, PyObject *obj); /* Inline functions trading binary compatibility for speed: _PyObject_Init() is the fast version of PyObject_Init(), and @@ -302,7 +303,7 @@ extern void _PyDebug_PrintTotalRefs(void); #endif #ifdef Py_TRACE_REFS -extern void _Py_AddToAllObjects(PyObject *op, int force); +extern void _Py_AddToAllObjects(PyObject *op); extern void _Py_PrintReferences(PyInterpreterState *, FILE *); extern void _Py_PrintReferenceAddresses(PyInterpreterState *, FILE *); #endif diff --git a/Include/internal/pycore_object_state.h b/Include/internal/pycore_object_state.h index 65feb5af969..9eac27b1a9a 100644 --- a/Include/internal/pycore_object_state.h +++ b/Include/internal/pycore_object_state.h @@ -8,6 +8,8 @@ extern "C" { # error "this header requires Py_BUILD_CORE define" #endif +#include "pycore_hashtable.h" // _Py_hashtable_t + struct _py_object_runtime_state { #ifdef Py_REF_DEBUG Py_ssize_t interpreter_leaks; @@ -20,11 +22,10 @@ struct _py_object_state { Py_ssize_t reftotal; #endif #ifdef Py_TRACE_REFS - /* Head of circular doubly-linked list of all objects. These are linked - * together via the _ob_prev and _ob_next members of a PyObject, which - * exist only in a Py_TRACE_REFS build. - */ - PyObject refchain; + // Hash table storing all objects. The key is the object pointer + // (PyObject*) and the value is always the number 1 (as uintptr_t). + // See _PyRefchain_IsTraced() and _PyRefchain_Trace() functions. + _Py_hashtable_t *refchain; #endif int _not_used; }; diff --git a/Include/internal/pycore_runtime_init.h b/Include/internal/pycore_runtime_init.h index c775a8a7e7e..2deba02a89f 100644 --- a/Include/internal/pycore_runtime_init.h +++ b/Include/internal/pycore_runtime_init.h @@ -192,7 +192,7 @@ extern PyTypeObject _PyExc_MemoryError; #ifdef Py_TRACE_REFS # define _py_object_state_INIT(INTERP) \ { \ - .refchain = {&INTERP.object_state.refchain, &INTERP.object_state.refchain}, \ + .refchain = NULL, \ } #else # define _py_object_state_INIT(INTERP) \ diff --git a/Include/modsupport.h b/Include/modsupport.h index 88577e027b5..7c15ab50c32 100644 --- a/Include/modsupport.h +++ b/Include/modsupport.h @@ -111,14 +111,6 @@ PyAPI_FUNC(int) PyModule_ExecDef(PyObject *module, PyModuleDef *def); #define PYTHON_ABI_VERSION 3 #define PYTHON_ABI_STRING "3" -#ifdef Py_TRACE_REFS - /* When we are tracing reference counts, rename module creation functions so - modules compiled with incompatible settings will generate a - link-time error. */ - #define PyModule_Create2 PyModule_Create2TraceRefs - #define PyModule_FromDefAndSpec2 PyModule_FromDefAndSpec2TraceRefs -#endif - PyAPI_FUNC(PyObject *) PyModule_Create2(PyModuleDef*, int apiver); #ifdef Py_LIMITED_API diff --git a/Include/object.h b/Include/object.h index d82eb613874..de2a1ce0f3c 100644 --- a/Include/object.h +++ b/Include/object.h @@ -58,23 +58,6 @@ whose size is determined when the object is allocated. # define Py_REF_DEBUG #endif -#if defined(Py_LIMITED_API) && defined(Py_TRACE_REFS) -# error Py_LIMITED_API is incompatible with Py_TRACE_REFS -#endif - -#ifdef Py_TRACE_REFS -/* Define pointers to support a doubly-linked list of all live heap objects. */ -#define _PyObject_HEAD_EXTRA \ - PyObject *_ob_next; \ - PyObject *_ob_prev; - -#define _PyObject_EXTRA_INIT _Py_NULL, _Py_NULL, - -#else -# define _PyObject_HEAD_EXTRA -# define _PyObject_EXTRA_INIT -#endif - /* PyObject_HEAD defines the initial segment of every PyObject. */ #define PyObject_HEAD PyObject ob_base; @@ -130,14 +113,12 @@ check by comparing the reference count field to the immortality reference count. #ifdef Py_BUILD_CORE #define PyObject_HEAD_INIT(type) \ { \ - _PyObject_EXTRA_INIT \ { _Py_IMMORTAL_REFCNT }, \ (type) \ }, #else #define PyObject_HEAD_INIT(type) \ { \ - _PyObject_EXTRA_INIT \ { 1 }, \ (type) \ }, @@ -164,8 +145,6 @@ check by comparing the reference count field to the immortality reference count. * in addition, be cast to PyVarObject*. */ struct _object { - _PyObject_HEAD_EXTRA - #if (defined(__GNUC__) || defined(__clang__)) \ && !(defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112L) // On C99 and older, anonymous union is a GCC and clang extension diff --git a/Include/pyport.h b/Include/pyport.h index 2dc24138924..115b54fd969 100644 --- a/Include/pyport.h +++ b/Include/pyport.h @@ -684,12 +684,6 @@ extern char * _getpty(int *, int, mode_t, int); # endif #endif -/* Check that ALT_SOABI is consistent with Py_TRACE_REFS: - ./configure --with-trace-refs should must be used to define Py_TRACE_REFS */ -#if defined(ALT_SOABI) && defined(Py_TRACE_REFS) -# error "Py_TRACE_REFS ABI is not compatible with release and debug ABI" -#endif - #if defined(__ANDROID__) || defined(__VXWORKS__) // Use UTF-8 as the locale encoding, ignore the LC_CTYPE locale. // See _Py_GetLocaleEncoding(), PyUnicode_DecodeLocale() diff --git a/Lib/test/support/__init__.py b/Lib/test/support/__init__.py index c3f8527bd69..16a5056a33a 100644 --- a/Lib/test/support/__init__.py +++ b/Lib/test/support/__init__.py @@ -779,9 +779,6 @@ def python_is_optimized(): _header = 'nP' _align = '0n' -if hasattr(sys, "getobjects"): - _header = '2P' + _header - _align = '0P' _vheader = _header + 'n' def calcobjsize(fmt): @@ -2469,3 +2466,5 @@ C_RECURSION_LIMIT = 1500 #Windows doesn't have os.uname() but it doesn't support s390x. skip_on_s390x = unittest.skipIf(hasattr(os, 'uname') and os.uname().machine == 's390x', 'skipped on s390x') + +Py_TRACE_REFS = hasattr(sys, 'getobjects') diff --git a/Lib/test/test_capi/test_mem.py b/Lib/test/test_capi/test_mem.py index 527000875b7..72f23b1a340 100644 --- a/Lib/test/test_capi/test_mem.py +++ b/Lib/test/test_capi/test_mem.py @@ -112,6 +112,9 @@ class PyMemDebugTests(unittest.TestCase): def test_pyobject_freed_is_freed(self): self.check_pyobject_is_freed('check_pyobject_freed_is_freed') + # Python built with Py_TRACE_REFS fail with a fatal error in + # _PyRefchain_Trace() on memory allocation error. + @unittest.skipIf(support.Py_TRACE_REFS, 'cannot test Py_TRACE_REFS build') def test_set_nomemory(self): code = """if 1: import _testcapi diff --git a/Lib/test/test_exceptions.py b/Lib/test/test_exceptions.py index 764122ed4ef..c766f4d4331 100644 --- a/Lib/test/test_exceptions.py +++ b/Lib/test/test_exceptions.py @@ -1484,6 +1484,9 @@ class ExceptionTests(unittest.TestCase): @cpython_only + # Python built with Py_TRACE_REFS fail with a fatal error in + # _PyRefchain_Trace() on memory allocation error. + @unittest.skipIf(support.Py_TRACE_REFS, 'cannot test Py_TRACE_REFS build') def test_recursion_normalizing_with_no_memory(self): # Issue #30697. Test that in the abort that occurs when there is no # memory left and the size of the Python frames stack is greater than @@ -1652,6 +1655,9 @@ class ExceptionTests(unittest.TestCase): self.assertTrue(report.endswith("\n")) @cpython_only + # Python built with Py_TRACE_REFS fail with a fatal error in + # _PyRefchain_Trace() on memory allocation error. + @unittest.skipIf(support.Py_TRACE_REFS, 'cannot test Py_TRACE_REFS build') def test_memory_error_in_PyErr_PrintEx(self): code = """if 1: import _testcapi diff --git a/Lib/test/test_import/__init__.py b/Lib/test/test_import/__init__.py index 740ce7d5ef2..33bce779f6c 100644 --- a/Lib/test/test_import/__init__.py +++ b/Lib/test/test_import/__init__.py @@ -28,7 +28,7 @@ import _imp from test.support import os_helper from test.support import ( STDLIB_DIR, swap_attr, swap_item, cpython_only, is_emscripten, - is_wasi, run_in_subinterp, run_in_subinterp_with_config) + is_wasi, run_in_subinterp, run_in_subinterp_with_config, Py_TRACE_REFS) from test.support.import_helper import ( forget, make_legacy_pyc, unlink, unload, DirsOnSysPath, CleanImport) from test.support.os_helper import ( @@ -2555,7 +2555,7 @@ class SinglephaseInitTests(unittest.TestCase): def test_basic_multiple_interpreters_deleted_no_reset(self): # without resetting; already loaded in a deleted interpreter - if hasattr(sys, 'getobjects'): + if Py_TRACE_REFS: # It's a Py_TRACE_REFS build. # This test breaks interpreter isolation a little, # which causes problems on Py_TRACE_REF builds. diff --git a/Lib/test/test_repl.py b/Lib/test/test_repl.py index ddb4aa68048..58392f2384a 100644 --- a/Lib/test/test_repl.py +++ b/Lib/test/test_repl.py @@ -5,6 +5,7 @@ import os import unittest import subprocess from textwrap import dedent +from test import support from test.support import cpython_only, has_subprocess_support, SuppressCrashReport from test.support.script_helper import kill_python @@ -59,6 +60,9 @@ def run_on_interactive_mode(source): class TestInteractiveInterpreter(unittest.TestCase): @cpython_only + # Python built with Py_TRACE_REFS fail with a fatal error in + # _PyRefchain_Trace() on memory allocation error. + @unittest.skipIf(support.Py_TRACE_REFS, 'cannot test Py_TRACE_REFS build') def test_no_memory(self): # Issue #30696: Fix the interactive interpreter looping endlessly when # no memory. Check also that the fix does not break the interactive diff --git a/Lib/test/test_sys.py b/Lib/test/test_sys.py index f3608ce142f..d8b684c8a00 100644 --- a/Lib/test/test_sys.py +++ b/Lib/test/test_sys.py @@ -1174,6 +1174,27 @@ class SysModuleTest(unittest.TestCase): self.assertEqual(os.path.normpath(sys._stdlib_dir), os.path.normpath(expected)) + @unittest.skipUnless(hasattr(sys, 'getobjects'), 'need sys.getobjects()') + def test_getobjects(self): + # sys.getobjects(0) + all_objects = sys.getobjects(0) + self.assertIsInstance(all_objects, list) + self.assertGreater(len(all_objects), 0) + + # sys.getobjects(0, MyType) + class MyType: + pass + size = 100 + my_objects = [MyType() for _ in range(size)] + get_objects = sys.getobjects(0, MyType) + self.assertEqual(len(get_objects), size) + for obj in get_objects: + self.assertIsInstance(obj, MyType) + + # sys.getobjects(3, MyType) + get_objects = sys.getobjects(3, MyType) + self.assertEqual(len(get_objects), 3) + @test.support.cpython_only class UnraisableHookTest(unittest.TestCase): diff --git a/Lib/test/test_tracemalloc.py b/Lib/test/test_tracemalloc.py index 4af4ca3b977..bea12452103 100644 --- a/Lib/test/test_tracemalloc.py +++ b/Lib/test/test_tracemalloc.py @@ -173,9 +173,11 @@ class TestTracemallocEnabled(unittest.TestCase): self.assertEqual(len(traceback), 1) self.assertEqual(traceback, obj_traceback) - def find_trace(self, traces, traceback): + def find_trace(self, traces, traceback, size): + # filter also by size to ignore the memory allocated by + # _PyRefchain_Trace() if Python is built with Py_TRACE_REFS. for trace in traces: - if trace[2] == traceback._frames: + if trace[2] == traceback._frames and trace[1] == size: return trace self.fail("trace not found") @@ -186,11 +188,10 @@ class TestTracemallocEnabled(unittest.TestCase): obj, obj_traceback = allocate_bytes(obj_size) traces = tracemalloc._get_traces() - trace = self.find_trace(traces, obj_traceback) + trace = self.find_trace(traces, obj_traceback, obj_size) self.assertIsInstance(trace, tuple) domain, size, traceback, length = trace - self.assertEqual(size, obj_size) self.assertEqual(traceback, obj_traceback._frames) tracemalloc.stop() @@ -208,17 +209,18 @@ class TestTracemallocEnabled(unittest.TestCase): # Ensure that two identical tracebacks are not duplicated tracemalloc.stop() tracemalloc.start(4) - obj_size = 123 - obj1, obj1_traceback = allocate_bytes4(obj_size) - obj2, obj2_traceback = allocate_bytes4(obj_size) + obj1_size = 123 + obj2_size = 125 + obj1, obj1_traceback = allocate_bytes4(obj1_size) + obj2, obj2_traceback = allocate_bytes4(obj2_size) traces = tracemalloc._get_traces() obj1_traceback._frames = tuple(reversed(obj1_traceback._frames)) obj2_traceback._frames = tuple(reversed(obj2_traceback._frames)) - trace1 = self.find_trace(traces, obj1_traceback) - trace2 = self.find_trace(traces, obj2_traceback) + trace1 = self.find_trace(traces, obj1_traceback, obj1_size) + trace2 = self.find_trace(traces, obj2_traceback, obj2_size) domain1, size1, traceback1, length1 = trace1 domain2, size2, traceback2, length2 = trace2 self.assertIs(traceback2, traceback1) diff --git a/Misc/NEWS.d/next/Build/2023-08-30-02-52-52.gh-issue-108634.3dpBvf.rst b/Misc/NEWS.d/next/Build/2023-08-30-02-52-52.gh-issue-108634.3dpBvf.rst new file mode 100644 index 00000000000..d1530787067 --- /dev/null +++ b/Misc/NEWS.d/next/Build/2023-08-30-02-52-52.gh-issue-108634.3dpBvf.rst @@ -0,0 +1,3 @@ +Python built with :file:`configure` :option:`--with-trace-refs` (tracing +references) is now ABI compatible with Python release build and :ref:`debug +build `. Patch by Victor Stinner. diff --git a/Misc/NEWS.d/next/C API/2023-08-30-02-54-06.gh-issue-108634.oV3Xzk.rst b/Misc/NEWS.d/next/C API/2023-08-30-02-54-06.gh-issue-108634.oV3Xzk.rst new file mode 100644 index 00000000000..0427644ad37 --- /dev/null +++ b/Misc/NEWS.d/next/C API/2023-08-30-02-54-06.gh-issue-108634.oV3Xzk.rst @@ -0,0 +1,3 @@ +Python built with :file:`configure` :option:`--with-trace-refs` (tracing +references) now supports the :ref:`Limited API `. Patch by +Victor Stinner. diff --git a/Misc/SpecialBuilds.txt b/Misc/SpecialBuilds.txt index 5609928284d..78201bfbd67 100644 --- a/Misc/SpecialBuilds.txt +++ b/Misc/SpecialBuilds.txt @@ -43,13 +43,9 @@ Py_TRACE_REFS Build option: ``./configure --with-trace-refs``. -Turn on heavy reference debugging. This is major surgery. Every PyObject grows -two more pointers, to maintain a doubly-linked list of all live heap-allocated -objects. Most built-in type objects are not in this list, as they're statically -allocated. - -Note that because the fundamental PyObject layout changes, Python modules -compiled with Py_TRACE_REFS are incompatible with modules compiled without it. +Turn on heavy reference debugging. This is major surgery. All live +heap-allocated objects are traced in a hash table. Most built-in type objects +are not in this list, as they're statically allocated. Special gimmicks: diff --git a/Modules/_testcapi/parts.h b/Modules/_testcapi/parts.h index 65ebf80bcd1..9c6d6157141 100644 --- a/Modules/_testcapi/parts.h +++ b/Modules/_testcapi/parts.h @@ -1,27 +1,9 @@ #ifndef Py_TESTCAPI_PARTS_H #define Py_TESTCAPI_PARTS_H -#include "pyconfig.h" // for Py_TRACE_REFS - -// Figure out if Limited API is available for this build. If it isn't we won't -// build tests for it. -// Currently, only Py_TRACE_REFS disables Limited API. -#ifdef Py_TRACE_REFS -#undef LIMITED_API_AVAILABLE -#else -#define LIMITED_API_AVAILABLE 1 -#endif - // Always enable assertions #undef NDEBUG -#if !defined(LIMITED_API_AVAILABLE) && defined(Py_LIMITED_API) -// Limited API being unavailable means that with Py_LIMITED_API defined -// we can't even include Python.h. -// Do nothing; the .c file that defined Py_LIMITED_API should also do nothing. - -#else - #include "Python.h" int _PyTestCapi_Init_Vectorcall(PyObject *module); @@ -44,10 +26,7 @@ int _PyTestCapi_Init_PyOS(PyObject *module); int _PyTestCapi_Init_Immortal(PyObject *module); int _PyTestCapi_Init_GC(PyObject *mod); -#ifdef LIMITED_API_AVAILABLE int _PyTestCapi_Init_VectorcallLimited(PyObject *module); int _PyTestCapi_Init_HeaptypeRelative(PyObject *module); -#endif // LIMITED_API_AVAILABLE -#endif #endif // Py_TESTCAPI_PARTS_H diff --git a/Objects/object.c b/Objects/object.c index 0d88421bf0f..a4d7111a686 100644 --- a/Objects/object.c +++ b/Objects/object.c @@ -9,6 +9,7 @@ #include "pycore_dict.h" // _PyObject_MakeDictFromInstanceAttributes() #include "pycore_floatobject.h" // _PyFloat_DebugMallocStats() #include "pycore_initconfig.h" // _PyStatus_EXCEPTION() +#include "pycore_hashtable.h" // _Py_hashtable_new() #include "pycore_memoryobject.h" // _PyManagedBuffer_Type #include "pycore_namespace.h" // _PyNamespace_Type #include "pycore_object.h" // PyAPI_DATA() _Py_SwappedOp definition @@ -162,44 +163,51 @@ _PyDebug_PrintTotalRefs(void) { #ifdef Py_TRACE_REFS -#define REFCHAIN(interp) &interp->object_state.refchain +#define REFCHAIN(interp) interp->object_state.refchain +#define REFCHAIN_VALUE ((void*)(uintptr_t)1) -static inline void -init_refchain(PyInterpreterState *interp) +bool +_PyRefchain_IsTraced(PyInterpreterState *interp, PyObject *obj) { - PyObject *refchain = REFCHAIN(interp); - refchain->_ob_prev = refchain; - refchain->_ob_next = refchain; + return (_Py_hashtable_get(REFCHAIN(interp), obj) == REFCHAIN_VALUE); } -/* Insert op at the front of the list of all objects. If force is true, - * op is added even if _ob_prev and _ob_next are non-NULL already. If - * force is false amd _ob_prev or _ob_next are non-NULL, do nothing. - * force should be true if and only if op points to freshly allocated, - * uninitialized memory, or you've unlinked op from the list and are - * relinking it into the front. - * Note that objects are normally added to the list via _Py_NewReference, - * which is called by PyObject_Init. Not all objects are initialized that - * way, though; exceptions include statically allocated type objects, and - * statically allocated singletons (like Py_True and Py_None). - */ -void -_Py_AddToAllObjects(PyObject *op, int force) + +static void +_PyRefchain_Trace(PyInterpreterState *interp, PyObject *obj) { -#ifdef Py_DEBUG - if (!force) { - /* If it's initialized memory, op must be in or out of - * the list unambiguously. - */ - _PyObject_ASSERT(op, (op->_ob_prev == NULL) == (op->_ob_next == NULL)); + if (_Py_hashtable_set(REFCHAIN(interp), obj, REFCHAIN_VALUE) < 0) { + // Use a fatal error because _Py_NewReference() cannot report + // the error to the caller. + Py_FatalError("_Py_hashtable_set() memory allocation failed"); } +} + + +static void +_PyRefchain_Remove(PyInterpreterState *interp, PyObject *obj) +{ + void *value = _Py_hashtable_steal(REFCHAIN(interp), obj); +#ifndef NDEBUG + assert(value == REFCHAIN_VALUE); +#else + (void)value; #endif - if (force || op->_ob_prev == NULL) { - PyObject *refchain = REFCHAIN(_PyInterpreterState_GET()); - op->_ob_next = refchain->_ob_next; - op->_ob_prev = refchain; - refchain->_ob_next->_ob_prev = op; - refchain->_ob_next = op; +} + + +/* Add an object to the refchain hash table. + * + * Note that objects are normally added to the list by PyObject_Init() + * indirectly. Not all objects are initialized that way, though; exceptions + * include statically allocated type objects, and statically allocated + * singletons (like Py_True and Py_None). */ +void +_Py_AddToAllObjects(PyObject *op) +{ + PyInterpreterState *interp = _PyInterpreterState_GET(); + if (!_PyRefchain_IsTraced(interp, op)) { + _PyRefchain_Trace(interp, op); } } #endif /* Py_TRACE_REFS */ @@ -471,16 +479,6 @@ _PyObject_IsFreed(PyObject *op) if (_PyMem_IsPtrFreed(op) || _PyMem_IsPtrFreed(Py_TYPE(op))) { return 1; } - /* ignore op->ob_ref: its value can have be modified - by Py_INCREF() and Py_DECREF(). */ -#ifdef Py_TRACE_REFS - if (op->_ob_next != NULL && _PyMem_IsPtrFreed(op->_ob_next)) { - return 1; - } - if (op->_ob_prev != NULL && _PyMem_IsPtrFreed(op->_ob_prev)) { - return 1; - } -#endif return 0; } @@ -1929,7 +1927,6 @@ PyTypeObject _PyNone_Type = { }; PyObject _Py_NoneStruct = { - _PyObject_EXTRA_INIT { _Py_IMMORTAL_REFCNT }, &_PyNone_Type }; @@ -2032,7 +2029,6 @@ PyTypeObject _PyNotImplemented_Type = { }; PyObject _Py_NotImplementedStruct = { - _PyObject_EXTRA_INIT { _Py_IMMORTAL_REFCNT }, &_PyNotImplemented_Type }; @@ -2042,12 +2038,30 @@ void _PyObject_InitState(PyInterpreterState *interp) { #ifdef Py_TRACE_REFS - if (!_Py_IsMainInterpreter(interp)) { - init_refchain(interp); + _Py_hashtable_allocator_t alloc = { + // Don't use default PyMem_Malloc() and PyMem_Free() which + // require the caller to hold the GIL. + .malloc = PyMem_RawMalloc, + .free = PyMem_RawFree, + }; + REFCHAIN(interp) = _Py_hashtable_new_full( + _Py_hashtable_hash_ptr, _Py_hashtable_compare_direct, + NULL, NULL, &alloc); + if (REFCHAIN(interp) == NULL) { + Py_FatalError("_PyObject_InitState() memory allocation failure"); } #endif } +void +_PyObject_FiniState(PyInterpreterState *interp) +{ +#ifdef Py_TRACE_REFS + _Py_hashtable_destroy(REFCHAIN(interp)); + REFCHAIN(interp) = NULL; +#endif +} + extern PyTypeObject _PyAnextAwaitable_Type; extern PyTypeObject _PyLegacyEventHandler_Type; @@ -2230,7 +2244,7 @@ new_reference(PyObject *op) // Skip the immortal object check in Py_SET_REFCNT; always set refcnt to 1 op->ob_refcnt = 1; #ifdef Py_TRACE_REFS - _Py_AddToAllObjects(op, 1); + _Py_AddToAllObjects(op); #endif } @@ -2258,53 +2272,62 @@ _Py_ForgetReference(PyObject *op) _PyObject_ASSERT_FAILED_MSG(op, "negative refcnt"); } - PyObject *refchain = REFCHAIN(_PyInterpreterState_GET()); - if (op == refchain || - op->_ob_prev->_ob_next != op || op->_ob_next->_ob_prev != op) - { - _PyObject_ASSERT_FAILED_MSG(op, "invalid object chain"); - } + PyInterpreterState *interp = _PyInterpreterState_GET(); #ifdef SLOW_UNREF_CHECK - PyObject *p; - for (p = refchain->_ob_next; p != refchain; p = p->_ob_next) { - if (p == op) { - break; - } - } - if (p == refchain) { + if (!_PyRefchain_Get(interp, op)) { /* Not found */ _PyObject_ASSERT_FAILED_MSG(op, "object not found in the objects list"); } #endif - op->_ob_next->_ob_prev = op->_ob_prev; - op->_ob_prev->_ob_next = op->_ob_next; - op->_ob_next = op->_ob_prev = NULL; + _PyRefchain_Remove(interp, op); } +static int +_Py_PrintReference(_Py_hashtable_t *ht, + const void *key, const void *value, + void *user_data) +{ + PyObject *op = (PyObject*)key; + FILE *fp = (FILE *)user_data; + fprintf(fp, "%p [%zd] ", (void *)op, Py_REFCNT(op)); + if (PyObject_Print(op, fp, 0) != 0) { + PyErr_Clear(); + } + putc('\n', fp); + return 0; +} + + /* Print all live objects. Because PyObject_Print is called, the * interpreter must be in a healthy state. */ void _Py_PrintReferences(PyInterpreterState *interp, FILE *fp) { - PyObject *op; if (interp == NULL) { interp = _PyInterpreterState_Main(); } fprintf(fp, "Remaining objects:\n"); - PyObject *refchain = REFCHAIN(interp); - for (op = refchain->_ob_next; op != refchain; op = op->_ob_next) { - fprintf(fp, "%p [%zd] ", (void *)op, Py_REFCNT(op)); - if (PyObject_Print(op, fp, 0) != 0) { - PyErr_Clear(); - } - putc('\n', fp); - } + _Py_hashtable_foreach(REFCHAIN(interp), _Py_PrintReference, fp); } + +static int +_Py_PrintReferenceAddress(_Py_hashtable_t *ht, + const void *key, const void *value, + void *user_data) +{ + PyObject *op = (PyObject*)key; + FILE *fp = (FILE *)user_data; + fprintf(fp, "%p [%zd] %s\n", + (void *)op, Py_REFCNT(op), Py_TYPE(op)->tp_name); + return 0; +} + + /* Print the addresses of all live objects. Unlike _Py_PrintReferences, this * doesn't make any calls to the Python C API, so is always safe to call. */ @@ -2315,47 +2338,96 @@ _Py_PrintReferences(PyInterpreterState *interp, FILE *fp) void _Py_PrintReferenceAddresses(PyInterpreterState *interp, FILE *fp) { - PyObject *op; - PyObject *refchain = REFCHAIN(interp); fprintf(fp, "Remaining object addresses:\n"); - for (op = refchain->_ob_next; op != refchain; op = op->_ob_next) - fprintf(fp, "%p [%zd] %s\n", (void *)op, - Py_REFCNT(op), Py_TYPE(op)->tp_name); + _Py_hashtable_foreach(REFCHAIN(interp), _Py_PrintReferenceAddress, fp); } + +typedef struct { + PyObject *self; + PyObject *args; + PyObject *list; + PyObject *type; + Py_ssize_t limit; +} _Py_GetObjectsData; + +enum { + _PY_GETOBJECTS_IGNORE = 0, + _PY_GETOBJECTS_ERROR = 1, + _PY_GETOBJECTS_STOP = 2, +}; + +static int +_Py_GetObject(_Py_hashtable_t *ht, + const void *key, const void *value, + void *user_data) +{ + PyObject *op = (PyObject *)key; + _Py_GetObjectsData *data = user_data; + if (data->limit > 0) { + if (PyList_GET_SIZE(data->list) >= data->limit) { + return _PY_GETOBJECTS_STOP; + } + } + + if (op == data->self) { + return _PY_GETOBJECTS_IGNORE; + } + if (op == data->args) { + return _PY_GETOBJECTS_IGNORE; + } + if (op == data->list) { + return _PY_GETOBJECTS_IGNORE; + } + if (data->type != NULL) { + if (op == data->type) { + return _PY_GETOBJECTS_IGNORE; + } + if (!Py_IS_TYPE(op, (PyTypeObject *)data->type)) { + return _PY_GETOBJECTS_IGNORE; + } + } + + if (PyList_Append(data->list, op) < 0) { + return _PY_GETOBJECTS_ERROR; + } + return 0; +} + + /* The implementation of sys.getobjects(). */ PyObject * _Py_GetObjects(PyObject *self, PyObject *args) { - int i, n; - PyObject *t = NULL; - PyObject *res, *op; - PyInterpreterState *interp = _PyInterpreterState_GET(); - - if (!PyArg_ParseTuple(args, "i|O", &n, &t)) + Py_ssize_t limit; + PyObject *type = NULL; + if (!PyArg_ParseTuple(args, "n|O", &limit, &type)) { return NULL; - PyObject *refchain = REFCHAIN(interp); - op = refchain->_ob_next; - res = PyList_New(0); - if (res == NULL) - return NULL; - for (i = 0; (n == 0 || i < n) && op != refchain; i++) { - while (op == self || op == args || op == res || op == t || - (t != NULL && !Py_IS_TYPE(op, (PyTypeObject *) t))) { - op = op->_ob_next; - if (op == refchain) - return res; - } - if (PyList_Append(res, op) < 0) { - Py_DECREF(res); - return NULL; - } - op = op->_ob_next; } - return res; + + PyObject *list = PyList_New(0); + if (list == NULL) { + return NULL; + } + + _Py_GetObjectsData data = { + .self = self, + .args = args, + .list = list, + .type = type, + .limit = limit, + }; + PyInterpreterState *interp = _PyInterpreterState_GET(); + int res = _Py_hashtable_foreach(REFCHAIN(interp), _Py_GetObject, &data); + if (res == _PY_GETOBJECTS_ERROR) { + Py_DECREF(list); + return NULL; + } + return list; } #undef REFCHAIN +#undef REFCHAIN_VALUE #endif /* Py_TRACE_REFS */ diff --git a/Objects/setobject.c b/Objects/setobject.c index 6051e57731c..ae3f0b8d5e5 100644 --- a/Objects/setobject.c +++ b/Objects/setobject.c @@ -2548,7 +2548,6 @@ static PyTypeObject _PySetDummy_Type = { }; static PyObject _dummy_struct = { - _PyObject_EXTRA_INIT { _Py_IMMORTAL_REFCNT }, &_PySetDummy_Type }; diff --git a/Objects/sliceobject.c b/Objects/sliceobject.c index 8cf654fb6f8..5ffc52ae674 100644 --- a/Objects/sliceobject.c +++ b/Objects/sliceobject.c @@ -98,7 +98,6 @@ PyTypeObject PyEllipsis_Type = { }; PyObject _Py_EllipsisObject = { - _PyObject_EXTRA_INIT { _Py_IMMORTAL_REFCNT }, &PyEllipsis_Type }; diff --git a/Objects/structseq.c b/Objects/structseq.c index 6c07e636629..0ca622edc2b 100644 --- a/Objects/structseq.c +++ b/Objects/structseq.c @@ -573,9 +573,10 @@ PyStructSequence_InitType2(PyTypeObject *type, PyStructSequence_Desc *desc) Py_ssize_t n_members, n_unnamed_members; #ifdef Py_TRACE_REFS - /* if the type object was chained, unchain it first + /* if the type object was traced, remove it first before overwriting its storage */ - if (type->ob_base.ob_base._ob_next) { + PyInterpreterState *interp = _PyInterpreterState_GET(); + if (_PyRefchain_IsTraced(interp, (PyObject *)type)) { _Py_ForgetReference((PyObject *)type); } #endif diff --git a/Objects/typeobject.c b/Objects/typeobject.c index 7ce3de4d58d..67e059c3f74 100644 --- a/Objects/typeobject.c +++ b/Objects/typeobject.c @@ -7508,7 +7508,7 @@ type_ready(PyTypeObject *type, int rerunbuiltin) * to get type objects into the doubly-linked list of all objects. * Still, not all type objects go through PyType_Ready. */ - _Py_AddToAllObjects((PyObject *)type, 0); + _Py_AddToAllObjects((PyObject *)type); #endif /* Initialize tp_dict: _PyType_IsReady() tests if tp_dict != NULL */ diff --git a/Python/bltinmodule.c b/Python/bltinmodule.c index 30dd717b0fe..971067e2d4f 100644 --- a/Python/bltinmodule.c +++ b/Python/bltinmodule.c @@ -3110,7 +3110,7 @@ _PyBuiltin_Init(PyInterpreterState *interp) * result, programs leaking references to None and False (etc) * couldn't be diagnosed by examining sys.getobjects(0). */ -#define ADD_TO_ALL(OBJECT) _Py_AddToAllObjects((PyObject *)(OBJECT), 0) +#define ADD_TO_ALL(OBJECT) _Py_AddToAllObjects((PyObject *)(OBJECT)) #else #define ADD_TO_ALL(OBJECT) (void)0 #endif diff --git a/Python/hashtable.c b/Python/hashtable.c index 4e22a1a5509..8f5e8168ba1 100644 --- a/Python/hashtable.c +++ b/Python/hashtable.c @@ -226,7 +226,6 @@ _Py_hashtable_set(_Py_hashtable_t *ht, const void *key, void *value) assert(entry == NULL); #endif - entry = ht->alloc.malloc(sizeof(_Py_hashtable_entry_t)); if (entry == NULL) { /* memory allocation failed */ diff --git a/Python/pylifecycle.c b/Python/pylifecycle.c index 7d362af32cb..ee5d4981da5 100644 --- a/Python/pylifecycle.c +++ b/Python/pylifecycle.c @@ -1956,6 +1956,20 @@ Py_FinalizeEx(void) // XXX Ensure finalizer errors are handled properly. finalize_interp_clear(tstate); + +#ifdef Py_TRACE_REFS + /* Display addresses (& refcnts) of all objects still alive. + * An address can be used to find the repr of the object, printed + * above by _Py_PrintReferences. */ + if (dump_refs) { + _Py_PrintReferenceAddresses(tstate->interp, stderr); + } + if (dump_refs_fp != NULL) { + _Py_PrintReferenceAddresses(tstate->interp, dump_refs_fp); + fclose(dump_refs_fp); + } +#endif /* Py_TRACE_REFS */ + finalize_interp_delete(tstate->interp); #ifdef Py_REF_DEBUG @@ -1966,21 +1980,6 @@ Py_FinalizeEx(void) #endif _Py_FinalizeAllocatedBlocks(runtime); -#ifdef Py_TRACE_REFS - /* Display addresses (& refcnts) of all objects still alive. - * An address can be used to find the repr of the object, printed - * above by _Py_PrintReferences. - */ - - if (dump_refs) { - _Py_PrintReferenceAddresses(tstate->interp, stderr); - } - - if (dump_refs_fp != NULL) { - _Py_PrintReferenceAddresses(tstate->interp, dump_refs_fp); - fclose(dump_refs_fp); - } -#endif /* Py_TRACE_REFS */ #ifdef WITH_PYMALLOC if (malloc_stats) { _PyObject_DebugMallocStats(stderr); diff --git a/Python/pystate.c b/Python/pystate.c index 01651d79f9a..4a8808f700e 100644 --- a/Python/pystate.c +++ b/Python/pystate.c @@ -674,6 +674,7 @@ init_interpreter(PyInterpreterState *interp, _obmalloc_pools_INIT(interp->obmalloc.pools); memcpy(&interp->obmalloc.pools.used, temp, sizeof(temp)); } + _PyObject_InitState(interp); _PyEval_InitState(interp, pending_lock); @@ -1001,6 +1002,9 @@ PyInterpreterState_Delete(PyInterpreterState *interp) if (interp->id_mutex != NULL) { PyThread_free_lock(interp->id_mutex); } + + _PyObject_FiniState(interp); + free_interpreter(interp); } diff --git a/configure b/configure index 57e3307266c..7fe4aead29a 100755 --- a/configure +++ b/configure @@ -23571,8 +23571,9 @@ SOABI='cpython-'`echo $VERSION | tr -d .`${ABIFLAGS}${PLATFORM_TRIPLET:+-$PLATFO { printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $SOABI" >&5 printf "%s\n" "$SOABI" >&6; } -# Release and debug (Py_DEBUG) ABI are compatible, but not Py_TRACE_REFS ABI -if test "$Py_DEBUG" = 'true' -a "$with_trace_refs" != "yes"; then +# Release build, debug build (Py_DEBUG), and trace refs build (Py_TRACE_REFS) +# are ABI compatible +if test "$Py_DEBUG" = 'true'; then # Similar to SOABI but remove "d" flag from ABIFLAGS ALT_SOABI='cpython-'`echo $VERSION | tr -d .``echo $ABIFLAGS | tr -d d`${PLATFORM_TRIPLET:+-$PLATFORM_TRIPLET} @@ -29962,7 +29963,7 @@ printf %s "checking for stdlib extension module _testclinic_limited... " >&6; } if test "$py_cv_module__testclinic_limited" != "n/a" then : - if test "$TEST_MODULES" = yes -a "$with_trace_refs" = "no" + if test "$TEST_MODULES" = yes then : if true then : @@ -30267,7 +30268,7 @@ printf %s "checking for stdlib extension module xxlimited... " >&6; } if test "$py_cv_module_xxlimited" != "n/a" then : - if test "$with_trace_refs" = "no" + if true then : if test "$ac_cv_func_dlopen" = yes then : @@ -30305,7 +30306,7 @@ printf %s "checking for stdlib extension module xxlimited_35... " >&6; } if test "$py_cv_module_xxlimited_35" != "n/a" then : - if test "$with_trace_refs" = "no" + if true then : if test "$ac_cv_func_dlopen" = yes then : diff --git a/configure.ac b/configure.ac index 6fb6e110647..5673b374353 100644 --- a/configure.ac +++ b/configure.ac @@ -5684,8 +5684,9 @@ AC_MSG_CHECKING([SOABI]) SOABI='cpython-'`echo $VERSION | tr -d .`${ABIFLAGS}${PLATFORM_TRIPLET:+-$PLATFORM_TRIPLET} AC_MSG_RESULT([$SOABI]) -# Release and debug (Py_DEBUG) ABI are compatible, but not Py_TRACE_REFS ABI -if test "$Py_DEBUG" = 'true' -a "$with_trace_refs" != "yes"; then +# Release build, debug build (Py_DEBUG), and trace refs build (Py_TRACE_REFS) +# are ABI compatible +if test "$Py_DEBUG" = 'true'; then # Similar to SOABI but remove "d" flag from ABIFLAGS AC_SUBST([ALT_SOABI]) ALT_SOABI='cpython-'`echo $VERSION | tr -d .``echo $ABIFLAGS | tr -d d`${PLATFORM_TRIPLET:+-$PLATFORM_TRIPLET} @@ -7229,7 +7230,7 @@ PY_STDLIB_MOD([_hashlib], [], [test "$ac_cv_working_openssl_hashlib" = yes], dnl test modules PY_STDLIB_MOD([_testcapi], [test "$TEST_MODULES" = yes]) PY_STDLIB_MOD([_testclinic], [test "$TEST_MODULES" = yes]) -PY_STDLIB_MOD([_testclinic_limited], [test "$TEST_MODULES" = yes -a "$with_trace_refs" = "no"]) +PY_STDLIB_MOD([_testclinic_limited], [test "$TEST_MODULES" = yes]) PY_STDLIB_MOD([_testinternalcapi], [test "$TEST_MODULES" = yes]) PY_STDLIB_MOD([_testbuffer], [test "$TEST_MODULES" = yes]) PY_STDLIB_MOD([_testimportmultiple], [test "$TEST_MODULES" = yes], [test "$ac_cv_func_dlopen" = yes]) @@ -7241,10 +7242,9 @@ PY_STDLIB_MOD([_ctypes_test], [], [$LIBM]) dnl Limited API template modules. -dnl The limited C API is not compatible with the Py_TRACE_REFS macro. dnl Emscripten does not support shared libraries yet. -PY_STDLIB_MOD([xxlimited], [test "$with_trace_refs" = "no"], [test "$ac_cv_func_dlopen" = yes]) -PY_STDLIB_MOD([xxlimited_35], [test "$with_trace_refs" = "no"], [test "$ac_cv_func_dlopen" = yes]) +PY_STDLIB_MOD([xxlimited], [], [test "$ac_cv_func_dlopen" = yes]) +PY_STDLIB_MOD([xxlimited_35], [], [test "$ac_cv_func_dlopen" = yes]) # substitute multiline block, must come after last PY_STDLIB_MOD() AC_SUBST([MODULE_BLOCK])