Commit Graph

15 Commits

Author SHA1 Message Date
Victor Stinner 13a00078b8
gh-108634: Py_TRACE_REFS uses a hash table (#108663)
Python built with "configure --with-trace-refs" (tracing references)
is now ABI compatible with Python release build and debug build.
Moreover, it now also supports the Limited API.

Change Py_TRACE_REFS build:

* Remove _PyObject_EXTRA_INIT macro.
* The PyObject structure no longer has two extra members (_ob_prev
  and _ob_next).
* Use a hash table (_Py_hashtable_t) to trace references (all
  objects): PyInterpreterState.object_state.refchain.
* Py_TRACE_REFS build is now ABI compatible with release build and
  debug build.
* Limited C API extensions can now be built with Py_TRACE_REFS:
  xxlimited, xxlimited_35, _testclinic_limited.
* No longer rename PyModule_Create2() and PyModule_FromDefAndSpec2()
  functions to PyModule_Create2TraceRefs() and
  PyModule_FromDefAndSpec2TraceRefs().
* _Py_PrintReferenceAddresses() is now called before
  finalize_interp_delete() which deletes the refchain hash table.
* test_tracemalloc find_trace() now also filters by size to ignore
  the memory allocated by _PyRefchain_Trace().

Test changes for Py_TRACE_REFS:

* Add test.support.Py_TRACE_REFS constant.
* Add test_sys.test_getobjects() to test sys.getobjects() function.
* test_exceptions skips test_recursion_normalizing_with_no_memory()
  and test_memory_error_in_PyErr_PrintEx() if Python is built with
  Py_TRACE_REFS.
* test_repl skips test_no_memory().
* test_capi skisp test_set_nomemory().
2023-08-31 18:33:34 +02:00
Eric Snow b72947a8d2
gh-106931: Intern Statically Allocated Strings Globally (gh-107272)
We tried this before with a dict and for all interned strings.  That ran into problems due to interpreter isolation.  However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning.  Here we circle back to using a global cache, but only for statically allocated strings.  We also use a more-basic _Py_hashtable_t for that global cache instead of a dict.

Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter.  Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter.
2023-07-27 13:56:59 -06:00
Victor Stinner 89f9875448
gh-106320: Move private _PyHash API to the internal C API (#107026)
* No longer export most private _PyHash symbols, only export the ones
  which are needed by shared extensions.
* Modules/_xxtestfuzz/fuzzer.c now uses the internal C API.
2023-07-22 13:49:37 +00:00
Christian Heimes 4901ea9526
bpo-41061: Fix incorrect expressions in hashtable (GH-21028)
Signed-off-by: Christian Heimes <christian@python.org>
2020-06-22 00:41:48 -07:00
Victor Stinner d2dc827d16
bpo-40602: _Py_hashtable_set() reports rehash failure (GH-20077)
If _Py_hashtable_set() fails to grow the hash table (rehash), it now
fails rather than ignoring the error.
2020-05-14 22:44:32 +02:00
Victor Stinner a482dc500b
bpo-40602: Write unit tests for _Py_hashtable_t (GH-20091)
Cleanup also hashtable.c.
Rename _Py_hashtable_t members:

* Rename entries to nentries
* Rename num_buckets to nbuckets
2020-05-14 21:55:47 +02:00
Victor Stinner 42bae3a3d9
bpo-40602: Optimize _Py_hashtable_get_ptr() (GH-20066)
_Py_hashtable_get_entry_ptr() avoids comparing the entry hash:
compare directly keys.

Move _Py_hashtable_get_entry_ptr() just after
_Py_hashtable_get_entry_generic().
2020-05-13 05:36:23 +02:00
Victor Stinner 5b0a30354d
bpo-40609: _Py_hashtable_t values become void* (GH-20065)
_Py_hashtable_t values become regular "void *" pointers.

* Add _Py_hashtable_entry_t.data member
* Remove _Py_hashtable_t.data_size member
* Remove _Py_hashtable_t.get_func member. It is no longer needed
  to specialize _Py_hashtable_get() for a specific value size, since
  all entries now have the same size (void*).
* Remove the following macros:

  * _Py_HASHTABLE_GET()
  * _Py_HASHTABLE_SET()
  * _Py_HASHTABLE_SET_NODATA()
  * _Py_HASHTABLE_POP()

* Rename _Py_hashtable_pop() to _Py_hashtable_steal()
* _Py_hashtable_foreach() callback now gets key and value rather than
  entry.
* Remove _Py_hashtable_value_destroy_func type. value_destroy_func
  callback now only has a single parameter: data (void*).
2020-05-13 04:40:30 +02:00
Victor Stinner d95bd4214c
bpo-40609: _tracemalloc allocates traces (GH-20064)
Rewrite _tracemalloc to store "trace_t*" rather than directly
"trace_t" in traces hash tables. Traces are now allocated on the heap
memory, outside the hash table.

Add tracemalloc_copy_traces() and tracemalloc_copy_domains() helper
functions.

Remove _Py_hashtable_copy() function since there is no API to copy a
key or a value.

Remove also _Py_hashtable_delete() function which was commented.
2020-05-13 03:52:11 +02:00
Victor Stinner 2d0a3d682f
bpo-40609: Add destroy functions to _Py_hashtable (GH-20062)
Add key_destroy_func and value_destroy_func parameters to
_Py_hashtable_new_full().

marshal.c and _tracemalloc.c use these destroy functions.
2020-05-13 02:50:18 +02:00
Victor Stinner f9b3b582b8
bpo-40609: Remove _Py_hashtable_t.key_size (GH-20060)
Rewrite _Py_hashtable_t type to always store the key as
a "const void *" pointer. Add an explicit "key" member to
_Py_hashtable_entry_t.

Remove _Py_hashtable_t.key_size member.

hash and compare functions drop their hash table parameter, and their
'key' parameter type becomes "const void *".
2020-05-13 02:26:02 +02:00
Victor Stinner f453221c8b
bpo-40602: Add _Py_HashPointerRaw() function (GH-20056)
Add a new _Py_HashPointerRaw() function which avoids replacing -1
with -2 to micro-optimize hash table using pointer keys: using
_Py_hashtable_hash_ptr() hash function.
2020-05-12 18:46:20 +02:00
Victor Stinner 7c6e970775
bpo-40602: Optimize _Py_hashtable for pointer keys (GH-20051)
Optimize _Py_hashtable_get() and _Py_hashtable_get_entry() for
pointer keys:

* key_size == sizeof(void*)
* hash_func == _Py_hashtable_hash_ptr
* compare_func == _Py_hashtable_compare_direct

Changes:

* Add get_func and get_entry_func members to _Py_hashtable_t
* Convert _Py_hashtable_get() and _Py_hashtable_get_entry() functions
  to static nline functions.
* Add specialized get and get entry for pointer keys.
2020-05-12 13:31:59 +02:00
Victor Stinner d0919f0d6b
bpo-40602: _Py_hashtable_new() uses PyMem_Malloc() (GH-20046)
_Py_hashtable_new() now uses PyMem_Malloc/PyMem_Free allocator by
default, rather than PyMem_RawMalloc/PyMem_RawFree.

PyMem_Malloc is faster than PyMem_RawMalloc for memory blocks smaller
than or equal to 512 bytes.
2020-05-12 03:07:40 +02:00
Victor Stinner b617993b7c
bpo-40602: Rename hashtable.h to pycore_hashtable.h (GH-20044)
* Move Modules/hashtable.h to Include/internal/pycore_hashtable.h
* Move Modules/hashtable.c to Python/hashtable.c
* Python is now linked to hashtable.c. _tracemalloc is no longer
  linked to hashtable.c. Previously, marshal.c got hashtable.c via
  _tracemalloc.c which is built as a builtin module.
2020-05-12 02:42:19 +02:00