MAP_BOT_LENGTH was incorrectly used to compute MAP_TOP_MASK instead of
MAP_TOP_LENGTH. On 64-bit machines, the error causes the tree to hold
46-bits of virtual addresses, rather than the intended 48-bits.
* Remove 'zombie' frames. We won't need them once we are allocating fixed-size frames.
* Add co_nlocalplus field to code object to avoid recomputing size of locals + frees + cells.
* Move locals, cells and freevars out of frame object into separate memory buffer.
* Use per-threadstate allocated memory chunks for local variables.
* Move globals and builtins from frame object to per-thread stack.
* Move (slow) locals frame object to per-thread stack.
* Move internal frame functions to internal header.
When printing stats, move radix tree info to its own section.
Restore that the breakdown of bytes in arenas exactly accounts for the total of arena bytes allocated.
Add an assert so that invariant doesn't break again.
The radix tree approach is a relatively simple and memory sanitary
alternative to the old (slightly) unsanitary address_in_range().
To disable the radix tree map, set a preprocessor flag as follows:
-DWITH_PYMALLOC_RADIX_TREE=0.
Co-authored-by: Tim Peters <tim.peters@gmail.com>
The PEP 353, written in 2005, introduced PY_FORMAT_SIZE_T. Python no
longer supports macOS 10.4 and Visual Studio 2010, but requires more
recent macOS and Visual Studio versions. In 2020 with Python 3.10, it
is now safe to use directly "%zu" to format size_t and "%zi" to
format Py_ssize_t.
The Py_FatalError() function is replaced with a macro which logs
automatically the name of the current function, unless the
Py_LIMITED_API macro is defined.
Changes:
* Add _Py_FatalErrorFunc() function.
* Remove the function name from the message of Py_FatalError() calls
which included the function name.
* Update tests.
bpo-3605, bpo-38733: Optimize _PyErr_Occurred(): remove "tstate ==
NULL" test.
Py_FatalError() no longer calls PyErr_Occurred() if called without
holding the GIL. So PyErr_Occurred() no longer has to support
tstate==NULL case.
_Py_CheckFunctionResult(): use directly _PyErr_Occurred() to avoid
explicit "!= NULL" test.
pymalloc_alloc() now returns directly the pointer, return NULL on
memory allocation error.
allocate_from_new_pool() already uses NULL as marker for "allocation
failed".
PyObject_Malloc() and PyObject_Free() inlines pymalloc_alloc and
pymalloc_free partially.
But when PGO is not used, compiler don't know where is the hot part
in pymalloc_alloc and pymalloc_free.
Keeping an account of allocated blocks slows down _PyObject_Malloc()
and _PyObject_Free() by a measureable amount. Have
_Py_GetAllocatedBlocks() iterate over the arenas to sum up the
allocated blocks for pymalloc.
GH-14039: allow (no more than) one wholly empty arena on the usable_arenas list.
This prevents thrashing in some easily-provoked simple cases that could end up creating and destroying an arena on each loop iteration in client code. Intuitively, if the only arena on the list becomes empty, it makes scant sense to give it back to the system unless we know we'll never need another free pool again before another arena frees a pool. If the latter obtains, then - yes - this will "waste" an arena.
This adds a vector of "search fingers" so that usable_arenas can be kept in sorted order (by number of free pools) via constant-time operations instead of linear search.
This should reduce worst-case time for reclaiming a great many objects from O(A**2) to O(A), where A is the number of arenas. See bpo-37029.
* Add PyMemAllocatorName enum
* _PyPreConfig.allocator type becomes PyMemAllocatorName, instead of
char*
* Remove _PyPreConfig_Clear()
* Add _PyMem_GetAllocatorName()
* Rename _PyMem_GetAllocatorsName() to
_PyMem_GetCurrentAllocatorName()
* Remove _PyPreConfig_SetAllocator(): just call
_PyMem_SetupAllocators() directly, we don't have do reallocate the
configuration with the new allocator anymore!
* _PyPreConfig_Write() parameter becomes const, as it should be in
the first place!
Omit serialno field from debug hooks on Python memory allocators to
reduce the memory footprint by 5%.
Enable tracemalloc to get the traceback where a memory block has been
allocated when a fatal memory error is logged to decide where to put
a breakpoint.
Compile Python with PYMEM_DEBUG_SERIALNO defined to get back the
field.
Modify CLEANBYTE, DEADDYTE and FORBIDDENBYTE constants: use 0xCD,
0xDD and 0xFD, rather than 0xCB, 0xBB and 0xFB, to use the same byte
patterns than Windows CRT debug malloc() and free().
Replace _PyMem_IsFreed() function with _PyMem_IsPtrFreed() inline
function. The function is now way more efficient, it became a simple
comparison on integers, rather than a short loop. It detects also
uninitialized bytes and "forbidden bytes" filled by debug hooks
on memory allocators.
Add unit tests on _PyObject_IsFreed().
* _PyPreConfig_Write() now reallocates the pre-configuration with the
new memory allocator.
* It is no longer needed to force the "default raw memory allocator"
to clear pre-configuration and core configuration. Simplify the
code.
* _PyPreConfig_Write() now does nothing if called after
Py_Initialize(): no longer check if the allocator is the same.
* Remove _PyMem_GetDebugAllocatorsName(): dev mode sets again
allocator to "debug".
The development mode now uses the effective name of the debug memory
allocator ("pymalloc_debug" or "malloc_debug"). So the name doesn't
change after setting the memory allocator.
This function may access memory which is mapped but is considered
free by libc allocator. It behaves so by design, therefore we
need to suppress sanitizer reports.
GCC doesn't support MSan, so disable only TSan for it.
tracemalloc now tries to update the traceback when an object is
reused from a "free list" (optimization for faster object creation,
used by the builtin list type for example).
Changes:
* Add _PyTraceMalloc_NewReference() function which tries to update
the Python traceback of a Python object.
* _Py_NewReference() now calls _PyTraceMalloc_NewReference().
* Add an unit test.
_PyObject_Dump() now uses an heuristic to check if the object memory
has been freed: log "<freed object>" in that case.
The heuristic rely on the debug hooks on Python memory allocators
which fills the memory with DEADBYTE (0xDB) when memory is
deallocated. Use PYTHONMALLOC=debug to always enable these debug
hooks.
* Py_Main() now starts by reading Py_xxx configuration variables to
only work on its own private structure, and then later writes back
the configuration into these variables.
* Replace Py_GETENV() with pymain_get_env_var() which ignores empty
variables.
* Add _PyCoreConfig.dump_refs
* Add _PyCoreConfig.malloc_stats
* _PyObject_DebugMallocStats() is now responsible to check if debug
hooks are installed. The function returns 1 if stats were written,
or 0 if the hooks are disabled. Mark _PyMem_PymallocEnabled() as
static.
* Rename PyPathConfig structure to _PyPathConfig and move it to
Include/internal/pystate.h
* Rename path_config to _Py_path_config
* _PyPathConfig: Rename program_name field to program_full_path
* Add assert(str != NULL); to _PyMem_RawWcsdup(), _PyMem_RawStrdup()
and _PyMem_Strdup().
* Rename calculate_path() to pathconfig_global_init(). The function
now does nothing if it's already initiallized.
* Fix _PyMem_SetupAllocators("debug"): always restore allocators to
the defaults, rather than only caling _PyMem_SetupDebugHooks().
* Add _PyMem_SetDefaultAllocator() helper to set the "default"
allocator.
* Add _PyMem_GetAllocatorsName(): get the name of the allocators
* main() now uses debug hooks on memory allocators if Py_DEBUG is
defined, rather than calling directly malloc()
* Document default memory allocators in C API documentation
* _Py_InitializeCore() now fails with a fatal user error if
PYTHONMALLOC value is an unknown memory allocator, instead of
failing with a fatal internal error.
* Add new tests on the PYTHONMALLOC environment variable
* Add support.with_pymalloc()
* Add the _testcapi.WITH_PYMALLOC constant and expose it as
support.with_pymalloc().
* sysconfig.get_config_var('WITH_PYMALLOC') doesn't work on Windows, so
replace it with support.with_pymalloc().
* pythoninfo: add _testcapi collector for pymem
Py_GetPath() and Py_Main() now call
_PyMainInterpreterConfig_ReadEnv() to share the same code to get
environment variables.
Changes:
* Add _PyMainInterpreterConfig_ReadEnv()
* Add _PyMainInterpreterConfig_Clear()
* Add _PyMem_RawWcsdup()
* _PyMainInterpreterConfig: rename pythonhome to home
* Rename _Py_ReadMainInterpreterConfig() to
_PyMainInterpreterConfig_Read()
* Use _Py_INIT_USER_ERR(), instead of _Py_INIT_ERR(), for decoding
errors: the user is able to fix the issue, it's not a bug in
Python. Same change was made in _Py_INIT_NO_MEMORY().
* Remove _Py_GetPythonHomeWithConfig()
bpo-32096, bpo-30860: Partially revert the commit
2ebc5ce42a8a9e047e790aefbf9a94811569b2b6:
* Move structures back from Include/internal/mem.h to
Objects/obmalloc.c
* Remove _PyObject_Initialize() and _PyMem_Initialize()
* Remove Include/internal/pymalloc.h
* Add test_capi.test_pre_initialization_api():
Make sure that it's possible to call Py_DecodeLocale(), and then call
Py_SetProgramName() with the decoded string, before Py_Initialize().
PyMem_RawMalloc() and Py_DecodeLocale() can be called again before
_PyRuntimeState_Init().
Co-Authored-By: Eric Snow <ericsnowcurrently@gmail.com>
Add a new "developer mode": new "-X dev" command line option to
enable debug checks at runtime.
Changes:
* Add unit tests for -X dev
* test_cmd_line: replace test.support with support.
* Fix _PyRuntimeState_Fini(): Use the same memory allocator
than _PyRuntimeState_Init().
* Fix _PyMem_GetDefaultRawAllocator()
* Don't use "Python runtime" anymore to parse command line options or
to get environment variables: pymain_init() is now a strict
separation.
* Use an error message rather than "crashing" directly with
Py_FatalError(). Limit the number of calls to Py_FatalError(). It
prepares the code to handle errors more nicely later.
* Warnings options (-W, PYTHONWARNINGS) and "XOptions" (-X) are now
only added to the sys module once Python core is properly
initialized.
* _PyMain is now the well identified owner of some important strings
like: warnings options, XOptions, and the "program name". The
program name string is now properly freed at exit.
pymain_free() is now responsible to free the "command" string.
* Rename most methods in Modules/main.c to use a "pymain_" prefix to
avoid conflits and ease debug.
* Replace _Py_CommandLineDetails_INIT with memset(0)
* Reorder a lot of code to fix the initialization ordering. For
example, initializing standard streams now comes before parsing
PYTHONWARNINGS.
* Py_Main() now handles errors when adding warnings options and
XOptions.
* Add _PyMem_GetDefaultRawAllocator() private function.
* Cleanup _PyMem_Initialize(): remove useless global constants: move
them into _PyMem_Initialize().
* Call _PyRuntime_Initialize() as soon as possible:
_PyRuntime_Initialize() now returns an error message on failure.
* Add _PyInitError structure and following macros:
* _Py_INIT_OK()
* _Py_INIT_ERR(msg)
* _Py_INIT_USER_ERR(msg): "user" error, don't abort() in that case
* _Py_INIT_FAILED(err)
Few bytes at the begin and at the end of the reallocated blocks, as well
as the header and the trailer, now are erased before calling realloc()
in debug build. This will help to detect using or double freeing the
reallocated block.
Cleanup pymalloc:
* Rename _PyObject_Alloc() to pymalloc_alloc()
* Rename _PyObject_FreeImpl() to pymalloc_free()
* Rename _PyObject_Realloc() to pymalloc_realloc()
* pymalloc_alloc() and pymalloc_realloc() don't fallback on the raw
allocator anymore, it now must be done by the caller
* Add "success" and "failed" labels to pymalloc_alloc() and
pymalloc_free()
* pymalloc_alloc() and pymalloc_free() don't update
num_allocated_blocks anymore: it should be done in the caller
* _PyObject_Calloc() is now responsible to fill the memory block
allocated by pymalloc with zeros
* Simplify pymalloc_alloc() prototype
* _PyObject_Realloc() now calls _PyObject_Malloc() rather than
calling directly pymalloc_alloc()
_PyMem_DebugRawAlloc() and _PyMem_DebugRawRealloc():
* document the layout of a memory block
* don't increase the serial number if the allocation failed
* check for integer overflow before computing the total size
* add a 'data' variable to make the code easiler to follow
test_setallocators() of _testcapimodule.c now test also the context.