cpython

Commit Graph

Author	SHA1	Message	Date
Irit Katriel	6f7acaab50	gh-120686: remove unused internal c api functions (#120687 )	2024-06-27 11:09:30 +01:00
Petr Viktorin	6f1d448bc1	gh-113993: Allow interned strings to be mortal, and fix related issues (GH-120520) * Add an InternalDocs file describing how interning should work and how to use it. * Add internal functions to explicitly request what kind of interning is done: - `_PyUnicode_InternMortal` - `_PyUnicode_InternImmortal` - `_PyUnicode_InternStatic` * Switch uses of `PyUnicode_InternInPlace` to those. * Disallow using `_Py_SetImmortal` on strings directly. You should use `_PyUnicode_InternImmortal` instead: - Strings should be interned before immortalization, otherwise you're possibly interning a immortalizing copy. - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in backports, as they are now part of public API and version-specific ABI. * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery. * Make sure the statically allocated string singletons are unique. This means these sets are now disjoint: - `_Py_ID` - `_Py_STR` (including the empty string) - one-character latin-1 singletons Now, when you intern a singleton, that exact singleton will be interned. * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic). * Intern `_Py_STR` singletons at startup. * For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup. * Beef up the tests. Cover internal details (marked with `@cpython_only`). * Add lots of assertions Co-Authored-By: Eric Snow <ericsnowcurrently@gmail.com>	2024-06-21 17:19:31 +02:00
Xie Yanbo	6a97929a5a	Fix typos in comments (#120188 )	2024-06-07 10:19:41 +02:00
Brandt Bucher	f0df35eeca	GH-115802: JIT "small" code for Windows (GH-115964)	2024-02-29 08:11:28 -08:00
Sam Gross	cf6110ba13	gh-111924: Use PyMutex for Runtime-global Locks. (gh-112207) This replaces some usages of PyThread_type_lock with PyMutex, which does not require memory allocation to initialize. This simplifies some of the runtime initialization and is also one step towards avoiding changing the default raw memory allocator during initialize/finalization, which can be non-thread-safe in some circumstances.	2023-12-07 12:33:40 -07:00
Victor Stinner	58469244be	gh-112026: Restore removed private C API (#112115 ) Restore removed private C API functions, macros and structures which have no simple replacement for now: * _PyDict_GetItem_KnownHash() * _PyDict_NewPresized() * _PyHASH_BITS * _PyHASH_IMAG * _PyHASH_INF * _PyHASH_MODULUS * _PyHASH_MULTIPLIER * _PyLong_Copy() * _PyLong_FromDigits() * _PyLong_New() * _PyLong_Sign() * _PyObject_CallMethodId() * _PyObject_CallMethodNoArgs() * _PyObject_CallMethodOneArg() * _PyObject_CallOneArg() * _PyObject_EXTRA_INIT * _PyObject_FastCallDict() * _PyObject_GetAttrId() * _PyObject_Vectorcall() * _PyObject_VectorcallMethod() * _PyStack_AsDict() * _PyThread_CurrentFrames() * _PyUnicodeWriter structure * _PyUnicodeWriter_Dealloc() * _PyUnicodeWriter_Finish() * _PyUnicodeWriter_Init() * _PyUnicodeWriter_Prepare() * _PyUnicodeWriter_PrepareKind() * _PyUnicodeWriter_WriteASCIIString() * _PyUnicodeWriter_WriteChar() * _PyUnicodeWriter_WriteLatin1String() * _PyUnicodeWriter_WriteStr() * _PyUnicodeWriter_WriteSubstring() * _PyUnicode_AsString() * _PyUnicode_FromId() * _PyVectorcall_Function() * _Py_HashDouble() * _Py_HashPointer() * _Py_IDENTIFIER() * _Py_c_abs() * _Py_c_diff() * _Py_c_neg() * _Py_c_pow() * _Py_c_prod() * _Py_c_quot() * _Py_c_sum() * _Py_static_string() * _Py_static_string_init()	2023-11-15 16:38:31 +00:00
Serhiy Storchaka	771bd3c94a	Add private _PyUnicode_AsUTF8NoNUL() function (GH-111957) Like PyUnicode_AsUTF8(), but check for embedded null characters.	2023-11-10 21:31:36 +02:00
Victor Stinner	4fb96a11db	gh-106320: Remove private _Py_Identifier API (#108593 ) Remove the private _Py_Identifier type and related private functions from the public C API: * _PyObject_GetAttrId() * _PyObject_LookupSpecialId() * _PyObject_SetAttrId() * _PyType_LookupId() * _Py_IDENTIFIER() * _Py_static_string() * _Py_static_string_init() Move them to the internal C API: add a new pycore_identifier.h header file. No longer export these functions.	2023-08-29 02:29:46 +02:00
Victor Stinner	f1ae706ca5	gh-107211: No longer export internal functions (7) (#108425 ) No longer export _PyUnicode_FromId() internal C API function. Change comment style to "// comment" and add comment explaining why other functions have to be exported. Update Tools/build/generate_token.py to update Include/internal/pycore_token.h comments.	2023-08-24 17:40:56 +02:00
Victor Stinner	0d0520af83	gh-107211: No longer export internal functions (3) (#107215 ) No longer export these 14 internal C API functions: * _PySys_Audit() * _PySys_SetAttr() * _PyTraceBack_FromFrame() * _PyTraceBack_Print_Indented() * _PyUnicode_FormatAdvancedWriter() * _PyUnicode_ScanIdentifier() * _PyWarnings_Init() * _Py_DumpASCII() * _Py_DumpDecimal() * _Py_DumpHexadecimal() * _Py_DumpTraceback() * _Py_DumpTracebackThreads() * _Py_WriteIndent() * _Py_WriteIndentedMargin()	2023-07-25 02:25:45 +00:00
Victor Stinner	d27eb1e406	gh-106320: Remove private _PyUnicode C API (#107185 ) Move private _PyUnicode functions to the internal C API (pycore_unicodeobject.h): * _PyUnicode_IsCaseIgnorable() * _PyUnicode_IsCased() * _PyUnicode_IsXidContinue() * _PyUnicode_IsXidStart() * _PyUnicode_ToFoldedFull() * _PyUnicode_ToLowerFull() * _PyUnicode_ToTitleFull() * _PyUnicode_ToUpperFull() No longer export these functions.	2023-07-24 18:26:29 +00:00
Victor Stinner	8a73b57b9b	gh-106320: Remove _PyUnicode_TransformDecimalAndSpaceToASCII() (#106398 ) Remove private _PyUnicode_TransformDecimalAndSpaceToASCII() and other private _PyUnicode C API functions: move them to the internal C API (pycore_unicodeobject.h). No longer most of these functions. Replace _testcapi.unicode_transformdecimalandspacetoascii() with _testinternal._PyUnicode_TransformDecimalAndSpaceToASCII().	2023-07-04 08:59:09 +00:00
Victor Stinner	d8c5d76da2	gh-106320: Remove private _PyUnicode codecs C API functions (#106385 ) Remove private _PyUnicode codecs C API functions: move them to the internal C API (pycore_unicodeobject.h). No longer export most of these functions.	2023-07-04 07:29:52 +00:00
Victor Stinner	506cfdf141	gh-106320: Remove more private _PyUnicode C API functions (#106382 ) Remove more private _PyUnicode C API functions: move them to the internal C API (pycore_unicodeobject.h). No longer export most pycore_unicodeobject.h functions.	2023-07-03 22:35:46 +00:00
Victor Stinner	5ccbbe5bb9	gh-106320: Move _PyUnicodeWriter to the internal C API (#106342 ) Move also _PyUnicode_FormatAdvancedWriter(). CJK codecs and multibytecodec.c now define the Py_BUILD_CORE_MODULE macro.	2023-07-03 10:23:43 +02:00
Eddie Elizondo	ea2c001650	gh-84436: Implement Immortal Objects (gh-19474) This is the implementation of PEP683 Motivation: The PR introduces the ability to immortalize instances in CPython which bypasses reference counting. Tagging objects as immortal allows up to skip certain operations when we know that the object will be around for the entire execution of the runtime. Note that this by itself will bring a performance regression to the runtime due to the extra reference count checks. However, this brings the ability of having truly immutable objects that are useful in other contexts such as immutable data sharing between sub-interpreters.	2023-04-22 13:39:37 -06:00
Eric Snow	ba65a065cf	gh-100227: Move the Dict of Interned Strings to PyInterpreterState (gh-102339) We can revisit the options for keeping it global later, if desired. For now the approach seems quite complex, so we've gone with the simpler isolation solution in the meantime. https://github.com/python/cpython/issues/100227	2023-03-28 12:52:28 -06:00
Eric Snow	89e67ada69	gh-100227: Revert gh-102925 "gh-100227: Make the Global Interned Dict Safe for Isolated Interpreters" (gh-103063) This reverts commit `87be8d9`. This approach to keeping the interned strings safe is turning out to be too complex for my taste (due to obmalloc isolation). For now I'm going with the simpler solution, making the dict per-interpreter. We can revisit that later if we want a sharing solution.	2023-03-27 16:53:05 -06:00
Eric Snow	87be8d9522	gh-100227: Make the Global Interned Dict Safe for Isolated Interpreters (gh-102925) This is effectively two changes. The first (the bulk of the change) is where we add _Py_AddToGlobalDict() (and _PyRuntime.cached_objects.main_tstate, etc.). The second (much smaller) change is where we update PyUnicode_InternInPlace() to use _Py_AddToGlobalDict() instead of calling PyDict_SetDefault() directly. Basically, _Py_AddToGlobalDict() is a wrapper around PyDict_SetDefault() that should be used whenever we need to add a value to a runtime-global dict object (in the few cases where we are leaving the container global rather than moving it to PyInterpreterState, e.g. the interned strings dict). _Py_AddToGlobalDict() does all the necessary work to make sure the target global dict is shared safely between isolated interpreters. This is especially important as we move the obmalloc state to each interpreter (gh-101660), as well as, potentially, the GIL (PEP 684). https://github.com/python/cpython/issues/100227	2023-03-22 18:30:04 -06:00
Eric Snow	91a8e002c2	gh-81057: Move More Globals to _PyRuntimeState (gh-100092) https://github.com/python/cpython/issues/81057	2022-12-07 15:56:31 -07:00
Eric Snow	5f55067e23	gh-81057: Move More Globals in Core Code to _PyRuntimeState (gh-99516) https://github.com/python/cpython/issues/81057	2022-11-16 09:37:14 -07:00
Kumar Aditya	6dab8c95bd	GH-96458: Statically initialize utf8 representation of static strings (#96481 )	2022-09-02 23:43:08 -07:00
Dennis Sweeney	da6c78584b	gh-90667: Add specializations of Py_DECREF when types are known (GH-30872)	2022-04-19 19:02:19 +01:00
Kumar Aditya	8c54c3dacc	gh-91576: Speed up iteration of strings (#91574 )	2022-04-18 07:18:27 -07:00
Jeremy Kloth	88872a29f1	bpo-47084: Clear Unicode cached representations on finalization (GH-32032)	2022-03-22 13:53:51 +01:00
Kumar Aditya	8714b6fa27	bpo-46881: Statically allocate and initialize the latin1 characters. (GH-31616)	2022-03-09 15:02:00 -08:00
Eric Snow	81c72044a1	bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928) We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules. The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings). https://bugs.python.org/issue46541#msg411799 explains the rationale for this change. The core of the change is in: * (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros * Include/internal/pycore_runtime_init.h - added the static initializers for the global strings * Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState * Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config. The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _PyId functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _PyId(), replacing the _Py_Identifier * parameter with PyObject . The following are not changed (yet): stop using _Py_IDENTIFIER() in the stdlib modules * (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API * (maybe) intern the strings during runtime init https://bugs.python.org/issue46541	2022-02-08 13:39:07 -07:00
Victor Stinner	1626bf4ac7	bpo-46417: Clear Unicode static types at exit (GH-30806) Add _PyUnicode_FiniTypes() function, called by finalize_interp_types(). It clears these static types: * EncodingMapType * PyFieldNameIter_Type * PyFormatterIter_Type _PyStaticType_Dealloc() now does nothing if tp_subclasses is not NULL.	2022-01-22 22:55:39 +01:00
Victor Stinner	ea1a54506b	bpo-46303: Move fileutils.h private functions to internal C API (GH-30484) Move almost all private functions of Include/cpython/fileutils.h to the internal C API Include/internal/pycore_fileutils.h. Only keep _Py_fopen_obj() in Include/cpython/fileutils.h, since it's used by _testcapi which must not use the internal C API. Move EncodeLocaleEx() and DecodeLocaleEx() functions from _testcapi to _testinternalcapi, since the C API moved to the internal C API.	2022-01-11 11:56:16 +01:00
Victor Stinner	35d6540c90	bpo-46006: Revert "bpo-40521: Per-interpreter interned strings (GH-20085)" (GH-30422) This reverts commit `ea251806b8`. Keep "assert(interned == NULL);" in _PyUnicode_Fini(), but only for the main interpreter. Keep _PyUnicode_ClearInterned() changes avoiding the creation of a temporary Python list object.	2022-01-06 08:53:44 +01:00
Eric Snow	c8749b5783	bpo-46008: Make runtime-global object/type lifecycle functions and state consistent. (gh-29998) This change is strictly renames and moving code around. It helps in the following ways: * ensures type-related init functions focus strictly on one of the three aspects (state, objects, types) * passes in PyInterpreterState * to all those functions, simplifying work on moving types/objects/state to the interpreter * consistent naming conventions help make what's going on more clear * keeping API related to a type in the corresponding header file makes it more obvious where to look for it https://bugs.python.org/issue46008	2021-12-09 12:59:26 -07:00

31 Commits