We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
bytesobject.c, bytearrayobject.c and unicodeobject.c now define all
macros used by stringlib, to avoid using undefined macros.
Fix "gcc -Wundef" warnings.
Move Include/pystrhex.h to Include/internal/pycore_strhex.h.
The header file only contains private functions.
The following C extensions are now built with Py_BUILD_CORE_MODULE
macro defined to get access to the internal C API:
* _blake2
* _hashopenssl
* _md5
* _sha1
* _sha3
* _ssl
* binascii
When Python is built in debug mode (with C assertions), calling a
type slot like sq_length (__len__() in Python) now fails with a fatal
error if the slot succeeded with an exception set, or failed with no
exception set. The error message contains the slot, the type name,
and the current exception (if an exception is set).
* Check the result of all slots using _Py_CheckSlotResult().
* No longer pass op_name to ternary_op() in release mode.
* Replace operator with dunder Python method name in error messages.
For example, replace "*" with "__mul__".
* Fix compiler_exit_scope() when an exception is set.
* Fix bytearray.extend() when an exception is set: don't call
bytearray_setslice() with an exception set.
Before, using the * operator to repeat a bytearray would copy data from the start of
the internal buffer (ob_bytes) and not from the start of the actual data (ob_start).
* Speed up comparison of bytes objects with non-bytes objects when
option -b is specified.
* Speed up comparison of bytarray objects with non-buffer object.
Added str.removeprefix and str.removesuffix methods and corresponding
bytes, bytearray, and collections.UserString methods to remove affixes
from a string if present. See PEP 616 for a full description.
Don't access PyInterpreterState.config member directly anymore, but
use new functions:
* _PyInterpreterState_GetConfig()
* _PyInterpreterState_SetConfig()
* _Py_GetConfig()
Add _PyIndex_Check() function to the internal C API: fast inlined
verson of PyIndex_Check().
Add Include/internal/pycore_abstract.h header file.
Replace PyIndex_Check() with _PyIndex_Check() in C files of Objects
and Python subdirectories.
Move the bytes_methods.h header file to the internal C API as
pycore_bytes_methods.h: it only contains private symbols (prefixed by
"_Py"), except of the PyDoc_STRVAR_shared() macro.
The bulk of this patch was generated automatically with:
for name in \
PyObject_Vectorcall \
Py_TPFLAGS_HAVE_VECTORCALL \
PyObject_VectorcallMethod \
PyVectorcall_Function \
PyObject_CallOneArg \
PyObject_CallMethodNoArgs \
PyObject_CallMethodOneArg \
;
do
echo $name
git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g"
done
old=_PyObject_FastCallDict
new=PyObject_VectorcallDict
git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g"
and then cleaned up:
- Revert changes to in docs & news
- Revert changes to backcompat defines in headers
- Nudge misaligned comments
* bpo-22385: Support output separators in hex methods.
Also in binascii.hexlify aka b2a_hex.
The underlying implementation behind all hex generation in CPython uses the
same pystrhex.c implementation. This adds support to bytes, bytearray,
and memoryview objects.
The binascii module functions exist rather than being slated for deprecation
because they return bytes rather than requiring an intermediate step through a
str object.
This change was inspired by MicroPython which supports sep in its binascii
implementation (and does not yet support the .hex methods).
https://bugs.python.org/issue22385
The final addition (cur += step) may overflow, so use size_t for "cur".
"cur" is always positive (even for negative steps), so it is safe to use
size_t here.
Co-Authored-By: Martin Panter <vadmium+py@gmail.com>
* The PyByteArray_Init() and PyByteArray_Fini() functions have been
removed. They did nothing since Python 2.7.4 and Python 3.2.0, were
excluded from the limited API (stable ABI), and were not
documented.
* Move "_PyXXX_Init()" and "_PyXXX_Fini()" declarations from
Include/cpython/pylifecycle.h to
Include/internal/pycore_pylifecycle.h. Replace
"PyAPI_FUNC(TYPE)" with "extern TYPE".
* _PyExc_Init() now returns an error on failure rather than calling
Py_FatalError(). Move macros inside _PyExc_Init() and undefine them
when done. Rewrite macros to make them look more like statement:
add ";" when using them, add "do { ... } while (0)".
* _PyUnicode_Init() now returns a _PyInitError error rather than call
Py_FatalError().
* Move stdin check from _PySys_BeginInit() to init_sys_streams().
* _Py_ReadyTypes() now returns a _PyInitError error rather than
calling Py_FatalError().
Add more fields to _PyCoreConfig:
* _check_hash_pycs_mode
* bytes_warning
* debug
* inspect
* interactive
* legacy_windows_fs_encoding
* legacy_windows_stdio
* optimization_level
* quiet
* unbuffered_stdio
* user_site_directory
* verbose
* write_bytecode
Changes:
* Remove pymain_get_global_config() and pymain_set_global_config()
which became useless. These functions have been replaced by
_PyCoreConfig_GetGlobalConfig() and
_PyCoreConfig_SetGlobalConfig().
* sys.flags.dont_write_bytecode value is now restricted to 1 even if
-B option is specified multiple times on the command line.
* PyThreadState_Clear() now uses the config from the current
interpreter rather than using global Py_VerboseFlag
METH_NOARGS functions need only a single argument but they are cast
into a PyCFunction, which takes two arguments. This triggers an
invalid function cast warning in gcc8 due to the argument mismatch.
Fix this by adding a dummy unused argument.
* group the (stateful) runtime globals into various topical structs
* consolidate the topical structs under a single top-level _PyRuntimeState struct
* add a check-c-globals.py script that helps identify runtime globals
Other globals are excluded (see globals.txt and check-c-globals.py).