We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
The bulk of this patch was generated automatically with:
for name in \
PyObject_Vectorcall \
Py_TPFLAGS_HAVE_VECTORCALL \
PyObject_VectorcallMethod \
PyVectorcall_Function \
PyObject_CallOneArg \
PyObject_CallMethodNoArgs \
PyObject_CallMethodOneArg \
;
do
echo $name
git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g"
done
old=_PyObject_FastCallDict
new=PyObject_VectorcallDict
git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g"
and then cleaned up:
- Revert changes to in docs & news
- Revert changes to backcompat defines in headers
- Nudge misaligned comments
In debug mode, PyObject_GC_Track() now calls tp_traverse() of the
object type to ensure that the object is valid: test that objects
visited by tp_traverse() are valid.
Fix pyexpat.c: only track the parser in the GC once the parser is
fully initialized.
In ArgumentClinic, value "NULL" should now be used only for unrepresentable default values
(like in the optional third parameter of getattr). "None" should be used if None is accepted
as argument and passing None has the same effect as not passing the argument at all.
* bpo-37399: Correctly attach tail text to the last element/comment/pi, even when comments or pis are discarded.
Also fixes the insertion of PIs when "insert_pis=True" is configured for a TreeBuilder.
Add a new public PyObject_CallNoArgs() function to the C API: call a
callable Python object without any arguments.
It is the most efficient way to call a callback without any argument.
On x86-64, for example, PyObject_CallFunctionObjArgs(func, NULL)
allocates 960 bytes on the stack per call, whereas
PyObject_CallNoArgs(func) only allocates 624 bytes per call.
It is excluded from stable ABI 3.8.
Replace private _PyObject_CallNoArg() with public
PyObject_CallNoArgs() in C extensions: _asyncio, _datetime,
_elementtree, _pickle, _tkinter and readline.
The final addition (cur += step) may overflow, so use size_t for "cur".
"cur" is always positive (even for negative steps), so it is safe to use
size_t here.
Co-Authored-By: Martin Panter <vadmium+py@gmail.com>
Add new trashcan macros to deal with a double deallocation that could occur when the `tp_dealloc` of a subclass calls the `tp_dealloc` of a base class and that base class uses the trashcan mechanism.
Patch by Jeroen Demeyer.
* bpo-36673: Implement comment/PI parsing support for the TreeBuilder in ElementTree.
* bpo-36673: Rewrite the comment/PI factory handling for the TreeBuilder in "_elementtree" to make it use the same factories as the ElementTree module, and to make it explicit when the comments/PIs are inserted into the tree and when they are not (which is the default).
Fix invalid function cast warnings with gcc 8
for method conventions different from METH_NOARGS, METH_O and
METH_VARARGS excluding Argument Clinic generated code.
It is now guarantied that children of xml.etree.ElementTree.Element
are Elements (at least in C implementation). Previously methods
__setitem__(), __setstate__() and __deepcopy__() could be used for
adding non-Element children.
The C accelerated _elementtree module now initializes hash randomization
salt from _Py_HashSecret instead of libexpat's default CPRNG.
Signed-off-by: Christian Heimes <christian@python.org>
https://bugs.python.org/issue34623
* bpo-31499, xml.etree: Fix xmlparser_gc_clear() crash
xml.etree: xmlparser_gc_clear() now sets self.parser to NULL to prevent a
crash in xmlparser_dealloc() if xmlparser_gc_clear() was called previously
by the garbage collector, because the parser was part of a reference cycle.
Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
* Avoid calling "PyObject_GetAttrString()" (and potentially executing user code) with a live exception set.
* Ignore only AttributeError on attribute lookups in ElementTree.XMLParser() and propagate all other exceptions.