* Use instruction offset, rather than bytecode offset. Streamlines interpreter dispatch a bit, and removes most EXTENDED_ARGs for jumps.
* Change some uses of PyCode_Addr2Line to PyFrame_GetLineNumber
When printing stats, move radix tree info to its own section.
Restore that the breakdown of bytes in arenas exactly accounts for the total of arena bytes allocated.
Add an assert so that invariant doesn't break again.
* Remove m68k-specific hack from ascii_decode
On m68k, alignments of primitives is more relaxed, with 4-byte and
8-byte types only requiring 2-byte alignment, thus using sizeof(size_t)
does not work. Instead, use the portable alternative.
Note that this is a minimal fix that only relaxes the assertion and the
condition for when to use the optimised version remains overly strict.
Such issues will be fixed tree-wide in the next commit.
NB: In C11 we could use _Alignof(size_t) instead, but for compatibility
we use autoconf.
* Optimise string routines for architectures with non-natural alignment
C only requires that sizeof(x) is a multiple of alignof(x), not that the
two are equal. Thus anywhere where we optimise based on alignment we
should be using alignof(x) not sizeof(x).
This is more annoying than it would be in C11 where we could just use
_Alignof(x) (and alignof(x) in C++11), but since we still require only
C99 we must plumb the information all the way from autoconf through the
various typedefs and defines.
The radix tree approach is a relatively simple and memory sanitary
alternative to the old (slightly) unsanitary address_in_range().
To disable the radix tree map, set a preprocessor flag as follows:
-DWITH_PYMALLOC_RADIX_TREE=0.
Co-authored-by: Tim Peters <tim.peters@gmail.com>
See [PEP 597](https://www.python.org/dev/peps/pep-0597/).
* Add `-X warn_default_encoding` and `PYTHONWARNDEFAULTENCODING`.
* Add EncodingWarning
* Add io.text_encoding()
* open(), TextIOWrapper() emits EncodingWarning when encoding is omitted and warn_default_encoding is enabled.
* _pyio.TextIOWrapper() uses UTF-8 as fallback default encoding used when failed to import locale module. (used during building Python)
* bz2, configparser, gzip, lzma, pathlib, tempfile modules use io.text_encoding().
* What's new entry
Remove the compiler functions using "struct _mod" type, because the
public AST C API was removed:
* PyAST_Compile()
* PyAST_CompileEx()
* PyAST_CompileObject()
* PyFuture_FromAST()
* PyFuture_FromASTObject()
These functions were undocumented and excluded from the limited C API.
Rename functions:
* PyAST_CompileObject() => _PyAST_Compile()
* PyFuture_FromASTObject() => _PyFuture_FromAST()
Moreover, _PyFuture_FromAST() is no longer exported (replace
PyAPI_FUNC() with extern). _PyAST_Compile() remains exported for
test_peg_generator.
Remove also compatibility functions:
* PyAST_Compile()
* PyAST_CompileEx()
* PyFuture_FromAST()
The common case going through _PyType_Lookup is to have a cache hit. There are some small tweaks that can make this a little cheaper:
* The name field identity is used for a cache hit and is kept alive by the cache. So there's no need to read the hash code o the name - instead, the address can be used as the hash.
* There's no need to check if the name is cachable on the lookup either, it probably is, and if it is, it'll be in the cache.
* If we clear the version tag when invalidating a type then we don't actually need to check for a valid version tag bit.
* Remove an assertion which required CO_NEWLOCALS and CO_OPTIMIZED
code flags. It is ok to call this function on a code with these
flags set.
* Fix reference counting on builtins: remove Py_DECREF().
Fix regression introduced in the
commit 46496f9d12.
Add also a comment to document that _PyEval_BuiltinsFromGlobals()
returns a borrowed reference.
Python no longer fails at startup with a fatal error if a command
line argument contains an invalid Unicode character.
The Py_DecodeLocale() function now escapes byte sequences which would
be decoded as Unicode characters outside the [U+0000; U+10ffff]
range.
Use MAX_UNICODE constant in unicodeobject.c.
Implement an enhanced variant of Crochemore and Perrin's Two-Way string searching algorithm, which reduces worst-case time from quadratic (the product of the string and pattern lengths) to linear. This applies to forward searches (like``find``, ``index``, ``replace``); the algorithm for reverse searches (like ``rfind``) is not changed.
Co-authored-by: Tim Peters <tim.peters@gmail.com>
* No longer save/restore the current exception. It is no longer used
with an exception raised.
* No longer clear the current exception on error: it's now up to the
caller.
For some mysterious reason we have PySet_Check, PyFrozenSet_Check, PyAnySet_Check, PyAnySet_CheckExact and PyFrozenSet_CheckExact but no PySet_CheckExact.
The types.FunctionType constructor now inherits the current builtins
if the globals dictionary has no "__builtins__" key, rather than
using {"None": None} as builtins: same behavior as eval() and exec()
functions.
Defining a function with "def function(...): ..." in Python is not
affected, globals cannot be overriden with this syntax: it also
inherits the current builtins.
PyFrame_New(), PyEval_EvalCode(), PyEval_EvalCodeEx(),
PyFunction_New() and PyFunction_NewWithQualName() now inherits the
current builtins namespace if the globals dictionary has no
"__builtins__" key.
* Add _PyEval_GetBuiltins() function.
* _PyEval_BuiltinsFromGlobals() now uses _PyEval_GetBuiltins() if
builtins cannot be found in globals.
* Add tstate parameter to _PyEval_BuiltinsFromGlobals().
Pass the current interpreter (interp) rather than the current Python
thread state (tstate) to internal functions which only use the
interpreter.
Modified functions:
* _PyXXX_Fini() and _PyXXX_ClearFreeList() functions
* _PyEval_SignalAsyncExc(), make_pending_calls()
* _PySys_GetObject(), sys_set_object(), sys_set_object_id(), sys_set_object_str()
* should_audit(), set_flags_from_config(), make_flags()
* _PyAtExit_Call()
* init_stdio_encoding()
* etc.
* Refactor _PyFrame_New_NoTrack() and PyFunction_NewWithQualName()
code.
* PyFrame_New() checks for _PyEval_BuiltinsFromGlobals() failure.
* Fix a ref leak in _PyEval_BuiltinsFromGlobals() error path.
* Complete PyFunction_GetModule() documentation: it returns a
borrowed reference and it can return NULL.
* Move _PyEval_BuiltinsFromGlobals() definition to the internal C
API.
* PyFunction_NewWithQualName() uses _Py_IDENTIFIER() API for the
"__name__" string to make it compatible with subinterpreters.
Expose the new PyFunctionObject.func_builtins member in Python as a
new __builtins__ attribute on functions.
Document also the behavior change in What's New in Python 3.10.
* Further refactoring of PyEval_EvalCode and friends. Break into make-frame, and eval-frame parts.
* Simplify function vector call using new _PyEval_Vector.
* Remove unused internal functions: _PyEval_EvalCodeWithName and _PyEval_EvalCode.
* Don't use legacy function PyEval_EvalCodeEx.
When Python is built in debug mode (with C assertions), calling a
type slot like sq_length (__len__() in Python) now fails with a fatal
error if the slot succeeded with an exception set, or failed with no
exception set. The error message contains the slot, the type name,
and the current exception (if an exception is set).
* Check the result of all slots using _Py_CheckSlotResult().
* No longer pass op_name to ternary_op() in release mode.
* Replace operator with dunder Python method name in error messages.
For example, replace "*" with "__mul__".
* Fix compiler_exit_scope() when an exception is set.
* Fix bytearray.extend() when an exception is set: don't call
bytearray_setslice() with an exception set.
* bpo-42979: Enhance abstract.c assertions checking slot result
Add _Py_CheckSlotResult() function which fails with a fatal error if
a slot function succeeded with an exception set or failed with no
exception set: write the slot name, the type name and the current
exception (if an exception is set).
The Py_FatalError() function and the faulthandler module now dump the
list of extension modules on a fatal error.
Add _Py_DumpExtensionModules() and _PyModule_IsExtension() internal
functions.
Before, using the * operator to repeat a bytearray would copy data from the start of
the internal buffer (ob_bytes) and not from the start of the actual data (ob_start).
* Add test for frame.f_lineno with/without tracing.
* Make sure that frame.f_lineno is correct regardless of whether frame.f_trace is set.
* Update importlib
* Add NEWS
In is_typing_name(), va_end() is not always called before the
function returns. It is undefined behavior to call va_start()
without also calling va_end().
Previously this didn't raise an error. Now it will:
```python
from collections.abc import Callable
isinstance(int, list | Callable[..., str])
```
Also added tests in Union since there were previously none for stuff like ``isinstance(list, list | list[int])`` either.
Backport to 3.9 not required.
Automerge-Triggered-By: GH:gvanrossum
Make the Unicode dictionary of interned strings compatible with
subinterpreters.
Remove the INTERN_NAME_STRINGS macro in typeobject.c: names are
always now interned (even if EXPERIMENTAL_ISOLATED_SUBINTERPRETERS
macro is defined).
_PyUnicode_ClearInterned() now uses PyDict_Next() to no longer
allocate memory, to ensure that the interned dictionary is cleared.
Make the type attribute lookup cache per-interpreter.
Add private _PyType_InitCache() function, called by PyInterpreterState_New().
Continue to share next_version_tag between interpreters, since static
types are still shared by interpreters.
Remove MCACHE macro: the cache is no longer disabled if the
EXPERIMENTAL_ISOLATED_SUBINTERPRETERS macro is defined.
Make _PyUnicode_FromId() function compatible with subinterpreters.
Each interpreter now has an array of identifier objects (interned
strings decoded from UTF-8).
* Add PyInterpreterState.unicode.identifiers: array of identifiers
objects.
* Add _PyRuntimeState.unicode_ids used to allocate unique indexes
to _Py_Identifier.
* Rewrite the _Py_Identifier structure.
Microbenchmark on _PyUnicode_FromId(&PyId_a) with _Py_IDENTIFIER(a):
[ref] 2.42 ns +- 0.00 ns -> [atomic] 3.39 ns +- 0.00 ns: 1.40x slower
This change adds 1 ns per _PyUnicode_FromId() call in average.