The following sqlite3 features were deprecated in 3.10, scheduled for
removal in 3.12:
- sqlite3.OptimizedUnicode (gh-23163)
- sqlite3.enable_shared_cache (gh-24008)
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Signed-off-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
In the limited C API version 3.12, PyUnicode_KIND() is now
implemented as a static inline function. Keep the macro for the
regular C API and for the limited C API version 3.11 and older to
prevent introducing new compiler warnings.
Update _decimal.c and stringlib/eq.h for PyUnicode_KIND().
The `utc_to_seconds` call can fail, here's a minimal reproducer on
Linux:
TZ=UTC python -c "from datetime import *; datetime.fromtimestamp(253402300799 + 1)"
The old behavior still raised an error in a similar way, but only
because subsequent calculations happened to fail as well. Better to fail
fast.
This also refactors the tests to split out the `fromtimestamp` and
`utcfromtimestamp` tests, and to get us closer to the actual desired
limits of the functions. As part of this, we also changed the way we
detect platforms where the same limits don't necessarily apply (e.g.
Windows).
As part of refactoring the tests to hit this condition explicitly (even
though the user-facing behvior doesn't change in any way we plan to
guarantee), I noticed that there was a difference in the places that
`datetime.utcfromtimestamp` fails in the C and pure Python versions, which
was fixed by skipping the "probe for fold" logic for UTC specifically —
since UTC doesn't have any folds or gaps, we were never going to find a
fold value anyway. This should prevent some failures in the pure python
`utcfromtimestamp` method on timestamps close to 0001-01-01.
There are two separate news entries for this because one is a
potentially user-facing change, the other is an internal code
correctness change that, if anything, changes some error messages. The
two happen to be coupled because of the test refactoring, but they are
probably best thought of as independent changes.
Fixes GH-91581
38f331d introduced a delayed initialization routine to set up
ctypes formattable (`_ctypes_init_fielddesc`), but inadvertently
removed setting the `initialization` flag to 1 to avoid initting
each time.
Add the -P command line option and the PYTHONSAFEPATH environment
variable to not prepend a potentially unsafe path to sys.path.
* Add sys.flags.safe_path flag.
* Add PyConfig.safe_path member.
* Programs/_bootstrap_python.c uses config.safe_path=0.
* Update subprocess._optim_args_from_interpreter_flags() to handle
the -P command line option.
* Modules/getpath.py sets safe_path to 1 if a "._pth" file is
present.
One more thing that can help prevent people from using `preexec_fn`.
Also adds conditional skips to two tests exposing ASAN flakiness on the Ubuntu 20.04 Address Sanitizer Github CI system. When that build is run on more modern systems the "problem" does not show up. It seems ASAN implementation related.
Co-authored-by: Zackery Spytz <zspytz@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
``pymain_run_python()`` now imports ``readline`` and ``rlcompleter``
before sys.path is extended to include the current working directory of
an interactive interpreter. Non-interactive interpreters are not
affected.
Also move imports of ``re`` and ``keyword`` module to top level so they
are materialized early, too. The ``keyword`` module is trivial and the
``re`` is already imported via ``inspect`` -> ``linecache``.
#92301: subprocess: Prefer `close_range()` to procfs-based fd closing.
`close_range()` is much faster for large number of file descriptors, e.g.
4 times faster for 1000 descriptors in a Linux 5.16-based environment.
We prefer close_range() only if it's known to be async-signal-safe.
* Map SQLITE_MISUSE to sqlite3.InterfaceError
SQLITE_MISUSE implies misuse of the SQLite C API, which, if it happens,
is _not_ a user error; it is an sqlite3 extension module error.
* Raise better errors when binding parameters fail.
Instead of always raising InterfaceError, guessing what went wrong,
raise accurate exceptions with more accurate error messages.
Currently, calling Py_EnterRecursiveCall() and
Py_LeaveRecursiveCall() may use a function call or a static inline
function call, depending if the internal pycore_ceval.h header file
is included or not. Use a different name for the static inline
function to ensure that the static inline function is always used in
Python internals for best performance. Similar approach than
PyThreadState_GET() (function call) and _PyThreadState_GET() (static
inline function).
* Rename _Py_EnterRecursiveCall() to _Py_EnterRecursiveCallTstate()
* Rename _Py_LeaveRecursiveCall() to _Py_LeaveRecursiveCallTstate()
* pycore_ceval.h: Rename Py_EnterRecursiveCall() to
_Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() and
_Py_LeaveRecursiveCall()
Fix a crash in subinterpreters related to the garbage collector. When
a subinterpreter is deleted, untrack all objects tracked by its GC.
To prevent a crash in deallocator functions expecting objects to be
tracked by the GC, leak a strong reference to these objects on
purpose, so they are never deleted and their deallocator functions
are not called.
Replace "(PyCFunction)(void(*)(void))func" cast with
_PyCFunction_CAST(func).
Change generated by the command:
sed -i -e \
's!(PyCFunction)(void(\*)(void)) *\([A-Za-z0-9_]\+\)!_PyCFunction_CAST(\1)!g' \
$(find -name "*.c")
This makes macOS gdbm provided by Homebrew not segfault through correct
selection of the linked library (-lgdbm_compat) *AND* the correct ndbm-style
header (gdbm-ndbm.h instead of the invalid ndbm.h).
It was initially added to support atomic groups, but that
support was never fully implemented, and CALL was only left
in the compiler, but not interpreter and parser.
ATOMIC_GROUP is now used to support atomic groups.
Just in case there is ever an issue with _posixsubprocess's use of
vfork() due to the complexity of using it properly and potential
directions that Linux platforms where it defaults to on could take, this
adds a failsafe so that users can disable its use entirely by setting
a global flag.
No known reason to disable it exists. But it'd be a shame to encounter
one and not be able to use CPython without patching and rebuilding it.
See the linked issue for some discussion on reasoning.
Also documents the existing way to disable posix_spawn.
Fix signal.NSIG value on FreeBSD to accept signal numbers greater
than 32, like signal.SIGRTMIN and signal.SIGRTMAX.
* Add Py_NSIG constant.
* Add pycore_signal.h internal header file.
* _Py_Sigset_Converter() now includes the range of valid signals in
the error message.
In the limited C API version 3.11 and newer, the following functions
no longer cast their object pointer argument with _PyObject_CAST() or
_PyObject_CAST_CONST():
* Py_REFCNT(), Py_TYPE(), Py_SIZE()
* Py_SET_REFCNT(), Py_SET_TYPE(), Py_SET_SIZE()
* Py_IS_TYPE()
* Py_INCREF(), Py_DECREF()
* Py_XINCREF(), Py_XDECREF()
* Py_NewRef(), Py_XNewRef()
* PyObject_TypeCheck()
* PyType_Check()
* PyType_CheckExact()
Split Py_DECREF() implementation in 3 versions to make the code more
readable.
Update the xxlimited.c extension, which uses the limited C API
version 3.11, to pass PyObject* to these functions.
Python 3.11 now uses C11 standard which adds static_assert()
to <assert.h>.
* In pytime.c, replace Py_BUILD_ASSERT() with preprocessor checks on
SIZEOF_TIME_T with #error.
* On macOS, py_mach_timebase_info() now accepts timebase members with
the same size than _PyTime_t.
* py_get_monotonic_clock() now saturates GetTickCount64() to
_PyTime_MAX: GetTickCount64() is unsigned, whereas _PyTime_t is
signed.
`TextIOWrapper.__init__()` called `os.device_encoding(file.fileno())` if fileno is 0-2 and encoding=None.
But it is very rarely works, and never documented behavior.
Limit the maximum capturing group to 2**30-1 on 64-bit platforms
(it was 2**31-1). No change on 32-bit platforms (2**28-1).
It allows to reduce the size of SRE(match_context):
- On 32 bit platform: 36 bytes, no change. (msvc2022)
- On 64 bit platform: 72 bytes -> 56 bytes. (msvc2022/gcc9.4)
which leads to increasing the depth of backtracking.
Unless sqlite3_blob_open() returns SQLITE_MISUSE, the error code and
message are available on the connection object. This means we have to
handle SQLITE_MISUSE error messages explicitly.
* Move the code for generating Modules/_sre/sre_constants.h from
Lib/re/_constants.py into a separate script
Tools/scripts/generate_sre_constants.py.
* Add target `regen-sre` in the makefile.
* Make target `regen-all` depending on `regen-sre`.
Copying and pickling instances of subclasses of builtin types
bytearray, set, frozenset, collections.OrderedDict, collections.deque,
weakref.WeakSet, and datetime.tzinfo now copies and pickles instance attributes
implemented as slots.
The side effect of this bug was that venv environments directly
used the main interpreter instead of the intermediate stub executable,
which can cause problems when a script uses system APIs that
require the use of an application bundle.
The function PyInit_imp_dummy is declared as void f(PyObject* spec)
but called as void f(void). On wasm targets without the call
trampolines this causes a fatal error.
Automerge-Triggered-By: GH:tiran
bpo-47151: Fallback to fork when vfork fails in subprocess. An OS kernel can specifically decide to disallow vfork() in a process. No need for that to prevent us from launching subprocesses.
- Remove ``--with-tclk-*`` options from `configure`
- Use pkg-config to detect `_tkinter` dependencies (Tcl/Tk, X11)
- Manual override via environment variables `TCLTK_CFLAGS` and `TCLTK_LIBS`
Add macros to cast objects to PyASCIIObject*, PyCompactUnicodeObject*
and PyUnicodeObject*: _PyASCIIObject_CAST(),
_PyCompactUnicodeObject_CAST() and _PyUnicodeObject_CAST(). Using
these new macros make the code more readable and check their argument
with: assert(PyUnicode_Check(op)).
Remove redundant assert(PyUnicode_Check(op)) in macros using directly
or indirectly these new CAST macros.
Replacing existing casts with these macros.
In rare cases, capturing group could get wrong result.
Regular expression engines in Perl and Java have similar bugs.
The new behavior now matches the behavior of more modern
RE engines: in the regex module and in PHP, Ruby and Node.js.
After a long deliberation we ended up feeling that the message argument for Future.cancel(), added in 3.9, was a bad idea, so we're deprecating it in 3.11 and plan to remove it in 3.13.
When compiled with `USE_ZLIB_CRC32` defined (`configure` sets this on POSIX systems), `binascii.crc32(...)` failed to compute the correct value when the input data was >= 4GiB. Because the zlib crc32 API is limited to a 32-bit length.
This lines it up with the `zlib.crc32(...)` implementation that doesn't have that flaw.
**Performance:** This also adopts the same GIL releasing for larger inputs logic that `zlib.crc32` has, and causes the Windows build to always use zlib's crc32 instead of our slow C code as zlib is a required build dependency on Windows.
Clarifies a versionchanged note on crc32 & adler32 docs that the workaround is only needed for Python 2 and earlier.
Also cleans up an unnecessary intermediate variable in the implementation.
Authored-By: Ma Lin / animalize
Co-authored-by: Gregory P. Smith <greg@krypto.org>
* Improve exception compliance with PEP 249
* Raise InterfaceError instead of ProgrammingError for SQLITE_MISUSE.
If SQLITE_MISUSE is raised, it is a sqlite3 module bug. Users of the
sqlite3 module are not responsible for using the SQLite C API correctly.
* Don't overwrite BufferError with ValueError when conversion to BLOB fails.
* Raise ProgrammingError instead of Warning if user tries to execute() more
than one SQL statement.
* Raise ProgrammingError instead of ValueError if an SQL query contains null characters.
* Make sure `_pysqlite_set_result` raises an exception if it returns -1.
wasm32-emscripten does not support exceptfds and requires NULL. Python
now passes NULL instead of a fdset pointer when the input list is empty.
This works fine on all platforms and might even be a tiny bit faster.
Add new functions to pack and unpack C double (serialize and
deserialize):
* PyFloat_Pack2(), PyFloat_Pack4(), PyFloat_Pack8()
* PyFloat_Unpack2(), PyFloat_Unpack4(), PyFloat_Unpack8()
Document these functions and add unit tests.
Rename private functions and move them from the internal C API
to the public C API:
* _PyFloat_Pack2() => PyFloat_Pack2()
* _PyFloat_Pack4() => PyFloat_Pack4()
* _PyFloat_Pack8() => PyFloat_Pack8()
* _PyFloat_Unpack2() => PyFloat_Unpack2()
* _PyFloat_Unpack4() => PyFloat_Unpack4()
* _PyFloat_Unpack8() => PyFloat_Unpack8()
Replace the "unsigned char*" type with "char*" which is more common
and easy to use.
In Linux kernel 5.14 one can dynamically request size of altstacksize
based on hardware capabilities with getauxval(AT_MINSIGSTKSZ).
This changes allows for Python extension's request to Linux kernel
to use AMX_TILE instruction set on Sapphire Rapids Xeon processor
to succeed, unblocking use of the ISA in frameworks.
Introduced HAVE_LINUX_AUXVEC_H in configure.ac and pyconfig.h.in
Used cpython_autoconf:269 docker container to generate configure.
Also from the _asyncio C accelerator module,
and adjust one test that the change caused to fail.
For more discussion see the discussion starting here:
https://github.com/python/cpython/pull/31394#issuecomment-1053545331
(Basically, @asvetlov proposed to return False from cancel()
when there is already a pending cancellation, and I went along,
even though it wasn't necessary for the task group implementation,
and @agronholm has come up with a counterexample that fails
because of this change. So now I'm changing it back to the old
semantics (but still bumping the counter) until we can have a
proper discussion about this.)
<stdbool.h> is the standard/modern way to define embedd/extends Python free to define bool, true and false, but there are existing applications that use slightly different redefinitions, which fail if the header is included.
It's OK to use stdbool outside the public headers, though.
https://bugs.python.org/issue46748
This changes cancelling() and uncancel() to return the count of pending cancellations.
This can be used to avoid bugs in certain edge cases (e.g. two timeouts going off at the same time).
Remove the HAVE_PY_SET_53BIT_PRECISION macro (moved to the internal
C API).
* Move HAVE_PY_SET_53BIT_PRECISION macro to pycore_pymath.h.
* Replace PY_NO_SHORT_FLOAT_REPR macro with _PY_SHORT_FLOAT_REPR
macro which is always defined. gcc -Wundef emits a warning when
using _PY_SHORT_FLOAT_REPR but the macro is not defined, if
pycore_pymath.h include was forgotten.
The libexpat 2.4.1 upgrade from introduced the following new exported symbols:
* `testingAccountingGetCountBytesDirect`
* `testingAccountingGetCountBytesIndirect`
* `unsignedCharToPrintable`
* `XML_SetBillionLaughsAttackProtectionActivationThreshold`
* `XML_SetBillionLaughsAttackProtectionMaximumAmplification`
We need to adjust [Modules/expat/pyexpatns.h](https://github.com/python/cpython/blob/master/Modules/expat/pyexpatns.h)
(The newer libexpat upgrade has no new symbols).
Automerge-Triggered-By: GH:gpshead
asyncio/taskgroups.py is an adaptation of taskgroup.py from EdgeDb, with the following key changes:
- Allow creating new tasks as long as the last task hasn't finished
- Raise [Base]ExceptionGroup (directly) rather than TaskGroupError deriving from MultiError
- Instead of monkey-patching the parent task's cancel() method,
add a new public API to Task
The Task class has a new internal flag, `_cancel_requested`, which is set when `.cancel()` is called successfully. The `.cancelling()` method returns the value of this flag. Further `.cancel()` calls while this flag is set return False. To reset this flag, call `.uncancel()`.
Thus, a Task that catches and ignores `CancelledError` should call `.uncancel()` if it wants to be cancellable again; until it does so, it is deemed to be busy with uninterruptible cleanup.
This new Task API helps solve the problem where TaskGroup needs to distinguish between whether the parent task being cancelled "from the outside" vs. "from inside".
Co-authored-by: Yury Selivanov <yury@edgedb.com>
Co-authored-by: Andrew Svetlov <andrew.svetlov@gmail.com>
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
Cast PY_TIMEOUT_MAX to double, not to _PyTime_t.
Fix the clang warning:
Modules/_threadmodule.c:1648:26: warning: implicit conversion from
'_PyTime_t' (aka 'long') to 'double' changes value from
9223372036854775 to 9223372036854776
[-Wimplicit-const-int-float-conversion]
double timeout_max = (_PyTime_t)PY_TIMEOUT_MAX * 1e-6;
^~~~~~~~~~~~~~~~~~~~~~~~~ ~
ctypes.CFUNCTYPE() and ctypes.WINFUNCTYPE() now fail to create the
type if its "_argtypes_" member contains too many arguments.
Previously, the error was only raised when calling a function.
Change also how CFUNCTYPE() and WINFUNCTYPE() handle KeyError to
prevent creating a chain of exceptions if ctypes.CFuncPtr raises an
error.
Convert the PyType_SUPPORTS_WEAKREFS() macro to a regular function.
It no longer access the PyTypeObject.tp_weaklistoffset member
directly.
Add _PyType_SUPPORTS_WEAKREFS() static inline functions, used
internally by Python for best performance.
Add a new _PyType_GetSubclasses() function to get type's subclasses.
_PyType_GetSubclasses(type) returns a list which holds strong
refererences to subclasses. It is safer than iterating on
type->tp_subclasses which yields weak references and can be modified
in the loop.
_PyType_GetSubclasses(type) now holds a reference to the tp_subclasses
dict while creating the list of subclasses.
set_collection_flag_recursive() of _abc.c now uses
_PyType_GetSubclasses().
The signal module now creates its struct_siginfo type as a heap type
using PyStructSequence_NewType(), rather than using a static type.
Add 'siginfo_type' member to the global signal_state_t structure.
The _curses module now creates its ncurses_version type as a heap
type using PyStructSequence_NewType(), rather than using a static
type.
* Move _PyStructSequence_FiniType() definition to pycore_structseq.h.
* test.pythoninfo: log curses.ncurses_version.
The time module now creates its struct_time type as a heap
type using PyStructSequence_NewType(), rather than using a static
type.
* Add a module state to the time module: add traverse, clear and free
functions.
* Use PyModule_AddType().
* Remove the 'initialized' variable.
Py_EndInterpreter() now explicitly untracks all objects currently
tracked by the GC. Previously, if an object was used later by another
interpreter, calling PyObject_GC_UnTrack() on the object crashed if
the previous or the next object of the PyGC_Head structure became a
dangling pointer.
Previously, the main interpreter was allocated on the heap during runtime initialization. Here we instead embed it into _PyRuntimeState, which means it is statically allocated as part of the _PyRuntime global. The same goes for the initial thread state (of each interpreter, including the main one). Consequently there are fewer allocations during runtime/interpreter init, fewer possible failures, and better memory locality.
FYI, this also helps efforts to consolidate globals, which in turns helps work on subinterpreter isolation.
https://bugs.python.org/issue45953
Move almost all private functions of Include/cpython/fileutils.h to
the internal C API Include/internal/pycore_fileutils.h.
Only keep _Py_fopen_obj() in Include/cpython/fileutils.h, since it's
used by _testcapi which must not use the internal C API.
Move EncodeLocaleEx() and DecodeLocaleEx() functions from _testcapi
to _testinternalcapi, since the C API moved to the internal C API.
The idea is to ensure that module `xml.parsers.expat.errors`
contains all known error codes and messages,
even when CPython is compiled or run with an outdated version of libexpat.
https://bugs.python.org/issue45321
* Do not PUSH/POP traceback or type to the stack as part of exc_info
* Remove exc_traceback and exc_type from _PyErr_StackItem
* Add to what's new, because this change breaks things like Cython