Improves the docstring on signal.strsignal to make it explain when it returns a message, None, or when it raises ValueError.
Closes#98930
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Introduce the autocommit attribute to Connection and the autocommit
parameter to connect() for PEP 249-compliant transaction handling.
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Géry Ogam <gery.ogam@gmail.com>
Check to see if `base_executable` exists. If it does not, attempt
to use known alternative names of the python binary to find an
executable in the path specified by `home`.
If no alternative is found, previous behavior is preserved.
Signed-off-by: Vincent Fazio <vfazio@gmail.com>
Signed-off-by: Vincent Fazio <vfazio@gmail.com>
The Py_CLEAR(), Py_SETREF() and Py_XSETREF() macros now only evaluate
their argument once. If an argument has side effects, these side
effects are no longer duplicated.
Add test_py_clear() and test_py_setref() unit tests to _testcapi.
We do the following:
* move the generated _PyUnicode_InitStaticStrings() to its own file
* move the generated _PyStaticObjects_CheckRefcnt() to its own file
* include pycore_global_objects.h in extension modules instead of pycore_runtime_init.h
These changes help us avoid including things that aren't needed.
https://github.com/python/cpython/issues/90868
This makes it more clear that a given test is definitely testing against a single-phase init (legacy) extension module. The new module is a companion to _testmultiphase.
https://github.com/python/cpython/issues/98627
Add PyFrame_GetVar() and PyFrame_GetVarString() functions to get a
frame variable by its name.
Move PyFrameObject C API tests from test_capi to test_frame.
Now that our int<->str conversions are size limited and we have the
_pylong module handling larger integers, we don't need to limit
everything just to avoid wasting time in the quadratic time DoS-like
case while fuzzing.
We can tweak these further after seeing how this goes.
In very rare circumstances the JUMP opcode could be confused with the
argument of the opcode in the "then" part which doesn't end with the
JUMP opcode. This led to incorrect detection of the final JUMP opcode
and incorrect calculation of the size of the subexpression.
NOTE: Changed return value of functions _validate_inner() and
_validate_charset() in Modules/_sre/sre.c. Now they return 0 on success,
-1 on failure, and 1 if the last op is JUMP (which usually is a failure).
Previously they returned 1 on success and 0 on failure.
Previously, the optional restrictions on subinterpreters were: disallow fork, subprocess, and threads. By default, we were disallowing all three for "isolated" interpreters. We always allowed all three for the main interpreter and those created through the legacy `Py_NewInterpreter()` API.
Those settings were a bit conservative, so here we've adjusted the optional restrictions to: fork, exec, threads, and daemon threads. The default for "isolated" interpreters disables fork, exec, and daemon threads. Regular threads are allowed by default. We continue always allowing everything For the main interpreter and the legacy API.
In the code, we add `_PyInterpreterConfig.allow_exec` and `_PyInterpreterConfig.allow_daemon_threads`. We also add `Py_RTFLAGS_DAEMON_THREADS` and `Py_RTFLAGS_EXEC`.
(see https://github.com/python/cpython/issues/98608)
This change does the following:
1. change the argument to a new `_PyInterpreterConfig` struct
2. rename the function to `_Py_NewInterpreterFromConfig()`, inspired by `Py_InitializeFromConfig()` (takes a `_PyInterpreterConfig` instead of `isolated_subinterpreter`)
3. split up the boolean `isolated_subinterpreter` into the corresponding multiple granular settings
* allow_fork
* allow_subprocess
* allow_threads
4. add `PyInterpreterState.feature_flags` to store those settings
5. add a function for checking if a feature is enabled on an opaque `PyInterpreterState *`
6. drop `PyConfig._isolated_interpreter`
The existing default (see `Py_NewInterpeter()` and `Py_Initialize*()`) allows fork, subprocess, and threads and the optional "isolated" interpreter (see the `_xxsubinterpreters` module) disables all three. None of that changes here; the defaults are preserved.
Note that the given `_PyInterpreterConfig` will not be used outside `_Py_NewInterpreterFromConfig()`, nor preserved. This contrasts with how `PyConfig` is currently preserved, used, and even modified outside `Py_InitializeFromConfig()`. I'd rather just avoid that mess from the start for `_PyInterpreterConfig`. We can preserve it later if we find an actual need.
This change allows us to follow up with a number of improvements (e.g. stop disallowing subprocess and support disallowing exec instead).
(Note that this PR adds "private" symbols. We'll probably make them public, and add docs, in a separate change.)
Functions re.sub() and re.subn() and corresponding re.Pattern methods
are now 2-3 times faster for replacement strings containing group references.
Closes#91524
Primarily authored by serhiy-storchaka Serhiy Storchaka
Minor-cleanups-by: Gregory P. Smith [Google] <greg@krypto.org>
Added os.setns and os.unshare to easily switch between namespaces
on Linux.
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: CAM Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Victor Stinner <vstinner@python.org>
The os module and the PyUnicode_FSDecoder() function no longer accept
bytes-like paths, like bytearray and memoryview types: only the exact
bytes type is accepted for bytes strings.
Change summary:
+ There is now a `gzip.READ_BUFFER_SIZE` constant that is 128KB. Other programs that read in 128KB chunks: pigz and cat. So this seems best practice among good programs. Also it is faster than 8 kb chunks.
+ a zlib._ZlibDecompressor was added. This is the _bz2.BZ2Decompressor ported to zlib. Since the zlib.Decompress object is better for in-memory decompression, the _ZlibDecompressor is hidden. It only makes sense in file decompression, and that is already implemented now in the gzip library. No need to bother the users with this.
+ The ZlibDecompressor uses the older Cpython arrange_output_buffer functions, as those are faster and more appropriate for the use case.
+ GzipFile.read has been optimized. There is no longer a `unconsumed_tail` member to write back to padded file. This is instead handled by the ZlibDecompressor itself, which has an internal buffer. `_add_read_data` has been inlined, as it was just two calls.
EDIT: While I am adding improvements anyway, I figured I could add another one-liner optimization now to the python -m gzip application. That read chunks in io.DEFAULT_BUFFER_SIZE previously, but has been updated now to use READ_BUFFER_SIZE chunks.
On macOS, fix a crash in syslog.syslog() in multi-threaded
applications. On macOS, the libc syslog() function is not
thread-safe, so syslog.syslog() no longer releases the GIL to call
it.
Signal wakeup fd errors are now logged with
_PyErr_WriteUnraisableMsg(), rather than PySys_WriteStderr() and
PyErr_WriteUnraisable(), to pass the error message to
sys.unraisablehook. By default, it's still written into stderr (unless
sys.unraisablehook is overriden).
This seems pretty straightforward. The issue mentions other calls in mmapmodule that we could release the GIL on, but those are in methods where we'd need to be careful to ensure that something sensible happens if those are called concurrently. In prior art, note that #12073 released the GIL for munmap. In a toy benchmark, I see the speedup you'd expect from doing this.
Automerge-Triggered-By: GH:gvanrossum
* gh-96821: Fix undefined behaviour in `audioop.c`
Left-shifting negative numbers is undefined behaviour.
Fortunately, multiplication works just as well, is defined behaviour,
and gets compiled to the same machine code as before by optimizing
compilers.
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
These were intentionally skipped when operator was updated to use the argument clinic:
https://github.com/python/cpython/issues/64385#issuecomment-1093641466
However, by not using the argument clinic, they missed out on getting signatures.
This is a narrow PR to update the docstrings so that `__text_signature__` can be
extracted from them. Updating to use the argument clinic is beyond scope.
`methodcaller` uses `*args, **kwargs` to match variadic names used elsewhere,
including in `operator.call`.
The macOS 13 SDK includes support for the `mkfifoat` and `mknodat` system calls.
Using the `dir_fd` option with either `os.mkfifo` or `os.mknod` could result in a
segfault if cpython is built with the macOS 13 SDK but run on an earlier
version of macOS. Prevent this by adding runtime support for detection of
these system calls ("weaklinking") as is done for other newer syscalls on
macOS.
Fix the Python path configuration used to initialized sys.path at
Python startup. Paths are no longer encoded to UTF-8/strict to avoid
encoding errors if it contains surrogate characters (bytes paths are
decoded with the surrogateescape error handler).
getpath_basename() and getpath_dirname() functions no longer encode
the path to UTF-8/strict, but work directly on Unicode strings. These
functions now use PyUnicode_FindChar() and PyUnicode_Substring() on
the Unicode path, rather than strrchr() on the encoded bytes string.
This PR fixes undefined behaviour in the struct module unpacking support functions `bu_longlong`, `lu_longlong`, `bu_int` and `lu_int`; thanks to @kumaraditya303 for finding these.
The fix is to accumulate the bytes in an unsigned integer type instead of a signed integer type, then to convert to the appropriate signed type. In cases where the width matches, that conversion will typically be compiled away to a no-op.
(Evidence from Godbolt: https://godbolt.org/z/5zvxodj64 .)
To make the conversions efficient, I've specialised the relevant functions for their output size: for `bu_longlong` and `lu_longlong`, this only entails checking that the output size is indeed `8`. But `bu_int` and `lu_int` were used for format sizes `2` and `4` - I've split those into two separate functions each.
No tests, because all of the affected cases are already exercised by the test suite.
Fix the faulthandler implementation of faulthandler.register(signal,
chain=True) if the sigaction() function is not available: don't call
the previous signal handler if it's NULL.
⚠️⚠️ Note for reviewers, hackers and fellow systems/low-level/compiler engineers ⚠️⚠️
If you have a lot of experience with this kind of shenanigans and want to improve the **first** version, **please make a PR against my branch** or **reach out by email** or **suggest code changes directly on GitHub**.
If you have any **refinements or optimizations** please, wait until the first version is merged before starting hacking or proposing those so we can keep this PR productive.
datetime.isoformat generates the tzoffset with colons, but there
was no format code to make strftime output the same format.
for simplicity and consistency the %:z formatting behaves mostly
as %z, with the exception of adding colons. this includes the
dynamic behaviour of adding seconds and microseconds only when
needed (when not 0).
this fixes the still open "generate" part of this issue:
https://github.com/python/cpython/issues/69142
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
- Limited API needs to be enabled per source file
- Some builds don't support Limited API, so Limited API tests must be skipped on those builds
(currently this is `Py_TRACE_REFS`, but that may change.)
- `Py_LIMITED_API` must be defined before `<Python.h>` is included.
This puts the hoop-jumping in `testcapi/parts.h`, so individual
test files can be relatively simple. (Currently that's only
`vectorcall_limited.c`, imagine more.)
- On WASI `ENOTCAPABLE` is now mapped to `PermissionError`.
- The `errno` modules exposes the new error number.
- `getpath.py` now ignores `PermissionError` when it cannot open landmark
files `pybuilddir.txt` and `pyenv.cfg`.
* Make sure that tp_dictoffset is correct with Py_TPFLAGS_MANAGED_DICT is set.
* Avoid traversing managed dict twice when subclassing class with Py_TPFLAGS_MANAGED_DICT set.
We only statically initialize for core code and builtin modules. Extension modules still create
the tuple at runtime. We'll solve that part of interpreter isolation separately.
This change includes generated code. The non-generated changes are in:
* Tools/clinic/clinic.py
* Python/getargs.c
* Include/cpython/modsupport.h
* Makefile.pre.in (re-generate global strings after running clinic)
* very minor tweaks to Modules/_codecsmodule.c and Python/Python-tokenize.c
All other changes are generated code (clinic, global strings).
- Move PyUnicode tests to a separate file
- Add some more tests for PyUnicode_FromFormat
Co-authored-by: philg314 <110174000+philg314@users.noreply.github.com>
* Add test for inheriting explicit __dict__ and weakref.
* Restore 3.10 behavior for multiple inheritance of C extension classes that store their dictionary at the end of the struct.
- check for ``dup()`` libc function
- handle missing ``F_DUPFD`` in ``dup2()`` replacement function
- add workaround for WASI libc bug in MSG_TRUNC
- ESHUTDOWN is missing, use EPIPE instead
- POLLPRI is missing, define as 0 (no-op)
* syslog_get_argv() swallows exceptions, but not in all cases.
* if ident is non UTF-8 encodable, syslog.openlog() fails after setting the
global reference to ident. Now the C string saved internally in the previous
call to openlog() points to the freed memory.
* PySys_Audit() can crash if ident is NULL.
* There may be a race condition with syslog.syslog(), because the global
reference to ident is decrefed before setting the new value.
* Possible use of freed memory if syslog.openlog() is called while
the GIL is released in syslog.syslog().
This PR partially reverts gh-24421 (PR) and fixes the remaining concerns
given in gh-93044 (issue):
- keyword arguments are passed as positional arguments to factory()
- if an argument is not passed to sqlite3.connect(), its default value
is passed to factory()
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
The wrapper macros are more readable and match the form recommended in
the OpenSSL documentation. They also slightly less error-prone, as the
mapping of arguments to SSL_CTX_ctrl is not always clear. (Though in
this case it's straightforward.)
https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_get_max_proto_version.html
When binding a unix socket to an empty address on Linux, the socket is
automatically bound to an available address in the abstract namespace.
>>> s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
>>> s.bind("")
>>> s.getsockname()
b'\x0075499'
Since python 3.9, the socket is bound to the one address:
>>> s.getsockname()
b'\x00'
And trying to bind multiple sockets will fail with:
Traceback (most recent call last):
File "/home/nsoffer/src/cpython/Lib/test/test_socket.py", line 5553, in testAutobind
s2.bind("")
OSError: [Errno 98] Address already in use
Added 2 tests:
- Auto binding empty address on Linux
- Failing to bind an empty address on other platforms
Fixes f6b3a07b7d (bpo-44493: Add missing terminated NUL in sockaddr_un's length (GH-26866)
The `_testcapimodule.c` file is getting too large to work with effectively.
This PR lays out a general structure of how tests can be split up, with more splitting to come later if the structure is OK.
Vectorcall tests aren't the biggest issue -- it's just an area I want to work on next, so I'm starting here.
An issue specific to vectorcall tests is that it wasn't clear that e.g. `MethodDescriptor2` is related to testing vectorcall: the `/* Test PEP 590 */` section had an ambiguous end. Separate file should make things like this much clearer.
OTOH, for some pieces it might not be clear where they should be -- I left `meth_fastcall` with tests of the other calling conventions. IMO, even with the ambiguity it's still worth it to split the huge file up.
I'm not sure about the buildsystem changes, hopefully CI will tell me what's wrong.
@vstinner, @markshannon: Do you think this is a good idea?
Automerge-Triggered-By: GH:encukou
Adds `ctypes.c_time_t` to represent the C `time_t` type accurately as its size varies.
Primarily-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org> [Google]
Remove dead code related to ssl.PROTOCOL_SSLv2. ssl.PROTOCOL_SSLv2
was already removed in Python 3.10.
In test_ssl, @requires_tls_version('SSLv2') always returned False.
Extract of the removed code: "OpenSSL has removed support for SSLv2".
Make _struct.Struct a GC type
This fixes a memory leak in the _struct module, where as soon
as a Struct object is stored in the cache, there's a cycle from
the _struct module to the cache to Struct objects to the Struct
type back to the module. If _struct.Struct is not gc-tracked, that
cycle is never collected.
This PR makes _struct.Struct GC-tracked, and adds a regression test.
- c_longlong and c_longdouble need experimental WASM bigint.
- Skip tests that need threading
- Define ``CTYPES_MAX_ARGCOUNT`` for Emscripten. libffi-emscripten 2022-06-23 supports up to 1000 args.
Move the follow functions and type from frameobject.h to pyframe.h,
so the standard <Python.h> provide frame getter functions:
* PyFrame_Check()
* PyFrame_GetBack()
* PyFrame_GetBuiltins()
* PyFrame_GetGenerator()
* PyFrame_GetGlobals()
* PyFrame_GetLasti()
* PyFrame_GetLocals()
* PyFrame_Type
Remove #include "frameobject.h" from many C files. It's no longer
needed.
Deprecate global configuration variable like
Py_IgnoreEnvironmentFlag: the Py_InitializeFromConfig() API should be
instead.
Fix declaration of Py_GETENV(): use PyAPI_FUNC(), not PyAPI_DATA().
Revert "bpo-23689: re module, fix memory leak when a match is terminated by a signal or memory allocation failure (GH-32283)"
This reverts commit 6e3eee5c11.
Manual fixups to increase the MAGIC number and to handle conflicts with
a couple of changes that landed after that.
Thanks for reviews by Ma Lin and Serhiy Storchaka.
The fix involves using pysqlite_check_remaining_sql(), not only to check
for multiple statements, but now also to strip leading comments and
whitespace from SQL statements, so we can improve DML query detection.
pysqlite_check_remaining_sql() is renamed lstrip_sql(), to more
accurately reflect its function, and hardened to handle more SQL comment
corner cases.
When changing PyType_FromMetaclass recently (GH-93012, GH-93466, GH-28748)
I found a bunch of opportunities to improve the code. Here they are.
Fixes: #89546
Automerge-Triggered-By: GH:encukou
It combines PyImport_ImportModule() and PyObject_GetAttrString()
and saves 4-6 lines of code on every use.
Add also _PyImport_GetModuleAttr() which takes Python strings as arguments.
This checks the bases of of a type created using the FromSpec
API to inherit the bases metaclasses. The metaclass's alloc
function will be called as is done in `tp_new` for classes
created in Python.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
This was added for bpo-40514 (gh-84694) to test out a per-interpreter GIL. However, it has since proven unnecessary to keep the experiment in the repo. (It can be done as a branch in a fork like normal.) So here we are removing:
* the configure option
* the macro
* the code enabled by the macro
Added a new stable API function ``PyType_FromMetaclass``, which mirrors
the behavior of ``PyType_FromModuleAndSpec`` except that it takes an
additional metaclass argument. This is, e.g., useful for language
binding tools that need to store additional information in the type
object.
Python now always use the ``%zu`` and ``%zd`` printf formats to
format a size_t or Py_ssize_t number. Building Python 3.12 requires a
C11 compiler, so these printf formats are now always supported.
* PyObject_Print() and _PyObject_Dump() now use the printf %zd format
to display an object reference count.
* Update PY_FORMAT_SIZE_T comment.
* Remove outdated notes about the %zd format in PyBytes_FromFormat()
and PyUnicode_FromFormat() documentations.
* configure no longer checks for the %zd format and no longer defines
PY_FORMAT_SIZE_T macro in pyconfig.h.
* pymacconfig.h no longer undefines PY_FORMAT_SIZE_T: macOS 10.4 is
no longer supported. Python 3.12 now requires macOS 10.6 (Snow
Leopard) or newer.
The following sqlite3 features were deprecated in 3.10, scheduled for
removal in 3.12:
- sqlite3.OptimizedUnicode (gh-23163)
- sqlite3.enable_shared_cache (gh-24008)
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Signed-off-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
In the limited C API version 3.12, PyUnicode_KIND() is now
implemented as a static inline function. Keep the macro for the
regular C API and for the limited C API version 3.11 and older to
prevent introducing new compiler warnings.
Update _decimal.c and stringlib/eq.h for PyUnicode_KIND().
The `utc_to_seconds` call can fail, here's a minimal reproducer on
Linux:
TZ=UTC python -c "from datetime import *; datetime.fromtimestamp(253402300799 + 1)"
The old behavior still raised an error in a similar way, but only
because subsequent calculations happened to fail as well. Better to fail
fast.
This also refactors the tests to split out the `fromtimestamp` and
`utcfromtimestamp` tests, and to get us closer to the actual desired
limits of the functions. As part of this, we also changed the way we
detect platforms where the same limits don't necessarily apply (e.g.
Windows).
As part of refactoring the tests to hit this condition explicitly (even
though the user-facing behvior doesn't change in any way we plan to
guarantee), I noticed that there was a difference in the places that
`datetime.utcfromtimestamp` fails in the C and pure Python versions, which
was fixed by skipping the "probe for fold" logic for UTC specifically —
since UTC doesn't have any folds or gaps, we were never going to find a
fold value anyway. This should prevent some failures in the pure python
`utcfromtimestamp` method on timestamps close to 0001-01-01.
There are two separate news entries for this because one is a
potentially user-facing change, the other is an internal code
correctness change that, if anything, changes some error messages. The
two happen to be coupled because of the test refactoring, but they are
probably best thought of as independent changes.
Fixes GH-91581
38f331d introduced a delayed initialization routine to set up
ctypes formattable (`_ctypes_init_fielddesc`), but inadvertently
removed setting the `initialization` flag to 1 to avoid initting
each time.
Add the -P command line option and the PYTHONSAFEPATH environment
variable to not prepend a potentially unsafe path to sys.path.
* Add sys.flags.safe_path flag.
* Add PyConfig.safe_path member.
* Programs/_bootstrap_python.c uses config.safe_path=0.
* Update subprocess._optim_args_from_interpreter_flags() to handle
the -P command line option.
* Modules/getpath.py sets safe_path to 1 if a "._pth" file is
present.
One more thing that can help prevent people from using `preexec_fn`.
Also adds conditional skips to two tests exposing ASAN flakiness on the Ubuntu 20.04 Address Sanitizer Github CI system. When that build is run on more modern systems the "problem" does not show up. It seems ASAN implementation related.
Co-authored-by: Zackery Spytz <zspytz@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
``pymain_run_python()`` now imports ``readline`` and ``rlcompleter``
before sys.path is extended to include the current working directory of
an interactive interpreter. Non-interactive interpreters are not
affected.
Also move imports of ``re`` and ``keyword`` module to top level so they
are materialized early, too. The ``keyword`` module is trivial and the
``re`` is already imported via ``inspect`` -> ``linecache``.
#92301: subprocess: Prefer `close_range()` to procfs-based fd closing.
`close_range()` is much faster for large number of file descriptors, e.g.
4 times faster for 1000 descriptors in a Linux 5.16-based environment.
We prefer close_range() only if it's known to be async-signal-safe.
* Map SQLITE_MISUSE to sqlite3.InterfaceError
SQLITE_MISUSE implies misuse of the SQLite C API, which, if it happens,
is _not_ a user error; it is an sqlite3 extension module error.
* Raise better errors when binding parameters fail.
Instead of always raising InterfaceError, guessing what went wrong,
raise accurate exceptions with more accurate error messages.
Currently, calling Py_EnterRecursiveCall() and
Py_LeaveRecursiveCall() may use a function call or a static inline
function call, depending if the internal pycore_ceval.h header file
is included or not. Use a different name for the static inline
function to ensure that the static inline function is always used in
Python internals for best performance. Similar approach than
PyThreadState_GET() (function call) and _PyThreadState_GET() (static
inline function).
* Rename _Py_EnterRecursiveCall() to _Py_EnterRecursiveCallTstate()
* Rename _Py_LeaveRecursiveCall() to _Py_LeaveRecursiveCallTstate()
* pycore_ceval.h: Rename Py_EnterRecursiveCall() to
_Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() and
_Py_LeaveRecursiveCall()
Fix a crash in subinterpreters related to the garbage collector. When
a subinterpreter is deleted, untrack all objects tracked by its GC.
To prevent a crash in deallocator functions expecting objects to be
tracked by the GC, leak a strong reference to these objects on
purpose, so they are never deleted and their deallocator functions
are not called.
Replace "(PyCFunction)(void(*)(void))func" cast with
_PyCFunction_CAST(func).
Change generated by the command:
sed -i -e \
's!(PyCFunction)(void(\*)(void)) *\([A-Za-z0-9_]\+\)!_PyCFunction_CAST(\1)!g' \
$(find -name "*.c")
This makes macOS gdbm provided by Homebrew not segfault through correct
selection of the linked library (-lgdbm_compat) *AND* the correct ndbm-style
header (gdbm-ndbm.h instead of the invalid ndbm.h).
It was initially added to support atomic groups, but that
support was never fully implemented, and CALL was only left
in the compiler, but not interpreter and parser.
ATOMIC_GROUP is now used to support atomic groups.
Just in case there is ever an issue with _posixsubprocess's use of
vfork() due to the complexity of using it properly and potential
directions that Linux platforms where it defaults to on could take, this
adds a failsafe so that users can disable its use entirely by setting
a global flag.
No known reason to disable it exists. But it'd be a shame to encounter
one and not be able to use CPython without patching and rebuilding it.
See the linked issue for some discussion on reasoning.
Also documents the existing way to disable posix_spawn.
Fix signal.NSIG value on FreeBSD to accept signal numbers greater
than 32, like signal.SIGRTMIN and signal.SIGRTMAX.
* Add Py_NSIG constant.
* Add pycore_signal.h internal header file.
* _Py_Sigset_Converter() now includes the range of valid signals in
the error message.
In the limited C API version 3.11 and newer, the following functions
no longer cast their object pointer argument with _PyObject_CAST() or
_PyObject_CAST_CONST():
* Py_REFCNT(), Py_TYPE(), Py_SIZE()
* Py_SET_REFCNT(), Py_SET_TYPE(), Py_SET_SIZE()
* Py_IS_TYPE()
* Py_INCREF(), Py_DECREF()
* Py_XINCREF(), Py_XDECREF()
* Py_NewRef(), Py_XNewRef()
* PyObject_TypeCheck()
* PyType_Check()
* PyType_CheckExact()
Split Py_DECREF() implementation in 3 versions to make the code more
readable.
Update the xxlimited.c extension, which uses the limited C API
version 3.11, to pass PyObject* to these functions.
Python 3.11 now uses C11 standard which adds static_assert()
to <assert.h>.
* In pytime.c, replace Py_BUILD_ASSERT() with preprocessor checks on
SIZEOF_TIME_T with #error.
* On macOS, py_mach_timebase_info() now accepts timebase members with
the same size than _PyTime_t.
* py_get_monotonic_clock() now saturates GetTickCount64() to
_PyTime_MAX: GetTickCount64() is unsigned, whereas _PyTime_t is
signed.
`TextIOWrapper.__init__()` called `os.device_encoding(file.fileno())` if fileno is 0-2 and encoding=None.
But it is very rarely works, and never documented behavior.
Limit the maximum capturing group to 2**30-1 on 64-bit platforms
(it was 2**31-1). No change on 32-bit platforms (2**28-1).
It allows to reduce the size of SRE(match_context):
- On 32 bit platform: 36 bytes, no change. (msvc2022)
- On 64 bit platform: 72 bytes -> 56 bytes. (msvc2022/gcc9.4)
which leads to increasing the depth of backtracking.
Unless sqlite3_blob_open() returns SQLITE_MISUSE, the error code and
message are available on the connection object. This means we have to
handle SQLITE_MISUSE error messages explicitly.
* Move the code for generating Modules/_sre/sre_constants.h from
Lib/re/_constants.py into a separate script
Tools/scripts/generate_sre_constants.py.
* Add target `regen-sre` in the makefile.
* Make target `regen-all` depending on `regen-sre`.
Copying and pickling instances of subclasses of builtin types
bytearray, set, frozenset, collections.OrderedDict, collections.deque,
weakref.WeakSet, and datetime.tzinfo now copies and pickles instance attributes
implemented as slots.
The side effect of this bug was that venv environments directly
used the main interpreter instead of the intermediate stub executable,
which can cause problems when a script uses system APIs that
require the use of an application bundle.
The function PyInit_imp_dummy is declared as void f(PyObject* spec)
but called as void f(void). On wasm targets without the call
trampolines this causes a fatal error.
Automerge-Triggered-By: GH:tiran
bpo-47151: Fallback to fork when vfork fails in subprocess. An OS kernel can specifically decide to disallow vfork() in a process. No need for that to prevent us from launching subprocesses.
- Remove ``--with-tclk-*`` options from `configure`
- Use pkg-config to detect `_tkinter` dependencies (Tcl/Tk, X11)
- Manual override via environment variables `TCLTK_CFLAGS` and `TCLTK_LIBS`
Add macros to cast objects to PyASCIIObject*, PyCompactUnicodeObject*
and PyUnicodeObject*: _PyASCIIObject_CAST(),
_PyCompactUnicodeObject_CAST() and _PyUnicodeObject_CAST(). Using
these new macros make the code more readable and check their argument
with: assert(PyUnicode_Check(op)).
Remove redundant assert(PyUnicode_Check(op)) in macros using directly
or indirectly these new CAST macros.
Replacing existing casts with these macros.
In rare cases, capturing group could get wrong result.
Regular expression engines in Perl and Java have similar bugs.
The new behavior now matches the behavior of more modern
RE engines: in the regex module and in PHP, Ruby and Node.js.
After a long deliberation we ended up feeling that the message argument for Future.cancel(), added in 3.9, was a bad idea, so we're deprecating it in 3.11 and plan to remove it in 3.13.
When compiled with `USE_ZLIB_CRC32` defined (`configure` sets this on POSIX systems), `binascii.crc32(...)` failed to compute the correct value when the input data was >= 4GiB. Because the zlib crc32 API is limited to a 32-bit length.
This lines it up with the `zlib.crc32(...)` implementation that doesn't have that flaw.
**Performance:** This also adopts the same GIL releasing for larger inputs logic that `zlib.crc32` has, and causes the Windows build to always use zlib's crc32 instead of our slow C code as zlib is a required build dependency on Windows.
Clarifies a versionchanged note on crc32 & adler32 docs that the workaround is only needed for Python 2 and earlier.
Also cleans up an unnecessary intermediate variable in the implementation.
Authored-By: Ma Lin / animalize
Co-authored-by: Gregory P. Smith <greg@krypto.org>
* Improve exception compliance with PEP 249
* Raise InterfaceError instead of ProgrammingError for SQLITE_MISUSE.
If SQLITE_MISUSE is raised, it is a sqlite3 module bug. Users of the
sqlite3 module are not responsible for using the SQLite C API correctly.
* Don't overwrite BufferError with ValueError when conversion to BLOB fails.
* Raise ProgrammingError instead of Warning if user tries to execute() more
than one SQL statement.
* Raise ProgrammingError instead of ValueError if an SQL query contains null characters.
* Make sure `_pysqlite_set_result` raises an exception if it returns -1.
wasm32-emscripten does not support exceptfds and requires NULL. Python
now passes NULL instead of a fdset pointer when the input list is empty.
This works fine on all platforms and might even be a tiny bit faster.
Add new functions to pack and unpack C double (serialize and
deserialize):
* PyFloat_Pack2(), PyFloat_Pack4(), PyFloat_Pack8()
* PyFloat_Unpack2(), PyFloat_Unpack4(), PyFloat_Unpack8()
Document these functions and add unit tests.
Rename private functions and move them from the internal C API
to the public C API:
* _PyFloat_Pack2() => PyFloat_Pack2()
* _PyFloat_Pack4() => PyFloat_Pack4()
* _PyFloat_Pack8() => PyFloat_Pack8()
* _PyFloat_Unpack2() => PyFloat_Unpack2()
* _PyFloat_Unpack4() => PyFloat_Unpack4()
* _PyFloat_Unpack8() => PyFloat_Unpack8()
Replace the "unsigned char*" type with "char*" which is more common
and easy to use.
In Linux kernel 5.14 one can dynamically request size of altstacksize
based on hardware capabilities with getauxval(AT_MINSIGSTKSZ).
This changes allows for Python extension's request to Linux kernel
to use AMX_TILE instruction set on Sapphire Rapids Xeon processor
to succeed, unblocking use of the ISA in frameworks.
Introduced HAVE_LINUX_AUXVEC_H in configure.ac and pyconfig.h.in
Used cpython_autoconf:269 docker container to generate configure.
Also from the _asyncio C accelerator module,
and adjust one test that the change caused to fail.
For more discussion see the discussion starting here:
https://github.com/python/cpython/pull/31394#issuecomment-1053545331
(Basically, @asvetlov proposed to return False from cancel()
when there is already a pending cancellation, and I went along,
even though it wasn't necessary for the task group implementation,
and @agronholm has come up with a counterexample that fails
because of this change. So now I'm changing it back to the old
semantics (but still bumping the counter) until we can have a
proper discussion about this.)
<stdbool.h> is the standard/modern way to define embedd/extends Python free to define bool, true and false, but there are existing applications that use slightly different redefinitions, which fail if the header is included.
It's OK to use stdbool outside the public headers, though.
https://bugs.python.org/issue46748
This changes cancelling() and uncancel() to return the count of pending cancellations.
This can be used to avoid bugs in certain edge cases (e.g. two timeouts going off at the same time).
Remove the HAVE_PY_SET_53BIT_PRECISION macro (moved to the internal
C API).
* Move HAVE_PY_SET_53BIT_PRECISION macro to pycore_pymath.h.
* Replace PY_NO_SHORT_FLOAT_REPR macro with _PY_SHORT_FLOAT_REPR
macro which is always defined. gcc -Wundef emits a warning when
using _PY_SHORT_FLOAT_REPR but the macro is not defined, if
pycore_pymath.h include was forgotten.
The libexpat 2.4.1 upgrade from introduced the following new exported symbols:
* `testingAccountingGetCountBytesDirect`
* `testingAccountingGetCountBytesIndirect`
* `unsignedCharToPrintable`
* `XML_SetBillionLaughsAttackProtectionActivationThreshold`
* `XML_SetBillionLaughsAttackProtectionMaximumAmplification`
We need to adjust [Modules/expat/pyexpatns.h](https://github.com/python/cpython/blob/master/Modules/expat/pyexpatns.h)
(The newer libexpat upgrade has no new symbols).
Automerge-Triggered-By: GH:gpshead
asyncio/taskgroups.py is an adaptation of taskgroup.py from EdgeDb, with the following key changes:
- Allow creating new tasks as long as the last task hasn't finished
- Raise [Base]ExceptionGroup (directly) rather than TaskGroupError deriving from MultiError
- Instead of monkey-patching the parent task's cancel() method,
add a new public API to Task
The Task class has a new internal flag, `_cancel_requested`, which is set when `.cancel()` is called successfully. The `.cancelling()` method returns the value of this flag. Further `.cancel()` calls while this flag is set return False. To reset this flag, call `.uncancel()`.
Thus, a Task that catches and ignores `CancelledError` should call `.uncancel()` if it wants to be cancellable again; until it does so, it is deemed to be busy with uninterruptible cleanup.
This new Task API helps solve the problem where TaskGroup needs to distinguish between whether the parent task being cancelled "from the outside" vs. "from inside".
Co-authored-by: Yury Selivanov <yury@edgedb.com>
Co-authored-by: Andrew Svetlov <andrew.svetlov@gmail.com>
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
Cast PY_TIMEOUT_MAX to double, not to _PyTime_t.
Fix the clang warning:
Modules/_threadmodule.c:1648:26: warning: implicit conversion from
'_PyTime_t' (aka 'long') to 'double' changes value from
9223372036854775 to 9223372036854776
[-Wimplicit-const-int-float-conversion]
double timeout_max = (_PyTime_t)PY_TIMEOUT_MAX * 1e-6;
^~~~~~~~~~~~~~~~~~~~~~~~~ ~
ctypes.CFUNCTYPE() and ctypes.WINFUNCTYPE() now fail to create the
type if its "_argtypes_" member contains too many arguments.
Previously, the error was only raised when calling a function.
Change also how CFUNCTYPE() and WINFUNCTYPE() handle KeyError to
prevent creating a chain of exceptions if ctypes.CFuncPtr raises an
error.
Convert the PyType_SUPPORTS_WEAKREFS() macro to a regular function.
It no longer access the PyTypeObject.tp_weaklistoffset member
directly.
Add _PyType_SUPPORTS_WEAKREFS() static inline functions, used
internally by Python for best performance.
Add a new _PyType_GetSubclasses() function to get type's subclasses.
_PyType_GetSubclasses(type) returns a list which holds strong
refererences to subclasses. It is safer than iterating on
type->tp_subclasses which yields weak references and can be modified
in the loop.
_PyType_GetSubclasses(type) now holds a reference to the tp_subclasses
dict while creating the list of subclasses.
set_collection_flag_recursive() of _abc.c now uses
_PyType_GetSubclasses().
The signal module now creates its struct_siginfo type as a heap type
using PyStructSequence_NewType(), rather than using a static type.
Add 'siginfo_type' member to the global signal_state_t structure.
The _curses module now creates its ncurses_version type as a heap
type using PyStructSequence_NewType(), rather than using a static
type.
* Move _PyStructSequence_FiniType() definition to pycore_structseq.h.
* test.pythoninfo: log curses.ncurses_version.
The time module now creates its struct_time type as a heap
type using PyStructSequence_NewType(), rather than using a static
type.
* Add a module state to the time module: add traverse, clear and free
functions.
* Use PyModule_AddType().
* Remove the 'initialized' variable.
Py_EndInterpreter() now explicitly untracks all objects currently
tracked by the GC. Previously, if an object was used later by another
interpreter, calling PyObject_GC_UnTrack() on the object crashed if
the previous or the next object of the PyGC_Head structure became a
dangling pointer.
Previously, the main interpreter was allocated on the heap during runtime initialization. Here we instead embed it into _PyRuntimeState, which means it is statically allocated as part of the _PyRuntime global. The same goes for the initial thread state (of each interpreter, including the main one). Consequently there are fewer allocations during runtime/interpreter init, fewer possible failures, and better memory locality.
FYI, this also helps efforts to consolidate globals, which in turns helps work on subinterpreter isolation.
https://bugs.python.org/issue45953
Move almost all private functions of Include/cpython/fileutils.h to
the internal C API Include/internal/pycore_fileutils.h.
Only keep _Py_fopen_obj() in Include/cpython/fileutils.h, since it's
used by _testcapi which must not use the internal C API.
Move EncodeLocaleEx() and DecodeLocaleEx() functions from _testcapi
to _testinternalcapi, since the C API moved to the internal C API.
The idea is to ensure that module `xml.parsers.expat.errors`
contains all known error codes and messages,
even when CPython is compiled or run with an outdated version of libexpat.
https://bugs.python.org/issue45321
* Do not PUSH/POP traceback or type to the stack as part of exc_info
* Remove exc_traceback and exc_type from _PyErr_StackItem
* Add to what's new, because this change breaks things like Cython
This defines VPATH differently in PGO instrumentation builds, to account for a different default output directory. It also adds sys._vpath on Windows to make the value available to sysconfig so that it can be used in tests.
When Python is embedded in other applications, it is not easy to determine which version of Python is being used. This change exposes the Python version as part of the API data. Tools like Austin (https://github.com/P403n1x87/austin) can benefit from this data when targeting applications like uWSGI, as the Python version can then be inferred systematically by looking at the exported symbols rather than relying on unreliable pattern matching or other hacks (like remote code execution etc...).
Automerge-Triggered-By: GH:pablogsal