Suppress all `OSError` exceptions from `pathlib.Path.exists()` and `is_*()`
rather than a selection of more common errors as we do presently. Also
adjust the implementations to call `os.path.exists()` etc, which are much
faster on Windows thanks to GH-101196.
Follow-up of gh-101693. The previous DeprecationWarning is replaced with
raising sqlite3.ProgrammingError.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Remove support for supplying additional positional arguments to
`PurePath.relative_to()` and `is_relative_to()`. This has been deprecated
since Python 3.12.
_PyWeakref_ClearRef was previously exposed in the public C-API, although
it begins with an underscore and is not documented. It's used by a few
C-API extensions. There is currently no alternative public API that can
replace its use.
_PyWeakref_ClearWeakRefsExceptCallbacks is the only thread-safe way to
use _PyWeakref_ClearRef in the free-threaded build. This exposes the C
symbol, but does not make the API public.
Some embedders and extensions include parts of the internal API. The
pycore_mimalloc.h file is transitively include by a number of other
internal headers. This avoids include errors for code that was
already including those headers.
The `list_preallocate_exact` function did not zero initialize array
contents. In the free-threaded build, this could expose uninitialized
memory to concurrent readers between the call to
`list_preallocate_exact` and the filling of the array contents with
items.
Callbacks registered in the tkinter module now take arguments as
various Python objects (int, float, bytes, tuple), not just str.
To restore the previous behavior set tkinter module global wantobject to 1
before creating the Tk object or call the wantobject() method of the Tk object
with argument 1.
Calling it with argument 2 restores the current default behavior.
Fix an edge case in `binascii.a2b_base64` strict mode, where
excessive padding was not detected when no padding is necessary.
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Rationale
=========
argparse performs a complex formatting of the usage for argument grouping
and for line wrapping to fit the terminal width. This formatting has been
a constant source of bugs for at least 10 years (see linked issues below)
where defensive assertion errors are triggered or brackets and paranthesis
are not properly handeled.
Problem
=======
The current implementation of argparse usage formatting relies on regular
expressions to group arguments usage only to separate them again later
with another set of regular expressions. This is a complex and error prone
approach that caused all the issues linked below. Special casing certain
argument formats has not solved the problem. The following are some of
the most common issues:
- empty `metavar`
- mutually exclusive groups with `SUPPRESS`ed arguments
- metavars with whitespace
- metavars with brackets or paranthesis
Solution
========
The following two comments summarize the solution:
- https://github.com/python/cpython/issues/82091#issuecomment-1093832187
- https://github.com/python/cpython/issues/77048#issuecomment-1093776995
Mainly, the solution is to rewrite the usage formatting to avoid the
group-then-separate approach. Instead, the usage parts are kept separate
and only joined together at the end. This allows for a much simpler
implementation that is easier to understand and maintain. It avoids the
regular expressions approach and fixes the corresponding issues.
This closes the following GitHub issues:
- #62090
- #62549
- #77048
- #82091
- #89743
- #96310
- #98666
These PRs become obsolete:
- #15372
- #96311
Add missing import to code that handles too large files and offsets.
Use list, not tuple, for a mutable sequence.
Add tests to prevent similar mistakes.
---------
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
This change makes sure all extension/builtin modules have their init function run first by the main interpreter before proceeding with import in the original interpreter (main or otherwise). This means when the import of a single-phase init module fails in an isolated subinterpreter, it won't tie any global state/callbacks to the subinterpreter.
Add the ability to enable/disable the GIL at runtime, and use that in
the C module loading code.
We can't know before running a module init function if it supports
free-threading, so the GIL is temporarily enabled before doing so. If
the module declares support for running without the GIL, the GIL is
later disabled. Otherwise, the GIL is permanently enabled, and will
never be disabled again for the life of the current interpreter.
Now, such classes will no longer require changes in Python 3.13 in the normal case.
The test suite for robotframework passes with no DeprecationWarnings under this PR.
I also added a new DeprecationWarning for the case where `_field_types` exists
but is incomplete, since that seems likely to indicate a user mistake.
Add _PyType_LookupRef and use incref before setting attribute on type
Makes setting an attribute on a class and signaling type modified atomic
Avoid adding re-entrancy exposing the type cache in an inconsistent state by decrefing after type is updated
* Add PhotoImage.read() to read an image from a file.
* Add PhotoImage.data() to get the image data.
* Add background and grayscale parameters to PhotoImage.write().
* Add the PhotoImage method copy_replace() to copy a region
from one image to other image, possibly with pixel zooming and/or
subsampling.
* Add from_coords parameter to PhotoImage methods copy(), zoom() and subsample().
* Add zoom and subsample parameters to PhotoImage method copy().
The designated initializer syntax in static inline functions in pycore_backoff.h
causes problems for C++ or MSVC users who aren't yet using C++20.
While internal, pycore_backoff.h is included (indirectly, via pycore_code.h)
by some key 3rd party software that does so for speed.
This is *not* sufficient for the final 3.13 release, but it will do for beta 1:
- What's new entry
- Updated changelog entry (news blurb)
- Mention the proxy for f_globals in the datamodel and Python frame object docs
This doesn't have any C API details (what's new refers to the PEP).
For converting large ints to strings, CPython invokes a function in _pylong.py,
which uses the decimal module to implement an asymptotically waaaaay
sub-quadratic algorithm. But if the C decimal module isn't available, CPython
uses _pydecimal.py instead. Which in turn frequently does str(int). If the int
is very large, _pylong ends up doing the work, which in turn asks decimal to do
"big" arithmetic, which in turn calls str(big_int), which in turn ... it can
become infinite mutual recursion.
This change introduces a different int->str function that doesn't use decimal.
It's asymptotically worse, "Karatsuba time" instead of quadratic time, so
still a huge improvement. _pylong switches to that when the C decimal isn't
available. It is also used for not too large integers (less than 450_000 bits),
where it is faster (up to 2 times for 30_000 bits) than the asymptotically
better implementation that uses the C decimal.
Co-authored-by: Tim Peters <tim.peters@gmail.com>
* Initial stab.
* Test the tentative fix. Hangs "forever" without this change.
* Move the new test to a better spot.
* New comment to explain why _convert_to_str allows any poewr of 10.
* Fixed a comment, and fleshed out an existing test that appeared unfinished.
* Added temporary asserts. Or maybe permanent ;-)
* Update Lib/_pydecimal.py
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* Remove the new _convert_to_str().
Serhiy and I independently concluded that exact powers of 10
aren't possible in these contexts, so just checking the
string length is sufficient.
* At least for now, add the asserts to the other block too.
* 📜🤖 Added by blurb_it.
---------
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.
PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.
A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
The function returns `True` or `False` depending on whether the GIL is
currently enabled. In the default build, it always returns `True`
because the GIL is always enabled.
Now inspect.signature() supports references to the module globals in
parameter defaults on methods in extension modules. Previously it was
only supported in functions. The workaround was to specify the fully
qualified name, including the module name.
Add "Raw" variant of PyTime functions:
* PyTime_MonotonicRaw()
* PyTime_PerfCounterRaw()
* PyTime_TimeRaw()
Changes:
* Add documentation and tests. Tests release the GIL while calling
raw clock functions.
* py_get_system_clock() and py_get_monotonic_clock() now check that
the GIL is hold by the caller if raise_exc is non-zero.
* Reimplement "Unchecked" functions with raw clock functions.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Account for `add_stopiteration_handler` pushing a block for `async with`.
To allow generator functions that previously almost hit the `CO_MAXBLOCKS`
limit by nesting non-async blocks, the limit is increased by 1.
This increase allows one more block in non-generator functions.
Add code and config for a minimal Android app, and instructions to build and run it.
Improve Android build instructions in general.
Add a tool subcommand to download the Gradle wrapper (with its binary blob). Android
studio must be downloaded manually (due to the license).
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.
We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.
On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.
In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
* Allow to specify the signature of custom callable instances of extension
type by the __text_signature__ attribute.
* Specify signatures of operator.attrgetter, operator.itemgetter, and
operator.methodcaller instances.
While properties like IPv6Address.is_private account for IPv4-mapped
IPv6 addresses, such as for example:
>>> ipaddress.ip_address("192.168.0.1").is_private
True
>>> ipaddress.ip_address("::ffff:192.168.0.1").is_private
True
...the same doesn't currently apply to the is_loopback property:
>>> ipaddress.ip_address("127.0.0.1").is_loopback
True
>>> ipaddress.ip_address("::ffff:127.0.0.1").is_loopback
False
At minimum, this inconsistency between different properties is
counter-intuitive. Moreover, ::ffff:127.0.0.0/104 is for all intents and
purposes a loopback address, and should be treated as such.
sqlite3.iterdump() depends on the row factory returning resulting rows
as tuples; it will fail with custom row factories like for example a
dict factory.
With this commit, we explicitly reset the row factory of the cursor used
by iterdump(), so we always get predictable results. This does not
affect the row factory of the parent connection.
Co-authored-by: Mariusz Felisiak <felisiak.mariusz@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* Add name and mode attributes for compressed and archived file-like objects
in modules bz2, lzma, tarfile and zipfile.
* Change the value of the mode attribute of GzipFile from integer (1 or 2)
to string ('rb' or 'wb').
* Change the value of the mode attribute of ZipExtFile from 'r' to 'rb'.
The second item in the tuple returned from `__reduce__()` is a tuple of arguments to supply to path constructor. Previously we returned the `parts` tuple here, which entailed joining, parsing and normalising the path object, and produced a compact pickle representation.
With this patch, we instead return a tuple of paths that were originally given to the path constructor. This makes pickling much faster (at the expense of compactness).
It's worth noting that, in the olden times, pathlib performed this parsing/normalization up-front in every case, and so using `parts` for pickling was almost free. Nowadays pathlib only parses/normalises paths when it's necessary or advantageous to do so (e.g. computing a path parent, or iterating over a directory, respectively).
Tarfile.addfile now throws an ValueError when the user passes
in a non-zero size tarinfo but does not provide a fileobj,
instead of writing an incomplete entry.
Only treat '\n', '\r' and '\r\n' as line separators in re-folding the email
messages. Preserve control characters '\v', '\f', '\x1c', '\x1d' and '\x1e'
and Unicode line separators '\x85', '\u2028' and '\u2029' as is.
Older libedit versions (like Apple's) use a different type signature
for rl_startup_hook and rl_pre_input_hook. Add a configure check to
determine which signature is accepted by introducing the
Py_RL_STARTUP_HOOK_TAKES_ARGS macro in pyconfig.h.
Fix mimalloc allocator for huge memory allocation (around
8,589,934,592 GiB) on s390x.
Abort allocation early in mimalloc if the number of slices doesn't
fit into uint32_t, to prevent a integer overflow (cast 64-bit
size_t to uint32_t).
Since 6258844c, paths that might not exist can be fed into pathlib's
globbing implementation, which will call `os.scandir()` / `os.lstat()` only
when strictly necessary. This allows us to drop an initial `self.is_dir()`
call, which saves a `stat()`.
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
rfc9110 obsoletes the earlier rfc 7231. This document also includes some
status codes that were previously only used for WebDAV and assigns more
generic names to these status codes.
ref: https://www.rfc-editor.org/rfc/rfc9110.html#name-changes-from-rfc-7231
- http.HTTPStatus.CONTENT_TOO_LARGE (413, previously
REQUEST_ENTITY_TOO_LARGE)
- http.HTTPStatus.URI_TOO_LONG (414, previously REQUEST_URI_TOO_LONG)
- http.HTTPStatus.RANGE_NOT_SATISFYABLE (416, previously
REQUEST_RANGE_NOT_SATISFYABLE)
- http.HTTPStatus.UNPROCESSABLE_CONTENT (422, previously
UNPROCESSABLE_ENTITY)
The new constants are added to http.HTTPStatus and the old constant names are
preserved for backwards compatibility.
References in documentation to the obsoleted rfc 7231 are updated
Replace use of `os.listdir()` with `os.scandir()`. Forgo setting `_drv`,
`_root` and `_tail_cached`, as these usually aren't needed. Use
`os.DirEntry.path` to set `_str`.
Don't bother calling `os.scandir()` to scan for literal pattern segments,
like `foo` in `foo/*.py`. Instead, append the segment(s) as-is and call
through to the next selector with `exists=False`, which signals that the
path might not exist. Subsequent selectors will call `os.scandir()` or
`os.lstat()` to filter out missing paths as needed.
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used. The following bytes and bytearray methods are adapted:
- count()
- find()
- index()
- rfind()
- rindex()
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
Detect libcrypto BLAKE2, Shake, SHA3, and Truncated-SHA512 support at hashlib build time
## BLAKE2
While OpenSSL supports both "b" and "s" variants of the BLAKE2 hash
function, other cryptographic libraries may lack support for one or both
of the variants. This commit modifies `hashlib`'s C code to detect
whether or not the linked libcrypto supports each BLAKE2 variant, and
elides references to each variant's NID accordingly. In cases where the
underlying libcrypto doesn't fully support BLAKE2, CPython's
`./configure` script can be given the following flag to use CPython's
interned BLAKE2 implementation: `--with-builtin-hashlib-hashes=blake2`.
## SHA3, Shake, & truncated SHA512.
Detect BLAKE2, SHA3, Shake, & truncated SHA512 support in the
OpenSSL-ish libcrypto library at build time. This helps allow hashlib's
`_hashopenssl` to be used with libraries that do not to support every
algorithm that upstream OpenSSL does. Such as AWS-LC & BoringSSL.
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
gh-16429 introduced support for an iterable of separators in
Stream.readuntil. Since bytes-like types are themselves iterable, this
can introduce ambiguities in deciding whether the argument is an
iterator of separators or a singleton separator. In gh-16429, only 'bytes'
was considered a singleton, but this will break code that passes other
buffer object types.
Fix it by only supporting tuples rather than arbitrary iterables.
Closes gh-117722.
Fall back to tp_call() for cases when arguments are passed by name.
Co-authored-by: Donghee Na <donghee.na@python.org>
Co-authored-by: Victor Stinner <vstinner@python.org>
Move `pathlib.Path.walk()` implementation into `glob._Globber`. The new
`glob._Globber.walk()` classmethod works with strings internally, which is
a little faster than generating `Path` objects and keeping them normalized.
The `pathlib.Path.walk()` method converts the strings back to path objects.
In the private pathlib ABCs, our existing subclass of `_Globber` ensures
that `PathBase` instances are used throughout.
Follow-up to #117589.
Move pathlib globbing implementation into a new private class: `glob._Globber`. This class implements fast string-based globbing. It's called by `pathlib.Path.glob()`, which then converts strings back to path objects.
In the private pathlib ABCs, add a `pathlib._abc.Globber` subclass that works with `PathBase` objects rather than strings, and calls user-defined path methods like `PathBase.stat()` rather than `os.stat()`.
This sets the stage for two more improvements:
- GH-115060: Query non-wildcard segments with `lstat()`
- GH-116380: Unify `pathlib` and `glob` implementations of globbing.
No change to the implementations of `glob.glob()` and `glob.iglob()`.
Moves the validation for invalid years in the C implementation of the `datetime` module into a common location between `fromisoformat` and `fromisocalendar`, which improves the error message and fixes a failed assertion when parsing invalid ISO 8601 years using one of the "ISO weeks" formats.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
This prevents external cancellations of a task group's parent task to
be dropped when an internal cancellation happens at the same time.
Also strengthen the semantics of uncancel() to clear self._must_cancel
when the cancellation count reaches zero.
Co-Authored-By: Tin Tvrtković <tinchester@gmail.com>
Co-Authored-By: Arthur Tacca
Apply the following optimizations to `posixpath.realpath()`:
- Remove use of recursion
- Construct child paths directly rather than using `join()`
- Use `os.getcwd[b]()` rather than `abspath()`
- Use `startswith(sep)` rather than `isabs()`
- Use slicing rather than `split()`
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.
The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used. The following methods are adapted:
- str.count
- str.find
- str.index
- str.rfind
- str.rindex
On Linux >= 2.6.36 with glibc < 2.27, `getcwd()` can return a relative
pathname starting with '(unreachable)'. We detect this and fail with
ENOENT, matching new glibc behaviour.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
* as_completed returns object that is both iterator and async iterator
* Existing tests adjusted to test both the old and new style
* New test to ensure iterator can be resumed
* New test to ensure async iterator yields any passed-in Futures as-is
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
* Extract method for _read_inner, reducing complexity and indentation by 1.
* Extract method for _raise_all and yield ParseErrors from _read_inner.
Reduces complexity by 1 and reduces touch points for handling errors in _read_inner.
* Prefer iterators to splat expansion and literal indexing.
* Extract method for _strip_comments. Reduces complexity by 7.
* Model the file lines in a class to encapsulate the comment status and cleaned value.
* Encapsulate the read state as a dataclass
* Extract _handle_continuation_line and _handle_rest methods. Reduces complexity by 8.
* Reindent
* At least for now, collect errors in the ReadState
* Check for missing section header separately.
* Extract methods for _handle_header and _handle_option. Reduces complexity by 6.
* Remove unreachable code. Reduces complexity by 4.
* Remove unreachable branch
* Handle error condition early. Reduces complexity by 1.
* Add blurb
* Move _raise_all to ParsingError, as its behavior is most closely related to the exception class and not the reader.
* Split _strip* into separate methods.
* Refactor _strip_full to compute the strip just once and use 'not any' to determine the factor.
* Replace use of 'sys.maxsize' with direct computation of the stripped value.
* Extract has_comments as a dynamic property.
* Implement clean as a cached property.
* Model comment prefixes in the RawConfigParser within a prefixes namespace.
* Use a regular expression to search for the first match.
Avoids mutating variables and tricky logic and over-computing all of the starts when only the first is relevant.
Remove extra self DECREF on ssl "no ciphers" error path.
This doesn't come up in practice because nobody links against a broken
OpenSSL library that provides nothing.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Jacob Coffee <jacob@z7x.org>
Co-authored-by: Malcolm Smith <smith@chaquo.com>
Co-authored-by: Ned Deily <nad@python.org>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Malcolm Smith <smith@chaquo.com>
Co-authored-by: Ned Deily <nad@python.org>
* Reads zip64 files as produced by the zipfile module
* Include tests (somewhat slow, however, because of the need to create "large" zips)
* About the same amount of strictness reading invalid zip files as zipfile has
* Still works on files with prepended data (like pex)
There are a lot more test cases at https://github.com/thatch/zipimport64/ that give me confidence that this works for real-world files.
Fixes#89739 and #77140.
---------
Co-authored-by: Itamar Ostricher <itamarost@gmail.com>
Reviewed-by: Gregory P. Smith <greg@krypto.org>
Explicitly handle the case where stdout=STDOUT
as otherwise the existing error handling gets
confused and reports hard to understand errors.
Signed-off-by: Paulo Neves <ptsneves@gmail.com>
Fix parsing of the following corner cases:
* URLs with only a host name
* URLs containing a fragment
* URLs containing a query
* filenames with only a UNC sharepoint on Windows
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
Change old space bit of young objects from 0 to gcstate->visited_space.
This ensures that any object created *and* collected during cycle GC has the bit set correctly.
Python 3.10 changed from using SSL_write() and SSL_read() to SSL_write_ex() and
SSL_read_ex(), but did not update handling of the return value.
Change error handling so that the return value is not examined.
OSError (not EOF) is now returned when retval is 0.
According to *recent* man pages of all functions for which we call
PySSL_SetError, (in OpenSSL 3.0 and 1.1.1), their return value should
be used to determine whether an error happened (i.e. if PySSL_SetError
should be called), but not what kind of error happened (so,
PySSL_SetError shouldn't need retval). To get the error,
we need to use SSL_get_error.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
This fixes XML unittest fallout from the https://github.com/python/cpython/issues/115398 security fix. When configured using `--with-system-expat` on systems with older pre 2.6.0 versions of libexpat, our unittests were failing.
* sax|etree: Simplify Expat version guard where simplifiable
Idea by Matěj Cepl
* sax|etree: Fix reparse deferral tests for vanilla Expat <2.6.0
This *does not fix* the case of distros with an older version of libexpat with the 2.6.0 feature backported as a security fix. (Ubuntu is a known example of this with its libexpat1 2.5.0-2ubunutu0.1 package)
Instead of calling `exec()` once for each function added to a dataclass, only call `exec()` once per dataclass. This can lead to speed improvements of up to 20%.
* GH-113171: Fix "private" (really non-global) IP address ranges
The _private_networks variables, used by various is_private
implementations, were missing some ranges and at the same time had
overly strict ranges (where there are more specific ranges considered
globally reachable by the IANA registries).
This patch updates the ranges with what was missing or otherwise
incorrect.
I left 100.64.0.0/10 alone, for now, as it's been made special in [1]
and I'm not sure if we want to undo that as I don't quite understand the
motivation behind it.
The _address_exclude_many() call returns 8 networks for IPv4, 121
networks for IPv6.
[1] https://github.com/python/cpython/issues/61602
Add Py_GetConstant() and Py_GetConstantBorrowed() functions.
In the limited C API version 3.13, getting Py_None, Py_False,
Py_True, Py_Ellipsis and Py_NotImplemented singletons is now
implemented as function calls at the stable ABI level to hide
implementation details. Getting these constants still return borrowed
references.
Add _testlimitedcapi/object.c and test_capi/test_object.py to test
Py_GetConstant() and Py_GetConstantBorrowed() functions.
* Ensure importlib.metadata tests do not leak references in sys.modules.
* Move importlib.metadata tests to their own package for easier syncing with importlib_metadata.
* Update owners and makefile for new directories.
* Add blurb
Before this change, ctypes classes used a custom dict subclass, `StgDict`,
as their `tp_dict`. This acts like a regular dict but also includes extra information
about the type.
This replaces stgdict by `StgInfo`, a C struct on the type, accessed by
`PyObject_GetTypeData()` (PEP-697).
All usage of `StgDict` (mainly variables named `stgdict`, `dict`, `edict` etc.) is
converted to `StgInfo` (named `stginfo`, `info`, `einfo`, etc.).
Where the dict is actually used for class attributes (as a regular PyDict), it's now
called `attrdict`.
This change -- not overriding `tp_dict` -- is made to make me comfortable with
the next part of this PR: moving the initialization logic from `tp_new` to `tp_init`.
The `StgInfo` is set up in `__init__` of each class, with a guard that prevents
calling `__init__` more than once. Note that abstract classes (like `Array` or
`Structure`) are created using `PyType_FromMetaclass` and do not have
`__init__` called.
Previously, this was done in `__new__`, which also wasn't called for abstract
classes.
Since `__init__` can be called from Python code or skipped, there is a tested
guard to ensure `StgInfo` is initialized exactly once before it's used.
Co-authored-by: neonene <53406459+neonene@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Starting in Python 3.12, we prevented calling fork() and starting new threads
during interpreter finalization (shutdown). This has led to a number of
regressions and flaky tests. We should not prevent starting new threads
(or `fork()`) until all non-daemon threads exit and finalization starts in
earnest.
This changes the checks to use `_PyInterpreterState_GetFinalizing(interp)`,
which is set immediately before terminating non-daemon threads.
If you catch DuplicateOptionError / DuplicateSectionError when reading a
config file (the intention is to skip invalid config files) and then
attempt to use the ConfigParser instance, any values it *had* read
successfully so far, were stored as a list instead of string! Later
`get` calls would raise "AttributeError: 'list' object has no attribute
'find'" from somewhere deep in the interpolation code.
These give applications the option of more forcefully terminating client
connections for asyncio servers. Useful when terminating a service and
there is limited time to wait for clients to finish up their work.
This is a do-over with a test fix for gh-114432, which was reverted.
This includes adding what should be a relatively temporary
`Modules/_decimal/windows/mpdecimal.h` shim to choose between `mpdecimal32vc.h`
or `mpdecimal64vc.h` based on which of `CONFIG_64` or `CONFIG_32` is defined.
* bpo-27578: Fix inspect.getsource() on empty file
For modules from empty files, `inspect.getsource()` now
returns an empty string, and `inspect.getsourcelines()` returns
a list of one empty string, fixing the expected invariant.
As indicated by `exec('')`, empty strings are valid Python
source code.
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:
1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
However, C holds it, so the assertion fails.
The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.
There are two main parts to this PR:
1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
The event is set by the runtime prior to the thread being cleared (in the same
place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
`PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
`_tstate_lock`s.
[^1]: 441affc9e7/Lib/threading.py (L1201)
[^2]: 441affc9e7/Lib/threading.py (L1115)
[^3]: 8194653279
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Change automatically generated tkinter.Checkbutton widget names to
avoid collisions with automatically generated tkinter.ttk.Checkbutton
widget names within the same parent widget.
* Restore support of None and other false values.
* Raise TypeError for non-zero integers and non-empty sequences.
The regressions were introduced in gh-74668
(bdba8ef42b).
Any capitalization of "xn--" should be acceptable for the ACE prefix
(see https://tools.ietf.org/html/rfc3490#section-5).
Co-authored-by: Pepijn de Vos <pepijndevos@gmail.com>
Co-authored-by: Erlend E. Aasland <erlend@python.org>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Use the NtQueryInformationProcess system call to efficiently retrieve the parent process ID in a single step, rather than using the process snapshots API which retrieves large amounts of unnecessary information and is more prone to failure (since it makes heap allocations).
Includes a fallback to the original win32_getppid implementation in case the unstable API appears to return strange results.
On Windows, time.monotonic() now uses the QueryPerformanceCounter()
clock to have a resolution better than 1 us, instead of the
gGetTickCount64() clock which has a resolution of 15.6 ms.
* GH-116554: Relax list.sort()'s notion of "descending" run
Rewrote `count_run()` so that sub-runs of equal elements no longer end a descending run. Both ascending and descending runs can have arbitrarily many sub-runs of arbitrarily many equal elements now. This is tricky, because we only use ``<`` comparisons, so checking for equality doesn't come "for free". Surprisingly, it turned out there's a very cheap (one comparison) way to determine whether an ascending run consisted of all-equal elements. That sealed the deal.
In addition, after a descending run is reversed in-place, we now go on to see whether it can be extended by an ascending run that just happens to be adjacent. This succeeds in finding at least one additional element to append about half the time, and so appears to more than repay its cost (the savings come from getting to skip a binary search, when a short run is artificially forced to length MIINRUN later, for each new element `count_run()` can add to the initial run).
While these have been in the back of my mind for years, a question on StackOverflow pushed it to action:
https://stackoverflow.com/questions/78108792/
They were wondering why it took about 4x longer to sort a list like:
[999_999, 999_999, ..., 2, 2, 1, 1, 0, 0]
than "similar" lists. Of course that runs very much faster after this patch.
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
gh-116307: Create a new import helper 'isolated modules' and use that instead of 'Clean Import' to ensure that tests from importlib_resources don't leave modules in sys.modules.
* and fix global flag repr
* Update Misc/NEWS.d/next/Library/2024-03-11-12-11-10.gh-issue-116600.FcNBy_.rst
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
These give applications the option of more forcefully terminating client
connections for asyncio servers. Useful when terminating a service and
there is limited time to wait for clients to finish up their work.
In free-threaded builds, running with `PYTHON_GIL=0` will now disable the
GIL. Follow-up issues track work to re-enable the GIL when loading an
incompatible extension, and to disable the GIL by default.
In order to support re-enabling the GIL at runtime, all GIL-related data
structures are initialized as usual, and disabling the GIL simply sets a flag
that causes `take_gil()` and `drop_gil()` to return early.
* set default return value of functional types as _mock_return_value
* added test of wrapping child attributes
* added backward compatibility with explicit return
* added docs on the order of precedence
* added test to check default return_value
This follows in the footsteps of #21586
This speeds up import uuid by about 6ms on Linux.
Before:
```
λ hyperfine -w 4 "./python -c 'import uuid'"
Benchmark 1: ./python -c 'import uuid'
Time (mean ± σ): 20.4 ms ± 0.4 ms [User: 16.7 ms, System: 3.8 ms]
Range (min … max): 19.6 ms … 21.8 ms 136 runs
```
After:
```
λ hyperfine -w 4 "./python -c 'import uuid'"
Benchmark 1: ./python -c 'import uuid'
Time (mean ± σ): 14.5 ms ± 0.3 ms [User: 11.5 ms, System: 3.2 ms]
Range (min … max): 13.9 ms … 16.0 ms 175 runs
```
This adds `VERIFY_X509_STRICT` to make the default
SSL context perform stricter (per RFC 5280) validation, as well
as `VERIFY_X509_PARTIAL_CHAIN` to enforce more standards-compliant
path-building behavior.
As part of this changeset, I had to tweak `make_ssl_certs.py`
slightly to emit 5280-conforming CA certs. This changeset includes
the regenerated certificates after that change.
Signed-off-by: William Woodruff <william@yossarian.net>
Co-authored-by: Victor Stinner <vstinner@python.org>
* Increment PyExpat_CAPI_MAGIC due to SetReparseDeferralEnabled addition.
This is a followup to git commit
6a95676bb5 from Github PR #115623.
* RESTify news API list.
Improve algorithm for computing which rolled-over log files to delete
in logging.TimedRotatingFileHandler. It is now reliable for handlers
without namer and with arbitrary deterministic namer that leaves
the datetime part in the file name unmodified.
If awailable, enable -fstrict-overflow for libmpdec. Also
shut off false positive warnings (-Warray-bounds).
The later was backported from mpdecimal-4.0.0.
This makes the asyncio REPL (`python -m asyncio`) more usable
and similar to the regular REPL.
This exposes register_readline() as a top-level function in site.py,
but it's intentionally undocumented.
Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
Co-authored-by: Itamar Oren <itamarost@gmail.com>
* Do not overwrite already rolled over files. It happened at midnight or
during the DST change and caused the loss of data.
* computeRollover() now always return the timestamp larger than the
specified time.
* Fix computation of the rollover time during the DST change.
Support callables with the __call__() method and types with
__new__() and __init__() methods set to class methods, static
methods, bound methods, partial functions, and other types of
methods and descriptors.
Add tests for numerous types of callables and descriptors.
Allow controlling Expat >=2.6.0 reparse deferral (CVE-2023-52425) by adding five new methods:
- `xml.etree.ElementTree.XMLParser.flush`
- `xml.etree.ElementTree.XMLPullParser.flush`
- `xml.parsers.expat.xmlparser.GetReparseDeferralEnabled`
- `xml.parsers.expat.xmlparser.SetReparseDeferralEnabled`
- `xml.sax.expatreader.ExpatParser.flush`
Based on the "flush" idea from https://github.com/python/cpython/pull/115138#issuecomment-1932444270 .
### Notes
- Please treat as a security fix related to CVE-2023-52425.
Includes code suggested-by: Snild Dolkow <snild@sony.com>
and by core dev Serhiy Storchaka.
This change is part of the work on PEP-738: Adding Android as a
supported platform.
* Remove the "1.0" suffix from libpython's filename on Android, which
would prevent Gradle from packaging it into an app.
* Simplify the build command in the Makefile so that libpython always
gets given an SONAME with the `-Wl-h` argument, even if the SONAME is
identical to the actual filename.
* Disable a number of functions on Android which can be compiled and
linked against, but always fail at runtime. As a result, the native
_multiprocessing module is no longer built for Android.
* gh-115390 (bee7bb331) added some pre-determined results to the
configure script for things that can't be autodetected when
cross-compiling; this change adds Android to these where appropriate.
* Add a couple more pre-determined results for Android, and making them
cover iOS as well. This means the --enable-ipv6 configure option will
no longer be required on either platform.
Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree.
Nothing else in Python generally logs the contents of variables, so this
can be very unexpected for developers and could leak sensitive
information in to terminals and log files.
In some cases we might cause a StreamWriter to stay alive even when the
application has dropped all references to it. This prevents us from
doing automatical cleanup, and complaining that the StreamWriter wasn't
properly closed.
Fortunately, the extra reference was never actually used for anything so
we can just drop it.
Instead of showing a dot for each iteration, show:
- '.' for zero (on negative) leaks
- number of leaks for 1-9
- 'X' if there are more leaks
This allows more rapid iteration: when bisecting, I don't need
to wait for the final report to see if the test still leaks.
Also, show the full result if there are any non-zero entries.
This shows negative entries, for the unfortunate cases where
a reference is created and cleaned up in different runs.
Test *failure* is still determined by the existing heuristic.
Setting the __class__ attribute of a lazy-loading module to ModuleType enables other threads to attempt to access attributes before the loading is complete. Now that is protected by a lock.
The platform standard on macOS is to show a proxy icon for open
files in the titlebar of Windows. Make sure IDLE matches this
behaviour.
Don't use both the long and short names in the window title.
The behaviour of other editors (such as Text Editor) is to show
only the short name with the proxy icon.
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Part of the work on PEP 738: Adding Android as a supported platform.
* Rename the LIBPYTHON variable to MODULE_LDFLAGS, to more accurately
reflect its purpose.
* Edit makesetup to use MODULE_LDFLAGS when linking extension modules.
* Edit the Makefile so that extension modules depend on libpython on
Android and Cygwin.
* Restore `-fPIC` on Android. It was removed several years ago with a
note that the toolchain used it automatically, but this is no longer
the case. Omitting it causes all linker commands to fail with an error
like `relocation R_AARCH64_ADR_PREL_PG_HI21 cannot be used against
symbol '_Py_FalseStruct'; recompile with -fPIC`.
Reproducer depends on terminal size - the traceback occurs when there's
an option long enough so the usage line doesn't fit the terminal width.
Option order is also important for reproducibility.
Excluding empty groups (with all options suppressed) from inserts
fixes the problem.
This builds on https://github.com/python/cpython/pull/106807, which adds
a return code to ResourceTracker, to make future debugging easier.
Testing this “in situ” proved difficult, since the global ResourceTracker is
involved in test infrastructure. So, the tests here create a new instance and
feed it fake data.
---------
Co-authored-by: Yonatan Bitton <yonatan.bitton@perception-point.io>
Co-authored-by: Yonatan Bitton <bityob@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Restore support of such combination, disabled in gh-113796.
csv.writer() now quotes empty fields if delimiter is a space and
skipinitialspace is true and raises exception if quoting is not possible.
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.
The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
Keep the old private _PyCFunctionFastWithKeywords name (Python 3.7)
as an alias to the new public name PyCFunctionFastWithKeywords
(Python 3.13a4).
_PyCFunctionWithKeywords doesn't exist in Python 3.13a3, whereas
_PyCFunctionFastWithKeywords was removed in Python 3.13a4.
The charset name "Windows-31J" is registered in the IANA Charset Registry[1]
and is implemented in Python as the cp932 codec.
[1] https://www.iana.org/assignments/charset-reg/windows-31J
Signed-off-by: Masayuki Moriyama <masayuki.moriyama@miraclelinux.com>
* test.bisect_cmd now exit with code 0 on success, and code 1 on
failure. Before, it was the opposite.
* test.bisect_cmd now runs the test worker process with
-X faulthandler.
* regrtest RunTests: Add create_python_cmd() and bisect_cmd()
methods.
Fix the exceptions raised by posixpath.commonpath
Raise ValueError, not IndexError when passed an empty iterable. Raise
TypeError, not ValueError when passed None.
lseek() always returns 0 for character pseudo-devices like
`/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it
always returns -1, to which CPython reacts by raising appropriate
exceptions). They are thus technically seekable despite not having seek
semantics.
When calling read() on e.g. an instance of `io.BufferedReader` that
wraps such a file, `BufferedReader` reads ahead, filling its buffer,
creating a discrepancy between the number of bytes read and the internal
`tell()` always returning 0, which previously resulted in e.g.
`BufferedReader.tell()` or `BufferedReader.seek()` being able to return
positions < 0 even though these are supposed to be always >= 0.
Invariably keep the return value non-negative by returning
max(former_return_value, 0) instead, and add some corresponding tests.
This change essentially replaces usage of `%1` with `%~1`, which removes
quotes, if any. Without this change, the if statements fail due to
the quotes mangling the syntax.
Additionally, this change works around comma being treated as a parameter
delimiter in test.bat by escaping commas at time of parsing. Tested
combinations of rt and regrtest arguments, all seems to work as before
but now you can specify commas in arguments like "-uall,extralargefile".
* gh-114572: Fix locking in cert_store_stats and get_ca_certs
cert_store_stats and get_ca_certs query the SSLContext's X509_STORE with
X509_STORE_get0_objects, but reading the result requires a lock. See
https://github.com/openssl/openssl/pull/23224 for details.
Instead, use X509_STORE_get1_objects, newly added in that PR.
X509_STORE_get1_objects does not exist in current OpenSSLs, but we can
polyfill it with X509_STORE_lock and X509_STORE_unlock.
* Work around const-correctness problem
* Add missing X509_STORE_get1_objects failure check
* Add blurb
* bpo-38364: unwrap partialmethods just like we unwrap partials
The inspect.isgeneratorfunction, inspect.iscoroutinefunction and inspect.isasyncgenfunction already unwrap functools.partial objects, this patch adds support for partialmethod objects as well.
Also: Rename _partialmethod to __partialmethod__.
Since we're checking this attribute on arbitrary function-like objects,
we should use the namespace reserved for core Python.
---------
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Use critical sections to make deque methods that operate on mutable
state thread-safe when the GIL is disabled. This is mostly accomplished
by using the @critical_section Argument Clinic directive, though there
are a few places where this was not possible and critical sections had
to be manually acquired/released.
Add PythonFinalizationError exception. This exception derived from
RuntimeError is raised when an operation is blocked during the Python
finalization.
The following functions now raise PythonFinalizationError, instead of
RuntimeError:
* _thread.start_new_thread()
* subprocess.Popen
* os.fork()
* os.fork1()
* os.forkpty()
Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
We add _winapi.BatchedWaitForMultipleObjects to wait for larger numbers of handles.
This is an internal module, hence undocumented, and should be used with caution.
Check the docstring for info before using BatchedWaitForMultipleObjects.
Part of the PEP 730 work to add iOS support.
This change lays the groundwork for introducing iOS/tvOS/watchOS
frameworks; it includes the structural refactoring needed so that iOS
branches can be added into in a subsequent PR.
Summary of changes:
* Updates config.sub to the 2024-01-01 release. This is the "as
released" version of config.sub.
* Adds a RESSRCDIR variable to allow sharing of macOS and iOS Makefile
steps.
* Adds an INSTALLTARGETS variable so platforms can customise which
targets are actually installed. This will be used to exclude certain
targets (e.g., binaries, manfiles) from iOS framework installs.
* Adds a PYTHONFRAMEWORKINSTALLNAMEPREFIX variable; this is used as
the install name for the library. This is needed to allow for iOS
frameworks to specify an @rpath-based install name.
* Evaluates MACHDEP earlier in the configure process so that
ac_sys_system is available.
* Modifies _PYTHON_HOST_PLATFORM evaluation for cross-platform builds
so that the CPU architecture is differentiated from the host
identifier. This will be used to generate a _PYTHON_HOST_PLATFORM
definition that includes ABI information, not just CPU architecture.
* Differentiates between SOABI_PLATFORM and PLATFORM_TRIPLET.
SOABI_PLATFORM is used in binary module names, and includes the ABI,
but not the OS or CPU architecture (e.g.,
math.cpython-313-iphonesimulator.dylib). PLATFORM_TRIPLET is used
as the sys._multiarch value, and on iOS will contains the ABI and
architecture (e.g., iphoneos-arm64). This differentiation hasn't
historically been needed because while macOS is a multiarch platform,
it uses a bare darwin as PLATFORM_TRIPLE.
* Removes the use of the deprecated -Wl,-single_module flag when
compiling macOS frameworks.
* Some whitespace normalisation where there was a mix of spaces and tabs
in a single block.
When replace() method is called on a subclass of datetime, date or time,
properly call derived constructor. Previously, only the base class's
constructor was called.
Also, make sure to pass non-zero fold values when creating subclasses in
various methods. Previously, fold was silently ignored.
Immediate merits:
* eliminate complex workarounds for 'z' format support
(NOTE: mpdecimal recently added 'z' support, so this becomes
efficient in the long term.)
* fix 'z' format memory leak
* fix 'z' format applied to 'F'
* fix missing '#' format support
Suggested and prototyped by Stefan Krah.
Fixes gh-114563, gh-91060
Co-authored-by: Stefan Krah <skrah@bytereef.org>
* Class methods no longer have "method of builtins.type instance" note.
* Corresponding notes are now added for class and unbound methods.
* Method and function aliases now have references to the module or the
class where the origin was defined if it differs from the current.
* Bound methods are now listed in the static methods section.
* Methods of builtin classes are now supported as well as methods of
Python classes.
Now the special comparison methods like `__eq__` and `__lt__` return
NotImplemented if one of comparands is date and other is datetime
instead of ignoring the time part and the time zone or forcefully
return "not equal" or raise TypeError.
It makes comparison of date and datetime subclasses more symmetric
and allows to change the default behavior by overriding
the special comparison methods in subclasses.
It is now the same as if date and datetime was independent classes.
Setters for members with an unsigned integer type now support
the same range of valid values for objects that has a __index__()
method as for int.
Previously, Py_T_UINT, Py_T_ULONG and Py_T_ULLONG did not support
objects that has a __index__() method larger than LONG_MAX.
Py_T_ULLONG did not support negative ints. Now it supports them and
emits a RuntimeWarning.
By default, it preserves an inconsistent behavior of older Python
versions: packs the count into a 1-tuple if only one or none
options are specified (including 'update'), returns None instead of 0.
Except that setting wantobjects to 0 no longer affects the result.
Add a new parameter return_ints: specifying return_ints=True makes
Text.count() always returning the single count as an integer
instead of a 1-tuple or None.
Avoid race conditions in the creation of directories during concurrent
extraction in tarfile and zipfile.
Co-authored-by: Samantha Hughes <shughes-uk@users.noreply.github.com>
Co-authored-by: Peder Bergebakken Sundt <pbsds@hotmail.com>
When expanding and filtering paths for a `**` wildcard segment, build an `re.Pattern` object from the subsequent pattern parts, rather than the entire pattern, and match against the `os.DirEntry` object prior to instantiating a path object. Also skip compiling a pattern when expanding a `*` wildcard segment.