* Add declaration of Tcl_AppInit(), missing in Tcl 9.0.
* Use Tcl_Size instead of int where needed.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
We make use of the same mechanism that we use for the static builtin types. This required a few tweaks.
The relevant code could use some cleanup but I opted to avoid the significant churn in this change. I'll tackle that separately.
This change is the final piece needed to make _datetime support multiple interpreters. I've updated the module slot accordingly.
Remove the delegation of `int` to the `__trunc__` special method: `int` will now only delegate to `__int__` and `__index__` (in that order). `__trunc__` continues to exist, but its sole purpose is to support `math.trunc`.
---------
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Implement `shutil._rmtree_safe_fd()` using a list as a stack to avoid emitting recursion errors on deeply nested trees.
`shutil._rmtree_unsafe()` was fixed in a150679f90.
Support non-dict globals in LOAD_FROM_DICT_OR_GLOBALS
The implementation basically copies LOAD_GLOBAL. Possibly it could be deduplicated,
but that seems like it may get hairy since the two operations have different operands.
This is important to fix in 3.14 for PEP 649, but it's a bug in earlier versions too,
and we should backport to 3.13 and 3.12 if possible.
Release the GIL before calling `_Py_qsbr_unregister`.
The deadlock could occur when the GIL was enabled at runtime. The
`_Py_qsbr_unregister` call might block while holding the GIL because the
thread state was not active, but the GIL was still held.
Make sure that `gilstate_counter` is not zero in when calling
`PyThreadState_Clear()`. A destructor called from `PyThreadState_Clear()` may
call back into `PyGILState_Ensure()` and `PyGILState_Release()`. If
`gilstate_counter` is zero, it will try to create a new thread state before
the current active thread state is destroyed, leading to an assertion failure
or crash.
When using the ** operator or pow() with Fraction as the base
and an exponent that is not rational, a float, or a complex, the
fraction is no longer converted to a float.
Some of standard Tcl types were renamed, removed, or no longer
registered in Tcl 8.7/9.0. This change fixes automatic conversion of Tcl
values to Python values to avoid returning a Tcl_Obj where the primary
Python types (int, bool, str, bytes) were returned in older Tcl.
* Passing a string as the "real" keyword argument is now an error;
it should only be passed as a single positional argument.
* Passing a complex number as the "real" or "imag" argument is now deprecated;
it should only be passed as a single positional argument.
Make `shutil._rmtree_unsafe()` call `os.walk()`, which is implemented
without recursion.
`shutil._rmtree_safe_fd()` is not affected and can still raise a recursion
error.
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
The deadlock only affected the free-threaded build and only occurred
when the GIL was enabled at runtime. The `Py_DECREF(old_name)` call
might temporarily release the GIL while holding the type seqlock.
Another thread may spin trying to acquire the seqlock while holding the
GIL.
The deadlock occurred roughly 1 in ~1,000 runs of `pool_in_threads.py`
from `test_multiprocessing_pool_circular_import`.
If one calls pow(fractions.Fraction, x, module) with modulo not None, the error message now says that the types are incompatible rather than saying pow only takes 2 arguments. Implemented by having fractions.Fraction __pow__ accept optional modulo argument and return NotImplemented if not None. pow() then raises with appropriate message.
---------
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
* gh-118673: Remove shebang and executable bits from stdlib modules.
* Removed shebangs and exe bits on turtledemo scripts.
The setting was inappropriate for '__main__' and inconsistent across the other modules. The scripts can still be executed directly by invoking with the desired interpreter.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
Structure layout, and especially bitfields, sometimes resulted in clearly
wrong behaviour like overlapping fields. This fixes
Co-authored-by: Gregory P. Smith <gps@python.org>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
* gh-119118: Fix performance regression in tokenize module
- Cache line object to avoid creating a Unicode object
for all of the tokens in the same line.
- Speed up byte offset to column offset conversion by using the
smallest buffer possible to measure the difference.
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
pathlib now treats "`.`" as a valid file extension (suffix). This brings
it in line with `os.path.splitext()`.
In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method
that splits a path into a `(root, ext)` pair, like `os.path.splitext()`.
This method is called by `PurePathBase.stem`, `suffix`, etc. In a future
version of pathlib, we might make these base classes public, and so users
will be able to define their own `splitext()` method to control file
extension splitting.
In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes`
properties that don't use `splitext()`, which avoids computing the path
base name twice.
The assertion was added in gh-118532 but was based on the invalid assumption that PyState_FindModule() would only be called with an already-initialized module def. I've added a test to make sure we don't make that assumption again.
``_fancy_replace()`` is no longer recursive. and a single call does a worst-case linear number of ratio() computations instead of quadratic. This renders toothless a universe of pathological cases. Some inputs may produce different output, but that's rare, and I didn't find a case where the final diff appeared to be of materially worse quality. To the contrary, by refusing to even consider synching on lines "far apart", there was more easy-to-digest locality in the output.
Add socket.VMADDR_CID_LOCAL constant.
Fix ThreadedVSOCKSocketStreamTest: if get_cid() returns the host
address or the "any" address, use the local communication address
(loopback): VMADDR_CID_LOCAL.
On Linux 6.9, apparently, the /dev/vsock device is now available but
get_cid() returns VMADDR_CID_ANY (-1).
ensurepip forks a subprocess to run pip itself, but that subprocess only inherits a -I isolated mode flag (see _run_pip() in Lib/ensurepip/__init__.py), not the "-E -s" flags that the installer has been using. This means that parts of ensurepip don't actually run in an isolated environment and can make incorrect decisions based on packages installed in the user site-packages.
Add `Py_BEGIN_CRITICAL_SECTION_SEQUENCE_FAST` and
`Py_END_CRITICAL_SECTION_SEQUENCE_FAST` macros and update `str.join` to use
them. Also add a regression test that would crash reliably without this
patch.
_PyArg_Parser holds static global data generated for modules by Argument Clinic. The _PyArg_Parser.kwtuple field is a tuple object, even though it's stored within a static global. In some cases the tuple is statically allocated and thus it's okay that it gets shared by multiple interpreters. However, in other cases the tuple is set lazily, allocated from the heap using the active interprepreter at the point the tuple is needed.
This is a problem once that interpreter is destroyed since _PyArg_Parser.kwtuple becomes at dangling pointer, leading to crashes. It isn't a problem if the tuple is allocated under the main interpreter, since its lifetime is bound to the lifetime of the runtime. The solution here is to temporarily switch to the main interpreter. The alternative would be to always statically allocate the tuple.
This change also fixes a bug where only the most recent parser was added to the global linked list.
Fix regression introduced in gh-100884: AttributeError when re-fold a long
address list.
Also fix more cases of incorrect encoding of the address separator in the
address list missed in gh-100884.
* bpo-15987: Implement ast.compare
Add a compare() function that compares two ASTs for structural equality. There are two set of attributes on AST node objects, fields and attributes. The fields are always compared, since they represent the actual structure of the code. The attributes can be optionally be included in the comparison. Attributes capture things like line numbers of column offsets, so comparing them involves test whether the layout of the program text is the same. Since whitespace seems inessential for comparing ASTs, the default is to compare fields but not attributes.
ASTs are just Python objects that can be modified in arbitrary ways. The API for ASTs is under-specified in the presence of user modifications to objects. The comparison respects modifications to fields and attributes, and to _fields and _attributes attributes. A user could create obviously malformed objects, and the code will probably fail with an AttributeError when that happens. (For example, adding "spam" to _fields but not adding a "spam" attribute to the object.)
Co-authored-by: Jeremy Hylton <jeremy@alum.mit.edu>
The PEP 649 implementation will require a way to load NotImplementedError
from the bytecode. @markshannon suggested implementing this by converting
LOAD_ASSERTION_ERROR into a more general mechanism for loading constants.
This PR adds this new opcode. I will work on the rest of the implementation
of the PEP separately.
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
regrtest test runner: Add XML support to the refleak checker
(-R option).
* run_unittest() now stores XML elements as string, rather than
objects, in support.junit_xml_list.
* runtest_refleak() now saves/restores XML strings before/after
checking for reference leaks. Save XML into a temporary file.
* Fix for email.generator.Generator with whitespace between encoded words.
email.generator.Generator currently does not handle whitespace between
encoded words correctly when the encoded words span multiple lines. The
current generator will create an encoded word for each line. If the end
of the line happens to correspond with the end real word in the
plaintext, the generator will place an unencoded space at the start of
the subsequent lines to represent the whitespace between the plaintext
words.
A compliant decoder will strip all the whitespace from between two
encoded words which leads to missing spaces in the round-tripped
output.
The fix for this is to make sure that whitespace between two encoded
words ends up inside of one or the other of the encoded words. This
fix places the space inside of the second encoded word.
A second problem happens with continuation lines. A continuation line that
starts with whitespace and is followed by a non-encoded word is fine because
the newline between such continuation lines is defined as condensing to
a single space character. When the continuation line starts with whitespace
followed by an encoded word, however, the RFCs specify that the word is run
together with the encoded word on the previous line. This is because normal
words are filded on syntactic breaks by encoded words are not.
The solution to this is to add the whitespace to the start of the encoded word
on the continuation line.
Test cases are from #92081
* Rename a variable so it's not confused with the final variable.
Code from https://github.com/pulkin, in PR
https://github.com/python/cpython/pull/119131
Greatly speeds `Differ` when there are many identically scoring pairs, by splitting the recursion near the inputs' midpoints instead of degenerating (as now) into just peeling off the first two lines.
Co-authored-by: Tim Peters <tim.peters@gmail.com>
Asymptotically faster (O(n log n)) str->int for very large strings, leveraging the faster multiplication scheme in the C-coded `_decimal` when available. This is used instead of the current Karatsuba-limited method starting at 2 million digits.
Lots of opportunity remains for fine-tuning. Good targets include changing BYTELIM, and possibly changing the internal output base (from 256 to a higher number of bytes).
Doing this was substantial work, and many of the new lines are actually comments giving correctness proofs. The obvious approaches sticking to integers were too slow to be useful, so this is doing variable-precision decimal floating-point arithmetic. Much faster, but worst-possible rounding errors have to be wholly accounted for, using as little precision as possible.
Special thanks to Serhiy Storchaka for asking many good questions in his code reviews!
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: sstandre <43125375+sstandre@users.noreply.github.com>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>
Suppress all `OSError` exceptions from `pathlib.Path.exists()` and `is_*()`
rather than a selection of more common errors as we do presently. Also
adjust the implementations to call `os.path.exists()` etc, which are much
faster on Windows thanks to GH-101196.
Follow-up of gh-101693. The previous DeprecationWarning is replaced with
raising sqlite3.ProgrammingError.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Remove support for supplying additional positional arguments to
`PurePath.relative_to()` and `is_relative_to()`. This has been deprecated
since Python 3.12.
_PyWeakref_ClearRef was previously exposed in the public C-API, although
it begins with an underscore and is not documented. It's used by a few
C-API extensions. There is currently no alternative public API that can
replace its use.
_PyWeakref_ClearWeakRefsExceptCallbacks is the only thread-safe way to
use _PyWeakref_ClearRef in the free-threaded build. This exposes the C
symbol, but does not make the API public.
Some embedders and extensions include parts of the internal API. The
pycore_mimalloc.h file is transitively include by a number of other
internal headers. This avoids include errors for code that was
already including those headers.
The `list_preallocate_exact` function did not zero initialize array
contents. In the free-threaded build, this could expose uninitialized
memory to concurrent readers between the call to
`list_preallocate_exact` and the filling of the array contents with
items.
Callbacks registered in the tkinter module now take arguments as
various Python objects (int, float, bytes, tuple), not just str.
To restore the previous behavior set tkinter module global wantobject to 1
before creating the Tk object or call the wantobject() method of the Tk object
with argument 1.
Calling it with argument 2 restores the current default behavior.
Fix an edge case in `binascii.a2b_base64` strict mode, where
excessive padding was not detected when no padding is necessary.
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Rationale
=========
argparse performs a complex formatting of the usage for argument grouping
and for line wrapping to fit the terminal width. This formatting has been
a constant source of bugs for at least 10 years (see linked issues below)
where defensive assertion errors are triggered or brackets and paranthesis
are not properly handeled.
Problem
=======
The current implementation of argparse usage formatting relies on regular
expressions to group arguments usage only to separate them again later
with another set of regular expressions. This is a complex and error prone
approach that caused all the issues linked below. Special casing certain
argument formats has not solved the problem. The following are some of
the most common issues:
- empty `metavar`
- mutually exclusive groups with `SUPPRESS`ed arguments
- metavars with whitespace
- metavars with brackets or paranthesis
Solution
========
The following two comments summarize the solution:
- https://github.com/python/cpython/issues/82091#issuecomment-1093832187
- https://github.com/python/cpython/issues/77048#issuecomment-1093776995
Mainly, the solution is to rewrite the usage formatting to avoid the
group-then-separate approach. Instead, the usage parts are kept separate
and only joined together at the end. This allows for a much simpler
implementation that is easier to understand and maintain. It avoids the
regular expressions approach and fixes the corresponding issues.
This closes the following GitHub issues:
- #62090
- #62549
- #77048
- #82091
- #89743
- #96310
- #98666
These PRs become obsolete:
- #15372
- #96311
Add missing import to code that handles too large files and offsets.
Use list, not tuple, for a mutable sequence.
Add tests to prevent similar mistakes.
---------
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
This change makes sure all extension/builtin modules have their init function run first by the main interpreter before proceeding with import in the original interpreter (main or otherwise). This means when the import of a single-phase init module fails in an isolated subinterpreter, it won't tie any global state/callbacks to the subinterpreter.
Add the ability to enable/disable the GIL at runtime, and use that in
the C module loading code.
We can't know before running a module init function if it supports
free-threading, so the GIL is temporarily enabled before doing so. If
the module declares support for running without the GIL, the GIL is
later disabled. Otherwise, the GIL is permanently enabled, and will
never be disabled again for the life of the current interpreter.
Now, such classes will no longer require changes in Python 3.13 in the normal case.
The test suite for robotframework passes with no DeprecationWarnings under this PR.
I also added a new DeprecationWarning for the case where `_field_types` exists
but is incomplete, since that seems likely to indicate a user mistake.
Add _PyType_LookupRef and use incref before setting attribute on type
Makes setting an attribute on a class and signaling type modified atomic
Avoid adding re-entrancy exposing the type cache in an inconsistent state by decrefing after type is updated
* Add PhotoImage.read() to read an image from a file.
* Add PhotoImage.data() to get the image data.
* Add background and grayscale parameters to PhotoImage.write().
* Add the PhotoImage method copy_replace() to copy a region
from one image to other image, possibly with pixel zooming and/or
subsampling.
* Add from_coords parameter to PhotoImage methods copy(), zoom() and subsample().
* Add zoom and subsample parameters to PhotoImage method copy().
The designated initializer syntax in static inline functions in pycore_backoff.h
causes problems for C++ or MSVC users who aren't yet using C++20.
While internal, pycore_backoff.h is included (indirectly, via pycore_code.h)
by some key 3rd party software that does so for speed.
This is *not* sufficient for the final 3.13 release, but it will do for beta 1:
- What's new entry
- Updated changelog entry (news blurb)
- Mention the proxy for f_globals in the datamodel and Python frame object docs
This doesn't have any C API details (what's new refers to the PEP).
For converting large ints to strings, CPython invokes a function in _pylong.py,
which uses the decimal module to implement an asymptotically waaaaay
sub-quadratic algorithm. But if the C decimal module isn't available, CPython
uses _pydecimal.py instead. Which in turn frequently does str(int). If the int
is very large, _pylong ends up doing the work, which in turn asks decimal to do
"big" arithmetic, which in turn calls str(big_int), which in turn ... it can
become infinite mutual recursion.
This change introduces a different int->str function that doesn't use decimal.
It's asymptotically worse, "Karatsuba time" instead of quadratic time, so
still a huge improvement. _pylong switches to that when the C decimal isn't
available. It is also used for not too large integers (less than 450_000 bits),
where it is faster (up to 2 times for 30_000 bits) than the asymptotically
better implementation that uses the C decimal.
Co-authored-by: Tim Peters <tim.peters@gmail.com>
* Initial stab.
* Test the tentative fix. Hangs "forever" without this change.
* Move the new test to a better spot.
* New comment to explain why _convert_to_str allows any poewr of 10.
* Fixed a comment, and fleshed out an existing test that appeared unfinished.
* Added temporary asserts. Or maybe permanent ;-)
* Update Lib/_pydecimal.py
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* Remove the new _convert_to_str().
Serhiy and I independently concluded that exact powers of 10
aren't possible in these contexts, so just checking the
string length is sufficient.
* At least for now, add the asserts to the other block too.
* 📜🤖 Added by blurb_it.
---------
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.
PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.
A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
The function returns `True` or `False` depending on whether the GIL is
currently enabled. In the default build, it always returns `True`
because the GIL is always enabled.
Now inspect.signature() supports references to the module globals in
parameter defaults on methods in extension modules. Previously it was
only supported in functions. The workaround was to specify the fully
qualified name, including the module name.
Add "Raw" variant of PyTime functions:
* PyTime_MonotonicRaw()
* PyTime_PerfCounterRaw()
* PyTime_TimeRaw()
Changes:
* Add documentation and tests. Tests release the GIL while calling
raw clock functions.
* py_get_system_clock() and py_get_monotonic_clock() now check that
the GIL is hold by the caller if raise_exc is non-zero.
* Reimplement "Unchecked" functions with raw clock functions.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Account for `add_stopiteration_handler` pushing a block for `async with`.
To allow generator functions that previously almost hit the `CO_MAXBLOCKS`
limit by nesting non-async blocks, the limit is increased by 1.
This increase allows one more block in non-generator functions.
Add code and config for a minimal Android app, and instructions to build and run it.
Improve Android build instructions in general.
Add a tool subcommand to download the Gradle wrapper (with its binary blob). Android
studio must be downloaded manually (due to the license).
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.
We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.
On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.
In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
* Allow to specify the signature of custom callable instances of extension
type by the __text_signature__ attribute.
* Specify signatures of operator.attrgetter, operator.itemgetter, and
operator.methodcaller instances.
While properties like IPv6Address.is_private account for IPv4-mapped
IPv6 addresses, such as for example:
>>> ipaddress.ip_address("192.168.0.1").is_private
True
>>> ipaddress.ip_address("::ffff:192.168.0.1").is_private
True
...the same doesn't currently apply to the is_loopback property:
>>> ipaddress.ip_address("127.0.0.1").is_loopback
True
>>> ipaddress.ip_address("::ffff:127.0.0.1").is_loopback
False
At minimum, this inconsistency between different properties is
counter-intuitive. Moreover, ::ffff:127.0.0.0/104 is for all intents and
purposes a loopback address, and should be treated as such.
sqlite3.iterdump() depends on the row factory returning resulting rows
as tuples; it will fail with custom row factories like for example a
dict factory.
With this commit, we explicitly reset the row factory of the cursor used
by iterdump(), so we always get predictable results. This does not
affect the row factory of the parent connection.
Co-authored-by: Mariusz Felisiak <felisiak.mariusz@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* Add name and mode attributes for compressed and archived file-like objects
in modules bz2, lzma, tarfile and zipfile.
* Change the value of the mode attribute of GzipFile from integer (1 or 2)
to string ('rb' or 'wb').
* Change the value of the mode attribute of ZipExtFile from 'r' to 'rb'.
The second item in the tuple returned from `__reduce__()` is a tuple of arguments to supply to path constructor. Previously we returned the `parts` tuple here, which entailed joining, parsing and normalising the path object, and produced a compact pickle representation.
With this patch, we instead return a tuple of paths that were originally given to the path constructor. This makes pickling much faster (at the expense of compactness).
It's worth noting that, in the olden times, pathlib performed this parsing/normalization up-front in every case, and so using `parts` for pickling was almost free. Nowadays pathlib only parses/normalises paths when it's necessary or advantageous to do so (e.g. computing a path parent, or iterating over a directory, respectively).
Tarfile.addfile now throws an ValueError when the user passes
in a non-zero size tarinfo but does not provide a fileobj,
instead of writing an incomplete entry.
Only treat '\n', '\r' and '\r\n' as line separators in re-folding the email
messages. Preserve control characters '\v', '\f', '\x1c', '\x1d' and '\x1e'
and Unicode line separators '\x85', '\u2028' and '\u2029' as is.
Older libedit versions (like Apple's) use a different type signature
for rl_startup_hook and rl_pre_input_hook. Add a configure check to
determine which signature is accepted by introducing the
Py_RL_STARTUP_HOOK_TAKES_ARGS macro in pyconfig.h.
Fix mimalloc allocator for huge memory allocation (around
8,589,934,592 GiB) on s390x.
Abort allocation early in mimalloc if the number of slices doesn't
fit into uint32_t, to prevent a integer overflow (cast 64-bit
size_t to uint32_t).
Since 6258844c, paths that might not exist can be fed into pathlib's
globbing implementation, which will call `os.scandir()` / `os.lstat()` only
when strictly necessary. This allows us to drop an initial `self.is_dir()`
call, which saves a `stat()`.
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
rfc9110 obsoletes the earlier rfc 7231. This document also includes some
status codes that were previously only used for WebDAV and assigns more
generic names to these status codes.
ref: https://www.rfc-editor.org/rfc/rfc9110.html#name-changes-from-rfc-7231
- http.HTTPStatus.CONTENT_TOO_LARGE (413, previously
REQUEST_ENTITY_TOO_LARGE)
- http.HTTPStatus.URI_TOO_LONG (414, previously REQUEST_URI_TOO_LONG)
- http.HTTPStatus.RANGE_NOT_SATISFYABLE (416, previously
REQUEST_RANGE_NOT_SATISFYABLE)
- http.HTTPStatus.UNPROCESSABLE_CONTENT (422, previously
UNPROCESSABLE_ENTITY)
The new constants are added to http.HTTPStatus and the old constant names are
preserved for backwards compatibility.
References in documentation to the obsoleted rfc 7231 are updated
Replace use of `os.listdir()` with `os.scandir()`. Forgo setting `_drv`,
`_root` and `_tail_cached`, as these usually aren't needed. Use
`os.DirEntry.path` to set `_str`.
Don't bother calling `os.scandir()` to scan for literal pattern segments,
like `foo` in `foo/*.py`. Instead, append the segment(s) as-is and call
through to the next selector with `exists=False`, which signals that the
path might not exist. Subsequent selectors will call `os.scandir()` or
`os.lstat()` to filter out missing paths as needed.
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used. The following bytes and bytearray methods are adapted:
- count()
- find()
- index()
- rfind()
- rindex()
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
Detect libcrypto BLAKE2, Shake, SHA3, and Truncated-SHA512 support at hashlib build time
## BLAKE2
While OpenSSL supports both "b" and "s" variants of the BLAKE2 hash
function, other cryptographic libraries may lack support for one or both
of the variants. This commit modifies `hashlib`'s C code to detect
whether or not the linked libcrypto supports each BLAKE2 variant, and
elides references to each variant's NID accordingly. In cases where the
underlying libcrypto doesn't fully support BLAKE2, CPython's
`./configure` script can be given the following flag to use CPython's
interned BLAKE2 implementation: `--with-builtin-hashlib-hashes=blake2`.
## SHA3, Shake, & truncated SHA512.
Detect BLAKE2, SHA3, Shake, & truncated SHA512 support in the
OpenSSL-ish libcrypto library at build time. This helps allow hashlib's
`_hashopenssl` to be used with libraries that do not to support every
algorithm that upstream OpenSSL does. Such as AWS-LC & BoringSSL.
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
gh-16429 introduced support for an iterable of separators in
Stream.readuntil. Since bytes-like types are themselves iterable, this
can introduce ambiguities in deciding whether the argument is an
iterator of separators or a singleton separator. In gh-16429, only 'bytes'
was considered a singleton, but this will break code that passes other
buffer object types.
Fix it by only supporting tuples rather than arbitrary iterables.
Closes gh-117722.
Fall back to tp_call() for cases when arguments are passed by name.
Co-authored-by: Donghee Na <donghee.na@python.org>
Co-authored-by: Victor Stinner <vstinner@python.org>
Move `pathlib.Path.walk()` implementation into `glob._Globber`. The new
`glob._Globber.walk()` classmethod works with strings internally, which is
a little faster than generating `Path` objects and keeping them normalized.
The `pathlib.Path.walk()` method converts the strings back to path objects.
In the private pathlib ABCs, our existing subclass of `_Globber` ensures
that `PathBase` instances are used throughout.
Follow-up to #117589.
Move pathlib globbing implementation into a new private class: `glob._Globber`. This class implements fast string-based globbing. It's called by `pathlib.Path.glob()`, which then converts strings back to path objects.
In the private pathlib ABCs, add a `pathlib._abc.Globber` subclass that works with `PathBase` objects rather than strings, and calls user-defined path methods like `PathBase.stat()` rather than `os.stat()`.
This sets the stage for two more improvements:
- GH-115060: Query non-wildcard segments with `lstat()`
- GH-116380: Unify `pathlib` and `glob` implementations of globbing.
No change to the implementations of `glob.glob()` and `glob.iglob()`.
Moves the validation for invalid years in the C implementation of the `datetime` module into a common location between `fromisoformat` and `fromisocalendar`, which improves the error message and fixes a failed assertion when parsing invalid ISO 8601 years using one of the "ISO weeks" formats.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
This prevents external cancellations of a task group's parent task to
be dropped when an internal cancellation happens at the same time.
Also strengthen the semantics of uncancel() to clear self._must_cancel
when the cancellation count reaches zero.
Co-Authored-By: Tin Tvrtković <tinchester@gmail.com>
Co-Authored-By: Arthur Tacca
Apply the following optimizations to `posixpath.realpath()`:
- Remove use of recursion
- Construct child paths directly rather than using `join()`
- Use `os.getcwd[b]()` rather than `abspath()`
- Use `startswith(sep)` rather than `isabs()`
- Use slicing rather than `split()`
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.
The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used. The following methods are adapted:
- str.count
- str.find
- str.index
- str.rfind
- str.rindex
On Linux >= 2.6.36 with glibc < 2.27, `getcwd()` can return a relative
pathname starting with '(unreachable)'. We detect this and fail with
ENOENT, matching new glibc behaviour.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
* as_completed returns object that is both iterator and async iterator
* Existing tests adjusted to test both the old and new style
* New test to ensure iterator can be resumed
* New test to ensure async iterator yields any passed-in Futures as-is
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
* Extract method for _read_inner, reducing complexity and indentation by 1.
* Extract method for _raise_all and yield ParseErrors from _read_inner.
Reduces complexity by 1 and reduces touch points for handling errors in _read_inner.
* Prefer iterators to splat expansion and literal indexing.
* Extract method for _strip_comments. Reduces complexity by 7.
* Model the file lines in a class to encapsulate the comment status and cleaned value.
* Encapsulate the read state as a dataclass
* Extract _handle_continuation_line and _handle_rest methods. Reduces complexity by 8.
* Reindent
* At least for now, collect errors in the ReadState
* Check for missing section header separately.
* Extract methods for _handle_header and _handle_option. Reduces complexity by 6.
* Remove unreachable code. Reduces complexity by 4.
* Remove unreachable branch
* Handle error condition early. Reduces complexity by 1.
* Add blurb
* Move _raise_all to ParsingError, as its behavior is most closely related to the exception class and not the reader.
* Split _strip* into separate methods.
* Refactor _strip_full to compute the strip just once and use 'not any' to determine the factor.
* Replace use of 'sys.maxsize' with direct computation of the stripped value.
* Extract has_comments as a dynamic property.
* Implement clean as a cached property.
* Model comment prefixes in the RawConfigParser within a prefixes namespace.
* Use a regular expression to search for the first match.
Avoids mutating variables and tricky logic and over-computing all of the starts when only the first is relevant.
Remove extra self DECREF on ssl "no ciphers" error path.
This doesn't come up in practice because nobody links against a broken
OpenSSL library that provides nothing.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Jacob Coffee <jacob@z7x.org>
Co-authored-by: Malcolm Smith <smith@chaquo.com>
Co-authored-by: Ned Deily <nad@python.org>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Malcolm Smith <smith@chaquo.com>
Co-authored-by: Ned Deily <nad@python.org>
* Reads zip64 files as produced by the zipfile module
* Include tests (somewhat slow, however, because of the need to create "large" zips)
* About the same amount of strictness reading invalid zip files as zipfile has
* Still works on files with prepended data (like pex)
There are a lot more test cases at https://github.com/thatch/zipimport64/ that give me confidence that this works for real-world files.
Fixes#89739 and #77140.
---------
Co-authored-by: Itamar Ostricher <itamarost@gmail.com>
Reviewed-by: Gregory P. Smith <greg@krypto.org>
Explicitly handle the case where stdout=STDOUT
as otherwise the existing error handling gets
confused and reports hard to understand errors.
Signed-off-by: Paulo Neves <ptsneves@gmail.com>
Fix parsing of the following corner cases:
* URLs with only a host name
* URLs containing a fragment
* URLs containing a query
* filenames with only a UNC sharepoint on Windows
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
Change old space bit of young objects from 0 to gcstate->visited_space.
This ensures that any object created *and* collected during cycle GC has the bit set correctly.
Python 3.10 changed from using SSL_write() and SSL_read() to SSL_write_ex() and
SSL_read_ex(), but did not update handling of the return value.
Change error handling so that the return value is not examined.
OSError (not EOF) is now returned when retval is 0.
According to *recent* man pages of all functions for which we call
PySSL_SetError, (in OpenSSL 3.0 and 1.1.1), their return value should
be used to determine whether an error happened (i.e. if PySSL_SetError
should be called), but not what kind of error happened (so,
PySSL_SetError shouldn't need retval). To get the error,
we need to use SSL_get_error.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
This fixes XML unittest fallout from the https://github.com/python/cpython/issues/115398 security fix. When configured using `--with-system-expat` on systems with older pre 2.6.0 versions of libexpat, our unittests were failing.
* sax|etree: Simplify Expat version guard where simplifiable
Idea by Matěj Cepl
* sax|etree: Fix reparse deferral tests for vanilla Expat <2.6.0
This *does not fix* the case of distros with an older version of libexpat with the 2.6.0 feature backported as a security fix. (Ubuntu is a known example of this with its libexpat1 2.5.0-2ubunutu0.1 package)
Instead of calling `exec()` once for each function added to a dataclass, only call `exec()` once per dataclass. This can lead to speed improvements of up to 20%.
* GH-113171: Fix "private" (really non-global) IP address ranges
The _private_networks variables, used by various is_private
implementations, were missing some ranges and at the same time had
overly strict ranges (where there are more specific ranges considered
globally reachable by the IANA registries).
This patch updates the ranges with what was missing or otherwise
incorrect.
I left 100.64.0.0/10 alone, for now, as it's been made special in [1]
and I'm not sure if we want to undo that as I don't quite understand the
motivation behind it.
The _address_exclude_many() call returns 8 networks for IPv4, 121
networks for IPv6.
[1] https://github.com/python/cpython/issues/61602
Add Py_GetConstant() and Py_GetConstantBorrowed() functions.
In the limited C API version 3.13, getting Py_None, Py_False,
Py_True, Py_Ellipsis and Py_NotImplemented singletons is now
implemented as function calls at the stable ABI level to hide
implementation details. Getting these constants still return borrowed
references.
Add _testlimitedcapi/object.c and test_capi/test_object.py to test
Py_GetConstant() and Py_GetConstantBorrowed() functions.
* Ensure importlib.metadata tests do not leak references in sys.modules.
* Move importlib.metadata tests to their own package for easier syncing with importlib_metadata.
* Update owners and makefile for new directories.
* Add blurb
Before this change, ctypes classes used a custom dict subclass, `StgDict`,
as their `tp_dict`. This acts like a regular dict but also includes extra information
about the type.
This replaces stgdict by `StgInfo`, a C struct on the type, accessed by
`PyObject_GetTypeData()` (PEP-697).
All usage of `StgDict` (mainly variables named `stgdict`, `dict`, `edict` etc.) is
converted to `StgInfo` (named `stginfo`, `info`, `einfo`, etc.).
Where the dict is actually used for class attributes (as a regular PyDict), it's now
called `attrdict`.
This change -- not overriding `tp_dict` -- is made to make me comfortable with
the next part of this PR: moving the initialization logic from `tp_new` to `tp_init`.
The `StgInfo` is set up in `__init__` of each class, with a guard that prevents
calling `__init__` more than once. Note that abstract classes (like `Array` or
`Structure`) are created using `PyType_FromMetaclass` and do not have
`__init__` called.
Previously, this was done in `__new__`, which also wasn't called for abstract
classes.
Since `__init__` can be called from Python code or skipped, there is a tested
guard to ensure `StgInfo` is initialized exactly once before it's used.
Co-authored-by: neonene <53406459+neonene@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Starting in Python 3.12, we prevented calling fork() and starting new threads
during interpreter finalization (shutdown). This has led to a number of
regressions and flaky tests. We should not prevent starting new threads
(or `fork()`) until all non-daemon threads exit and finalization starts in
earnest.
This changes the checks to use `_PyInterpreterState_GetFinalizing(interp)`,
which is set immediately before terminating non-daemon threads.
If you catch DuplicateOptionError / DuplicateSectionError when reading a
config file (the intention is to skip invalid config files) and then
attempt to use the ConfigParser instance, any values it *had* read
successfully so far, were stored as a list instead of string! Later
`get` calls would raise "AttributeError: 'list' object has no attribute
'find'" from somewhere deep in the interpolation code.
These give applications the option of more forcefully terminating client
connections for asyncio servers. Useful when terminating a service and
there is limited time to wait for clients to finish up their work.
This is a do-over with a test fix for gh-114432, which was reverted.