Commit Graph

8669 Commits

Author SHA1 Message Date
Barney Gale a74f117dab
GH-115060: Speed up `pathlib.Path.glob()` by omitting initial `stat()` (#117831)
Since 6258844c, paths that might not exist can be fed into pathlib's
globbing implementation, which will call `os.scandir()` / `os.lstat()` only
when strictly necessary. This allows us to drop an initial `self.is_dir()`
call, which saves a `stat()`.

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2024-04-14 00:08:03 +01:00
Hugo van Kemenade 56ed979d04
gh-68583: webbrowser: replace `getopt` with `argparse`, add long options (#117047) 2024-04-13 08:56:56 -06:00
Michiel W. Beijen 022ba6d161
gh-102247: http: support rfc9110 status codes (GH-117611)
rfc9110 obsoletes the earlier rfc 7231. This document also includes some
status codes that were previously only used for WebDAV and assigns more
generic names to these status codes.

ref: https://www.rfc-editor.org/rfc/rfc9110.html#name-changes-from-rfc-7231

- http.HTTPStatus.CONTENT_TOO_LARGE (413, previously
  REQUEST_ENTITY_TOO_LARGE)
- http.HTTPStatus.URI_TOO_LONG (414, previously REQUEST_URI_TOO_LONG)
- http.HTTPStatus.RANGE_NOT_SATISFYABLE (416, previously
  REQUEST_RANGE_NOT_SATISFYABLE)
- http.HTTPStatus.UNPROCESSABLE_CONTENT (422, previously
  UNPROCESSABLE_ENTITY)

The new constants are added to http.HTTPStatus and the old constant names are
preserved for backwards compatibility.

References in documentation to the obsoleted rfc 7231 are updated
2024-04-13 07:33:20 -07:00
Barney Gale 30f0643e36
GH-117727: Speed up `pathlib.Path.iterdir()` by using `os.scandir()` (#117728)
Replace use of `os.listdir()` with `os.scandir()`. Forgo setting `_drv`,
`_root` and `_tail_cached`, as these usually aren't needed. Use
`os.DirEntry.path` to set `_str`.
2024-04-12 22:02:39 +00:00
Barney Gale 0eb52f5f26
GH-115060: Speed up `pathlib.Path.glob()` by not scanning literal parts (#117732)
Don't bother calling `os.scandir()` to scan for literal pattern segments,
like `foo` in `foo/*.py`. Instead, append the segment(s) as-is and call
through to the next selector with `exists=False`, which signals that the
path might not exist. Subsequent selectors will call `os.scandir()` or
`os.lstat()` to filter out missing paths as needed.
2024-04-12 22:19:21 +01:00
Erlend E. Aasland deb921f851
gh-117431: Adapt bytes and bytearray .find() and friends to Argument Clinic (#117502)
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used. The following bytes and bytearray methods are adapted:

- count()
- find()
- index()
- rfind()
- rindex()

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
2024-04-12 07:40:55 +00:00
Will Childs-Klein b8eaad3009
gh-117233: Detect support for several hashes at hashlib build time (GH-117234)
Detect libcrypto BLAKE2, Shake, SHA3, and Truncated-SHA512 support at hashlib build time

## BLAKE2

While OpenSSL supports both "b" and "s" variants of the BLAKE2 hash
function, other cryptographic libraries may lack support for one or both
of the variants. This commit modifies `hashlib`'s C code to detect
whether or not the linked libcrypto supports each BLAKE2 variant, and
elides references to each variant's NID accordingly. In cases where the
underlying libcrypto doesn't fully support BLAKE2, CPython's
`./configure` script can be given the following flag to use CPython's
interned BLAKE2 implementation: `--with-builtin-hashlib-hashes=blake2`.

## SHA3, Shake, & truncated SHA512.

Detect BLAKE2, SHA3, Shake, & truncated SHA512 support in the
OpenSSL-ish libcrypto library at build time.  This helps allow hashlib's
`_hashopenssl` to be used with libraries that do not to support every
algorithm that upstream OpenSSL does.  Such as AWS-LC & BoringSSL.

Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
2024-04-11 16:49:41 +02:00
Bruce Merry 01a51f9494
gh-117722: Fix Stream.readuntil with non-bytes buffer objects (#117723)
gh-16429 introduced support for an iterable of separators in
Stream.readuntil. Since bytes-like types are themselves iterable, this
can introduce ambiguities in deciding whether the argument is an
iterator of separators or a singleton separator. In gh-16429, only 'bytes'
was considered a singleton, but this will break code that passes other
buffer object types.

Fix it by only supporting tuples rather than arbitrary iterables.

Closes gh-117722.
2024-04-11 07:41:55 -07:00
Erlend E. Aasland 044dc496e0
gh-117709: Add vectorcall support for str() with positional-only arguments (#117746)
Fall back to tp_call() for cases when arguments are passed by name.

Co-authored-by: Donghee Na <donghee.na@python.org>
Co-authored-by: Victor Stinner <vstinner@python.org>
2024-04-11 13:55:37 +00:00
Barney Gale 0cc71bde00
GH-117586: Speed up `pathlib.Path.walk()` by working with strings (#117726)
Move `pathlib.Path.walk()` implementation into `glob._Globber`. The new
`glob._Globber.walk()` classmethod works with strings internally, which is
a little faster than generating `Path` objects and keeping them normalized.
The `pathlib.Path.walk()` method converts the strings back to path objects.

In the private pathlib ABCs, our existing subclass of `_Globber` ensures
that `PathBase` instances are used throughout.

Follow-up to #117589.
2024-04-11 01:26:53 +01:00
Barney Gale 6258844c27
GH-117586: Speed up `pathlib.Path.glob()` by working with strings (#117589)
Move pathlib globbing implementation into a new private class: `glob._Globber`. This class implements fast string-based globbing. It's called by `pathlib.Path.glob()`, which then converts strings back to path objects.

In the private pathlib ABCs, add a `pathlib._abc.Globber` subclass that works with `PathBase` objects rather than strings, and calls user-defined path methods like `PathBase.stat()` rather than `os.stat()`.

This sets the stage for two more improvements:

- GH-115060: Query non-wildcard segments with `lstat()`
- GH-116380: Unify `pathlib` and `glob` implementations of globbing.

No change to the implementations of `glob.glob()` and `glob.iglob()`.
2024-04-10 20:43:07 +01:00
Barney Gale 630df37116
GH-117546: Fix symlink resolution in `os.path.realpath('loop/../link')` (#117568)
Continue resolving symlink targets after encountering a symlink loop, which
matches coreutils `realpath` behaviour.
2024-04-10 18:17:18 +01:00
neonene ef4118222b
gh-117142: Port _ctypes to multi-phase init (GH-117181) 2024-04-10 11:00:01 +00:00
Nikita Sobolev 4bb7d121bc
gh-117692: Fix `AttributeError` in `DocTestFinder` on wrapped `builtin_or_method` (#117699)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-04-10 10:52:47 +01:00
Thomas Wouters d0f93d132f Merge branch 'main' of https://github.com/python/cpython 2024-04-09 20:42:07 +02:00
Ethan Furman e5521bcca9
gh-117663: [Enum] fix _simple_enum's detection of aliases (GH-117664) 2024-04-09 11:31:07 -07:00
Vlad4896 d5f1139c79
gh-117534: Add checking for input parameter in iso_to_ymd (#117543)
Moves the validation for invalid years in the C implementation of the `datetime` module into a common location between `fromisoformat` and `fromisocalendar`, which improves the error message and fixes a failed assertion when parsing invalid ISO 8601 years using one of the "ISO weeks" formats.

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2024-04-09 13:53:00 -04:00
Guido van Rossum fa58e75a86
gh-116720: Fix corner cases of taskgroups (#117407)
This prevents external cancellations of a task group's parent task to
be dropped when an internal cancellation happens at the same time.
Also strengthen the semantics of uncancel() to clear self._must_cancel
when the cancellation count reaches zero.

Co-Authored-By: Tin Tvrtković <tinchester@gmail.com>
Co-Authored-By: Arthur Tacca
2024-04-09 08:17:28 -07:00
Jelle Zijlstra f2132fcd2a
gh-117516: Implement typing.TypeIs (#117517)
See PEP 742.

Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2024-04-09 10:50:37 +00:00
Thomas Wouters 57aee2a02c Python 3.13.0a6 2024-04-09 11:56:22 +02:00
Nice Zombies 99852d9e65
gh-117648: Improve performance of os.join (#117654)
Replace map() with a method call in the loop body.

Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
2024-04-09 10:27:14 +02:00
Chris Markiewicz 19a2202067
gh-117182: Allow lazily loaded modules to modify their own __class__ 2024-04-09 04:08:48 +01:00
Bruce Merry 775912a51d
gh-81322: support multiple separators in StreamReader.readuntil (#16429) 2024-04-08 09:58:02 -07:00
Serhiy Storchaka 24a2bd0481
gh-117642: Fix PEP 737 implementation (GH-117643)
* Fix implementation of %#T and %#N (they were implemented as %T# and
  %N#).
* Restore tests removed in gh-116417.
2024-04-08 16:27:25 +00:00
Nice Zombies 733e56ef96
gh-117584: Raise TypeError for non-paths in posixpath.relpath() (GH-117585) 2024-04-07 12:00:08 +03:00
Laurie O df4d84c3cd
gh-96471: Add asyncio queue shutdown (#104228)
Co-authored-by: Duprat <yduprat@gmail.com>
2024-04-06 07:27:13 -07:00
Steve Dower 687616877b
gh-111140: PyLong_From/AsNativeBytes: Take *flags* rather than just *endianness* (GH-116053) 2024-04-05 16:21:16 +02:00
Barney Gale abfa16b44b
GH-114847: Speed up `posixpath.realpath()` (#114848)
Apply the following optimizations to `posixpath.realpath()`:

- Remove use of recursion
- Construct child paths directly rather than using `join()`
- Use `os.getcwd[b]()` rather than `abspath()`
- Use `startswith(sep)` rather than `isabs()`
- Use slicing rather than `split()`

Co-authored-by: Petr Viktorin <encukou@gmail.com>
2024-04-05 12:35:01 +00:00
Petr Viktorin 9ceaee74db
gh-116608: importlib.resources: Un-deprecate functional API & add subdirectory support (GH-116609) 2024-04-05 13:55:59 +02:00
Irit Katriel 04697bcfaf
gh-117494: extract the Instruction Sequence data structure into a separate file (#117496) 2024-04-04 15:47:26 +00:00
Guido van Rossum 060a96f1a9
gh-116968: Reimplement Tier 2 counters (#117144)
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.

The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
2024-04-04 15:03:27 +00:00
Tony Mountifield 3f5bcc86d0
gh-117467: Add preserving of mailbox owner on flush (GH-117510)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-04-04 13:32:53 +03:00
rsp4jack 85843348c5
gh-117459: Keep the traceback in _convert_future_exc (#117460) 2024-04-03 20:13:32 -07:00
Shantanu b4fe02f595
gh-117205: Increase chunksize when compiling pyc in parallel (#117206) 2024-04-03 15:24:24 -07:00
Steve Dower 985917dc8d
gh-117267: Ensure DirEntry.stat().st_ctime still contains creation time during deprecation period (GH-117354) 2024-04-03 23:14:55 +01:00
Erlend E. Aasland 7ecd55d604
gh-117431: Adapt str.find and friends to Argument Clinic (#117468)
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used. The following methods are adapted:

- str.count
- str.find
- str.index
- str.rfind
- str.rindex
2024-04-03 17:59:18 +02:00
Barney Gale 345194de8c
GH-114847: Raise FileNotFoundError when getcwd() returns '(unreachable)' (#117481)
On Linux >= 2.6.36 with glibc < 2.27, `getcwd()` can return a relative
pathname starting with '(unreachable)'. We detect this and fail with
ENOENT, matching new glibc behaviour.

Co-authored-by: Petr Viktorin <encukou@gmail.com>
2024-04-03 16:39:40 +01:00
Zackery Spytz fc5f68e58e
gh-59215: unittest: restore _top_level_dir at end of discovery (GH-15242) 2024-04-03 16:17:13 +02:00
Nice Zombies 2ec6bb4111
gh-117381: Improve error messages for ntpath.commonpath() (GH-117382) 2024-04-03 16:10:09 +03:00
Gregory P. Smith 33ee5cb3e9
GH-70647: Deprecate strptime day of month parsing without a year present to avoid leap-year bugs (GH-117107) 2024-04-03 14:19:49 +02:00
Erlend E. Aasland 595bb496b0
gh-117431: Adapt bytes and bytearray .startswith() and .endswith() to Argument Clinic (#117495)
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used.
2024-04-03 13:11:14 +02:00
Erlend E. Aasland 444156ede4
gh-117431: Adapt str.startswith and str.endswith to Argument Clinic (#117466)
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used.
2024-04-03 09:11:39 +02:00
Nice Zombies cae4cdd07d
gh-117349: Micro-optimize a few `os.path` functions (#117350)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Barney Gale <barney.gale@gmail.com>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
2024-04-02 21:32:35 +01:00
Mark Shannon c32dc47aca
GH-115776: Embed the values array into the object, for "normal" Python objects. (GH-116115) 2024-04-02 11:59:21 +01:00
Grigoriev Semyon c97d3af239
gh-109120: Fix syntax error in handlinh of incorrect star expressions (#117444) 2024-04-02 11:42:58 +01:00
Irit Katriel 1d5479b236
gh-117411: move PyFutureFeatures to pycore_symtable.h and make it private (#117412) 2024-04-02 10:34:49 +00:00
Barney Gale fc8007ee36
GH-117337: Deprecate `glob.glob0()` and `glob.glob1()`. (#117371)
These undocumented functions are no longer used by `msilib`, so there's no
reason to keep them around.
2024-04-01 19:37:41 +00:00
Justin Turner Arthur c741ad3537
gh-77714: Provide an async iterator version of as_completed (GH-22491)
* as_completed returns object that is both iterator and async iterator
* Existing tests adjusted to test both the old and new style
* New test to ensure iterator can be resumed
* New test to ensure async iterator yields any passed-in Futures as-is

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
2024-04-01 20:07:29 +03:00
Steve (Gadget) Barnes 3de09cadde
gh-91565: Replace bugs.python.org links with Devguide/GitHub ones (GH-91568)
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
2024-04-01 13:02:07 +00:00
Jason R. Coombs 019143fecb
gh-117348: Refactored RawConfigParser._read for similicity and comprehensibility (#117372)
* Extract method for _read_inner, reducing complexity and indentation by 1.

* Extract method for _raise_all and yield ParseErrors from _read_inner.

Reduces complexity by 1 and reduces touch points for handling errors in _read_inner.

* Prefer iterators to splat expansion and literal indexing.

* Extract method for _strip_comments. Reduces complexity by 7.

* Model the file lines in a class to encapsulate the comment status and cleaned value.

* Encapsulate the read state as a dataclass

* Extract _handle_continuation_line and _handle_rest methods. Reduces complexity by 8.

* Reindent

* At least for now, collect errors in the ReadState

* Check for missing section header separately.

* Extract methods for _handle_header and _handle_option. Reduces complexity by 6.

* Remove unreachable code. Reduces complexity by 4.

* Remove unreachable branch

* Handle error condition early. Reduces complexity by 1.

* Add blurb

* Move _raise_all to ParsingError, as its behavior is most closely related to the exception class and not the reader.

* Split _strip* into separate methods.

* Refactor _strip_full to compute the strip just once and use 'not any' to determine the factor.

* Replace use of 'sys.maxsize' with direct computation of the stripped value.

* Extract has_comments as a dynamic property.

* Implement clean as a cached property.

* Model comment prefixes in the RawConfigParser within a prefixes namespace.

* Use a regular expression to search for the first match.

Avoids mutating variables and tricky logic and over-computing all of the starts when only the first is relevant.
2024-03-29 16:06:09 -04:00