* Extract method for _read_inner, reducing complexity and indentation by 1.
* Extract method for _raise_all and yield ParseErrors from _read_inner.
Reduces complexity by 1 and reduces touch points for handling errors in _read_inner.
* Prefer iterators to splat expansion and literal indexing.
* Extract method for _strip_comments. Reduces complexity by 7.
* Model the file lines in a class to encapsulate the comment status and cleaned value.
* Encapsulate the read state as a dataclass
* Extract _handle_continuation_line and _handle_rest methods. Reduces complexity by 8.
* Reindent
* At least for now, collect errors in the ReadState
* Check for missing section header separately.
* Extract methods for _handle_header and _handle_option. Reduces complexity by 6.
* Remove unreachable code. Reduces complexity by 4.
* Remove unreachable branch
* Handle error condition early. Reduces complexity by 1.
* Add blurb
* Move _raise_all to ParsingError, as its behavior is most closely related to the exception class and not the reader.
* Split _strip* into separate methods.
* Refactor _strip_full to compute the strip just once and use 'not any' to determine the factor.
* Replace use of 'sys.maxsize' with direct computation of the stripped value.
* Extract has_comments as a dynamic property.
* Implement clean as a cached property.
* Model comment prefixes in the RawConfigParser within a prefixes namespace.
* Use a regular expression to search for the first match.
Avoids mutating variables and tricky logic and over-computing all of the starts when only the first is relevant.
Remove extra self DECREF on ssl "no ciphers" error path.
This doesn't come up in practice because nobody links against a broken
OpenSSL library that provides nothing.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Jacob Coffee <jacob@z7x.org>
Co-authored-by: Malcolm Smith <smith@chaquo.com>
Co-authored-by: Ned Deily <nad@python.org>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Malcolm Smith <smith@chaquo.com>
Co-authored-by: Ned Deily <nad@python.org>
* Reads zip64 files as produced by the zipfile module
* Include tests (somewhat slow, however, because of the need to create "large" zips)
* About the same amount of strictness reading invalid zip files as zipfile has
* Still works on files with prepended data (like pex)
There are a lot more test cases at https://github.com/thatch/zipimport64/ that give me confidence that this works for real-world files.
Fixes#89739 and #77140.
---------
Co-authored-by: Itamar Ostricher <itamarost@gmail.com>
Reviewed-by: Gregory P. Smith <greg@krypto.org>
Explicitly handle the case where stdout=STDOUT
as otherwise the existing error handling gets
confused and reports hard to understand errors.
Signed-off-by: Paulo Neves <ptsneves@gmail.com>
Fix parsing of the following corner cases:
* URLs with only a host name
* URLs containing a fragment
* URLs containing a query
* filenames with only a UNC sharepoint on Windows
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
Change old space bit of young objects from 0 to gcstate->visited_space.
This ensures that any object created *and* collected during cycle GC has the bit set correctly.
Python 3.10 changed from using SSL_write() and SSL_read() to SSL_write_ex() and
SSL_read_ex(), but did not update handling of the return value.
Change error handling so that the return value is not examined.
OSError (not EOF) is now returned when retval is 0.
According to *recent* man pages of all functions for which we call
PySSL_SetError, (in OpenSSL 3.0 and 1.1.1), their return value should
be used to determine whether an error happened (i.e. if PySSL_SetError
should be called), but not what kind of error happened (so,
PySSL_SetError shouldn't need retval). To get the error,
we need to use SSL_get_error.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
This fixes XML unittest fallout from the https://github.com/python/cpython/issues/115398 security fix. When configured using `--with-system-expat` on systems with older pre 2.6.0 versions of libexpat, our unittests were failing.
* sax|etree: Simplify Expat version guard where simplifiable
Idea by Matěj Cepl
* sax|etree: Fix reparse deferral tests for vanilla Expat <2.6.0
This *does not fix* the case of distros with an older version of libexpat with the 2.6.0 feature backported as a security fix. (Ubuntu is a known example of this with its libexpat1 2.5.0-2ubunutu0.1 package)
Instead of calling `exec()` once for each function added to a dataclass, only call `exec()` once per dataclass. This can lead to speed improvements of up to 20%.
* GH-113171: Fix "private" (really non-global) IP address ranges
The _private_networks variables, used by various is_private
implementations, were missing some ranges and at the same time had
overly strict ranges (where there are more specific ranges considered
globally reachable by the IANA registries).
This patch updates the ranges with what was missing or otherwise
incorrect.
I left 100.64.0.0/10 alone, for now, as it's been made special in [1]
and I'm not sure if we want to undo that as I don't quite understand the
motivation behind it.
The _address_exclude_many() call returns 8 networks for IPv4, 121
networks for IPv6.
[1] https://github.com/python/cpython/issues/61602
Add Py_GetConstant() and Py_GetConstantBorrowed() functions.
In the limited C API version 3.13, getting Py_None, Py_False,
Py_True, Py_Ellipsis and Py_NotImplemented singletons is now
implemented as function calls at the stable ABI level to hide
implementation details. Getting these constants still return borrowed
references.
Add _testlimitedcapi/object.c and test_capi/test_object.py to test
Py_GetConstant() and Py_GetConstantBorrowed() functions.
* Ensure importlib.metadata tests do not leak references in sys.modules.
* Move importlib.metadata tests to their own package for easier syncing with importlib_metadata.
* Update owners and makefile for new directories.
* Add blurb
Before this change, ctypes classes used a custom dict subclass, `StgDict`,
as their `tp_dict`. This acts like a regular dict but also includes extra information
about the type.
This replaces stgdict by `StgInfo`, a C struct on the type, accessed by
`PyObject_GetTypeData()` (PEP-697).
All usage of `StgDict` (mainly variables named `stgdict`, `dict`, `edict` etc.) is
converted to `StgInfo` (named `stginfo`, `info`, `einfo`, etc.).
Where the dict is actually used for class attributes (as a regular PyDict), it's now
called `attrdict`.
This change -- not overriding `tp_dict` -- is made to make me comfortable with
the next part of this PR: moving the initialization logic from `tp_new` to `tp_init`.
The `StgInfo` is set up in `__init__` of each class, with a guard that prevents
calling `__init__` more than once. Note that abstract classes (like `Array` or
`Structure`) are created using `PyType_FromMetaclass` and do not have
`__init__` called.
Previously, this was done in `__new__`, which also wasn't called for abstract
classes.
Since `__init__` can be called from Python code or skipped, there is a tested
guard to ensure `StgInfo` is initialized exactly once before it's used.
Co-authored-by: neonene <53406459+neonene@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Starting in Python 3.12, we prevented calling fork() and starting new threads
during interpreter finalization (shutdown). This has led to a number of
regressions and flaky tests. We should not prevent starting new threads
(or `fork()`) until all non-daemon threads exit and finalization starts in
earnest.
This changes the checks to use `_PyInterpreterState_GetFinalizing(interp)`,
which is set immediately before terminating non-daemon threads.
If you catch DuplicateOptionError / DuplicateSectionError when reading a
config file (the intention is to skip invalid config files) and then
attempt to use the ConfigParser instance, any values it *had* read
successfully so far, were stored as a list instead of string! Later
`get` calls would raise "AttributeError: 'list' object has no attribute
'find'" from somewhere deep in the interpolation code.
These give applications the option of more forcefully terminating client
connections for asyncio servers. Useful when terminating a service and
there is limited time to wait for clients to finish up their work.
This is a do-over with a test fix for gh-114432, which was reverted.
This includes adding what should be a relatively temporary
`Modules/_decimal/windows/mpdecimal.h` shim to choose between `mpdecimal32vc.h`
or `mpdecimal64vc.h` based on which of `CONFIG_64` or `CONFIG_32` is defined.
* bpo-27578: Fix inspect.getsource() on empty file
For modules from empty files, `inspect.getsource()` now
returns an empty string, and `inspect.getsourcelines()` returns
a list of one empty string, fixing the expected invariant.
As indicated by `exec('')`, empty strings are valid Python
source code.
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:
1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
However, C holds it, so the assertion fails.
The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.
There are two main parts to this PR:
1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
The event is set by the runtime prior to the thread being cleared (in the same
place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
`PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
`_tstate_lock`s.
[^1]: 441affc9e7/Lib/threading.py (L1201)
[^2]: 441affc9e7/Lib/threading.py (L1115)
[^3]: 8194653279
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Change automatically generated tkinter.Checkbutton widget names to
avoid collisions with automatically generated tkinter.ttk.Checkbutton
widget names within the same parent widget.
* Restore support of None and other false values.
* Raise TypeError for non-zero integers and non-empty sequences.
The regressions were introduced in gh-74668
(bdba8ef42b).
Any capitalization of "xn--" should be acceptable for the ACE prefix
(see https://tools.ietf.org/html/rfc3490#section-5).
Co-authored-by: Pepijn de Vos <pepijndevos@gmail.com>
Co-authored-by: Erlend E. Aasland <erlend@python.org>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Use the NtQueryInformationProcess system call to efficiently retrieve the parent process ID in a single step, rather than using the process snapshots API which retrieves large amounts of unnecessary information and is more prone to failure (since it makes heap allocations).
Includes a fallback to the original win32_getppid implementation in case the unstable API appears to return strange results.
On Windows, time.monotonic() now uses the QueryPerformanceCounter()
clock to have a resolution better than 1 us, instead of the
gGetTickCount64() clock which has a resolution of 15.6 ms.
* GH-116554: Relax list.sort()'s notion of "descending" run
Rewrote `count_run()` so that sub-runs of equal elements no longer end a descending run. Both ascending and descending runs can have arbitrarily many sub-runs of arbitrarily many equal elements now. This is tricky, because we only use ``<`` comparisons, so checking for equality doesn't come "for free". Surprisingly, it turned out there's a very cheap (one comparison) way to determine whether an ascending run consisted of all-equal elements. That sealed the deal.
In addition, after a descending run is reversed in-place, we now go on to see whether it can be extended by an ascending run that just happens to be adjacent. This succeeds in finding at least one additional element to append about half the time, and so appears to more than repay its cost (the savings come from getting to skip a binary search, when a short run is artificially forced to length MIINRUN later, for each new element `count_run()` can add to the initial run).
While these have been in the back of my mind for years, a question on StackOverflow pushed it to action:
https://stackoverflow.com/questions/78108792/
They were wondering why it took about 4x longer to sort a list like:
[999_999, 999_999, ..., 2, 2, 1, 1, 0, 0]
than "similar" lists. Of course that runs very much faster after this patch.
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
gh-116307: Create a new import helper 'isolated modules' and use that instead of 'Clean Import' to ensure that tests from importlib_resources don't leave modules in sys.modules.