Add socket.VMADDR_CID_LOCAL constant.
Fix ThreadedVSOCKSocketStreamTest: if get_cid() returns the host
address or the "any" address, use the local communication address
(loopback): VMADDR_CID_LOCAL.
On Linux 6.9, apparently, the /dev/vsock device is now available but
get_cid() returns VMADDR_CID_ANY (-1).
`drop_gil()` assumes that its caller is attached, which means that the current
thread holds the GIL if and only if the GIL is enabled, and the enabled-state
of the GIL won't change. This isn't true, though, because `detach_thread()`
calls `_PyEval_ReleaseLock()` after detaching and
`_PyThreadState_DeleteCurrent()` calls it after removing the current thread
from consideration for stop-the-world requests (effectively detaching it).
Fix this by remembering whether or not a thread acquired the GIL when it last
attached, in `PyThreadState._status.holds_gil`, and check this in `drop_gil()`
instead of `gil->enabled`.
This fixes a crash in `test_multiprocessing_pool_circular_import()`, so I've
reenabled it.
Track all pairs achieving the best ratio in Differ(). This repairs the "very deep recursion and cubic time" bad cases in a way that preserves previous output.
Add `Py_BEGIN_CRITICAL_SECTION_SEQUENCE_FAST` and
`Py_END_CRITICAL_SECTION_SEQUENCE_FAST` macros and update `str.join` to use
them. Also add a regression test that would crash reliably without this
patch.
As reported in #117847 and #115366, an unpaired backtick in a docstring
tends to confuse e.g. Sphinx running on subclasses of standard library
objects, and the typographic style of using a backtick as an opening
quote is no longer in favor. Convert almost all uses of the form
The variable `foo' should do xyz
to
The variable 'foo' should do xyz
and also fix up miscellaneous other unpaired backticks (extraneous /
missing characters).
No functional change is intended here other than in human-readable
docstrings.
_PyArg_Parser holds static global data generated for modules by Argument Clinic. The _PyArg_Parser.kwtuple field is a tuple object, even though it's stored within a static global. In some cases the tuple is statically allocated and thus it's okay that it gets shared by multiple interpreters. However, in other cases the tuple is set lazily, allocated from the heap using the active interprepreter at the point the tuple is needed.
This is a problem once that interpreter is destroyed since _PyArg_Parser.kwtuple becomes at dangling pointer, leading to crashes. It isn't a problem if the tuple is allocated under the main interpreter, since its lifetime is bound to the lifetime of the runtime. The solution here is to temporarily switch to the main interpreter. The alternative would be to always statically allocate the tuple.
This change also fixes a bug where only the most recent parser was added to the global linked list.
Fix regression introduced in gh-100884: AttributeError when re-fold a long
address list.
Also fix more cases of incorrect encoding of the address separator in the
address list missed in gh-100884.
* bpo-15987: Implement ast.compare
Add a compare() function that compares two ASTs for structural equality. There are two set of attributes on AST node objects, fields and attributes. The fields are always compared, since they represent the actual structure of the code. The attributes can be optionally be included in the comparison. Attributes capture things like line numbers of column offsets, so comparing them involves test whether the layout of the program text is the same. Since whitespace seems inessential for comparing ASTs, the default is to compare fields but not attributes.
ASTs are just Python objects that can be modified in arbitrary ways. The API for ASTs is under-specified in the presence of user modifications to objects. The comparison respects modifications to fields and attributes, and to _fields and _attributes attributes. A user could create obviously malformed objects, and the code will probably fail with an AttributeError when that happens. (For example, adding "spam" to _fields but not adding a "spam" attribute to the object.)
Co-authored-by: Jeremy Hylton <jeremy@alum.mit.edu>
The PEP 649 implementation will require a way to load NotImplementedError
from the bytecode. @markshannon suggested implementing this by converting
LOAD_ASSERTION_ERROR into a more general mechanism for loading constants.
This PR adds this new opcode. I will work on the rest of the implementation
of the PEP separately.
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
* expand on What's New entry for PEP 667 (including porting notes)
* define 'optimized scope' as a glossary term
* cover comprehensions and generator expressions in locals() docs
* review all mentions of "locals" in documentation (updating if needed)
* review all mentions of "f_locals" in documentation (updating if needed)
regrtest test runner: Add XML support to the refleak checker
(-R option).
* run_unittest() now stores XML elements as string, rather than
objects, in support.junit_xml_list.
* runtest_refleak() now saves/restores XML strings before/after
checking for reference leaks. Save XML into a temporary file.
* Fix for email.generator.Generator with whitespace between encoded words.
email.generator.Generator currently does not handle whitespace between
encoded words correctly when the encoded words span multiple lines. The
current generator will create an encoded word for each line. If the end
of the line happens to correspond with the end real word in the
plaintext, the generator will place an unencoded space at the start of
the subsequent lines to represent the whitespace between the plaintext
words.
A compliant decoder will strip all the whitespace from between two
encoded words which leads to missing spaces in the round-tripped
output.
The fix for this is to make sure that whitespace between two encoded
words ends up inside of one or the other of the encoded words. This
fix places the space inside of the second encoded word.
A second problem happens with continuation lines. A continuation line that
starts with whitespace and is followed by a non-encoded word is fine because
the newline between such continuation lines is defined as condensing to
a single space character. When the continuation line starts with whitespace
followed by an encoded word, however, the RFCs specify that the word is run
together with the encoded word on the previous line. This is because normal
words are filded on syntactic breaks by encoded words are not.
The solution to this is to add the whitespace to the start of the encoded word
on the continuation line.
Test cases are from #92081
* Rename a variable so it's not confused with the final variable.
Code from https://github.com/pulkin, in PR
https://github.com/python/cpython/pull/119131
Greatly speeds `Differ` when there are many identically scoring pairs, by splitting the recursion near the inputs' midpoints instead of degenerating (as now) into just peeling off the first two lines.
Co-authored-by: Tim Peters <tim.peters@gmail.com>
Various test bots (outside the ones GH normally runs) are timing out during test_int after ecd8664 (asymptotically faster str->int). Best guess is that they don't build the C _decimal module. So require that module in the most likely tests to time out then. Flying mostly blind, though!
Asymptotically faster (O(n log n)) str->int for very large strings, leveraging the faster multiplication scheme in the C-coded `_decimal` when available. This is used instead of the current Karatsuba-limited method starting at 2 million digits.
Lots of opportunity remains for fine-tuning. Good targets include changing BYTELIM, and possibly changing the internal output base (from 256 to a higher number of bytes).
Doing this was substantial work, and many of the new lines are actually comments giving correctness proofs. The obvious approaches sticking to integers were too slow to be useful, so this is doing variable-precision decimal floating-point arithmetic. Much faster, but worst-possible rounding errors have to be wholly accounted for, using as little precision as possible.
Special thanks to Serhiy Storchaka for asking many good questions in his code reviews!
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: sstandre <43125375+sstandre@users.noreply.github.com>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Co-authored-by: Nice Zombies <nineteendo19d0@gmail.com>
Suppress all `OSError` exceptions from `pathlib.Path.exists()` and `is_*()`
rather than a selection of more common errors as we do presently. Also
adjust the implementations to call `os.path.exists()` etc, which are much
faster on Windows thanks to GH-101196.
Follow-up of gh-101693. The previous DeprecationWarning is replaced with
raising sqlite3.ProgrammingError.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
* Fix `test_strptime` raises a DeprecationWarning
* Ignore deprecation warnings where appropriate.
* Update Lib/test/datetimetester.py
This is follow on work to silence unnecessary warnings from the test suite that changes for https://github.com/python/cpython/issues/70647 added.
The free-threaded build currently immortalizes some objects once the
first thread is started. This can lead to test failures depending on the
order in which tests are run. This PR addresses those failures by
suppressing immortalization or skipping the affected tests.
* BaseException_vectorcall() now creates a tuple from 'args' array.
* Creation an exception using BaseException_vectorcall() is now a
single function call, rather than having to call
BaseException_new() and then BaseException_init().
Calling BaseException_init() is inefficient since it overrides
the 'args' attribute.
* _PyErr_SetKeyError() now uses PyObject_CallOneArg() to create the
KeyError instance to use BaseException_vectorcall().
Unfortunately, released versions of typing_extensions
monkeypatch this function without the extra parameter, which makes
it so things break badly if current main is used with typing_extensions.
Fortunately, the monkeypatching is not needed on Python 3.13, because CPython
now implements PEP 696. By renaming the function, we prevent the monkeypatch
from breaking typing.py internals.
We keep the old name (raising a DeprecationWarning) to help other external users who call it.
Remove support for supplying additional positional arguments to
`PurePath.relative_to()` and `is_relative_to()`. This has been deprecated
since Python 3.12.
A new `compute_powers()` function computes all and only the powers of the base the various base-conversion functions need, as efficiently as reasonably possible (turns out that invoking `**`is needed at most once). This typically gives a few % speedup, but the primary point is to simplify the base-conversion functions, which no longer need their own, ad hoc, and less efficient power-caching schemes.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
The `pool_in_threads.py` test file may crash in free-threaded builds,
which can lead to the Tsan test hanging. Skip it for now until we fix
the underlying issue.
Callbacks registered in the tkinter module now take arguments as
various Python objects (int, float, bytes, tuple), not just str.
To restore the previous behavior set tkinter module global wantobject to 1
before creating the Tk object or call the wantobject() method of the Tk object
with argument 1.
Calling it with argument 2 restores the current default behavior.
Fix an edge case in `binascii.a2b_base64` strict mode, where
excessive padding was not detected when no padding is necessary.
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Rationale
=========
argparse performs a complex formatting of the usage for argument grouping
and for line wrapping to fit the terminal width. This formatting has been
a constant source of bugs for at least 10 years (see linked issues below)
where defensive assertion errors are triggered or brackets and paranthesis
are not properly handeled.
Problem
=======
The current implementation of argparse usage formatting relies on regular
expressions to group arguments usage only to separate them again later
with another set of regular expressions. This is a complex and error prone
approach that caused all the issues linked below. Special casing certain
argument formats has not solved the problem. The following are some of
the most common issues:
- empty `metavar`
- mutually exclusive groups with `SUPPRESS`ed arguments
- metavars with whitespace
- metavars with brackets or paranthesis
Solution
========
The following two comments summarize the solution:
- https://github.com/python/cpython/issues/82091#issuecomment-1093832187
- https://github.com/python/cpython/issues/77048#issuecomment-1093776995
Mainly, the solution is to rewrite the usage formatting to avoid the
group-then-separate approach. Instead, the usage parts are kept separate
and only joined together at the end. This allows for a much simpler
implementation that is easier to understand and maintain. It avoids the
regular expressions approach and fixes the corresponding issues.
This closes the following GitHub issues:
- #62090
- #62549
- #77048
- #82091
- #89743
- #96310
- #98666
These PRs become obsolete:
- #15372
- #96311
Add missing import to code that handles too large files and offsets.
Use list, not tuple, for a mutable sequence.
Add tests to prevent similar mistakes.
---------
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
This change makes sure all extension/builtin modules have their init function run first by the main interpreter before proceeding with import in the original interpreter (main or otherwise). This means when the import of a single-phase init module fails in an isolated subinterpreter, it won't tie any global state/callbacks to the subinterpreter.
We already intern and immortalize most string constants. In the
free-threaded build, other constants can be a source of reference count
contention because they are shared by all threads running the same code
objects.
Now, such classes will no longer require changes in Python 3.13 in the normal case.
The test suite for robotframework passes with no DeprecationWarnings under this PR.
I also added a new DeprecationWarning for the case where `_field_types` exists
but is incomplete, since that seems likely to indicate a user mistake.
Add _PyType_LookupRef and use incref before setting attribute on type
Makes setting an attribute on a class and signaling type modified atomic
Avoid adding re-entrancy exposing the type cache in an inconsistent state by decrefing after type is updated
This is an experimental feature, for internal use.
Setting tkinter._debug = True before creating the root window enables
printing every executed Tcl command (or a Tcl command equivalent to the
used Tcl C API).
This will help to convert a Tkinter example into Tcl script to check
whether the issue is caused by Tkinter or exists in the underlying Tcl/Tk
library.
* Add PhotoImage.read() to read an image from a file.
* Add PhotoImage.data() to get the image data.
* Add background and grayscale parameters to PhotoImage.write().
* Add the PhotoImage method copy_replace() to copy a region
from one image to other image, possibly with pixel zooming and/or
subsampling.
* Add from_coords parameter to PhotoImage methods copy(), zoom() and subsample().
* Add zoom and subsample parameters to PhotoImage method copy().
* Fix mangle_from_ default value in email.policy.Policy.__doc__
The docstring says it defaults to True, but it actually defaults
to False. Only the Compat32 subclass overrides that.
---------
Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
For converting large ints to strings, CPython invokes a function in _pylong.py,
which uses the decimal module to implement an asymptotically waaaaay
sub-quadratic algorithm. But if the C decimal module isn't available, CPython
uses _pydecimal.py instead. Which in turn frequently does str(int). If the int
is very large, _pylong ends up doing the work, which in turn asks decimal to do
"big" arithmetic, which in turn calls str(big_int), which in turn ... it can
become infinite mutual recursion.
This change introduces a different int->str function that doesn't use decimal.
It's asymptotically worse, "Karatsuba time" instead of quadratic time, so
still a huge improvement. _pylong switches to that when the C decimal isn't
available. It is also used for not too large integers (less than 450_000 bits),
where it is faster (up to 2 times for 30_000 bits) than the asymptotically
better implementation that uses the C decimal.
Co-authored-by: Tim Peters <tim.peters@gmail.com>
* Initial stab.
* Test the tentative fix. Hangs "forever" without this change.
* Move the new test to a better spot.
* New comment to explain why _convert_to_str allows any poewr of 10.
* Fixed a comment, and fleshed out an existing test that appeared unfinished.
* Added temporary asserts. Or maybe permanent ;-)
* Update Lib/_pydecimal.py
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* Remove the new _convert_to_str().
Serhiy and I independently concluded that exact powers of 10
aren't possible in these contexts, so just checking the
string length is sufficient.
* At least for now, add the asserts to the other block too.
* 📜🤖 Added by blurb_it.
---------
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
* Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations.
* Remove CALL_PY_WITH_DEFAULTS specialization
* Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize
Moving this code under the `pathlib` package makes it quite a lot easier
to backport in the `pathlib-abc` PyPI package. It was a bit foolish of me
to add it to `glob` in the first place.
Also add `translate()` to `__all__` in `glob`. This function is new in
3.13, so there's no NEWS needed.
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.
PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.
A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
This is unsupported. Note that `skip_unless_reliable_fork()` checks for
the conditions used by the decorators that were removed, along with checking
for TSAN.
The function returns `True` or `False` depending on whether the GIL is
currently enabled. In the default build, it always returns `True`
because the GIL is always enabled.
The `time.sleep()` call should happen before the GC to give the worker
threads time to clean-up their remaining references to objs.
Additionally, use `support.gc_collect()` instead of `gc.collect()`
just in case the extra GC calls matter.
Now inspect.signature() supports references to the module globals in
parameter defaults on methods in extension modules. Previously it was
only supported in functions. The workaround was to specify the fully
qualified name, including the module name.
* docs: tiny grammar change: "pointed by" -> "pointed to by"
This commit uses "file pointed to by" to replace "file pointed by" in
- doc for shutil.copytree
- docstring for shutil.copytree
- docstring _abc.PathBase.open
- docstring for pathlib.Path.open
- doc for os.copy_file_range
- doc for os.splice
The docs use "file pointed to by" more frequently than
"file pointed by". So, this commit replaces the uses of
"file pointed by" in order to make the uses consistent
through the docs.
```bash
$ grep -ri 'pointed to by' cpython/
```
yields more results than
```bash
$ grep -ri 'pointed by' cpython/
```
Separately:
There are two occurrences of "tree pointed by":
- cpython/Doc/library/xml.etree.elementtree.rst for
`xml.etree.ElementInclude.include`
- cpython/Lib/xml/etree/ElementInclude.py for `include`
For those uses of "tree pointed by", I expect "tree pointed to by"
instead. However, I found enough uses online of (a) "tree pointed by"
rather than (b) "tree pointed to by" to convince me that (a) is in
common use.
So, this commit does not replace those occurrences of "tree pointed by"
to "tree pointed to by". But I will replace them if a reviewer
believes it is correct to replace them.
* docs: typo: "exists and executable" -> "exists and is executable"
---------
Co-authored-by: Andrew-Zipperer <atzipperer@gmail.com>
Free-threaded builds can intermittently tickle a longstanding bug (24 years!)
in the implementation of `threading.Condition`, leading to flakiness in the
test suite. Fixing the underlying issue will require more discussion, and will
likely apply to most of the concurrency primitives in the `threading` module
that are written in Python. See gh-118433 for more details.
Add "Raw" variant of PyTime functions:
* PyTime_MonotonicRaw()
* PyTime_PerfCounterRaw()
* PyTime_TimeRaw()
Changes:
* Add documentation and tests. Tests release the GIL while calling
raw clock functions.
* py_get_system_clock() and py_get_monotonic_clock() now check that
the GIL is hold by the caller if raise_exc is non-zero.
* Reimplement "Unchecked" functions with raw clock functions.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Account for `add_stopiteration_handler` pushing a block for `async with`.
To allow generator functions that previously almost hit the `CO_MAXBLOCKS`
limit by nesting non-async blocks, the limit is increased by 1.
This increase allows one more block in non-generator functions.
Adds a test that length-1 tuple-style sequence patterns must end in a comma, since there isn't currently one.
Spotted while reviewing Cython's proposed implementation of the pattern matching syntax (https://github.com/cython/cython/pull/4897#discussion_r1489177169) where there was a bug my the reimplementation that wasn't caught against the CPython tests here.
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.
We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.
On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.
In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
Add tests for "import", pkgutil.resolve_name() and unittest.mock.path()
for cases when "import a.b as x" and "from a import b as x" give
different results.
Deferred reference counting is not fully implemented yet. As a temporary
measure, we immortalize objects that would use deferred reference
counting to avoid multi-threaded scaling bottlenecks.
This is only performed in the free-threaded build once the first
non-main thread is started. Additionally, some tests, including refleak
tests, suppress this behavior.
It's not safe to raise an exception in `PyObject_ClearWeakRefs()` if one
is not already set, since it may be called by `_Py_Dealloc()`, which
requires that the active exception does not change.
Additionally, make sure we clear the weakrefs even when tuple allocation
fails.
* Allow to specify the signature of custom callable instances of extension
type by the __text_signature__ attribute.
* Specify signatures of operator.attrgetter, operator.itemgetter, and
operator.methodcaller instances.
This is an improvement over the status quo, reducing the likelihood of completely filling the pending calls queue. However, the problem won't go away completely unless we move to an unbounded linked list or add a mechanism for waiting until the queue isn't full.
While properties like IPv6Address.is_private account for IPv4-mapped
IPv6 addresses, such as for example:
>>> ipaddress.ip_address("192.168.0.1").is_private
True
>>> ipaddress.ip_address("::ffff:192.168.0.1").is_private
True
...the same doesn't currently apply to the is_loopback property:
>>> ipaddress.ip_address("127.0.0.1").is_loopback
True
>>> ipaddress.ip_address("::ffff:127.0.0.1").is_loopback
False
At minimum, this inconsistency between different properties is
counter-intuitive. Moreover, ::ffff:127.0.0.0/104 is for all intents and
purposes a loopback address, and should be treated as such.
sqlite3.iterdump() depends on the row factory returning resulting rows
as tuples; it will fail with custom row factories like for example a
dict factory.
With this commit, we explicitly reset the row factory of the cursor used
by iterdump(), so we always get predictable results. This does not
affect the row factory of the parent connection.
Co-authored-by: Mariusz Felisiak <felisiak.mariusz@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* Add name and mode attributes for compressed and archived file-like objects
in modules bz2, lzma, tarfile and zipfile.
* Change the value of the mode attribute of GzipFile from integer (1 or 2)
to string ('rb' or 'wb').
* Change the value of the mode attribute of ZipExtFile from 'r' to 'rb'.
The second item in the tuple returned from `__reduce__()` is a tuple of arguments to supply to path constructor. Previously we returned the `parts` tuple here, which entailed joining, parsing and normalising the path object, and produced a compact pickle representation.
With this patch, we instead return a tuple of paths that were originally given to the path constructor. This makes pickling much faster (at the expense of compactness).
It's worth noting that, in the olden times, pathlib performed this parsing/normalization up-front in every case, and so using `parts` for pickling was almost free. Nowadays pathlib only parses/normalises paths when it's necessary or advantageous to do so (e.g. computing a path parent, or iterating over a directory, respectively).
Makes sys.settrace, sys.setprofile, and monitoring generally thread-safe.
Mostly uses a stop-the-world approach and synchronization around the code object's _co_instrumentation_version. There may be a little bit of extra synchronization around the monitoring data that's required to be TSAN clean.
Before this PR tests decorated with a `requires_singlephase_init` helper
did not run because of an incorrect call to the `requires_gil_enabled`
helper.
Tarfile.addfile now throws an ValueError when the user passes
in a non-zero size tarinfo but does not provide a fileobj,
instead of writing an incomplete entry.
The implementation uses 'ptr' for the name of the first parameter of
ctypes.string_at() and ctypes.wstring_at(). Align docs and docstrings
with the naming used in the implementation.
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
The smtp test server can be set via CPYTHON_TEST_SMTP_SERVER environment variable.
If not set, it uses the default value smtp.gmail.com
This is needed because the network I'm on filters access to
smtp.gmail.com resulting in a failing test.
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Only treat '\n', '\r' and '\r\n' as line separators in re-folding the email
messages. Preserve control characters '\v', '\f', '\x1c', '\x1d' and '\x1e'
and Unicode line separators '\x85', '\u2028' and '\u2029' as is.
Remove unreliable tests on huge memory allocations:
* Remove test_maxcontext_exact_arith() of test_decimal.
Stefan Krah, test author, agreed on removing the test:
https://github.com/python/cpython/issues/114331#issuecomment-1925731273
* Remove test_constructor() tests of test_io.
Sam Gross suggests remove them:
https://github.com/python/cpython/pull/117809#pullrequestreview-2003889558
On Linux, depending how overcommit is configured, especially on Linux
s390x, a huge memory allocation (half or more of the full address
space) can succeed, but then the process will eat the full system
swap and make the system slower and slower until the whole system
becomes unusable.
Moreover, these tests had to be skipped when Python is built with
sanitizers.
We want code objects to use deferred reference counting in the
free-threaded build. This requires them to be tracked by the GC, so we
set `Py_TPFLAGS_HAVE_GC` in the free-threaded build, but not the default
build.
gh-117662 introduced some refleaks, or, rather, exposed some existing refleaks. The leaks are coming when test.support.os_helper is imported in a "legacy" interpreter. I've updated test.test_interpreters.utils to avoid importing os_helper, which fixes the leaks. I'll address the root cause separately.
Check `my_object_collected.wait()` in a loop to give the main thread a
chance to merge the reference count fields. Additionally, call
`my_object_collected.set()` in a background thread to avoid deadlocking
when the destructor is called asynchronously via the eval breaker
within the body of of `my_object_collected.wait()`.
Additionally, reduce the iterations for a few weakref tests that would
otherwise take a prohibitively long amount of time (> 1 hour) when TSAN
is enabled and the GIL is disabled.
Since 6258844c, paths that might not exist can be fed into pathlib's
globbing implementation, which will call `os.scandir()` / `os.lstat()` only
when strictly necessary. This allows us to drop an initial `self.is_dir()`
call, which saves a `stat()`.
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
rfc9110 obsoletes the earlier rfc 7231. This document also includes some
status codes that were previously only used for WebDAV and assigns more
generic names to these status codes.
ref: https://www.rfc-editor.org/rfc/rfc9110.html#name-changes-from-rfc-7231
- http.HTTPStatus.CONTENT_TOO_LARGE (413, previously
REQUEST_ENTITY_TOO_LARGE)
- http.HTTPStatus.URI_TOO_LONG (414, previously REQUEST_URI_TOO_LONG)
- http.HTTPStatus.RANGE_NOT_SATISFYABLE (416, previously
REQUEST_RANGE_NOT_SATISFYABLE)
- http.HTTPStatus.UNPROCESSABLE_CONTENT (422, previously
UNPROCESSABLE_ENTITY)
The new constants are added to http.HTTPStatus and the old constant names are
preserved for backwards compatibility.
References in documentation to the obsoleted rfc 7231 are updated
Replace use of `os.listdir()` with `os.scandir()`. Forgo setting `_drv`,
`_root` and `_tail_cached`, as these usually aren't needed. Use
`os.DirEntry.path` to set `_str`.
Don't bother calling `os.scandir()` to scan for literal pattern segments,
like `foo` in `foo/*.py`. Instead, append the segment(s) as-is and call
through to the next selector with `exists=False`, which signals that the
path might not exist. Subsequent selectors will call `os.scandir()` or
`os.lstat()` to filter out missing paths as needed.
Test signatures of all public builtins and methods of builtin classes
in modules builtins, types, sys, and several other modules (either
included in the list of standard builtin modules sys.builtin_module_names,
or providing a public interface for such modules).
Most builtins should have supported signatures, with few known exceptions.
When more builtins will be converted to Argument Clinic or support of
new signatures be implemented, they will be removed from the exception
lists.
This is similar to the situation with threading._DummyThread. The methods (incl. __del__()) of interpreters.Interpreter objects must be careful with interpreters not created by interpreters.create(). The simplest thing to start with is to disable any method that modifies or runs in the interpreter. As part of this, the runtime keeps track of where an interpreter was created. We also handle interpreter "refcounts" properly.
The free-threaded build does not currently support the combination of
single-phase init modules and non-isolated subinterpreters. Ensure that
`check_multi_interp_extensions` is always `True` for subinterpreters in
the free-threaded build so that importing these modules raises an
`ImportError`.
gh-16429 introduced support for an iterable of separators in
Stream.readuntil. Since bytes-like types are themselves iterable, this
can introduce ambiguities in deciding whether the argument is an
iterator of separators or a singleton separator. In gh-16429, only 'bytes'
was considered a singleton, but this will break code that passes other
buffer object types.
Fix it by only supporting tuples rather than arbitrary iterables.
Closes gh-117722.
Fall back to tp_call() for cases when arguments are passed by name.
Co-authored-by: Donghee Na <donghee.na@python.org>
Co-authored-by: Victor Stinner <vstinner@python.org>
* Move ifndef_symbols, includes and add_include() from Clinic to
Codegen. Add a 'codegen' (Codegen) attribute to Clinic.
* Remove libclinic.crenderdata module: move code to libclinic.codegen.
* BlockPrinter.print_block(): remove unused 'limited_capi' argument.
Remove also 'core_includes' parameter.
* Add get_includes() methods.
* Make Codegen.ifndef_symbols private.
* Make Codegen.includes private.
* Make CConverter.includes private.
It's possible to build Python with option `--with-wheel-pkg-dir`
pointing to a custom wheel directory. Don't include the directory in the test
set if the wheels are used from a different location.
Co-authored-by: Miro Hrončok <miro@hroncok.cz>
Move `pathlib.Path.walk()` implementation into `glob._Globber`. The new
`glob._Globber.walk()` classmethod works with strings internally, which is
a little faster than generating `Path` objects and keeping them normalized.
The `pathlib.Path.walk()` method converts the strings back to path objects.
In the private pathlib ABCs, our existing subclass of `_Globber` ensures
that `PathBase` instances are used throughout.
Follow-up to #117589.
Move pathlib globbing implementation into a new private class: `glob._Globber`. This class implements fast string-based globbing. It's called by `pathlib.Path.glob()`, which then converts strings back to path objects.
In the private pathlib ABCs, add a `pathlib._abc.Globber` subclass that works with `PathBase` objects rather than strings, and calls user-defined path methods like `PathBase.stat()` rather than `os.stat()`.
This sets the stage for two more improvements:
- GH-115060: Query non-wildcard segments with `lstat()`
- GH-116380: Unify `pathlib` and `glob` implementations of globbing.
No change to the implementations of `glob.glob()` and `glob.iglob()`.
Moves the validation for invalid years in the C implementation of the `datetime` module into a common location between `fromisoformat` and `fromisocalendar`, which improves the error message and fixes a failed assertion when parsing invalid ISO 8601 years using one of the "ISO weeks" formats.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
This prevents external cancellations of a task group's parent task to
be dropped when an internal cancellation happens at the same time.
Also strengthen the semantics of uncancel() to clear self._must_cancel
when the cancellation count reaches zero.
Co-Authored-By: Tin Tvrtković <tinchester@gmail.com>
Co-Authored-By: Arthur Tacca
Most mutable data is protected by a striped lock that is keyed on the
referenced object's address. The weakref's hash is protected using the
weakref's per-object lock.
Note that this only affects free-threaded builds. Apart from some minor
refactoring, the added code is all either gated by `ifdef`s or is a no-op
(e.g. `Py_BEGIN_CRITICAL_SECTION`).
The worker thread may still be alive after it enqueues it's last result,
which can lead to a delay of 30 seconds after the test finishes. This
happens much more frequently in the free-threaded build with the GIL
disabled.
This changes run_workers.py to track of live workers by enqueueing a
`WorkerExited()` instance before the worker exits.
The test suite fetches the C recursion limit from the _testcapi
extension module. Test extension modules can be disabled using the
--disable-test-modules configure option.
- re-enable test_fcntl_64_bit on Linux aarch64, but disable it on all
Android ABIs
- use support.setswitchinterval in all relevant tests
- skip test_fma_zero_result on Android x86_64
- accept EACCES when calling os.get_terminal_size on Android
Replace tri-state `follow_symlinks` with boolean `recurse_symlinks` argument. The new argument controls whether symlinks are followed when expanding recursive `**` wildcards. The possible argument values correspond as follows:
follow_symlinks recurse_symlinks
=============== ================
False N/A
None False
True True
We therefore drop support for not following symlinks when expanding non-recursive pattern parts; it wasn't requested in the original issue, and it's a feature not found in any shells.
This makes the API a easier to grok by eliminating `None` as an option.
No news blurb as `follow_symlinks` was new in 3.13.
gh-116609: Ignore UTF-16 BOM in importlib.resources._functional tests
To test the `errors` argument, we read a UTF-16 file as UTF-8
with "backslashreplace" error handling. However, the utf-16
codec adds an endian-specific byte-order mark, so on big-endian
machines the expectation doesn't match the test file (which was
saved on a little-endian machine).
Use endswith to ignore the BOM.
Apply the following optimizations to `posixpath.realpath()`:
- Remove use of recursion
- Construct child paths directly rather than using `join()`
- Use `os.getcwd[b]()` rather than `abspath()`
- Use `startswith(sep)` rather than `isabs()`
- Use slicing rather than `split()`
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.
The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
Add libclinic.clanguage module and move the following classes and
functions there:
* CLanguage
* declare_parser()
Add libclinic.codegen and move the following classes there:
* BlockPrinter
* BufferSeries
* Destination
Move the following functions to libclinic.function:
* permute_left_option_groups()
* permute_optional_groups()
* permute_right_option_groups()
This merges all `_CHECK_STACK_SPACE` uops in a trace into a single `_CHECK_STACK_SPACE_OPERAND` uop that checks whether there is enough stack space for all calls included in the entire trace.
I had meant to switch everything to InterpreterError when I added it a while back. At the time I missed a few key spots.
As part of this, I've added print-the-exception to _PyXI_InitTypes() and fixed an error case in `_PyStaticType_InitBuiltin().
This change gives a significant speedup, as the METH_FASTCALL calling
convention is now used. The following methods are adapted:
- str.count
- str.find
- str.index
- str.rfind
- str.rindex
Use the fully qualified type name in repr() of weakref.ref and
weakref.proxy types.
Fix a crash in proxy_repr() when the reference is dead.
Add also test_ref_repr() and test_proxy_repr().
These helpers make it easier to customize and inspect the config used to initialize interpreters. This is especially valuable in our tests. I found inspiration from the PyConfig API for the PyInterpreterConfig dict conversion stuff. As part of this PR I've also added a bunch of tests.
* as_completed returns object that is both iterator and async iterator
* Existing tests adjusted to test both the old and new style
* New test to ensure iterator can be resumed
* New test to ensure async iterator yields any passed-in Futures as-is
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
This just documents the parameter that already exists.
---------
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
* Extract method for _read_inner, reducing complexity and indentation by 1.
* Extract method for _raise_all and yield ParseErrors from _read_inner.
Reduces complexity by 1 and reduces touch points for handling errors in _read_inner.
* Prefer iterators to splat expansion and literal indexing.
* Extract method for _strip_comments. Reduces complexity by 7.
* Model the file lines in a class to encapsulate the comment status and cleaned value.
* Encapsulate the read state as a dataclass
* Extract _handle_continuation_line and _handle_rest methods. Reduces complexity by 8.
* Reindent
* At least for now, collect errors in the ReadState
* Check for missing section header separately.
* Extract methods for _handle_header and _handle_option. Reduces complexity by 6.
* Remove unreachable code. Reduces complexity by 4.
* Remove unreachable branch
* Handle error condition early. Reduces complexity by 1.
* Add blurb
* Move _raise_all to ParsingError, as its behavior is most closely related to the exception class and not the reader.
* Split _strip* into separate methods.
* Refactor _strip_full to compute the strip just once and use 'not any' to determine the factor.
* Replace use of 'sys.maxsize' with direct computation of the stripped value.
* Extract has_comments as a dynamic property.
* Implement clean as a cached property.
* Model comment prefixes in the RawConfigParser within a prefixes namespace.
* Use a regular expression to search for the first match.
Avoids mutating variables and tricky logic and over-computing all of the starts when only the first is relevant.
This adds a stop the world pause to make the two functions thread-safe
when the GIL is disabled in the free-threaded build.
Additionally, the main test thread may call `sys._current_exceptions()` as
soon as `g_raised.set()` is called. The background thread may not yet reach
the `leave_g.wait()` line.
The tests are not reliable with the GIL disabled. In theory, they can
fail with the GIL enabled too, but the failures are much more likely
with the GIL disabled.
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Malcolm Smith <smith@chaquo.com>
Co-authored-by: Ned Deily <nad@python.org>
* Reads zip64 files as produced by the zipfile module
* Include tests (somewhat slow, however, because of the need to create "large" zips)
* About the same amount of strictness reading invalid zip files as zipfile has
* Still works on files with prepended data (like pex)
There are a lot more test cases at https://github.com/thatch/zipimport64/ that give me confidence that this works for real-world files.
Fixes#89739 and #77140.
---------
Co-authored-by: Itamar Ostricher <itamarost@gmail.com>
Reviewed-by: Gregory P. Smith <greg@krypto.org>
* Add a new create_parser_namespace() function for
PythonParser to pass objects to executed code.
* In run_clinic(), list converters using 'converters' and
'return_converters' dictionarties.
* test_clinic: add 'object()' return converter.
* Use also create_parser_namespace() in eval_ast_expr().
Co-authored-by: Erlend E. Aasland <erlend@python.org>
Explicitly handle the case where stdout=STDOUT
as otherwise the existing error handling gets
confused and reports hard to understand errors.
Signed-off-by: Paulo Neves <ptsneves@gmail.com>
Fix parsing of the following corner cases:
* URLs with only a host name
* URLs containing a fragment
* URLs containing a query
* filenames with only a UNC sharepoint on Windows
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
Change old space bit of young objects from 0 to gcstate->visited_space.
This ensures that any object created *and* collected during cycle GC has the bit set correctly.
Python 3.10 changed from using SSL_write() and SSL_read() to SSL_write_ex() and
SSL_read_ex(), but did not update handling of the return value.
Change error handling so that the return value is not examined.
OSError (not EOF) is now returned when retval is 0.
According to *recent* man pages of all functions for which we call
PySSL_SetError, (in OpenSSL 3.0 and 1.1.1), their return value should
be used to determine whether an error happened (i.e. if PySSL_SetError
should be called), but not what kind of error happened (so,
PySSL_SetError shouldn't need retval). To get the error,
we need to use SSL_get_error.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
This fixes XML unittest fallout from the https://github.com/python/cpython/issues/115398 security fix. When configured using `--with-system-expat` on systems with older pre 2.6.0 versions of libexpat, our unittests were failing.
* sax|etree: Simplify Expat version guard where simplifiable
Idea by Matěj Cepl
* sax|etree: Fix reparse deferral tests for vanilla Expat <2.6.0
This *does not fix* the case of distros with an older version of libexpat with the 2.6.0 feature backported as a security fix. (Ubuntu is a known example of this with its libexpat1 2.5.0-2ubunutu0.1 package)
Instead of calling `exec()` once for each function added to a dataclass, only call `exec()` once per dataclass. This can lead to speed improvements of up to 20%.
pythongh-112571: allow using fish venv activation script on windows
The fish shell can be used on windows under cygwin or msys2.
This change moves the script to the common folder
so the venv module will install it on both posix and nt systems (like the bash script).