A regression would still absolutely fail and even a flaky pass isn't
harmful as it'd fail most of the time across our N system test runs.
Windows has a low resolution timer and CI systems are prone to odd
timing so this just gives more leeway to avoid flakiness.
This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8.
It also fixes an off-by-one error introduced in 3.10 for the line number when the tokenizer reports bad UTF8.
This removes the unused `name` variable in the block where `ArgumentTypeError` is handled.
`ArgumentTypeError` errors are handled by showing just the string of the exception; unlike `ValueError`, the name (`__name__`) of the function is not included in the error message.
Fixes#96548
This doesn't happen naturally, but is allowed by the ASDL and compiler.
We don't want to change ASDL for backward compatibility reasons
(#57645, #92987)
Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)
The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.
The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```
In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$
From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
* gh-68163: Correct conversion of Rational instances to float
Also document that numerator/denominator properties are instances of Integral.
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Link to #93884
* Test with some large negative and positive values(out of range of a longlong,i.e.[-2\*\*63, 2\*\*63-1])
* Test with objects of non-int type
Automerge-Triggered-By: GH:mdickinson
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Martin Panter <vadmium@users.noreply.github.com>
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.
This PR comes fresh from a pile of work done in our private PSRT security response team repo.
Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
* gh-96132: Add some comments and minor fixes missed in the original PR
* Update Doc/using/cmdline.rst
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
⚠️⚠️ Note for reviewers, hackers and fellow systems/low-level/compiler engineers ⚠️⚠️
If you have a lot of experience with this kind of shenanigans and want to improve the **first** version, **please make a PR against my branch** or **reach out by email** or **suggest code changes directly on GitHub**.
If you have any **refinements or optimizations** please, wait until the first version is merged before starting hacking or proposing those so we can keep this PR productive.
datetime.isoformat generates the tzoffset with colons, but there
was no format code to make strftime output the same format.
for simplicity and consistency the %:z formatting behaves mostly
as %z, with the exception of adding colons. this includes the
dynamic behaviour of adding seconds and microseconds only when
needed (when not 0).
this fixes the still open "generate" part of this issue:
https://github.com/python/cpython/issues/69142
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
* inspect.getsource: avoid stat on file in linecache
The check for os.path.exists() on source file is postponed in
inspect.getsourcefile() until needed avoiding an expensive filesystem
stat call and PEP 302 module loader check is moved last for performance
since it is an uncommon case.
Editors don't agree that `test_source_encoding.py` was valid koi8-r, making it
hard to edit that file without the editor breaking it in some way (see gh-96272).
Only one test actually relied on the koi8-r encoding and it was a duplicate of a
test from the deprecated `imp` module's `test_imp`, so here we replace
`test_pep263` with `test_import_encoded_module` stolen from `test_imp` and
set `test_source_encoding.py`'s encoding to utf-8 to make editing it easier
going forward.
find_unused_port() has an inherent race condition, but we can't use
bind_port() as that uses .getsockname() which this test is exercising.
Try binding to unused ports a few times before failing.
Signed-off-by: Ross Burton <ross.burton@arm.com>
* gh-93503: Add APIs to set profiling and tracing functions in all threads in the C-API
* Use a separate API
* Fix NEWS entry
* Add locks around the loop
* Document ignoring exceptions
* Use the new APIs in the sys module
* Update docs
Tests for IsolatedAsyncioTestCase.debug() rely on the runner be closed
in __del__. It makes tests depending on the GC an unreliable on other
implementations. It is better to close the runner explicitly even if
currently there is no a public API for this.
The clearing of the temporary directory is not working on some platforms and
leaving behind files.
This has been updated to use the pattern in test_cmd_line.py [1] using the
special TESTFN rather than a test directory.
[1] https://github.com/python/cpython/blob/main/Lib/test/test_cmd_line.py#L559
* Treat tp_weakref and tp_dictoffset like other opaque slots for multiple inheritance.
* Document Py_TPFLAGS_MANAGED_DICT and Py_TPFLAGS_MANAGED_WEAKREF in what's new.
- Limited API needs to be enabled per source file
- Some builds don't support Limited API, so Limited API tests must be skipped on those builds
(currently this is `Py_TRACE_REFS`, but that may change.)
- `Py_LIMITED_API` must be defined before `<Python.h>` is included.
This puts the hoop-jumping in `testcapi/parts.h`, so individual
test files can be relatively simple. (Currently that's only
`vectorcall_limited.c`, imagine more.)
When a task catches CancelledError and raises some other error,
the other error should not silently be suppressed.
Any scenario where a task crashes in cleanup upon cancellation
will now result in an ExceptionGroup wrapping the crash(es)
instead of propagating CancelledError and ignoring the side errors.
NOTE: This represents a change in behavior (hence the need to
change several tests). But it is only an edge case.
Co-authored-by: Thomas Grainger <tagrain@gmail.com>
When loading a source file from disk, there is a separate UTF-8 validator
distinct from the one in `unicode_decode_utf8`. This exercises that code path
with the same set of invalid inputs as we use for testing the "other" UTF-8
decoder.
If kernel fips is enabled, we get permission error upon doing
`import crypt`. So, if kernel fips is enabled, disable the
unallowed hashing methods.
Python 3.9.1 (default, May 10 2022, 11:36:26)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import crypt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.9/crypt.py", line 117, in <module>
_add_method('MD5', '1', 8, 34)
File "/usr/lib/python3.9/crypt.py", line 94, in _add_method
result = crypt('', salt)
File "/usr/lib/python3.9/crypt.py", line 82, in crypt
return _crypt.crypt(word, salt)
PermissionError: [Errno 1] Operation not permitted
Signed-off-by: Shreenidhi Shedi <sshedi@vmware.com>
* Make sure that tp_dictoffset is correct with Py_TPFLAGS_MANAGED_DICT is set.
* Avoid traversing managed dict twice when subclassing class with Py_TPFLAGS_MANAGED_DICT set.
Automate WASM build with a new Python script. The script provides
several build profiles with configure flags for Emscripten flavors
and WASI. The script can detect and use Emscripten SDK and WASI SDK from
default locations or env vars.
``configure`` now detects Node arguments and creates HOSTRUNNER
arguments for Node 16. It also sets correct arguments for
``wasm64-emscripten``.
Co-authored-by: Brett Cannon <brett@python.org>
Have pathlib use `os.path.join()` to join arguments to the `PurePath` initialiser, which fixes a minor bug when handling relative paths with drives.
Previously:
```python
>>> from pathlib import PureWindowsPath
>>> a = 'C:/a/b'
>>> b = 'C:x/y'
>>> PureWindowsPath(a, b)
PureWindowsPath('C:x/y')
```
Now:
```python
>>> PureWindowsPath(a, b)
PureWindowsPath('C:/a/b/x/y')
```
We only statically initialize for core code and builtin modules. Extension modules still create
the tuple at runtime. We'll solve that part of interpreter isolation separately.
This change includes generated code. The non-generated changes are in:
* Tools/clinic/clinic.py
* Python/getargs.c
* Include/cpython/modsupport.h
* Makefile.pre.in (re-generate global strings after running clinic)
* very minor tweaks to Modules/_codecsmodule.c and Python/Python-tokenize.c
All other changes are generated code (clinic, global strings).
#91242 replaced the Windows chm help file with a copy
of the html docs. This PR replaces the IDLE code that
fetches the Windows local help url passed to os.startfile.
Co-authored-by: Steve Dower
Under certain build conditions, test_check_c_globals fails. This fix takes the same approach as we took for gh-84236 (via gh-20095). We'll be removing use of distutils in the c-analyzer at some point. Until then we'll hide the warning filter.
An unrecognized format character in PyUnicode_FromFormat() and
PyUnicode_FromFormatV() now sets a SystemError.
In previous versions it caused all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
gh-93243
This PR is required to reduce diffs of the following porting (no need to either maintain documentation and tests consistent with each porting step, or try to port everything and remove smtpd in a single PR).
Automerge-Triggered-By: GH:warsaw
Have `pathlib.WindowsPath.is_mount()` call `ntpath.ismount()`. Previously it raised `NotImplementedError` unconditionally.
https://bugs.python.org/issue42777
Adds a regression test for an re slowdown observed by rjsmin.
Uses multiprocessing to kill the test after SHORT_TIMEOUT.
Co-authored-by: Oleg Iarygin <dralife@yandex.ru>
Co-authored-by: Christian Heimes <christian@python.org>
* Add test for inheriting explicit __dict__ and weakref.
* Restore 3.10 behavior for multiple inheritance of C extension classes that store their dictionary at the end of the struct.
The API documentation for [findtext](https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.findtext) states that this function gives back an empty string on "no text content." With the previous implementation, this would give back a empty string even on text content values such as 0 or False. This patch attempts to resolve that by only giving back an empty string if the text attribute is set to `None`. Resolves#91447.
Automerge-Triggered-By: GH:gvanrossum
If one selects whole lines, as the sidebar makes easy, do not
add an extra line. Only move the end of a selection to the
beginning of the next line when not already at the beginning
of a line. (Also improve the surrounding code.)
Move `Select All` above `Cut` as it is used with `Cut` and `Copy` but not `Paste`. Add a separator between `Replace` and `Go to Line` to separate items that belong to the 'Edit-find' (above) and 'Edit-show' (below) IDLE github project topics.
When keyword argument name is an instance of a str subclass with
overloaded methods __eq__ and __hash__, the former code could not find
the name of an extraneous keyword argument to report an error, and
_PyArg_UnpackKeywords() returned success without setting the
corresponding cell in the linearized arguments array. But since the number
of expected initialized cells is determined as the total number of passed
arguments, this lead to reading NULL as a keyword parameter value, that
caused SystemError or crash or other undesired behavior.
This is the last precursor to storing tp_subclasses (and tp_weaklist) on the interpreter state for static builtin types.
Here we add per-type storage on PyInterpreterState, but only for the static builtin types. This involves the following:
* add PyInterpreterState.types
* move PyInterpreterState.type_cache to it
* add a "num_builtins_initialized" field
* add a "builtins" field (a static array big enough for all the static builtin types)
* add _PyStaticType_GetState() to look up a static builtin type's state
* (temporarily) add PyTypeObject.tp_static_builtin_index (to hold the type's index into PyInterpreterState.types.builtins)
We will be eliminating tp_static_builtin_index in a later change.
Previously, when using `functools.wrap` around them (and inherit their docstrings), sphinx renders the docstrings badly and raises warnings about wrong indent.
* gh-95218: Move tests for importlib.resources into test_importlib.resources.
* Also update makefile
* Include test_importlib/resources in code ownership rule.
This PR partially reverts gh-24421 (PR) and fixes the remaining concerns
given in gh-93044 (issue):
- keyword arguments are passed as positional arguments to factory()
- if an argument is not passed to sqlite3.connect(), its default value
is passed to factory()
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
This appears to be a typo. It causes try_protocol_combo to try to turn
on SSL 3.0 when testing PROTOCOL_SSLv23 (aka PROTOCOL_TLS), which
doesn't make any sense. Fix it to be PROTOCOL_SSLv3.
Without this, try_protocol_combo is actually setting
context.minimum_version to SSLv3 when called as
try_protocol_combo(ssl.PROTOCOL_TLS, ssl.PROTOCOL_TLS, True)
One would think this causes a no-ssl3 OpenSSL build to fail, but OpenSSL
forgot to make SSL_CTX_set_min_proto_version(SSL3_VERSION) does not
notice no-ssl3, so this typo has gone undetected. But we should still
fix the typo because, presumably, a future version of OpenSSL will
remove SSL 3.0 and do so more thoroughly, at which point this will
break.
* gh-94949: Disallow parsing parenthesised ctx manager with old feature_version
* 📜🤖 Added by blurb_it.
* Allow it with feature_version=(3, 9) as well
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Support for bytes broke sometime between Python 3.2 and 3.6 and has been broken ever since. Trying to bring back supports is surprisingly difficult in the face of -b and checking for keys in sys.path_importer_cache. Since the support was broken for so long, trying to overcome the difficulty of bringing back the support has been deemed not worth it.
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Brett Cannon <brett@python.org>
The case where there are more than (1 << 15) lines was not covered.
I don't know if increasing test coverage requires a blurb -- let me know if it does.
Automerge-Triggered-By: GH:brandtbucher
When binding a unix socket to an empty address on Linux, the socket is
automatically bound to an available address in the abstract namespace.
>>> s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
>>> s.bind("")
>>> s.getsockname()
b'\x0075499'
Since python 3.9, the socket is bound to the one address:
>>> s.getsockname()
b'\x00'
And trying to bind multiple sockets will fail with:
Traceback (most recent call last):
File "/home/nsoffer/src/cpython/Lib/test/test_socket.py", line 5553, in testAutobind
s2.bind("")
OSError: [Errno 98] Address already in use
Added 2 tests:
- Auto binding empty address on Linux
- Failing to bind an empty address on other platforms
Fixes f6b3a07b7d (bpo-44493: Add missing terminated NUL in sockaddr_un's length (GH-26866)
This makes calls to co_lnotab to exercise this code, as well
as generating synthetically large code to exercise the corner
cases where line numbers need multiple bytes.
Automerge-Triggered-By: GH:brandtbucher
The current status quo is that private attribute names are not
mangled when a class is matched. I've added a test to
document/legimize this behaviour.
Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
This is a quick-and-dirty way to run the C++ tests.
It can definitely be improved in the future, but it should fail when things go wrong.
- Run test functions on import (yes, this can definitely be improved)
- Fudge setuptools metadata (name & version) to make the extension installable
- Install and import the extension in test_cppext
* gh-93883: elide traceback indicators when possible
Elide traceback column indicators when the entire line of the
frame is implicated. This reduces traceback length and draws
even more attention to the remaining (very relevant) indicators.
Example:
```
Traceback (most recent call last):
File "query.py", line 99, in <module>
bar()
File "query.py", line 66, in bar
foo()
File "query.py", line 37, in foo
magic_arithmetic('foo')
File "query.py", line 18, in magic_arithmetic
return add_counts(x) / 25
^^^^^^^^^^^^^
File "query.py", line 24, in add_counts
return 25 + query_user(user1) + query_user(user2)
^^^^^^^^^^^^^^^^^
File "query.py", line 32, in query_user
return 1 + query_count(db, response['a']['b']['c']['user'], retry=True)
~~~~~~~~~~~~~~~~~~^^^^^
TypeError: 'NoneType' object is not subscriptable
```
Rather than going out of our way to provide indicator coverage
in every traceback test suite, the indicator test suite should
be responible for sufficient coverage (e.g. by adding a basic
exception group test to ensure that margin strings are covered).
Remove the ssl.wrap_socket() function, deprecated in Python 3.7:
instead, create a ssl.SSLContext object and call its
sl.SSLContext.wrap_socket() method. Any package that still uses
ssl.wrap_socket() is broken and insecure. The function neither sends
a SNI TLS extension nor validates server hostname. Code is subject to
CWE-295 : Improper Certificate Validation.
This removes the performance regression in 3.11, **at the expense of not fixing
the "bug" that allows accessing values from values** (e.g. `Color.RED.BLUE`).
Using the benchmark @markshannon [presented](https://github.com/python/cpython/issues/93910#issuecomment-1165503032), the results are:
| Version | Enum | Fast enum | Normal class |
| --- | --- | --- | --- |
| 3.10 | 2.04 | 0.59 | 0.56 |
| 3.11 | 2.78 | 0.31 | 0.15 |
| This PR | 1.30 | 0.32 | 0.16 |
I share this mostly as information about the source of the regression, as this may be useful. It may be that the lower-risk approach for the beta is just to revert to a previously-known working state.
Inlining of code that corresponds to source code lines, can make it hard to distinguish later between code which is only reachable from except handlers, and that which is reachable in normal control flow. This caused problems with the debugger's jump feature.
This PR turns off the inlining optimisation for code which has line numbers. We still inline things like the implicit "return None".
zipimport: Remove find_loader() and find_module() methods, deprecated
in Python 3.10: use the find_spec() method instead. See PEP 451 for
the rationale.
xml.etree: Remove the ElementTree.Element.copy() method of the pure
Python implementation, deprecated in Python 3.10, use the copy.copy()
function instead. The C implementation of xml.etree has no copy()
method, only a __copy__() method.
Adds `ctypes.c_time_t` to represent the C `time_t` type accurately as its size varies.
Primarily-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org> [Google]
Once the task group is shutting down, it should not be possible to create a new task.
Here "shutting down" means `self._aborting` is set, indicating that at least one task
has failed and we have cancelled all others.
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
The inspect version was not working with unittest.mock.AsyncMock.
The fix introduces special-casing of AsyncMock in
`inspect.iscoroutinefunction` equivalent to the one
performed in `asyncio.iscoroutinefunction`.
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Remove dead code related to ssl.PROTOCOL_SSLv2. ssl.PROTOCOL_SSLv2
was already removed in Python 3.10.
In test_ssl, @requires_tls_version('SSLv2') always returned False.
Extract of the removed code: "OpenSSL has removed support for SSLv2".
Remove the pure Python implementation of hashlib.pbkdf2_hmac(),
deprecated in Python 3.10. Python 3.10 and newer requires OpenSSL
1.1.1 or newer (PEP 644), this OpenSSL version provides a C
implementation of pbkdf2_hmac() which is faster.
``os.geteuid() == 0`` is not a reliable check whether the current user
has the capability to bypass permission checks. Tests now probe for DAC
override.
Remove the locale.format() function, deprecated in Python
3.7: use locale.format_string() instead.
Remove TestFormatPatternArg test case: it is irrelevant for
locale.format_string() which accepts complex formats.
Make _struct.Struct a GC type
This fixes a memory leak in the _struct module, where as soon
as a Struct object is stored in the cache, there's a cycle from
the _struct module to the cache to Struct objects to the Struct
type back to the module. If _struct.Struct is not gc-tracked, that
cycle is never collected.
This PR makes _struct.Struct GC-tracked, and adds a regression test.
`tarfile` already accepts a compressionlevel argument for creating
files. This patch adds the same for stream-based tarfile usage.
The default is 9, the value that was previously hard-coded.
The existing test covering this case passed only incidentally. We
explicitly disallow doing this and add a proper error message.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
The urllib.request no longer uses the deprecated check_hostname
parameter of the http.client module.
Add private http.client._create_https_context() helper to http.client,
used by urllib.request.
Remove the now redundant check on check_hostname and verify_mode in
http.client: the SSLContext.check_hostname setter already implements
the check.
- c_longlong and c_longdouble need experimental WASM bigint.
- Skip tests that need threading
- Define ``CTYPES_MAX_ARGCOUNT`` for Emscripten. libffi-emscripten 2022-06-23 supports up to 1000 args.
gzip: Remove the filename attribute of gzip.GzipFile,
deprecated since Python 2.6, use the name attribute instead. In write
mode, the filename attribute added '.gz' file extension if it was not
present.
Remove io.OpenWrapper and _pyio.OpenWrapper, deprecated in Python
3.10: just use :func:`open` instead. The open() (io.open()) function
is a built-in function. Since Python 3.10, _pyio.open() is also a
static method.
* Add an index_pages default list to SimpleHTTPRequestHandler and an
optional constructor parameter that allows the default indexes pages
list to be overridden. This makes it easy to set a new index page name
without having to override send_head.
When used with plain Enum, auto() returns the last numeric value assigned, skipping any incompatible member values (such as strings); starting in 3.13 the default auto() for plain Enums will require all the values to be of compatible types, and will return a new value that is 1 higher than any existing value.
Co-authored-by: Ethan Furman <ethan@stoneleaf.us>
* Move Lib/tkinter/test/test_tkinter/ to Lib/test/test_tkinter/.
* Move Lib/tkinter/test/test_ttk/ to Lib/test/test_ttk/.
* Add Lib/test/test_ttk/__init__.py based on test_ttk_guionly.py.
* Add Lib/test/test_tkinter/__init__.py
* Remove old Lib/test/test_tk.py.
* Remove old Lib/test/test_ttk_guionly.py.
* Add __main__ sub-modules.
* Update imports and update references to rename files.
It is no longer changed when create a zip or tar archive.
It is still changed for custom archivers registered with shutil.register_archive_format()
if root_dir is not None.
Co-authored-by: Éric <merwok@netwok.org>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Fix an open redirection vulnerability in the `http.server` module when
an URI path starts with `//` that could produce a 301 Location header
with a misleading target. Vulnerability discovered, and logic fix
proposed, by Hamza Avvan (@hamzaavvan).
Test and comments authored by Gregory P. Smith [Google].
When we construct the upper and lower candidates in limit_denominator,
the numerator and denominator are already relatively prime (and the
denominator positive) by construction, so there's no need to go through
the usual normalisation in the constructor. This saves a couple of
potentially expensive gcd calls.
Suggested by Michael Scott Asato Cuthbert in GH-93477.