Previously codeop.compile_command() emitted compiler warnings (SyntaxWarning or
DeprecationWarning) and raised a SyntaxError for incomplete input containing
a potentially incorrect code. Now it always returns None for incomplete input
without emitting any warnings.
* fix: annotate `pstats.FunctionProfile.ncalls` as `str`
This change aligns the type annotation of `pstats.FunctionProfile.ncalls` with its runtime type.
The test file, a modified version of Lib/test/audiodata/pluck-pcm24.wav, was provided by Andrea Celletti on the bug tracker.
Co-authored-by: Zackery Spytz <zspytz@gmail.com>
Fix the faulthandler implementation of faulthandler.register(signal,
chain=True) if the sigaction() function is not available: don't call
the previous signal handler if it's NULL.
This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8.
It also fixes an off-by-one error introduced in 3.10 for the line number when the tokenizer reports bad UTF8.
This doesn't happen naturally, but is allowed by the ASDL and compiler.
We don't want to change ASDL for backward compatibility reasons
(#57645, #92987)
Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)
The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.
The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```
In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$
From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
* gh-68163: Correct conversion of Rational instances to float
Also document that numerator/denominator properties are instances of Integral.
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Link to #93884
* Test with some large negative and positive values(out of range of a longlong,i.e.[-2\*\*63, 2\*\*63-1])
* Test with objects of non-int type
Automerge-Triggered-By: GH:mdickinson
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.
This PR comes fresh from a pile of work done in our private PSRT security response team repo.
Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
⚠️⚠️ Note for reviewers, hackers and fellow systems/low-level/compiler engineers ⚠️⚠️
If you have a lot of experience with this kind of shenanigans and want to improve the **first** version, **please make a PR against my branch** or **reach out by email** or **suggest code changes directly on GitHub**.
If you have any **refinements or optimizations** please, wait until the first version is merged before starting hacking or proposing those so we can keep this PR productive.
- pre-build Emscripten ports and system libraries
- check for broken EMSDK versions
- use EMSDK's node for wasm32-emscripten
- warn when PKG_CONFIG_PATH is set
- add support level information
The previous wording of this entry suggests that CPython
won't work if optional compiler features are enabled.
That's not the case. The change is that we require C11 rather
than C89.
Note that PEP 7 does say "Python 3.11 and newer versions use C11
without optional features." It is correct there: that's
not a guide for users who compile Python, but for CPython devs
who must avoid the features.
datetime.isoformat generates the tzoffset with colons, but there
was no format code to make strftime output the same format.
for simplicity and consistency the %:z formatting behaves mostly
as %z, with the exception of adding colons. this includes the
dynamic behaviour of adding seconds and microseconds only when
needed (when not 0).
this fixes the still open "generate" part of this issue:
https://github.com/python/cpython/issues/69142
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
find_unused_port() has an inherent race condition, but we can't use
bind_port() as that uses .getsockname() which this test is exercising.
Try binding to unused ports a few times before failing.
Signed-off-by: Ross Burton <ross.burton@arm.com>
* gh-93503: Add APIs to set profiling and tracing functions in all threads in the C-API
* Use a separate API
* Fix NEWS entry
* Add locks around the loop
* Document ignoring exceptions
* Use the new APIs in the sys module
* Update docs
* Add support for the BOLT post-link binary optimizer
Using [bolt](https://github.com/llvm/llvm-project/tree/main/bolt)
provides a fairly large speedup without any code or functionality
changes. It provides roughly a 1% speedup on pyperformance, and a
4% improvement on the Pyston web macrobenchmarks.
It is gated behind an `--enable-bolt` configure arg because not all
toolchains and environments are supported. It has been tested on a
Linux x86_64 toolchain, using llvm-bolt built from the LLVM 14.0.6
sources (their binary distribution of this version did not include bolt).
Compared to [a previous attempt](https://github.com/faster-cpython/ideas/issues/224),
this commit uses bolt's preferred "instrumentation" approach, as well as adds some non-PIE
flags which enable much better optimizations from bolt.
The effects of this change are a bit more dependent on CPU microarchitecture
than other changes, since it optimizes i-cache behavior which seems
to be a bit more variable between architectures. The 1%/4% numbers
were collected on an Intel Skylake CPU, and on an AMD Zen 3 CPU I
got a slightly larger speedup (2%/4%), and on a c6i.xlarge EC2 instance
I got a slightly lower speedup (1%/3%).
The low speedup on pyperformance is not entirely unexpected, because
BOLT improves i-cache behavior, and the benchmarks in the pyperformance
suite are small and tend to fit in i-cache.
This change uses the existing pgo profiling task (`python -m test --pgo`),
though I was able to measure about a 1% macrobenchmark improvement by
using the macrobenchmarks as the training task. I personally think that
both the PGO and BOLT tasks should be updated to use macrobenchmarks,
but for the sake of splitting up the work this PR uses the existing pgo task.
* Simplify the build flags
* Add a NEWS entry
* Update Makefile.pre.in
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
* Update configure.ac
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
* Add myself to ACKS
* Add docs
* Other review comments
* fix tab/space issue
* Make it more clear that --enable-bolt is experimental
* Add link to bolt's github page
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
* Treat tp_weakref and tp_dictoffset like other opaque slots for multiple inheritance.
* Document Py_TPFLAGS_MANAGED_DICT and Py_TPFLAGS_MANAGED_WEAKREF in what's new.
When a task catches CancelledError and raises some other error,
the other error should not silently be suppressed.
Any scenario where a task crashes in cleanup upon cancellation
will now result in an ExceptionGroup wrapping the crash(es)
instead of propagating CancelledError and ignoring the side errors.
NOTE: This represents a change in behavior (hence the need to
change several tests). But it is only an edge case.
Co-authored-by: Thomas Grainger <tagrain@gmail.com>
- On WASI `ENOTCAPABLE` is now mapped to `PermissionError`.
- The `errno` modules exposes the new error number.
- `getpath.py` now ignores `PermissionError` when it cannot open landmark
files `pybuilddir.txt` and `pyenv.cfg`.
If kernel fips is enabled, we get permission error upon doing
`import crypt`. So, if kernel fips is enabled, disable the
unallowed hashing methods.
Python 3.9.1 (default, May 10 2022, 11:36:26)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import crypt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.9/crypt.py", line 117, in <module>
_add_method('MD5', '1', 8, 34)
File "/usr/lib/python3.9/crypt.py", line 94, in _add_method
result = crypt('', salt)
File "/usr/lib/python3.9/crypt.py", line 82, in crypt
return _crypt.crypt(word, salt)
PermissionError: [Errno 1] Operation not permitted
Signed-off-by: Shreenidhi Shedi <sshedi@vmware.com>
* Make sure that tp_dictoffset is correct with Py_TPFLAGS_MANAGED_DICT is set.
* Avoid traversing managed dict twice when subclassing class with Py_TPFLAGS_MANAGED_DICT set.
Automate WASM build with a new Python script. The script provides
several build profiles with configure flags for Emscripten flavors
and WASI. The script can detect and use Emscripten SDK and WASI SDK from
default locations or env vars.
``configure`` now detects Node arguments and creates HOSTRUNNER
arguments for Node 16. It also sets correct arguments for
``wasm64-emscripten``.
Co-authored-by: Brett Cannon <brett@python.org>
Have pathlib use `os.path.join()` to join arguments to the `PurePath` initialiser, which fixes a minor bug when handling relative paths with drives.
Previously:
```python
>>> from pathlib import PureWindowsPath
>>> a = 'C:/a/b'
>>> b = 'C:x/y'
>>> PureWindowsPath(a, b)
PureWindowsPath('C:x/y')
```
Now:
```python
>>> PureWindowsPath(a, b)
PureWindowsPath('C:/a/b/x/y')
```
- Move PyUnicode tests to a separate file
- Add some more tests for PyUnicode_FromFormat
Co-authored-by: philg314 <110174000+philg314@users.noreply.github.com>
An unrecognized format character in PyUnicode_FromFormat() and
PyUnicode_FromFormatV() now sets a SystemError.
In previous versions it caused all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
gh-93243
This PR is required to reduce diffs of the following porting (no need to either maintain documentation and tests consistent with each porting step, or try to port everything and remove smtpd in a single PR).
Automerge-Triggered-By: GH:warsaw
Have `pathlib.WindowsPath.is_mount()` call `ntpath.ismount()`. Previously it raised `NotImplementedError` unconditionally.
https://bugs.python.org/issue42777
Remove the "configure --with-cxx-main" build option: it didn't work
for many years. Remove the MAINCC variable from configure and
Makefile.
The MAINCC variable was added by the issue gh-42471: commit
0f48d98b74. Previously, --with-cxx-main
was named --with-cxx.
Keep CXX and LDCXXSHARED variables, even if they are no longer used
by Python build system.
Adds a regression test for an re slowdown observed by rjsmin.
Uses multiprocessing to kill the test after SHORT_TIMEOUT.
Co-authored-by: Oleg Iarygin <dralife@yandex.ru>
Co-authored-by: Christian Heimes <christian@python.org>
* Add test for inheriting explicit __dict__ and weakref.
* Restore 3.10 behavior for multiple inheritance of C extension classes that store their dictionary at the end of the struct.
The API documentation for [findtext](https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.findtext) states that this function gives back an empty string on "no text content." With the previous implementation, this would give back a empty string even on text content values such as 0 or False. This patch attempts to resolve that by only giving back an empty string if the text attribute is set to `None`. Resolves#91447.
Automerge-Triggered-By: GH:gvanrossum
If one selects whole lines, as the sidebar makes easy, do not
add an extra line. Only move the end of a selection to the
beginning of the next line when not already at the beginning
of a line. (Also improve the surrounding code.)
Move `Select All` above `Cut` as it is used with `Cut` and `Copy` but not `Paste`. Add a separator between `Replace` and `Go to Line` to separate items that belong to the 'Edit-find' (above) and 'Edit-show' (below) IDLE github project topics.
When keyword argument name is an instance of a str subclass with
overloaded methods __eq__ and __hash__, the former code could not find
the name of an extraneous keyword argument to report an error, and
_PyArg_UnpackKeywords() returned success without setting the
corresponding cell in the linearized arguments array. But since the number
of expected initialized cells is determined as the total number of passed
arguments, this lead to reading NULL as a keyword parameter value, that
caused SystemError or crash or other undesired behavior.
- check for ``dup()`` libc function
- handle missing ``F_DUPFD`` in ``dup2()`` replacement function
- add workaround for WASI libc bug in MSG_TRUNC
- ESHUTDOWN is missing, use EPIPE instead
- POLLPRI is missing, define as 0 (no-op)
* Add _Py_memory_repeat function to pycore_list
* Add _Py_RefcntAdd function to pycore_object
* Use the new functions in tuplerepeat, list_repeat, and list_inplace_repeat
* gh-95218: Move tests for importlib.resources into test_importlib.resources.
* Also update makefile
* Include test_importlib/resources in code ownership rule.
This PR partially reverts gh-24421 (PR) and fixes the remaining concerns
given in gh-93044 (issue):
- keyword arguments are passed as positional arguments to factory()
- if an argument is not passed to sqlite3.connect(), its default value
is passed to factory()
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* gh-94949: Disallow parsing parenthesised ctx manager with old feature_version
* 📜🤖 Added by blurb_it.
* Allow it with feature_version=(3, 9) as well
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Support for bytes broke sometime between Python 3.2 and 3.6 and has been broken ever since. Trying to bring back supports is surprisingly difficult in the face of -b and checking for keys in sys.path_importer_cache. Since the support was broken for so long, trying to overcome the difficulty of bringing back the support has been deemed not worth it.
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Brett Cannon <brett@python.org>
When binding a unix socket to an empty address on Linux, the socket is
automatically bound to an available address in the abstract namespace.
>>> s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
>>> s.bind("")
>>> s.getsockname()
b'\x0075499'
Since python 3.9, the socket is bound to the one address:
>>> s.getsockname()
b'\x00'
And trying to bind multiple sockets will fail with:
Traceback (most recent call last):
File "/home/nsoffer/src/cpython/Lib/test/test_socket.py", line 5553, in testAutobind
s2.bind("")
OSError: [Errno 98] Address already in use
Added 2 tests:
- Auto binding empty address on Linux
- Failing to bind an empty address on other platforms
Fixes f6b3a07b7d (bpo-44493: Add missing terminated NUL in sockaddr_un's length (GH-26866)
* gh-93883: elide traceback indicators when possible
Elide traceback column indicators when the entire line of the
frame is implicated. This reduces traceback length and draws
even more attention to the remaining (very relevant) indicators.
Example:
```
Traceback (most recent call last):
File "query.py", line 99, in <module>
bar()
File "query.py", line 66, in bar
foo()
File "query.py", line 37, in foo
magic_arithmetic('foo')
File "query.py", line 18, in magic_arithmetic
return add_counts(x) / 25
^^^^^^^^^^^^^
File "query.py", line 24, in add_counts
return 25 + query_user(user1) + query_user(user2)
^^^^^^^^^^^^^^^^^
File "query.py", line 32, in query_user
return 1 + query_count(db, response['a']['b']['c']['user'], retry=True)
~~~~~~~~~~~~~~~~~~^^^^^
TypeError: 'NoneType' object is not subscriptable
```
Rather than going out of our way to provide indicator coverage
in every traceback test suite, the indicator test suite should
be responible for sufficient coverage (e.g. by adding a basic
exception group test to ensure that margin strings are covered).
Deprecate typing.Hashable/Sized. Use the collections.abc counterparts directly instead.
To be consistent with PEP 585, deprecated aliases will not raise any DeprecationWarning.
Remove the ssl.wrap_socket() function, deprecated in Python 3.7:
instead, create a ssl.SSLContext object and call its
sl.SSLContext.wrap_socket() method. Any package that still uses
ssl.wrap_socket() is broken and insecure. The function neither sends
a SNI TLS extension nor validates server hostname. Code is subject to
CWE-295 : Improper Certificate Validation.
This removes the performance regression in 3.11, **at the expense of not fixing
the "bug" that allows accessing values from values** (e.g. `Color.RED.BLUE`).
Using the benchmark @markshannon [presented](https://github.com/python/cpython/issues/93910#issuecomment-1165503032), the results are:
| Version | Enum | Fast enum | Normal class |
| --- | --- | --- | --- |
| 3.10 | 2.04 | 0.59 | 0.56 |
| 3.11 | 2.78 | 0.31 | 0.15 |
| This PR | 1.30 | 0.32 | 0.16 |
I share this mostly as information about the source of the regression, as this may be useful. It may be that the lower-risk approach for the beta is just to revert to a previously-known working state.
Inlining of code that corresponds to source code lines, can make it hard to distinguish later between code which is only reachable from except handlers, and that which is reachable in normal control flow. This caused problems with the debugger's jump feature.
This PR turns off the inlining optimisation for code which has line numbers. We still inline things like the implicit "return None".
zipimport: Remove find_loader() and find_module() methods, deprecated
in Python 3.10: use the find_spec() method instead. See PEP 451 for
the rationale.
Add script ``Tools/scripts/check_modules.py`` to check and validate builtin
and shared extension modules. The script also handles ``Modules/Setup`` and
will eventually replace ``setup.py``.
Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@protonmail.com>
xml.etree: Remove the ElementTree.Element.copy() method of the pure
Python implementation, deprecated in Python 3.10, use the copy.copy()
function instead. The C implementation of xml.etree has no copy()
method, only a __copy__() method.
Adds `ctypes.c_time_t` to represent the C `time_t` type accurately as its size varies.
Primarily-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org> [Google]
Once the task group is shutting down, it should not be possible to create a new task.
Here "shutting down" means `self._aborting` is set, indicating that at least one task
has failed and we have cancelled all others.
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
The inspect version was not working with unittest.mock.AsyncMock.
The fix introduces special-casing of AsyncMock in
`inspect.iscoroutinefunction` equivalent to the one
performed in `asyncio.iscoroutinefunction`.
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Remove the pure Python implementation of hashlib.pbkdf2_hmac(),
deprecated in Python 3.10. Python 3.10 and newer requires OpenSSL
1.1.1 or newer (PEP 644), this OpenSSL version provides a C
implementation of pbkdf2_hmac() which is faster.
``os.geteuid() == 0`` is not a reliable check whether the current user
has the capability to bypass permission checks. Tests now probe for DAC
override.
Remove the locale.format() function, deprecated in Python
3.7: use locale.format_string() instead.
Remove TestFormatPatternArg test case: it is irrelevant for
locale.format_string() which accepts complex formats.
Make _struct.Struct a GC type
This fixes a memory leak in the _struct module, where as soon
as a Struct object is stored in the cache, there's a cycle from
the _struct module to the cache to Struct objects to the Struct
type back to the module. If _struct.Struct is not gc-tracked, that
cycle is never collected.
This PR makes _struct.Struct GC-tracked, and adds a regression test.
`tarfile` already accepts a compressionlevel argument for creating
files. This patch adds the same for stream-based tarfile usage.
The default is 9, the value that was previously hard-coded.
The existing test covering this case passed only incidentally. We
explicitly disallow doing this and add a proper error message.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
gzip: Remove the filename attribute of gzip.GzipFile,
deprecated since Python 2.6, use the name attribute instead. In write
mode, the filename attribute added '.gz' file extension if it was not
present.
Remove io.OpenWrapper and _pyio.OpenWrapper, deprecated in Python
3.10: just use :func:`open` instead. The open() (io.open()) function
is a built-in function. Since Python 3.10, _pyio.open() is also a
static method.
* Add an index_pages default list to SimpleHTTPRequestHandler and an
optional constructor parameter that allows the default indexes pages
list to be overridden. This makes it easy to set a new index page name
without having to override send_head.