There is no reason for this watcher to be attached to any particular loop.
This should make it safe to use regardless of the lifetime of the event loop running in the main thread
(relative to other loops).
Co-authored-by: Yury Selivanov <yury@edgedb.com>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
In `_warnings.c`, in the C equivalent of `warnings.warn_explicit()`, if the module globals are given (and not None), the warning will attempt to get the source line for the issued warning. To do this, it needs the module's loader.
Previously, it would only look up `__loader__` in the module globals. In https://github.com/python/cpython/issues/86298 we want to defer to the `__spec__.loader` if available.
The first step on this journey is to check that `loader == __spec__.loader` and issue another warning if it is not. This commit does that.
Since this is a PoC, only manual testing for now.
```python
# /tmp/foo.py
import warnings
import bar
warnings.warn_explicit(
'warning!',
RuntimeWarning,
'bar.py', 2,
module='bar knee',
module_globals=bar.__dict__,
)
```
```python
# /tmp/bar.py
import sys
import os
import pathlib
# __loader__ = pathlib.Path()
```
Then running this: `./python.exe -Wdefault /tmp/foo.py`
Produces:
```
bar.py:2: RuntimeWarning: warning!
import os
```
Uncomment the `__loader__ = ` line in `bar.py` and try it again:
```
sys:1: ImportWarning: Module bar; __loader__ != __spec__.loader (<_frozen_importlib_external.SourceFileLoader object at 0x109f7dfa0> != PosixPath('.'))
bar.py:2: RuntimeWarning: warning!
import os
```
Automerge-Triggered-By: GH:warsaw
This is a small performance improvement, especially for one or two hot
places such as _handle_fromlist() that are called a lot and the
.format() method was being used just to join two strings with a dot.
Otherwise it is merely a readability improvement.
We keep `_ERR_MSG` and `_ERR_MSG_PREFIX` as those may be used elsewhere for canonical looking error messages.
Right now, the tokenizer only returns type and two pointers to the start and end of the token.
This PR modifies the tokenizer to return the type and set all of the necessary information,
so that the parser does not have to this.
The macOS 13 SDK includes support for the `mkfifoat` and `mknodat` system calls.
Using the `dir_fd` option with either `os.mkfifo` or `os.mknod` could result in a
segfault if cpython is built with the macOS 13 SDK but run on an earlier
version of macOS. Prevent this by adding runtime support for detection of
these system calls ("weaklinking") as is done for other newer syscalls on
macOS.
* improve performance of get_proxies_environment when there are many environment variables
* 📜🤖 Added by blurb_it.
* fix case of short env name
* fix formatting
* fix whitespace
* whitespace
* Update Lib/urllib/request.py
Co-authored-by: Carl Meyer <carl@oddbird.net>
* Update Lib/urllib/request.py
Co-authored-by: Carl Meyer <carl@oddbird.net>
* Update Lib/urllib/request.py
Co-authored-by: Carl Meyer <carl@oddbird.net>
* Update Lib/urllib/request.py
Co-authored-by: Carl Meyer <carl@oddbird.net>
* whitespace
* Update Misc/NEWS.d/next/Library/2022-04-15-11-29-38.gh-issue-91539.7WgVuA.rst
Co-authored-by: Carl Meyer <carl@oddbird.net>
* Update Lib/urllib/request.py
Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Carl Meyer <carl@oddbird.net>
Relevant tests moved from test_exceptions to test_traceback to be able to
compare both implementations.
Co-authored-by: Carl Friedrich Bolz-Tereick <cfbolz@gmx.de>
Remove the sys.getdxp() function and the Tools/scripts/analyze_dxp.py
script. DXP stands for "dynamic execution pairs". They were related
to DYNAMIC_EXECUTION_PROFILE and DXPAIRS macros which have been
removed in Python 3.11. Python can now be built with "./configure
--enable-pystats" to gather statistics on Python opcodes.
dataclass used to get the annotations on a class object using
cls.__dict__.get('__annotations__'). Now that it always imports
inspect, it can use inspect.get_annotations, which is modern
best practice for coping with annotations.
It had to live as a global outside of PyConfig for stable ABI reasons in
the pre-3.12 backports.
This removes the `_Py_global_config_int_max_str_digits` and gets rid of
the equivalent field in the internal `struct _is PyInterpreterState` as
code can just use the existing nested config struct within that.
Adds tests to verify unique settings and configs in subinterpreters.
Remove the Tools/demo/ directory which contained old demo scripts. A
copy can be found in the old-demos project:
https://github.com/gvanrossum/old-demos
Remove the following old demo scripts:
* beer.py
* eiffel.py
* hanoi.py
* life.py
* markov.py
* mcast.py
* queens.py
* redemo.py
* rpython.py
* rpythond.py
* sortvisu.py
* spreadsheet.py
* vector.py
Changes:
* Remove a reference to the redemo.py script in the regex howto
documentation.
* Remove a reference to the removed Tools/demo/ directory in the
curses documentation.
* Update PC/layout/ to remove the reference to Tools/demo/ directory.
* gh-97740: Fix bang in Sphinx C domain ref target syntax
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
* Add NEWS entry for C domain bang fix
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Fix the Python path configuration used to initialized sys.path at
Python startup. Paths are no longer encoded to UTF-8/strict to avoid
encoding errors if it contains surrogate characters (bytes paths are
decoded with the surrogateescape error handler).
getpath_basename() and getpath_dirname() functions no longer encode
the path to UTF-8/strict, but work directly on Unicode strings. These
functions now use PyUnicode_FindChar() and PyUnicode_Substring() on
the Unicode path, rather than strrchr() on the encoded bytes string.
Use output from a 3.10+ REPL, showing invalid range, for the
SyntaxError examples in the tutorial introduction page.
Automerge-Triggered-By: GH:iritkatriel
Fix a shell code injection vulnerability in the
get-remote-certificate.py example script. The script no longer uses a
shell to run "openssl" commands. Issue reported and initial fix by
Caleb Shortt.
Remove the Windows code path to send "quit" on stdin to the "openssl
s_client" command: use DEVNULL on all platforms instead.
Co-authored-by: Caleb Shortt <caleb@rgauge.com>
Fix multiplying a list by an integer (list *= int): detect the
integer overflow when the new allocated length is close to the
maximum size. Issue reported by Jordan Limor.
list_resize() now checks for integer overflow before multiplying the
new allocated length by the list item size (sizeof(PyObject*)).
Previously, checkbuttons in different parent widgets could have the same
short name and share the same state if arguments "name" and "variable" are
not specified. Now they are globally unique.
Fix command line parsing: reject "-X int_max_str_digits" option with
no value (invalid) when the PYTHONINTMAXSTRDIGITS environment
variable is set to a valid limit.
This PR fixes undefined behaviour in the struct module unpacking support functions `bu_longlong`, `lu_longlong`, `bu_int` and `lu_int`; thanks to @kumaraditya303 for finding these.
The fix is to accumulate the bytes in an unsigned integer type instead of a signed integer type, then to convert to the appropriate signed type. In cases where the width matches, that conversion will typically be compiled away to a no-op.
(Evidence from Godbolt: https://godbolt.org/z/5zvxodj64 .)
To make the conversions efficient, I've specialised the relevant functions for their output size: for `bu_longlong` and `lu_longlong`, this only entails checking that the output size is indeed `8`. But `bu_int` and `lu_int` were used for format sizes `2` and `4` - I've split those into two separate functions each.
No tests, because all of the affected cases are already exercised by the test suite.
This is a preliminary PR to refactor `PyLong_FromString` which is currently quite messy and has spaghetti like code that mixes up different concerns as well as duplicating logic.
In particular:
- `PyLong_FromString` now only handles sign, base and prefix detection and calls a new function `long_from_string_base` to parse the main body of the string.
- The `long_from_string_base` function handles all string validation and then calls `long_from_binary_base` or a new function `long_from_non_binary_base` to construct the actual `PyLong`.
- The existing `long_from_binary_base` function is simplified by factoring duplicated logic to `long_from_string_base`.
- The new function `long_from_non_binary_base` factors out much of the code from `PyLong_FromString` including in particular the quadratic algorithm reffered to in gh-95778 so that this can be seen separately from unrelated concerns such as string validation.
The main problem was that an unluckily timed task cancellation could cause
the semaphore to be stuck. There were also doubts about strict FIFO ordering
of tasks allowed to pass.
The Semaphore implementation was rewritten to be more similar to Lock.
Many tests for edge cases (including cancellation) were added.
At Python exit, sometimes a thread holding the GIL can wait forever
for a thread (usually a daemon thread) which requested to drop the
GIL, whereas the thread already exited. To fix the race condition,
the thread which requested the GIL drop now resets its request before
exiting.
take_gil() now calls RESET_GIL_DROP_REQUEST() before
PyThread_exit_thread() if it called SET_GIL_DROP_REQUEST to fix a
race condition with drop_gil().
Issue discovered and analyzed by Mingliang ZHAO.
This reverts commit 0587810698.
Reason: This broke buildbots (some warnings added by that commit are turned to errors in the SSL buildbot).
Repro: ./python Lib/test/ssltests.py
- Improve error message when parameter without a default follows one with a default
- Show same error message when positional-only params precede the default/non-default sequence
Warn on loop initialization, when setting the wakeup fd disturbs a previously set wakeup fd, and on loop closing, when upon resetting the wakeup fd, we find it has been changed by someone else.
Previously codeop.compile_command() emitted compiler warnings (SyntaxWarning or
DeprecationWarning) and raised a SyntaxError for incomplete input containing
a potentially incorrect code. Now it always returns None for incomplete input
without emitting any warnings.
* fix: annotate `pstats.FunctionProfile.ncalls` as `str`
This change aligns the type annotation of `pstats.FunctionProfile.ncalls` with its runtime type.
The test file, a modified version of Lib/test/audiodata/pluck-pcm24.wav, was provided by Andrea Celletti on the bug tracker.
Co-authored-by: Zackery Spytz <zspytz@gmail.com>
Fix the faulthandler implementation of faulthandler.register(signal,
chain=True) if the sigaction() function is not available: don't call
the previous signal handler if it's NULL.
This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8.
It also fixes an off-by-one error introduced in 3.10 for the line number when the tokenizer reports bad UTF8.
This doesn't happen naturally, but is allowed by the ASDL and compiler.
We don't want to change ASDL for backward compatibility reasons
(#57645, #92987)
Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)
The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.
The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```
In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$
From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
* gh-68163: Correct conversion of Rational instances to float
Also document that numerator/denominator properties are instances of Integral.
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Link to #93884
* Test with some large negative and positive values(out of range of a longlong,i.e.[-2\*\*63, 2\*\*63-1])
* Test with objects of non-int type
Automerge-Triggered-By: GH:mdickinson
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.
This PR comes fresh from a pile of work done in our private PSRT security response team repo.
Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
⚠️⚠️ Note for reviewers, hackers and fellow systems/low-level/compiler engineers ⚠️⚠️
If you have a lot of experience with this kind of shenanigans and want to improve the **first** version, **please make a PR against my branch** or **reach out by email** or **suggest code changes directly on GitHub**.
If you have any **refinements or optimizations** please, wait until the first version is merged before starting hacking or proposing those so we can keep this PR productive.
- pre-build Emscripten ports and system libraries
- check for broken EMSDK versions
- use EMSDK's node for wasm32-emscripten
- warn when PKG_CONFIG_PATH is set
- add support level information
datetime.isoformat generates the tzoffset with colons, but there
was no format code to make strftime output the same format.
for simplicity and consistency the %:z formatting behaves mostly
as %z, with the exception of adding colons. this includes the
dynamic behaviour of adding seconds and microseconds only when
needed (when not 0).
this fixes the still open "generate" part of this issue:
https://github.com/python/cpython/issues/69142
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
find_unused_port() has an inherent race condition, but we can't use
bind_port() as that uses .getsockname() which this test is exercising.
Try binding to unused ports a few times before failing.
Signed-off-by: Ross Burton <ross.burton@arm.com>
* gh-93503: Add APIs to set profiling and tracing functions in all threads in the C-API
* Use a separate API
* Fix NEWS entry
* Add locks around the loop
* Document ignoring exceptions
* Use the new APIs in the sys module
* Update docs
* Add support for the BOLT post-link binary optimizer
Using [bolt](https://github.com/llvm/llvm-project/tree/main/bolt)
provides a fairly large speedup without any code or functionality
changes. It provides roughly a 1% speedup on pyperformance, and a
4% improvement on the Pyston web macrobenchmarks.
It is gated behind an `--enable-bolt` configure arg because not all
toolchains and environments are supported. It has been tested on a
Linux x86_64 toolchain, using llvm-bolt built from the LLVM 14.0.6
sources (their binary distribution of this version did not include bolt).
Compared to [a previous attempt](https://github.com/faster-cpython/ideas/issues/224),
this commit uses bolt's preferred "instrumentation" approach, as well as adds some non-PIE
flags which enable much better optimizations from bolt.
The effects of this change are a bit more dependent on CPU microarchitecture
than other changes, since it optimizes i-cache behavior which seems
to be a bit more variable between architectures. The 1%/4% numbers
were collected on an Intel Skylake CPU, and on an AMD Zen 3 CPU I
got a slightly larger speedup (2%/4%), and on a c6i.xlarge EC2 instance
I got a slightly lower speedup (1%/3%).
The low speedup on pyperformance is not entirely unexpected, because
BOLT improves i-cache behavior, and the benchmarks in the pyperformance
suite are small and tend to fit in i-cache.
This change uses the existing pgo profiling task (`python -m test --pgo`),
though I was able to measure about a 1% macrobenchmark improvement by
using the macrobenchmarks as the training task. I personally think that
both the PGO and BOLT tasks should be updated to use macrobenchmarks,
but for the sake of splitting up the work this PR uses the existing pgo task.
* Simplify the build flags
* Add a NEWS entry
* Update Makefile.pre.in
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
* Update configure.ac
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
* Add myself to ACKS
* Add docs
* Other review comments
* fix tab/space issue
* Make it more clear that --enable-bolt is experimental
* Add link to bolt's github page
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
* Treat tp_weakref and tp_dictoffset like other opaque slots for multiple inheritance.
* Document Py_TPFLAGS_MANAGED_DICT and Py_TPFLAGS_MANAGED_WEAKREF in what's new.
When a task catches CancelledError and raises some other error,
the other error should not silently be suppressed.
Any scenario where a task crashes in cleanup upon cancellation
will now result in an ExceptionGroup wrapping the crash(es)
instead of propagating CancelledError and ignoring the side errors.
NOTE: This represents a change in behavior (hence the need to
change several tests). But it is only an edge case.
Co-authored-by: Thomas Grainger <tagrain@gmail.com>
- On WASI `ENOTCAPABLE` is now mapped to `PermissionError`.
- The `errno` modules exposes the new error number.
- `getpath.py` now ignores `PermissionError` when it cannot open landmark
files `pybuilddir.txt` and `pyenv.cfg`.
If kernel fips is enabled, we get permission error upon doing
`import crypt`. So, if kernel fips is enabled, disable the
unallowed hashing methods.
Python 3.9.1 (default, May 10 2022, 11:36:26)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import crypt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.9/crypt.py", line 117, in <module>
_add_method('MD5', '1', 8, 34)
File "/usr/lib/python3.9/crypt.py", line 94, in _add_method
result = crypt('', salt)
File "/usr/lib/python3.9/crypt.py", line 82, in crypt
return _crypt.crypt(word, salt)
PermissionError: [Errno 1] Operation not permitted
Signed-off-by: Shreenidhi Shedi <sshedi@vmware.com>
* Make sure that tp_dictoffset is correct with Py_TPFLAGS_MANAGED_DICT is set.
* Avoid traversing managed dict twice when subclassing class with Py_TPFLAGS_MANAGED_DICT set.
Automate WASM build with a new Python script. The script provides
several build profiles with configure flags for Emscripten flavors
and WASI. The script can detect and use Emscripten SDK and WASI SDK from
default locations or env vars.
``configure`` now detects Node arguments and creates HOSTRUNNER
arguments for Node 16. It also sets correct arguments for
``wasm64-emscripten``.
Co-authored-by: Brett Cannon <brett@python.org>
Have pathlib use `os.path.join()` to join arguments to the `PurePath` initialiser, which fixes a minor bug when handling relative paths with drives.
Previously:
```python
>>> from pathlib import PureWindowsPath
>>> a = 'C:/a/b'
>>> b = 'C:x/y'
>>> PureWindowsPath(a, b)
PureWindowsPath('C:x/y')
```
Now:
```python
>>> PureWindowsPath(a, b)
PureWindowsPath('C:/a/b/x/y')
```
An unrecognized format character in PyUnicode_FromFormat() and
PyUnicode_FromFormatV() now sets a SystemError.
In previous versions it caused all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.