Remove the Tools/demo/ directory which contained old demo scripts. A
copy can be found in the old-demos project:
https://github.com/gvanrossum/old-demos
Remove the following old demo scripts:
* beer.py
* eiffel.py
* hanoi.py
* life.py
* markov.py
* mcast.py
* queens.py
* redemo.py
* rpython.py
* rpythond.py
* sortvisu.py
* spreadsheet.py
* vector.py
Changes:
* Remove a reference to the redemo.py script in the regex howto
documentation.
* Remove a reference to the removed Tools/demo/ directory in the
curses documentation.
* Update PC/layout/ to remove the reference to Tools/demo/ directory.
* gh-97740: Fix bang in Sphinx C domain ref target syntax
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
* Add NEWS entry for C domain bang fix
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Fix the Python path configuration used to initialized sys.path at
Python startup. Paths are no longer encoded to UTF-8/strict to avoid
encoding errors if it contains surrogate characters (bytes paths are
decoded with the surrogateescape error handler).
getpath_basename() and getpath_dirname() functions no longer encode
the path to UTF-8/strict, but work directly on Unicode strings. These
functions now use PyUnicode_FindChar() and PyUnicode_Substring() on
the Unicode path, rather than strrchr() on the encoded bytes string.
Use output from a 3.10+ REPL, showing invalid range, for the
SyntaxError examples in the tutorial introduction page.
Automerge-Triggered-By: GH:iritkatriel
Fix a shell code injection vulnerability in the
get-remote-certificate.py example script. The script no longer uses a
shell to run "openssl" commands. Issue reported and initial fix by
Caleb Shortt.
Remove the Windows code path to send "quit" on stdin to the "openssl
s_client" command: use DEVNULL on all platforms instead.
Co-authored-by: Caleb Shortt <caleb@rgauge.com>
Fix multiplying a list by an integer (list *= int): detect the
integer overflow when the new allocated length is close to the
maximum size. Issue reported by Jordan Limor.
list_resize() now checks for integer overflow before multiplying the
new allocated length by the list item size (sizeof(PyObject*)).
Previously, checkbuttons in different parent widgets could have the same
short name and share the same state if arguments "name" and "variable" are
not specified. Now they are globally unique.
Fix command line parsing: reject "-X int_max_str_digits" option with
no value (invalid) when the PYTHONINTMAXSTRDIGITS environment
variable is set to a valid limit.
This PR fixes undefined behaviour in the struct module unpacking support functions `bu_longlong`, `lu_longlong`, `bu_int` and `lu_int`; thanks to @kumaraditya303 for finding these.
The fix is to accumulate the bytes in an unsigned integer type instead of a signed integer type, then to convert to the appropriate signed type. In cases where the width matches, that conversion will typically be compiled away to a no-op.
(Evidence from Godbolt: https://godbolt.org/z/5zvxodj64 .)
To make the conversions efficient, I've specialised the relevant functions for their output size: for `bu_longlong` and `lu_longlong`, this only entails checking that the output size is indeed `8`. But `bu_int` and `lu_int` were used for format sizes `2` and `4` - I've split those into two separate functions each.
No tests, because all of the affected cases are already exercised by the test suite.
This is a preliminary PR to refactor `PyLong_FromString` which is currently quite messy and has spaghetti like code that mixes up different concerns as well as duplicating logic.
In particular:
- `PyLong_FromString` now only handles sign, base and prefix detection and calls a new function `long_from_string_base` to parse the main body of the string.
- The `long_from_string_base` function handles all string validation and then calls `long_from_binary_base` or a new function `long_from_non_binary_base` to construct the actual `PyLong`.
- The existing `long_from_binary_base` function is simplified by factoring duplicated logic to `long_from_string_base`.
- The new function `long_from_non_binary_base` factors out much of the code from `PyLong_FromString` including in particular the quadratic algorithm reffered to in gh-95778 so that this can be seen separately from unrelated concerns such as string validation.
The main problem was that an unluckily timed task cancellation could cause
the semaphore to be stuck. There were also doubts about strict FIFO ordering
of tasks allowed to pass.
The Semaphore implementation was rewritten to be more similar to Lock.
Many tests for edge cases (including cancellation) were added.
At Python exit, sometimes a thread holding the GIL can wait forever
for a thread (usually a daemon thread) which requested to drop the
GIL, whereas the thread already exited. To fix the race condition,
the thread which requested the GIL drop now resets its request before
exiting.
take_gil() now calls RESET_GIL_DROP_REQUEST() before
PyThread_exit_thread() if it called SET_GIL_DROP_REQUEST to fix a
race condition with drop_gil().
Issue discovered and analyzed by Mingliang ZHAO.
This reverts commit 0587810698.
Reason: This broke buildbots (some warnings added by that commit are turned to errors in the SSL buildbot).
Repro: ./python Lib/test/ssltests.py
- Improve error message when parameter without a default follows one with a default
- Show same error message when positional-only params precede the default/non-default sequence
Warn on loop initialization, when setting the wakeup fd disturbs a previously set wakeup fd, and on loop closing, when upon resetting the wakeup fd, we find it has been changed by someone else.
Previously codeop.compile_command() emitted compiler warnings (SyntaxWarning or
DeprecationWarning) and raised a SyntaxError for incomplete input containing
a potentially incorrect code. Now it always returns None for incomplete input
without emitting any warnings.
* fix: annotate `pstats.FunctionProfile.ncalls` as `str`
This change aligns the type annotation of `pstats.FunctionProfile.ncalls` with its runtime type.
The test file, a modified version of Lib/test/audiodata/pluck-pcm24.wav, was provided by Andrea Celletti on the bug tracker.
Co-authored-by: Zackery Spytz <zspytz@gmail.com>
Fix the faulthandler implementation of faulthandler.register(signal,
chain=True) if the sigaction() function is not available: don't call
the previous signal handler if it's NULL.
This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8.
It also fixes an off-by-one error introduced in 3.10 for the line number when the tokenizer reports bad UTF8.
This doesn't happen naturally, but is allowed by the ASDL and compiler.
We don't want to change ASDL for backward compatibility reasons
(#57645, #92987)
Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)
The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.
The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```
In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$
From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
* gh-68163: Correct conversion of Rational instances to float
Also document that numerator/denominator properties are instances of Integral.
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Link to #93884
* Test with some large negative and positive values(out of range of a longlong,i.e.[-2\*\*63, 2\*\*63-1])
* Test with objects of non-int type
Automerge-Triggered-By: GH:mdickinson
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.
This PR comes fresh from a pile of work done in our private PSRT security response team repo.
Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
⚠️⚠️ Note for reviewers, hackers and fellow systems/low-level/compiler engineers ⚠️⚠️
If you have a lot of experience with this kind of shenanigans and want to improve the **first** version, **please make a PR against my branch** or **reach out by email** or **suggest code changes directly on GitHub**.
If you have any **refinements or optimizations** please, wait until the first version is merged before starting hacking or proposing those so we can keep this PR productive.
- pre-build Emscripten ports and system libraries
- check for broken EMSDK versions
- use EMSDK's node for wasm32-emscripten
- warn when PKG_CONFIG_PATH is set
- add support level information
datetime.isoformat generates the tzoffset with colons, but there
was no format code to make strftime output the same format.
for simplicity and consistency the %:z formatting behaves mostly
as %z, with the exception of adding colons. this includes the
dynamic behaviour of adding seconds and microseconds only when
needed (when not 0).
this fixes the still open "generate" part of this issue:
https://github.com/python/cpython/issues/69142
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
find_unused_port() has an inherent race condition, but we can't use
bind_port() as that uses .getsockname() which this test is exercising.
Try binding to unused ports a few times before failing.
Signed-off-by: Ross Burton <ross.burton@arm.com>
* gh-93503: Add APIs to set profiling and tracing functions in all threads in the C-API
* Use a separate API
* Fix NEWS entry
* Add locks around the loop
* Document ignoring exceptions
* Use the new APIs in the sys module
* Update docs
* Add support for the BOLT post-link binary optimizer
Using [bolt](https://github.com/llvm/llvm-project/tree/main/bolt)
provides a fairly large speedup without any code or functionality
changes. It provides roughly a 1% speedup on pyperformance, and a
4% improvement on the Pyston web macrobenchmarks.
It is gated behind an `--enable-bolt` configure arg because not all
toolchains and environments are supported. It has been tested on a
Linux x86_64 toolchain, using llvm-bolt built from the LLVM 14.0.6
sources (their binary distribution of this version did not include bolt).
Compared to [a previous attempt](https://github.com/faster-cpython/ideas/issues/224),
this commit uses bolt's preferred "instrumentation" approach, as well as adds some non-PIE
flags which enable much better optimizations from bolt.
The effects of this change are a bit more dependent on CPU microarchitecture
than other changes, since it optimizes i-cache behavior which seems
to be a bit more variable between architectures. The 1%/4% numbers
were collected on an Intel Skylake CPU, and on an AMD Zen 3 CPU I
got a slightly larger speedup (2%/4%), and on a c6i.xlarge EC2 instance
I got a slightly lower speedup (1%/3%).
The low speedup on pyperformance is not entirely unexpected, because
BOLT improves i-cache behavior, and the benchmarks in the pyperformance
suite are small and tend to fit in i-cache.
This change uses the existing pgo profiling task (`python -m test --pgo`),
though I was able to measure about a 1% macrobenchmark improvement by
using the macrobenchmarks as the training task. I personally think that
both the PGO and BOLT tasks should be updated to use macrobenchmarks,
but for the sake of splitting up the work this PR uses the existing pgo task.
* Simplify the build flags
* Add a NEWS entry
* Update Makefile.pre.in
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
* Update configure.ac
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
* Add myself to ACKS
* Add docs
* Other review comments
* fix tab/space issue
* Make it more clear that --enable-bolt is experimental
* Add link to bolt's github page
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
* Treat tp_weakref and tp_dictoffset like other opaque slots for multiple inheritance.
* Document Py_TPFLAGS_MANAGED_DICT and Py_TPFLAGS_MANAGED_WEAKREF in what's new.
When a task catches CancelledError and raises some other error,
the other error should not silently be suppressed.
Any scenario where a task crashes in cleanup upon cancellation
will now result in an ExceptionGroup wrapping the crash(es)
instead of propagating CancelledError and ignoring the side errors.
NOTE: This represents a change in behavior (hence the need to
change several tests). But it is only an edge case.
Co-authored-by: Thomas Grainger <tagrain@gmail.com>
- On WASI `ENOTCAPABLE` is now mapped to `PermissionError`.
- The `errno` modules exposes the new error number.
- `getpath.py` now ignores `PermissionError` when it cannot open landmark
files `pybuilddir.txt` and `pyenv.cfg`.
If kernel fips is enabled, we get permission error upon doing
`import crypt`. So, if kernel fips is enabled, disable the
unallowed hashing methods.
Python 3.9.1 (default, May 10 2022, 11:36:26)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import crypt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.9/crypt.py", line 117, in <module>
_add_method('MD5', '1', 8, 34)
File "/usr/lib/python3.9/crypt.py", line 94, in _add_method
result = crypt('', salt)
File "/usr/lib/python3.9/crypt.py", line 82, in crypt
return _crypt.crypt(word, salt)
PermissionError: [Errno 1] Operation not permitted
Signed-off-by: Shreenidhi Shedi <sshedi@vmware.com>
* Make sure that tp_dictoffset is correct with Py_TPFLAGS_MANAGED_DICT is set.
* Avoid traversing managed dict twice when subclassing class with Py_TPFLAGS_MANAGED_DICT set.
Automate WASM build with a new Python script. The script provides
several build profiles with configure flags for Emscripten flavors
and WASI. The script can detect and use Emscripten SDK and WASI SDK from
default locations or env vars.
``configure`` now detects Node arguments and creates HOSTRUNNER
arguments for Node 16. It also sets correct arguments for
``wasm64-emscripten``.
Co-authored-by: Brett Cannon <brett@python.org>
Have pathlib use `os.path.join()` to join arguments to the `PurePath` initialiser, which fixes a minor bug when handling relative paths with drives.
Previously:
```python
>>> from pathlib import PureWindowsPath
>>> a = 'C:/a/b'
>>> b = 'C:x/y'
>>> PureWindowsPath(a, b)
PureWindowsPath('C:x/y')
```
Now:
```python
>>> PureWindowsPath(a, b)
PureWindowsPath('C:/a/b/x/y')
```
An unrecognized format character in PyUnicode_FromFormat() and
PyUnicode_FromFormatV() now sets a SystemError.
In previous versions it caused all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
gh-93243
This PR is required to reduce diffs of the following porting (no need to either maintain documentation and tests consistent with each porting step, or try to port everything and remove smtpd in a single PR).
Automerge-Triggered-By: GH:warsaw
Have `pathlib.WindowsPath.is_mount()` call `ntpath.ismount()`. Previously it raised `NotImplementedError` unconditionally.
https://bugs.python.org/issue42777
Remove the "configure --with-cxx-main" build option: it didn't work
for many years. Remove the MAINCC variable from configure and
Makefile.
The MAINCC variable was added by the issue gh-42471: commit
0f48d98b74. Previously, --with-cxx-main
was named --with-cxx.
Keep CXX and LDCXXSHARED variables, even if they are no longer used
by Python build system.
Adds a regression test for an re slowdown observed by rjsmin.
Uses multiprocessing to kill the test after SHORT_TIMEOUT.
Co-authored-by: Oleg Iarygin <dralife@yandex.ru>
Co-authored-by: Christian Heimes <christian@python.org>
* Add test for inheriting explicit __dict__ and weakref.
* Restore 3.10 behavior for multiple inheritance of C extension classes that store their dictionary at the end of the struct.
The API documentation for [findtext](https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.findtext) states that this function gives back an empty string on "no text content." With the previous implementation, this would give back a empty string even on text content values such as 0 or False. This patch attempts to resolve that by only giving back an empty string if the text attribute is set to `None`. Resolves#91447.
Automerge-Triggered-By: GH:gvanrossum
If one selects whole lines, as the sidebar makes easy, do not
add an extra line. Only move the end of a selection to the
beginning of the next line when not already at the beginning
of a line. (Also improve the surrounding code.)
Move `Select All` above `Cut` as it is used with `Cut` and `Copy` but not `Paste`. Add a separator between `Replace` and `Go to Line` to separate items that belong to the 'Edit-find' (above) and 'Edit-show' (below) IDLE github project topics.
When keyword argument name is an instance of a str subclass with
overloaded methods __eq__ and __hash__, the former code could not find
the name of an extraneous keyword argument to report an error, and
_PyArg_UnpackKeywords() returned success without setting the
corresponding cell in the linearized arguments array. But since the number
of expected initialized cells is determined as the total number of passed
arguments, this lead to reading NULL as a keyword parameter value, that
caused SystemError or crash or other undesired behavior.
- check for ``dup()`` libc function
- handle missing ``F_DUPFD`` in ``dup2()`` replacement function
- add workaround for WASI libc bug in MSG_TRUNC
- ESHUTDOWN is missing, use EPIPE instead
- POLLPRI is missing, define as 0 (no-op)
* Add _Py_memory_repeat function to pycore_list
* Add _Py_RefcntAdd function to pycore_object
* Use the new functions in tuplerepeat, list_repeat, and list_inplace_repeat
* gh-95218: Move tests for importlib.resources into test_importlib.resources.
* Also update makefile
* Include test_importlib/resources in code ownership rule.
This PR partially reverts gh-24421 (PR) and fixes the remaining concerns
given in gh-93044 (issue):
- keyword arguments are passed as positional arguments to factory()
- if an argument is not passed to sqlite3.connect(), its default value
is passed to factory()
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* gh-94949: Disallow parsing parenthesised ctx manager with old feature_version
* 📜🤖 Added by blurb_it.
* Allow it with feature_version=(3, 9) as well
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Support for bytes broke sometime between Python 3.2 and 3.6 and has been broken ever since. Trying to bring back supports is surprisingly difficult in the face of -b and checking for keys in sys.path_importer_cache. Since the support was broken for so long, trying to overcome the difficulty of bringing back the support has been deemed not worth it.
Co-authored-by: Eryk Sun <eryksun@gmail.com>
Co-authored-by: Brett Cannon <brett@python.org>
When binding a unix socket to an empty address on Linux, the socket is
automatically bound to an available address in the abstract namespace.
>>> s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
>>> s.bind("")
>>> s.getsockname()
b'\x0075499'
Since python 3.9, the socket is bound to the one address:
>>> s.getsockname()
b'\x00'
And trying to bind multiple sockets will fail with:
Traceback (most recent call last):
File "/home/nsoffer/src/cpython/Lib/test/test_socket.py", line 5553, in testAutobind
s2.bind("")
OSError: [Errno 98] Address already in use
Added 2 tests:
- Auto binding empty address on Linux
- Failing to bind an empty address on other platforms
Fixes f6b3a07b7d (bpo-44493: Add missing terminated NUL in sockaddr_un's length (GH-26866)