The switch cases (really TARGET(opcode) macros) have been moved from ceval.c to generated_cases.c.h. That file is generated from instruction definitions in bytecodes.c (which impersonates a C file so the C code it contains can be edited without custom support in e.g. VS Code).
The code generator lives in Tools/cases_generator (it has a README.md explaining how it works). The DSL used to describe the instructions is a work in progress, described in https://github.com/faster-cpython/ideas/blob/main/3.12/interpreter_definition.md.
This is surely a work-in-progress. An easy next step could be auto-generating super-instructions.
**IMPORTANT: Merge Conflicts**
If you get a merge conflict for instruction implementations in ceval.c, your best bet is to port your changes to bytecodes.c. That file looks almost the same as the original cases, except instead of `TARGET(NAME)` it uses `inst(NAME)`, and the trailing `DISPATCH()` call is omitted (the code generator adds it automatically).
The uuid.getnode() function has multiple implementations, tested sequentially.
The ifconfig implementation was incorrect and always failed: fix it.
In practice, functions of libuuid library are preferred, if available:
uuid_generate_time_safe(), uuid_create() or uuid_generate_time().
Co-authored-by: Dong-hee Na <donghee.na92@gmail.com>
The Python test suite now fails wit exit code 4 if no tests ran. It
should help detecting typos in test names and test methods.
* Add "EXITCODE_" constants to Lib/test/libregrtest/main.py.
* Fix a typo: "NO TEST RUN" becomes "NO TESTS RAN"
For wasmtime 2.0, the stack depth cost is 6% higher. This causes the default max `marshal` recursion depth to blow the stack.
As the default marshal depth is 2000 and Windows is set to 1000, split the difference and choose 1500 for WASI to be safe.
Fix subscription of type aliases containing bare generic types or types
like TypeVar: for example tuple[A, T][int] and tuple[TypeVar, T][int],
where A is a generic type, and T is a type variable.
Previously, the optional restrictions on subinterpreters were: disallow fork, subprocess, and threads. By default, we were disallowing all three for "isolated" interpreters. We always allowed all three for the main interpreter and those created through the legacy `Py_NewInterpreter()` API.
Those settings were a bit conservative, so here we've adjusted the optional restrictions to: fork, exec, threads, and daemon threads. The default for "isolated" interpreters disables fork, exec, and daemon threads. Regular threads are allowed by default. We continue always allowing everything For the main interpreter and the legacy API.
In the code, we add `_PyInterpreterConfig.allow_exec` and `_PyInterpreterConfig.allow_daemon_threads`. We also add `Py_RTFLAGS_DAEMON_THREADS` and `Py_RTFLAGS_EXEC`.
* As most of `test_embed` now uses `Py_InitializeFromConfig`, add
a specific test case to cover `Py_Initialize` (and `Py_InitializeEx`)
* Rename `_testembed` init helper to clarify the API used
* Add a `PyConfig_Clear` call in `Py_InitializeEx` to make
the code more obviously correct (it already didn't leak as
none of the dynamically allocated config fields were being
populated, but it's clearer if the wrappers follow the
documented API usage guidelines)
By default, :meth:`pathlib.PurePath.relative_to` doesn't deal with paths that are not a direct prefix of the other, raising an exception in that instance. This change adds a *walk_up* parameter that can be set to allow for using ``..`` to calculate the relative path.
example:
```
>>> p = PurePosixPath('/etc/passwd')
>>> p.relative_to('/etc')
PurePosixPath('passwd')
>>> p.relative_to('/usr')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pathlib.py", line 940, in relative_to
raise ValueError(error_message.format(str(self), str(formatted)))
ValueError: '/etc/passwd' does not start with '/usr'
>>> p.relative_to('/usr', strict=False)
PurePosixPath('../etc/passwd')
```
https://bugs.python.org/issue40358
Automerge-Triggered-By: GH:brettcannon
Change FOR_ITER to have the same stack effect regardless of whether it branches or not.
Performance is unchanged as FOR_ITER (and specialized forms jump over the cleanup code).
(see https://github.com/python/cpython/issues/98608)
This change does the following:
1. change the argument to a new `_PyInterpreterConfig` struct
2. rename the function to `_Py_NewInterpreterFromConfig()`, inspired by `Py_InitializeFromConfig()` (takes a `_PyInterpreterConfig` instead of `isolated_subinterpreter`)
3. split up the boolean `isolated_subinterpreter` into the corresponding multiple granular settings
* allow_fork
* allow_subprocess
* allow_threads
4. add `PyInterpreterState.feature_flags` to store those settings
5. add a function for checking if a feature is enabled on an opaque `PyInterpreterState *`
6. drop `PyConfig._isolated_interpreter`
The existing default (see `Py_NewInterpeter()` and `Py_Initialize*()`) allows fork, subprocess, and threads and the optional "isolated" interpreter (see the `_xxsubinterpreters` module) disables all three. None of that changes here; the defaults are preserved.
Note that the given `_PyInterpreterConfig` will not be used outside `_Py_NewInterpreterFromConfig()`, nor preserved. This contrasts with how `PyConfig` is currently preserved, used, and even modified outside `Py_InitializeFromConfig()`. I'd rather just avoid that mess from the start for `_PyInterpreterConfig`. We can preserve it later if we find an actual need.
This change allows us to follow up with a number of improvements (e.g. stop disallowing subprocess and support disallowing exec instead).
(Note that this PR adds "private" symbols. We'll probably make them public, and add docs, in a separate change.)
Add Python implementations of certain longobject.c functions. These use
asymptotically faster algorithms that can be used for operations on
integers with many digits. In those cases, the performance overhead of
the Python implementation is not significant since the asymptotic
behavior is what dominates runtime. Functions provided by this module
should be considered private and not part of any public API.
Co-author: Tim Peters <tim.peters@gmail.com>
Co-author: Mark Dickinson <dickinsm@gmail.com>
Co-author: Bjorn Martinsson
Functions re.sub() and re.subn() and corresponding re.Pattern methods
are now 2-3 times faster for replacement strings containing group references.
Closes#91524
Primarily authored by serhiy-storchaka Serhiy Storchaka
Minor-cleanups-by: Gregory P. Smith [Google] <greg@krypto.org>
On Windows, when the Python test suite is run with the -jN option,
the ANSI code page is now used as the encoding for the stdout
temporary file, rather than using UTF-8 which can lead to decoding
errors.
Linux abstract sockets are insecure as they lack any form of filesystem
permissions so their use allows anyone on the system to inject code into
the process.
This removes the default preference for abstract sockets in
multiprocessing introduced in Python 3.9+ via
https://github.com/python/cpython/pull/18866 while fixing
https://github.com/python/cpython/issues/84031.
Explicit use of an abstract socket by a user now generates a
RuntimeWarning. If we choose to keep this warning, it should be
backported to the 3.7 and 3.8 branches.
* The compiler analyzes the usage of the first 64 local variables all at once using bit masks.
* Local variables beyond the first 64 are only partially analyzed, achieving linear time.
Added os.setns and os.unshare to easily switch between namespaces
on Linux.
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: CAM Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Victor Stinner <vstinner@python.org>
Make sys.setprofile() and sys.settrace() functions reentrant. They
can no long fail with: RuntimeError("Cannot install a trace function
while another trace function is being installed").
Make _PyEval_SetTrace() and _PyEval_SetProfile() functions reentrant,
rather than detecting and rejecting reentrant calls. Only delete the
reference to function arguments once the new function is fully set,
when a reentrant call is safe. Call also _PySys_Audit() earlier.
The os module and the PyUnicode_FSDecoder() function no longer accept
bytes-like paths, like bytearray and memoryview types: only the exact
bytes type is accepted for bytes strings.
Change summary:
+ There is now a `gzip.READ_BUFFER_SIZE` constant that is 128KB. Other programs that read in 128KB chunks: pigz and cat. So this seems best practice among good programs. Also it is faster than 8 kb chunks.
+ a zlib._ZlibDecompressor was added. This is the _bz2.BZ2Decompressor ported to zlib. Since the zlib.Decompress object is better for in-memory decompression, the _ZlibDecompressor is hidden. It only makes sense in file decompression, and that is already implemented now in the gzip library. No need to bother the users with this.
+ The ZlibDecompressor uses the older Cpython arrange_output_buffer functions, as those are faster and more appropriate for the use case.
+ GzipFile.read has been optimized. There is no longer a `unconsumed_tail` member to write back to padded file. This is instead handled by the ZlibDecompressor itself, which has an internal buffer. `_add_read_data` has been inlined, as it was just two calls.
EDIT: While I am adding improvements anyway, I figured I could add another one-liner optimization now to the python -m gzip application. That read chunks in io.DEFAULT_BUFFER_SIZE previously, but has been updated now to use READ_BUFFER_SIZE chunks.
#97530 fixed IDLE tests possibly crashing on a Mac without a GUI.
But it resulted in IDLE not starting in 3.10.8, 3.12.0a1, and
Microsoft Python 3.10.2288.0 when test/* is not installed.
After this patch, test.* is only imported when testing on Mac.
This is the next step for deprecating child watchers.
Until we've removed the API completely we have to use it, so this PR is mostly suppressing a lot of warnings when using the API internally.
Once the child watcher API is totally removed, the two child watcher implementations we actually use and need (Pidfd and Thread) will be turned into internal helpers.
On macOS, fix a crash in syslog.syslog() in multi-threaded
applications. On macOS, the libc syslog() function is not
thread-safe, so syslog.syslog() no longer releases the GIL to call
it.
This seems pretty straightforward. The issue mentions other calls in mmapmodule that we could release the GIL on, but those are in methods where we'd need to be careful to ensure that something sensible happens if those are called concurrently. In prior art, note that #12073 released the GIL for munmap. In a toy benchmark, I see the speedup you'd expect from doing this.
Automerge-Triggered-By: GH:gvanrossum
* gh-96821: Fix undefined behaviour in `audioop.c`
Left-shifting negative numbers is undefined behaviour.
Fortunately, multiplication works just as well, is defined behaviour,
and gets compiled to the same machine code as before by optimizing
compilers.
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
There is no reason for this watcher to be attached to any particular loop.
This should make it safe to use regardless of the lifetime of the event loop running in the main thread
(relative to other loops).
Co-authored-by: Yury Selivanov <yury@edgedb.com>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
In `_warnings.c`, in the C equivalent of `warnings.warn_explicit()`, if the module globals are given (and not None), the warning will attempt to get the source line for the issued warning. To do this, it needs the module's loader.
Previously, it would only look up `__loader__` in the module globals. In https://github.com/python/cpython/issues/86298 we want to defer to the `__spec__.loader` if available.
The first step on this journey is to check that `loader == __spec__.loader` and issue another warning if it is not. This commit does that.
Since this is a PoC, only manual testing for now.
```python
# /tmp/foo.py
import warnings
import bar
warnings.warn_explicit(
'warning!',
RuntimeWarning,
'bar.py', 2,
module='bar knee',
module_globals=bar.__dict__,
)
```
```python
# /tmp/bar.py
import sys
import os
import pathlib
# __loader__ = pathlib.Path()
```
Then running this: `./python.exe -Wdefault /tmp/foo.py`
Produces:
```
bar.py:2: RuntimeWarning: warning!
import os
```
Uncomment the `__loader__ = ` line in `bar.py` and try it again:
```
sys:1: ImportWarning: Module bar; __loader__ != __spec__.loader (<_frozen_importlib_external.SourceFileLoader object at 0x109f7dfa0> != PosixPath('.'))
bar.py:2: RuntimeWarning: warning!
import os
```
Automerge-Triggered-By: GH:warsaw
This is a small performance improvement, especially for one or two hot
places such as _handle_fromlist() that are called a lot and the
.format() method was being used just to join two strings with a dot.
Otherwise it is merely a readability improvement.
We keep `_ERR_MSG` and `_ERR_MSG_PREFIX` as those may be used elsewhere for canonical looking error messages.
Right now, the tokenizer only returns type and two pointers to the start and end of the token.
This PR modifies the tokenizer to return the type and set all of the necessary information,
so that the parser does not have to this.
The macOS 13 SDK includes support for the `mkfifoat` and `mknodat` system calls.
Using the `dir_fd` option with either `os.mkfifo` or `os.mknod` could result in a
segfault if cpython is built with the macOS 13 SDK but run on an earlier
version of macOS. Prevent this by adding runtime support for detection of
these system calls ("weaklinking") as is done for other newer syscalls on
macOS.
* improve performance of get_proxies_environment when there are many environment variables
* 📜🤖 Added by blurb_it.
* fix case of short env name
* fix formatting
* fix whitespace
* whitespace
* Update Lib/urllib/request.py
Co-authored-by: Carl Meyer <carl@oddbird.net>
* Update Lib/urllib/request.py
Co-authored-by: Carl Meyer <carl@oddbird.net>
* Update Lib/urllib/request.py
Co-authored-by: Carl Meyer <carl@oddbird.net>
* Update Lib/urllib/request.py
Co-authored-by: Carl Meyer <carl@oddbird.net>
* whitespace
* Update Misc/NEWS.d/next/Library/2022-04-15-11-29-38.gh-issue-91539.7WgVuA.rst
Co-authored-by: Carl Meyer <carl@oddbird.net>
* Update Lib/urllib/request.py
Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Carl Meyer <carl@oddbird.net>
Relevant tests moved from test_exceptions to test_traceback to be able to
compare both implementations.
Co-authored-by: Carl Friedrich Bolz-Tereick <cfbolz@gmx.de>
Remove the sys.getdxp() function and the Tools/scripts/analyze_dxp.py
script. DXP stands for "dynamic execution pairs". They were related
to DYNAMIC_EXECUTION_PROFILE and DXPAIRS macros which have been
removed in Python 3.11. Python can now be built with "./configure
--enable-pystats" to gather statistics on Python opcodes.
I was perusing this file, and noticed that this part of the documentation is slightly out of date: the `struct` items in this TOML file currently contain `struct_abi_kind` members, which distinguish between the different types of ABI compatibility described in the comment.
I've updated the comment to reflect this.
dataclass used to get the annotations on a class object using
cls.__dict__.get('__annotations__'). Now that it always imports
inspect, it can use inspect.get_annotations, which is modern
best practice for coping with annotations.
It had to live as a global outside of PyConfig for stable ABI reasons in
the pre-3.12 backports.
This removes the `_Py_global_config_int_max_str_digits` and gets rid of
the equivalent field in the internal `struct _is PyInterpreterState` as
code can just use the existing nested config struct within that.
Adds tests to verify unique settings and configs in subinterpreters.
Remove the Tools/demo/ directory which contained old demo scripts. A
copy can be found in the old-demos project:
https://github.com/gvanrossum/old-demos
Remove the following old demo scripts:
* beer.py
* eiffel.py
* hanoi.py
* life.py
* markov.py
* mcast.py
* queens.py
* redemo.py
* rpython.py
* rpythond.py
* sortvisu.py
* spreadsheet.py
* vector.py
Changes:
* Remove a reference to the redemo.py script in the regex howto
documentation.
* Remove a reference to the removed Tools/demo/ directory in the
curses documentation.
* Update PC/layout/ to remove the reference to Tools/demo/ directory.
* gh-97740: Fix bang in Sphinx C domain ref target syntax
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
* Add NEWS entry for C domain bang fix
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Fix the Python path configuration used to initialized sys.path at
Python startup. Paths are no longer encoded to UTF-8/strict to avoid
encoding errors if it contains surrogate characters (bytes paths are
decoded with the surrogateescape error handler).
getpath_basename() and getpath_dirname() functions no longer encode
the path to UTF-8/strict, but work directly on Unicode strings. These
functions now use PyUnicode_FindChar() and PyUnicode_Substring() on
the Unicode path, rather than strrchr() on the encoded bytes string.
Use output from a 3.10+ REPL, showing invalid range, for the
SyntaxError examples in the tutorial introduction page.
Automerge-Triggered-By: GH:iritkatriel
Fix a shell code injection vulnerability in the
get-remote-certificate.py example script. The script no longer uses a
shell to run "openssl" commands. Issue reported and initial fix by
Caleb Shortt.
Remove the Windows code path to send "quit" on stdin to the "openssl
s_client" command: use DEVNULL on all platforms instead.
Co-authored-by: Caleb Shortt <caleb@rgauge.com>
Fix multiplying a list by an integer (list *= int): detect the
integer overflow when the new allocated length is close to the
maximum size. Issue reported by Jordan Limor.
list_resize() now checks for integer overflow before multiplying the
new allocated length by the list item size (sizeof(PyObject*)).
Previously, checkbuttons in different parent widgets could have the same
short name and share the same state if arguments "name" and "variable" are
not specified. Now they are globally unique.
Fix command line parsing: reject "-X int_max_str_digits" option with
no value (invalid) when the PYTHONINTMAXSTRDIGITS environment
variable is set to a valid limit.
This PR fixes undefined behaviour in the struct module unpacking support functions `bu_longlong`, `lu_longlong`, `bu_int` and `lu_int`; thanks to @kumaraditya303 for finding these.
The fix is to accumulate the bytes in an unsigned integer type instead of a signed integer type, then to convert to the appropriate signed type. In cases where the width matches, that conversion will typically be compiled away to a no-op.
(Evidence from Godbolt: https://godbolt.org/z/5zvxodj64 .)
To make the conversions efficient, I've specialised the relevant functions for their output size: for `bu_longlong` and `lu_longlong`, this only entails checking that the output size is indeed `8`. But `bu_int` and `lu_int` were used for format sizes `2` and `4` - I've split those into two separate functions each.
No tests, because all of the affected cases are already exercised by the test suite.
This is a preliminary PR to refactor `PyLong_FromString` which is currently quite messy and has spaghetti like code that mixes up different concerns as well as duplicating logic.
In particular:
- `PyLong_FromString` now only handles sign, base and prefix detection and calls a new function `long_from_string_base` to parse the main body of the string.
- The `long_from_string_base` function handles all string validation and then calls `long_from_binary_base` or a new function `long_from_non_binary_base` to construct the actual `PyLong`.
- The existing `long_from_binary_base` function is simplified by factoring duplicated logic to `long_from_string_base`.
- The new function `long_from_non_binary_base` factors out much of the code from `PyLong_FromString` including in particular the quadratic algorithm reffered to in gh-95778 so that this can be seen separately from unrelated concerns such as string validation.
HTTP links in the "HISTORY OF THE SOFTWARE" section of Doc/license.rst
were converted to HTTPS in f62ff97f31.
But there were other copies of these links, which were left HTTP links.
The main problem was that an unluckily timed task cancellation could cause
the semaphore to be stuck. There were also doubts about strict FIFO ordering
of tasks allowed to pass.
The Semaphore implementation was rewritten to be more similar to Lock.
Many tests for edge cases (including cancellation) were added.
At Python exit, sometimes a thread holding the GIL can wait forever
for a thread (usually a daemon thread) which requested to drop the
GIL, whereas the thread already exited. To fix the race condition,
the thread which requested the GIL drop now resets its request before
exiting.
take_gil() now calls RESET_GIL_DROP_REQUEST() before
PyThread_exit_thread() if it called SET_GIL_DROP_REQUEST to fix a
race condition with drop_gil().
Issue discovered and analyzed by Mingliang ZHAO.
This reverts commit 0587810698.
Reason: This broke buildbots (some warnings added by that commit are turned to errors in the SSL buildbot).
Repro: ./python Lib/test/ssltests.py
- Improve error message when parameter without a default follows one with a default
- Show same error message when positional-only params precede the default/non-default sequence
Warn on loop initialization, when setting the wakeup fd disturbs a previously set wakeup fd, and on loop closing, when upon resetting the wakeup fd, we find it has been changed by someone else.
Previously codeop.compile_command() emitted compiler warnings (SyntaxWarning or
DeprecationWarning) and raised a SyntaxError for incomplete input containing
a potentially incorrect code. Now it always returns None for incomplete input
without emitting any warnings.
* fix: annotate `pstats.FunctionProfile.ncalls` as `str`
This change aligns the type annotation of `pstats.FunctionProfile.ncalls` with its runtime type.
The test file, a modified version of Lib/test/audiodata/pluck-pcm24.wav, was provided by Andrea Celletti on the bug tracker.
Co-authored-by: Zackery Spytz <zspytz@gmail.com>