This reduces confusion between jumps at the bytecode level
(e.g. JUMPTO(), JUMPBY(), and various JUMP_*() opcodes)
and jumps in the C code (which are 'goto' statements).
Previously, the optional restrictions on subinterpreters were: disallow fork, subprocess, and threads. By default, we were disallowing all three for "isolated" interpreters. We always allowed all three for the main interpreter and those created through the legacy `Py_NewInterpreter()` API.
Those settings were a bit conservative, so here we've adjusted the optional restrictions to: fork, exec, threads, and daemon threads. The default for "isolated" interpreters disables fork, exec, and daemon threads. Regular threads are allowed by default. We continue always allowing everything For the main interpreter and the legacy API.
In the code, we add `_PyInterpreterConfig.allow_exec` and `_PyInterpreterConfig.allow_daemon_threads`. We also add `Py_RTFLAGS_DAEMON_THREADS` and `Py_RTFLAGS_EXEC`.
* As most of `test_embed` now uses `Py_InitializeFromConfig`, add
a specific test case to cover `Py_Initialize` (and `Py_InitializeEx`)
* Rename `_testembed` init helper to clarify the API used
* Add a `PyConfig_Clear` call in `Py_InitializeEx` to make
the code more obviously correct (it already didn't leak as
none of the dynamically allocated config fields were being
populated, but it's clearer if the wrappers follow the
documented API usage guidelines)
Change FOR_ITER to have the same stack effect regardless of whether it branches or not.
Performance is unchanged as FOR_ITER (and specialized forms jump over the cleanup code).
(see https://github.com/python/cpython/issues/98608)
This change does the following:
1. change the argument to a new `_PyInterpreterConfig` struct
2. rename the function to `_Py_NewInterpreterFromConfig()`, inspired by `Py_InitializeFromConfig()` (takes a `_PyInterpreterConfig` instead of `isolated_subinterpreter`)
3. split up the boolean `isolated_subinterpreter` into the corresponding multiple granular settings
* allow_fork
* allow_subprocess
* allow_threads
4. add `PyInterpreterState.feature_flags` to store those settings
5. add a function for checking if a feature is enabled on an opaque `PyInterpreterState *`
6. drop `PyConfig._isolated_interpreter`
The existing default (see `Py_NewInterpeter()` and `Py_Initialize*()`) allows fork, subprocess, and threads and the optional "isolated" interpreter (see the `_xxsubinterpreters` module) disables all three. None of that changes here; the defaults are preserved.
Note that the given `_PyInterpreterConfig` will not be used outside `_Py_NewInterpreterFromConfig()`, nor preserved. This contrasts with how `PyConfig` is currently preserved, used, and even modified outside `Py_InitializeFromConfig()`. I'd rather just avoid that mess from the start for `_PyInterpreterConfig`. We can preserve it later if we find an actual need.
This change allows us to follow up with a number of improvements (e.g. stop disallowing subprocess and support disallowing exec instead).
(Note that this PR adds "private" symbols. We'll probably make them public, and add docs, in a separate change.)
Add Python implementations of certain longobject.c functions. These use
asymptotically faster algorithms that can be used for operations on
integers with many digits. In those cases, the performance overhead of
the Python implementation is not significant since the asymptotic
behavior is what dominates runtime. Functions provided by this module
should be considered private and not part of any public API.
Co-author: Tim Peters <tim.peters@gmail.com>
Co-author: Mark Dickinson <dickinsm@gmail.com>
Co-author: Bjorn Martinsson
* The compiler analyzes the usage of the first 64 local variables all at once using bit masks.
* Local variables beyond the first 64 are only partially analyzed, achieving linear time.
Make sys.setprofile() and sys.settrace() functions reentrant. They
can no long fail with: RuntimeError("Cannot install a trace function
while another trace function is being installed").
Make _PyEval_SetTrace() and _PyEval_SetProfile() functions reentrant,
rather than detecting and rejecting reentrant calls. Only delete the
reference to function arguments once the new function is fully set,
when a reentrant call is safe. Call also _PySys_Audit() earlier.
The `}` marked with `/* End instructions */` is the end of the switch.
There is another pair of `{}` around the switch, which is vestigial
from ancient times when it was `for (;;) { switch (opcode) { ... } }`.
All `DISPATCH` macro calls should be inside that pair.
In `_warnings.c`, in the C equivalent of `warnings.warn_explicit()`, if the module globals are given (and not None), the warning will attempt to get the source line for the issued warning. To do this, it needs the module's loader.
Previously, it would only look up `__loader__` in the module globals. In https://github.com/python/cpython/issues/86298 we want to defer to the `__spec__.loader` if available.
The first step on this journey is to check that `loader == __spec__.loader` and issue another warning if it is not. This commit does that.
Since this is a PoC, only manual testing for now.
```python
# /tmp/foo.py
import warnings
import bar
warnings.warn_explicit(
'warning!',
RuntimeWarning,
'bar.py', 2,
module='bar knee',
module_globals=bar.__dict__,
)
```
```python
# /tmp/bar.py
import sys
import os
import pathlib
# __loader__ = pathlib.Path()
```
Then running this: `./python.exe -Wdefault /tmp/foo.py`
Produces:
```
bar.py:2: RuntimeWarning: warning!
import os
```
Uncomment the `__loader__ = ` line in `bar.py` and try it again:
```
sys:1: ImportWarning: Module bar; __loader__ != __spec__.loader (<_frozen_importlib_external.SourceFileLoader object at 0x109f7dfa0> != PosixPath('.'))
bar.py:2: RuntimeWarning: warning!
import os
```
Automerge-Triggered-By: GH:warsaw
This is a small performance improvement, especially for one or two hot
places such as _handle_fromlist() that are called a lot and the
.format() method was being used just to join two strings with a dot.
Otherwise it is merely a readability improvement.
We keep `_ERR_MSG` and `_ERR_MSG_PREFIX` as those may be used elsewhere for canonical looking error messages.
Right now, the tokenizer only returns type and two pointers to the start and end of the token.
This PR modifies the tokenizer to return the type and set all of the necessary information,
so that the parser does not have to this.
Remove the sys.getdxp() function and the Tools/scripts/analyze_dxp.py
script. DXP stands for "dynamic execution pairs". They were related
to DYNAMIC_EXECUTION_PROFILE and DXPAIRS macros which have been
removed in Python 3.11. Python can now be built with "./configure
--enable-pystats" to gather statistics on Python opcodes.
It had to live as a global outside of PyConfig for stable ABI reasons in
the pre-3.12 backports.
This removes the `_Py_global_config_int_max_str_digits` and gets rid of
the equivalent field in the internal `struct _is PyInterpreterState` as
code can just use the existing nested config struct within that.
Adds tests to verify unique settings and configs in subinterpreters.
Fix command line parsing: reject "-X int_max_str_digits" option with
no value (invalid) when the PYTHONINTMAXSTRDIGITS environment
variable is set to a valid limit.
At Python exit, sometimes a thread holding the GIL can wait forever
for a thread (usually a daemon thread) which requested to drop the
GIL, whereas the thread already exited. To fix the race condition,
the thread which requested the GIL drop now resets its request before
exiting.
take_gil() now calls RESET_GIL_DROP_REQUEST() before
PyThread_exit_thread() if it called SET_GIL_DROP_REQUEST to fix a
race condition with drop_gil().
Issue discovered and analyzed by Mingliang ZHAO.
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.
This PR comes fresh from a pile of work done in our private PSRT security response team repo.
Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
⚠️⚠️ Note for reviewers, hackers and fellow systems/low-level/compiler engineers ⚠️⚠️
If you have a lot of experience with this kind of shenanigans and want to improve the **first** version, **please make a PR against my branch** or **reach out by email** or **suggest code changes directly on GitHub**.
If you have any **refinements or optimizations** please, wait until the first version is merged before starting hacking or proposing those so we can keep this PR productive.
* gh-93503: Add APIs to set profiling and tracing functions in all threads in the C-API
* Use a separate API
* Fix NEWS entry
* Add locks around the loop
* Document ignoring exceptions
* Use the new APIs in the sys module
* Update docs
- "comparison of integers of different signs" in typeobject.c
- only define static_builtin_index_is_set in DEBUG builds
- only define recreate_gil with ifdef HAVE_FORK
Automate WASM build with a new Python script. The script provides
several build profiles with configure flags for Emscripten flavors
and WASI. The script can detect and use Emscripten SDK and WASI SDK from
default locations or env vars.
``configure`` now detects Node arguments and creates HOSTRUNNER
arguments for Node 16. It also sets correct arguments for
``wasm64-emscripten``.
Co-authored-by: Brett Cannon <brett@python.org>
We only statically initialize for core code and builtin modules. Extension modules still create
the tuple at runtime. We'll solve that part of interpreter isolation separately.
This change includes generated code. The non-generated changes are in:
* Tools/clinic/clinic.py
* Python/getargs.c
* Include/cpython/modsupport.h
* Makefile.pre.in (re-generate global strings after running clinic)
* very minor tweaks to Modules/_codecsmodule.c and Python/Python-tokenize.c
All other changes are generated code (clinic, global strings).
gh-93243
This PR is required to reduce diffs of the following porting (no need to either maintain documentation and tests consistent with each porting step, or try to port everything and remove smtpd in a single PR).
Automerge-Triggered-By: GH:warsaw
When keyword argument name is an instance of a str subclass with
overloaded methods __eq__ and __hash__, the former code could not find
the name of an extraneous keyword argument to report an error, and
_PyArg_UnpackKeywords() returned success without setting the
corresponding cell in the linearized arguments array. But since the number
of expected initialized cells is determined as the total number of passed
arguments, this lead to reading NULL as a keyword parameter value, that
caused SystemError or crash or other undesired behavior.
- check for ``dup()`` libc function
- handle missing ``F_DUPFD`` in ``dup2()`` replacement function
- add workaround for WASI libc bug in MSG_TRUNC
- ESHUTDOWN is missing, use EPIPE instead
- POLLPRI is missing, define as 0 (no-op)
Static builtin types are finalized by calling _PyStaticType_Dealloc(). Before this change, we were skipping finalizing such a type if it still had subtypes (i.e. its tp_subclasses hadn't been cleared yet). The problem is that types hold several heap objects, which leak if we skip the type's finalization. This change addresses that.
For context, there's an old comment (from e9e3eab0b8) that says the following:
// If a type still has subtypes, it cannot be deallocated.
// A subtype can inherit attributes and methods of its parent type,
// and a type must no longer be used once it's deallocated.
However, it isn't clear that is actually still true. Clearing tp_dict should mean it isn't a problem.
Furthermore, the only subtypes that might still be around come from extension modules that didn't clean them up when unloaded (i.e. extensions that do not implement multi-phase initialization, AKA PEP 489). Those objects are already leaking, so this change doesn't change anything in that regard. Instead, this change means more objects gets cleaned up that before.
This is the first of several precursors to storing tp_subclasses (and tp_weaklist) on the interpreter state for static builtin types.
We do the following:
* add `_PyStaticType_InitBuiltin()`
* add `_Py_TPFLAGS_STATIC_BUILTIN`
* set it on all static builtin types in `_PyStaticType_InitBuiltin()`
* shuffle some code around to be able to use _PyStaticType_InitBuiltin()
* rename `_PyStructSequence_InitType()` to `_PyStructSequence_InitBuiltinWithFlags()`
* add `_PyStructSequence_InitBuiltin()`.
* gh-93883: elide traceback indicators when possible
Elide traceback column indicators when the entire line of the
frame is implicated. This reduces traceback length and draws
even more attention to the remaining (very relevant) indicators.
Example:
```
Traceback (most recent call last):
File "query.py", line 99, in <module>
bar()
File "query.py", line 66, in bar
foo()
File "query.py", line 37, in foo
magic_arithmetic('foo')
File "query.py", line 18, in magic_arithmetic
return add_counts(x) / 25
^^^^^^^^^^^^^
File "query.py", line 24, in add_counts
return 25 + query_user(user1) + query_user(user2)
^^^^^^^^^^^^^^^^^
File "query.py", line 32, in query_user
return 1 + query_count(db, response['a']['b']['c']['user'], retry=True)
~~~~~~~~~~~~~~~~~~^^^^^
TypeError: 'NoneType' object is not subscriptable
```
Rather than going out of our way to provide indicator coverage
in every traceback test suite, the indicator test suite should
be responible for sufficient coverage (e.g. by adding a basic
exception group test to ensure that margin strings are covered).
Inlining of code that corresponds to source code lines, can make it hard to distinguish later between code which is only reachable from except handlers, and that which is reachable in normal control flow. This caused problems with the debugger's jump feature.
This PR turns off the inlining optimisation for code which has line numbers. We still inline things like the implicit "return None".
Seems in the past the copy of builtins was not made in some scenarios,
and the error was silenced. Write it now to stderr, so we have a chance
to see it.
pthread _PyThread_cond_after() implementation now uses the _PyTime_t
type to handle properly overflow: clamp to the maximum value.
Remove MICROSECONDS_TO_TIMESPEC() function.
On Windows, PyOS_StdioReadline() now gets
PyConfig.legacy_windows_stdio from _PyOS_ReadlineTState, rather than
using the deprecated global Py_LegacyWindowsStdioFlag variable.
Fix also a compiler warning in Py_SetStandardStreamEncoding().
Move the follow functions and type from frameobject.h to pyframe.h,
so the standard <Python.h> provide frame getter functions:
* PyFrame_Check()
* PyFrame_GetBack()
* PyFrame_GetBuiltins()
* PyFrame_GetGenerator()
* PyFrame_GetGlobals()
* PyFrame_GetLasti()
* PyFrame_GetLocals()
* PyFrame_Type
Remove #include "frameobject.h" from many C files. It's no longer
needed.
Reformat the pthread implementation of PyThread_acquire_lock_timed()
using a mutex and a conditioinal variable.
* Add goto to avoid multiple indentation levels and exit quickly
* Use "while(1)" and make the control flow more obvious.
* PEP 7: Add braces around if blocks.
Deprecate global configuration variable like
Py_IgnoreEnvironmentFlag: the Py_InitializeFromConfig() API should be
instead.
Fix declaration of Py_GETENV(): use PyAPI_FUNC(), not PyAPI_DATA().