This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.
This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.
The rest of the approach is to only load
`oparg` and `operand` in cases that use them.
The code generator know when these are used.
For `oparg`, it will conditionally emit
```
oparg = CURRENT_OPARG();
```
at the top of the case block.
(The `oparg` variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)
For `operand`, it will use `CURRENT_OPERAND()` directly
instead of referencing the `operand` variable,
which no longer exists.
(There is only one place where this will be used.)
This uses the new mechanism whereby certain uops
are replaced by others during translation,
using the `_PyUop_Replacements` table.
We further special-case the `_FOR_ITER_TIER_TWO` uop
to update the deoptimization target to point
just past the corresponding `END_FOR` opcode.
Two tiny code cleanups are also part of this PR.
- Ensure that `assert(type_version != 0);` always comes *before* using `type_version`
Also:
- In cases_generator, rename `-v` to from `--verbose` to `--viable`
- Double max trace size to 256
- Add a dependency on executor_cases.c.h for ceval.o
- Mark `_SPECIALIZE_UNPACK_SEQUENCE` as `TIER_ONE_ONLY`
- Add debug output back showing the optimized trace
- Bunch of cleanups to Tools/cases_generator/
* Replace jumps with deopts in tier 2
* Fewer special cases of uop names
* Add target field to uop IR
* Remove more redundant SET_IP and _CHECK_VALIDITY micro-ops
* Extend whitelist of non-escaping API functions.
_PyDict_Pop_KnownHash(): remove the default value and the return type
becomes an int.
Co-authored-by: Stefan Behnel <stefan_ml@behnel.de>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
In PGO mode, this function caused a compiler error in MSVC.
It turns out that optimizing for space only save the day, and is even faster.
However, without PGO, this is neither necessary nor slower.
Critical sections are helpers to replace the global interpreter lock
with finer grained locking. They provide similar guarantees to the GIL
and avoid the deadlock risk that plain locking involves. Critical
sections are implicitly ended whenever the GIL would be released. They
are resumed when the GIL would be acquired. Nested critical sections
behave as if the sections were interleaved.
* Revert "gh-111089: Use PyUnicode_AsUTF8() in Argument Clinic (#111585)"
This reverts commit d9b606b3d0.
* Revert "gh-111089: Use PyUnicode_AsUTF8() in getargs.c (#111620)"
This reverts commit cde1071b2a.
* Revert "gh-111089: PyUnicode_AsUTF8() now raises on embedded NUL (#111091)"
This reverts commit d731579bfb.
* Revert "gh-111089: Add PyUnicode_AsUTF8() to the limited C API (#111121)"
This reverts commit d8f32be5b6.
* Revert "gh-111089: Use PyUnicode_AsUTF8() in sqlite3 (#111122)"
This reverts commit 37e4e20eaa.
I added _Py_excinfo to the internal API (and added its functions in Python/errors.c) in gh-111530 (9322ce9). Since then I've had a nagging sense that I should have added the type and functions in its own PR. While I do plan on using _Py_excinfo outside crossinterp.c very soon (see gh-111572/gh-111573), I'd still feel more comfortable if the _Py_excinfo stuff went in as its own PR. Hence, here we are.
(FWIW, I may combine that with gh-111572, which I may, in turn, combine with gh-111573. We'll see.)
Joining a thread now ensures the underlying OS thread has exited. This is required for safer fork() in multi-threaded processes.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>