cpython

Commit Graph

Author	SHA1	Message	Date
Ken Jin	22b0de2755	gh-117139: Convert the evaluation stack to stack refs (#118450 ) This PR sets up tagged pointers for CPython. The general idea is to create a separate struct _PyStackRef for everything on the evaluation stack to store the bits. This forces the C compiler to warn us if we try to cast things or pull things out of the struct directly. Only for free threading: We tag the low bit if something is deferred - that means we skip incref and decref operations on it. This behavior may change in the future if Mark's plans to defer all objects in the interpreter loop pans out. This implies a strict stack reference discipline is required. ALL incref and decref operations on stackrefs must use the stackref variants. It is unsafe to untag something then do normal incref/decref ops on it. The new incref and decref variants are called dup and close. They mimic a "handle" API operating on these stackrefs. Please read Include/internal/pycore_stackref.h for more information! --------- Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com>	2024-06-27 03:10:43 +08:00
Mark Shannon	8f5a01707f	GH-120982: Add stack check assertions to generated interpreter code (GH-120992)	2024-06-25 16:42:29 +01:00
Tian Gao	9c14ed0618	gh-107674: Improve performance of `sys.settrace` (GH-117133) * Check tracing in RESUME_CHECK * Only change to RESUME_CHECK if not tracing	2024-05-03 19:49:24 +01:00
Dino Viehland	4a1cf66c5c	gh-117657: Fix small issues with instrumentation and TSAN (#118064 ) Small TSAN fixups for instrumentation	2024-04-30 11:38:05 -07:00
Mark Shannon	3e06c7f719	GH-118095: Add dynamic exit support and FOR_ITER_GEN support to tier 2 (GH-118279)	2024-04-26 18:08:50 +01:00
Mark Shannon	f180b31e76	GH-118095: Handle `RETURN_GENERATOR` in tier 2 (GH-118180)	2024-04-25 11:32:47 +01:00
Guido van Rossum	060a96f1a9	gh-116968: Reimplement Tier 2 counters (#117144 ) Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``), shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The API used for adaptive specialization counters is changed but the behavior is (supposed to be) identical. The behavior of the Tier 2 counters is changed: - There are no longer dynamic thresholds (we never varied these). - All counters now use the same exponential backoff. - The counter for ``JUMP_BACKWARD`` starts counting down from 16. - The ``temperature`` in side exits starts counting down from 64.	2024-04-04 15:03:27 +00:00
Michael Droettboom	26d328b2ba	GH-117121: Add pystats to JIT builds (GH-117346)	2024-03-28 15:23:08 -07:00
Mark Shannon	bf82f77957	GH-116422: Tier2 hot/cold splitting (GH-116813) Splits the "cold" path, deopts and exits, from the "hot" path, reducing the size of most jitted instructions, at the cost of slower exits.	2024-03-26 09:35:11 +00:00
Tian Gao	7895a61168	gh-116098: Revert "gh-107674: Improve performance of `sys.settrace` (GH-114986)" (GH-116178) Revert "gh-107674: Improve performance of `sys.settrace` (GH-114986)" This reverts commit `0a61e23700`.	2024-03-01 07:46:33 +01:00
Brett Simmers	339c8e1c13	gh-115999: Disable the specializing adaptive interpreter in free-threaded builds (#116013 ) For now, disable all specialization when the GIL might be disabled.	2024-02-29 21:53:32 -05:00
Brandt Bucher	f0df35eeca	GH-115802: JIT "small" code for Windows (GH-115964)	2024-02-29 08:11:28 -08:00
Tian Gao	0a61e23700	gh-107674: Improve performance of `sys.settrace` (GH-114986)	2024-02-28 15:21:42 +00:00
Kirill Podoprigora	e4561e0501	gh-115778: Add `tierN` annotation for instruction definitions (#115815 ) This replaces the old `TIER_{ONE,TWO}_ONLY` macros. Note that `specialized` implies `tier1`. Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-02-23 17:31:57 +00:00
Brett Simmers	0749244d13	gh-112175: Add `eval_breaker` to `PyThreadState` (#115194 ) This change adds an `eval_breaker` field to `PyThreadState`. The primary motivation is for performance in free-threaded builds: with thread-local eval breakers, we can stop a specific thread (e.g., for an async exception) without interrupting other threads. The source of truth for the global instrumentation version is stored in the `instrumentation_version` field in PyInterpreterState. Threads usually read the version from their local `eval_breaker`, where it continues to be colocated with the eval breaker bits.	2024-02-20 09:57:48 -05:00
Mark Shannon	7b21403ccd	GH-112354: Initial implementation of warm up on exits and trace-stitching (GH-114142)	2024-02-20 09:39:55 +00:00
Sam Gross	5903190727	gh-115103: Implement delayed memory reclamation (QSBR) (#115180 ) This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.	2024-02-16 15:25:19 -05:00
Mark Shannon	2ff072f21f	Delete unused macro (GH-114238)	2024-01-18 15:49:50 +00:00
Mark Shannon	6873555955	GH-112354: Treat _EXIT_TRACE like an unconditional side exit (GH-113104)	2023-12-14 14:26:44 +00:00
Michael Droettboom	dfaa9e060b	gh-113010: Don't decrement deferred in pystats (#113032 ) This fixes a recently introduced bug where the deferred count is being unnecessarily decremented to counteract an increment elsewhere that is no longer happening. This caused the values to flip around to "very large" 64-bit numbers.	2023-12-12 21:17:08 +00:00
Guido van Rossum	5b86644338	A smattering of cleanups in uop debug output and lltrace (#112980 ) * Include destination T1 opcode in Error debug message * Include destination T1 opcode in DEOPT debug message * Remove obsolete comment from remove_unneeded_uops * Change lltrace_instruction() to print caller's opcode/oparg	2023-12-11 16:42:30 -08:00
Guido van Rossum	8deb8bc2e5	gh-112287: Speed up Tier 2 (uop) interpreter a little (#112286 ) This makes the Tier 2 interpreter a little faster. I calculated by about 3%, though I hesitate to claim an exact number. This starts by doubling the trace size limit (to 512), making it more likely that loops fit in a trace. The rest of the approach is to only load `oparg` and `operand` in cases that use them. The code generator know when these are used. For `oparg`, it will conditionally emit ``` oparg = CURRENT_OPARG(); ``` at the top of the case block. (The `oparg` variable may be referenced multiple times by the instructions code block, so it must be in a variable.) For `operand`, it will use `CURRENT_OPERAND()` directly instead of referencing the `operand` variable, which no longer exists. (There is only one place where this will be used.)	2023-11-20 11:25:32 -08:00
Guido van Rossum	7e135a48d6	gh-111520: Integrate the Tier 2 interpreter in the Tier 1 interpreter (#111428 ) - There is no longer a separate Python/executor.c file. - Conventions in Python/bytecodes.c are slightly different -- don't use `goto error`, you must use `GOTO_ERROR(error)` (same for others like `unused_local_error`). - The `TIER_ONE` and `TIER_TWO` symbols are only valid in the generated (.c.h) files. - In Lib/test/support/__init__.py, `Py_C_RECURSION_LIMIT` is imported from `_testcapi`. - On Windows, in debug mode, stack allocation grows from 8MiB to 12MiB. - Beware! This changes the env vars to enable uops and their debugging to `PYTHON_UOPS` and `PYTHON_LLTRACE`.	2023-11-01 13:13:02 -07:00
Mark Shannon	d27acd4461	GH-111485: Increment `next_instr` consistently at the start of the instruction. (GH-111486)	2023-10-31 10:09:54 +00:00
Irit Katriel	67a91f78e4	gh-109094: replace frame->prev_instr by frame->instr_ptr (#109095 )	2023-10-26 13:43:10 +00:00
Mark Shannon	19b7ead5eb	GH-109214: Convert _SAVE_CURRENT_IP to _SET_IP in tier 2 trace creation. (GH-110755)	2023-10-12 10:34:32 +01:00
Mark Shannon	bf4bc36069	GH-109369: Merge all eval-breaker flags and monitoring version into one word. (GH-109846)	2023-10-04 16:09:48 +01:00
Brandt Bucher	22e65eecaa	GH-105848: Replace KW_NAMES + CALL with LOAD_CONST + CALL_KW (GH-109300)	2023-09-13 10:25:45 -07:00
Mark Shannon	0858328ca2	GH-108614: Add `RESUME_CHECK` instruction (GH-108630)	2023-09-07 14:39:03 +01:00
Victor Stinner	a0773b89df	gh-108753: Enhance pystats (#108754 ) Statistics gathering is now off by default. Use the "-X pystats" command line option or set the new PYTHONSTATS environment variable to 1 to turn statistics gathering on at Python startup. Statistics are no longer dumped at exit if statistics gathering was off or statistics have been cleared. Changes: * Add PYTHONSTATS environment variable. * sys._stats_dump() now returns False if statistics are not dumped because they are all equal to zero. * Add PyConfig._pystats member. * Add tests on sys functions and on setting PyConfig._pystats to 1. * Add Include/cpython/pystats.h and Include/internal/pycore_pystats.h header files. * Rename '_py_stats' variable to '_Py_stats'. * Exclude Include/cpython/pystats.h from the Py_LIMITED_API. * Move pystats.h include from object.h to Python.h. * Add _Py_StatsOn() and _Py_StatsOff() functions. Remove '_py_stats_struct' variable from the API: make it static in specialize.c. * Document API in Include/pystats.h and Include/cpython/pystats.h. * Complete pystats documentation in Doc/using/configure.rst. * Don't write "all zeros" stats: if _stats_off() and _stats_clear() or _stats_dump() were called. * _PyEval_Fini() now always call _Py_PrintSpecializationStats() which does nothing if stats are all zeros. Co-authored-by: Michael Droettboom <mdboom@gmail.com>	2023-09-06 15:54:59 +00:00
Mark Shannon	059bd4d299	GH-108614: Remove non-debug uses of `#if TIER_ONE` and `#if TIER_TWO` from `_POP_FRAME` op. (GH-108685)	2023-08-31 11:34:52 +01:00
Guido van Rossum	61c7249759	gh-106581: Project through calls (#108067 ) This finishes the work begun in gh-107760. When, while projecting a superblock, we encounter a call to a short, simple function, the superblock will now enter the function using `_PUSH_FRAME`, continue through it, and leave it using `_POP_FRAME`, and then continue through the original code. Multiple frame pushes and pops are even possible. It is also possible to stop appending to the superblock in the middle of a called function, when running out of space or encountering an unsupported bytecode.	2023-08-17 11:29:58 -07:00
Mark Shannon	006e44f950	GH-108035: Remove the `_PyCFrame` struct as it is no longer needed for performance. (GH-108036)	2023-08-17 11:16:03 +01:00
Guido van Rossum	dc8fdf5fd5	gh-106581: Split `CALL_PY_EXACT_ARGS` into uops (#107760 ) * Split `CALL_PY_EXACT_ARGS` into uops This is only the first step for doing `CALL` in Tier 2. The next step involves tracing into the called code object and back. After that we'll have to do the remaining `CALL` specialization. Finally we'll have to deal with `KW_NAMES`. Note: this moves setting `frame->return_offset` directly in front of `DISPATCH_INLINED()`, to make it easier to move it into `_PUSH_FRAME`.	2023-08-16 16:26:43 -07:00
Brandt Bucher	8f4de57699	GH-106701: Move _PyUopExecute to Python/executor.c (GH-106924)	2023-07-20 20:37:19 +00:00
Irit Katriel	40f3f11a77	gh-105481: Generate the opcode lists in dis from data extracted from bytecodes.c (#106758 )	2023-07-18 19:42:44 +01:00
Guido van Rossum	2b94a05a0e	gh-106581: Add 10 new opcodes by allowing `assert(kwnames == NULL)` (#106707 ) By turning `assert(kwnames == NULL)` into a macro that is not in the "forbidden" list, many instructions that formerly were skipped because they contained such an assert (but no other mention of `kwnames`) are now supported in Tier 2. This covers 10 instructions in total (all specializations of `CALL` that invoke some C code): - `CALL_NO_KW_TYPE_1` - `CALL_NO_KW_STR_1` - `CALL_NO_KW_TUPLE_1` - `CALL_NO_KW_BUILTIN_O` - `CALL_NO_KW_BUILTIN_FAST` - `CALL_NO_KW_LEN` - `CALL_NO_KW_ISINSTANCE` - `CALL_NO_KW_METHOD_DESCRIPTOR_O` - `CALL_NO_KW_METHOD_DESCRIPTOR_NOARGS` - `CALL_NO_KW_METHOD_DESCRIPTOR_FAST`	2023-07-17 11:02:58 -07:00
Mark Shannon	e5862113dd	GH-104584: Fix ENTER_EXECUTOR (GH-106141) * Check eval-breaker in ENTER_EXECUTOR. * Make sure that frame->prev_instr is set before entering executor.	2023-07-03 21:28:27 +01:00
Guido van Rossum	6b5166fb12	gh-104584: Change DEOPT_IF in uops executor (#106146 ) This effectively reverts `bb578a0`, restoring the original DEOPT_IF() macro in ceval_macros.h, and redefining it in the Tier 2 interpreter. We can get rid of the PREDICTED() macros there as well!	2023-06-27 14:17:41 -07:00
Guido van Rossum	bb578a0c30	gh-104584: Fix assert in DEOPT macro -- should fix buildbot (#106131 )	2023-06-27 07:02:51 -07:00
Guido van Rossum	51fc725117	gh-104584: Baby steps towards generating and executing traces (#105924 ) Added a new, experimental, tracing optimizer and interpreter (a.k.a. "tier 2"). This currently pessimizes, so don't use yet -- this is infrastructure so we can experiment with optimizing passes. To enable it, pass ``-Xuops`` or set ``PYTHONUOPS=1``. To get debug output, set ``PYTHONUOPSDEBUG=N`` where ``N`` is a debug level (0-4, where 0 is no debug output and 4 is excessively verbose). All of this code is likely to change dramatically before the 3.13 feature freeze. But this is a first step.	2023-06-26 19:02:57 -07:00
Irit Katriel	d1b0297d3e	gh-105481: add HAS_JUMP flag to opcode metadata (#105791 )	2023-06-14 23:14:22 +00:00
Mark Shannon	7199584ac8	GH-100987: Allow objects other than code objects as the "executable" of an internal frame. (GH-105727) * Add table describing possible executable classes for out-of-process debuggers. * Remove shim code object creation code as it is no longer needed. * Make lltrace a bit more robust w.r.t. non-standard frames.	2023-06-14 13:46:37 +01:00
Irit Katriel	be2779c0cb	gh-105481: add flags to each instr in the opcode metadata table, to replace opcode.hasarg/hasname/hasconst (#105482 )	2023-06-13 21:42:03 +01:00
Mark Shannon	064de0e3fc	GH-104610: Remove the use of `PREDICT` macros. (GH-104651)	2023-06-07 17:04:53 +01:00
Mark Shannon	4bfa01b9d9	GH-104584: Plugin optimizer API (GH-105100)	2023-06-02 11:46:18 +01:00
Mark Shannon	68b5f08b72	GH-104580: Don't cache eval breaker in interpreter (GH-104581) Move eval-breaker to the front of the interpreter state.	2023-05-18 10:08:33 +01:00
Brandt Bucher	1eb950ca55	GH-104405: Add missing PEP 523 checks (GH-104406)	2023-05-12 22:23:13 +00:00
Mark Shannon	45f5aa8fc7	GH-103082: Filter LINE events in VM, to simplify tool implementation. (GH-104387) When monitoring LINE events, instrument all instructions that can have a predecessor on a different line. Then check that the a new line has been hit in the instrumentation code. This brings the behavior closer to that of 3.11, simplifying implementation and porting of tools.	2023-05-12 12:21:20 +01:00
Mark Shannon	411b169281	GH-103082: Implementation of PEP 669: Low Impact Monitoring for CPython (GH-103083) * The majority of the monitoring code is in instrumentation.c * The new instrumentation bytecodes are in bytecodes.c * legacy_tracing.c adapts the new API to the old sys.setrace and sys.setprofile APIs	2023-04-12 12:04:55 +01:00

1 2

55 Commits