cpython

Commit Graph

Author	SHA1	Message	Date
mpage	2e95c5ba3b	gh-115999: Implement thread-local bytecode and enable specialization for `BINARY_OP` (#123926 ) Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads. Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization. Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.	2024-11-04 11:13:32 -08:00
Mark Shannon	faa3272fb8	GH-125837: Split `LOAD_CONST` into three. (GH-125972) * Add LOAD_CONST_IMMORTAL opcode * Add LOAD_SMALL_INT opcode * Remove RETURN_CONST opcode	2024-10-29 11:15:42 +00:00
mpage	e99f159be4	gh-115999: Stop the world when invalidating function versions (#124997 ) Stop the world when invalidating function versions The tier1 interpreter specializes `CALL` instructions based on the values of certain function attributes (e.g. `__code__`, `__defaults__`). The tier1 interpreter uses function versions to verify that the attributes of a function during execution of a specialization match those seen during specialization. A function's version is initialized in `MAKE_FUNCTION` and is invalidated when any of the critical function attributes are changed. The tier1 interpreter stores the function version in the inline cache during specialization. A guard is used by the specialized instruction to verify that the version of the function on the operand stack matches the cached version (and therefore has all of the expected attributes). It is assumed that once the guard passes, all attributes will remain unchanged while executing the rest of the specialized instruction. Stopping the world when invalidating function versions ensures that all critical function attributes will remain unchanged after the function version guard passes in free-threaded builds. It's important to note that this is only true if the remainder of the specialized instruction does not enter and exit a stop-the-world point. We will stop the world the first time any of the following function attributes are mutated: - defaults - vectorcall - kwdefaults - closure - code This should happen rarely and only happens once per function, so the performance impact on majority of code should be minimal. Additionally, refactor the API for manipulating function versions to more clearly match the stated semantics.	2024-10-08 10:04:35 -04:00
Mark Shannon	c87b0e4a46	GH-124284: Add stats for refcount operations on immortal objects (GH-124288)	2024-09-23 19:10:55 +01:00
Mark Shannon	0b0f7befad	GH-123232: Fix "not specialized" stats (GH-123236)	2024-08-23 10:46:03 +01:00
Mark Shannon	5d3201fe3f	GH-123040: Specialize shadowed `LOAD_ATTR`. (GH-123219)	2024-08-23 10:22:35 +01:00
Mark Shannon	a3d8c0542e	GH-123197: Only count an instruction as deferred if it hasn't deopted first. (GH-123222) Only count an instruction as deferred if hasn't deopted first.	2024-08-22 14:17:10 +01:00
Brandt Bucher	427b106162	GH-118093: Specialize calls to non-vectorcall classes as `CALL_NON_PY_GENERAL` (GH-123212) Specialize classes without vectorcall as CALL_NON_PY_GENERAL	2024-08-22 11:50:55 +01:00
Mark Shannon	a4fd7aa4a6	GH-115776: Allow any fixed sized object to have inline values (GH-123192)	2024-08-21 15:52:04 +01:00
Mark Shannon	bb1d30336e	GH-118093: Make `CALL_ALLOC_AND_ENTER_INIT` suitable for tier 2. (GH-123140) * Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it * Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.	2024-08-20 16:52:58 +01:00
Mark Shannon	c13e7d98fb	GH-118093: Specialize `CALL_KW` (GH-123006)	2024-08-16 17:11:24 +01:00
Mark Shannon	7a65439b93	GH-122390: Replace `_Py_GetbaseOpcode` with `_Py_GetBaseCodeUnit` (GH-122942)	2024-08-13 14:22:57 +01:00
Brandt Bucher	5f6001130f	GH-118093: Add tier two support for LOAD_ATTR_PROPERTY (GH-122283)	2024-07-25 10:45:28 -07:00
Michael Droettboom	f036a463db	GH-121583: Remove dependency from pystats.h to internal header file (GH-121587) Co-authored-by: Peter Bierma <zintensitydev@gmail.com>	2024-07-16 15:38:29 -07:00
Nadeshiko Manju	223c03a43c	gh-121082: Fix build failure when the developer use `--enable-pystats` arguments in configuration command after #118450 (#121083 ) Signed-off-by: Manjusaka <me@manjusaka.me> Co-authored-by: Ken Jin <kenjin4096@gmail.com>	2024-06-27 19:35:25 +08:00
Ken Jin	22b0de2755	gh-117139: Convert the evaluation stack to stack refs (#118450 ) This PR sets up tagged pointers for CPython. The general idea is to create a separate struct _PyStackRef for everything on the evaluation stack to store the bits. This forces the C compiler to warn us if we try to cast things or pull things out of the struct directly. Only for free threading: We tag the low bit if something is deferred - that means we skip incref and decref operations on it. This behavior may change in the future if Mark's plans to defer all objects in the interpreter loop pans out. This implies a strict stack reference discipline is required. ALL incref and decref operations on stackrefs must use the stackref variants. It is unsafe to untag something then do normal incref/decref ops on it. The new incref and decref variants are called dup and close. They mimic a "handle" API operating on these stackrefs. Please read Include/internal/pycore_stackref.h for more information! --------- Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com>	2024-06-27 03:10:43 +08:00
Xie Yanbo	656a1c8108	Fix typos in comments (#120481 )	2024-06-19 23:16:14 -04:00
Victor Stinner	c2d5df5787	gh-83754: Use the Py_TYPE() macro (#120599 ) Don't access directly PyObject.ob_type, but use the Py_TYPE() macro instead.	2024-06-17 10:34:29 +02:00
Mark Shannon	1ab6356ebe	GH-118095: Use broader specializations of CALL in tier 1, for better tier 2 support of calls. (GH-118322) * Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations. * Remove CALL_PY_WITH_DEFAULTS specialization * Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize	2024-05-04 12:11:11 +01:00
Mark Shannon	72867c962c	GH-118095: Unify the behavior of tier 2 FOR_ITER branch micro-ops (GH-118420) * Target _FOR_ITER_TIER_TWO at POP_TOP following the matching END_FOR * Modify _GUARD_NOT_EXHAUSTED_RANGE, _GUARD_NOT_EXHAUSTED_LIST and _GUARD_NOT_EXHAUSTED_TUPLE so that they also target the POP_TOP following the matching END_FOR	2024-05-02 16:17:59 +01:00
Dino Viehland	8b541c017e	gh-112075: Make instance attributes stored in inline "dict" thread safe (#114742 ) Make instance attributes stored in inline "dict" thread safe on free-threaded builds	2024-04-21 22:57:05 -07:00
Jeff Glass	acf69e09c6	gh-115178: Add Counts of UOp Pairs to pystats (GH-115181)	2024-04-16 14:27:18 +01:00
Guido van Rossum	060a96f1a9	gh-116968: Reimplement Tier 2 counters (#117144 ) Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``), shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The API used for adaptive specialization counters is changed but the behavior is (supposed to be) identical. The behavior of the Tier 2 counters is changed: - There are no longer dynamic thresholds (we never varied these). - All counters now use the same exponential backoff. - The counter for ``JUMP_BACKWARD`` starts counting down from 16. - The ``temperature`` in side exits starts counting down from 64.	2024-04-04 15:03:27 +00:00
Mark Shannon	c32dc47aca	GH-115776: Embed the values array into the object, for "normal" Python objects. (GH-116115)	2024-04-02 11:59:21 +01:00
Mark Shannon	23e4f80ce2	A few minor tweaks to get stats working and compiling cleanly. (#117219 ) Fixes a compilation error when configured with `--enable-pystats`, an array size issue, and an unused variable.	2024-03-25 13:43:51 -07:00
Michael Droettboom	50369e6c34	gh-116996: Add pystats about _Py_uop_analyse_and_optimize (GH-116997)	2024-03-22 01:27:46 +08:00
Ken Jin	41457c7fdb	gh-116381: Remove bad specializations, add fail stats (GH-116464) * Remove bad specializations, add fail stats	2024-03-08 00:21:21 +08:00
Ken Jin	7114cf20c0	gh-116381: Specialize CONTAINS_OP (GH-116385) * Specialize CONTAINS_OP * 📜🤖 Added by blurb_it. * Add PyAPI_FUNC for JIT --------- Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>	2024-03-07 03:30:11 +08:00
Michael Droettboom	b05afdd5ec	gh-115168: Add pystats counter for invalidated executors (GH-115169)	2024-02-26 17:51:47 +00:00
Guido van Rossum	142502ea8d	Tier 2 cleanups and tweaks (#115534 ) * Rename `_testinternalcapi.get_{uop,counter}_optimizer` to `new__optimizer` Use `_PyUOpName()` instead of` _PyOpcode_uop_name[]` * Add `target` to executor iterator items -- `list(ex)` now returns `(opcode, oparg, target, operand)` quadruples * Add executor methods `get_opcode()` and `get_oparg()` to get `vmdata.opcode`, `vmdata.oparg` * Define a helper for printing uops, and unify various places where they are printed * Add a hack to summarize_stats.py to fix legacy uop names (e.g. `POP_TOP` -> `_POP_TOP`) * Define helpers in `test_opt.py` for accessing the set or list of opnames of an executor	2024-02-20 20:24:35 +00:00
Ken Jin	7cce857622	gh-114058: Foundations of the Tier2 redundancy eliminator (GH-115085) --------- Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com> Co-authored-by: Jules <57632293+JuliaPoo@users.noreply.github.com> Co-authored-by: Guido van Rossum <gvanrossum@users.noreply.github.com>	2024-02-13 21:24:48 +08:00
Mark Shannon	8144661017	GH-113710: Fix updating of dict version tag and add watched dict stats (GH-115221)	2024-02-12 16:07:38 +00:00
Mark Shannon	e66d0399cc	GH-114806. Don't specialize calls to classes with metaclasses. (GH-114870)	2024-02-01 19:39:32 +00:00
Michael Droettboom	ea3cd0498c	gh-114312: Collect stats for unlikely events (GH-114493)	2024-01-25 11:10:51 +00:00
Peter Lazorchak	f653caa5a8	gh-89811: Check for valid tp_version_tag in specializer (GH-113558)	2024-01-11 13:33:05 +08:00
Mark Shannon	0ae60b66de	GH-113486: Do not emit spurious PY_UNWIND events for optimized calls to classes. (GH-113680)	2024-01-05 09:45:22 +00:00
Mark Shannon	e96f26083b	GH-111485: Generate instruction and uop metadata (GH-113287)	2023-12-20 14:27:25 +00:00
Guido van Rossum	7316dfb0eb	gh-112320: Implement on-trace confidence tracking for branches (#112321 ) We track the confidence as a scaled int.	2023-12-12 21:43:08 +00:00
Mark Shannon	a7b0f63cdb	GH-111772: Specialize slot loads and stores for `_Py_T_OBJECT` (GH-111773)	2023-11-06 13:55:04 +00:00
Michael Droettboom	84b4533e84	gh-109329: Count tier2 opcode misses (#110561 ) This keeps a separate 'miss' counter for each micro-opcode, incremented whenever a guard uop takes a deoptimization side exit.	2023-10-30 17:02:45 -07:00
Sam Gross	6dfb8fe023	gh-110481: Implement biased reference counting (gh-110764)	2023-10-30 16:06:09 +00:00
Irit Katriel	a0c414c35d	gh-111354: define names for RESUME oparg values (#111365 )	2023-10-26 16:30:18 +01:00
Irit Katriel	67a91f78e4	gh-109094: replace frame->prev_instr by frame->instr_ptr (#109095 )	2023-10-26 13:43:10 +00:00
Mark Shannon	b0699aa544	GH-111213: Fix a few broken stats (GH-111216)	2023-10-26 11:33:12 +01:00
Michael Droettboom	9eb2489266	gh-109329: Add stat for "trace too short" (GH-110402)	2023-10-05 16:12:06 +01:00
Michael Droettboom	e561e98058	GH-109329: Add tier 2 stats (GH-109913)	2023-10-04 14:52:28 -07:00
Brandt Bucher	22e65eecaa	GH-105848: Replace KW_NAMES + CALL with LOAD_CONST + CALL_KW (GH-109300)	2023-09-13 10:25:45 -07:00
Michael Droettboom	5dcbbd8861	GH-109330: Dump and compare stats using opcode names, not numbers (GH-109335)	2023-09-12 14:12:57 -07:00
Guido van Rossum	bcce5e2718	gh-109039: Branch prediction for Tier 2 interpreter (#109038 ) This adds a 16-bit inline cache entry to the conditional branch instructions POP_JUMP_IF_{FALSE,TRUE,NONE,NOT_NONE} and their instrumented variants, which is used to keep track of the branch direction. Each time we encounter these instructions we shift the cache entry left by one and set the bottom bit to whether we jumped. Then when it's time to translate such a branch to Tier 2 uops, we use the bit count from the cache entry to decided whether to continue translating the "didn't jump" branch or the "jumped" branch. The counter is initialized to a pattern of alternating ones and zeros to avoid bias. The .pyc file magic number is updated. There's a new test, some fixes for existing tests, and a few miscellaneous cleanups.	2023-09-11 18:20:24 +00:00
Victor Stinner	fd5989bda1	gh-108753: _Py_PrintSpecializationStats() uses Py_hexdigits (#109040 )	2023-09-07 04:47:57 +02:00

1 2 3 4 5

242 Commits