cpython

Commit Graph

Author	SHA1	Message	Date
mpage	2e95c5ba3b	gh-115999: Implement thread-local bytecode and enable specialization for `BINARY_OP` (#123926 ) Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads. Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization. Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.	2024-11-04 11:13:32 -08:00
Shantanu	500f5338a8	gh-123930: Better error for "from imports" when script shadows module (#123929 )	2024-10-24 12:11:12 -07:00
Sam Gross	3c4a7fa617	gh-124218: Avoid refcount contention on builtins module (GH-125847) This replaces `_PyEval_BuiltinsFromGlobals` with `_PyDict_LoadBuiltinsFromGlobals`, which returns a new reference instead of a borrowed reference. Internally, the new function uses per-thread reference counting when possible to avoid contention on the refcount fields on the builtins module.	2024-10-24 12:44:38 -04:00
Pablo Galindo Salgado	3d1df3d84e	gh-125703: Correctly honour tracemalloc hooks on more PyDECREF specialized paths (#125712 )	2024-10-21 15:39:05 +01:00
Eric Snow	6d93690954	gh-125604: Move _Py_AuditHookEntry, etc. Out of pycore_runtime.h (gh-125605) This is essentially a cleanup, moving a handful of API declarations to the header files where they fit best, creating new ones when needed. We do the following: * add pycore_debug_offsets.h and move _Py_DebugOffsets, etc. there * inline struct _getargs_runtime_state and struct _gilstate_runtime_state in _PyRuntimeState * move struct _reftracer_runtime_state to the existing pycore_object_state.h * add pycore_audit.h and move to it _Py_AuditHookEntry , _PySys_Audit(), and _PySys_ClearAuditHooks * add audit.h and cpython/audit.h and move the existing audit-related API there *move the perfmap/trampoline API from cpython/sysmodule.h to cpython/ceval.h, and remove the now-empty cpython/sysmodule.h	2024-10-18 09:26:08 -06:00
Michael Droettboom	37986e830b	gh-123153: Fix PGO builds with free-threading on Windows (#125607 ) * gh-123153: Fix PGO builds with free-threading * Redo how the #define works	2024-10-17 08:20:30 -04:00
Michael Droettboom	51410d8bdc	gh-125217: Turn off optimization around_PyEval_EvalFrameDefault to avoid MSVC crash (#125477 )	2024-10-16 12:51:15 +00:00
Victor Stinner	b9a8ca0a6a	gh-115754: Use Py_GetConstant(Py_CONSTANT_EMPTY_STR) (#125194 ) Replace PyUnicode_New(0, 0), PyUnicode_FromString("") and PyUnicode_FromStringAndSize("", 0) with Py_GetConstant(Py_CONSTANT_EMPTY_STR).	2024-10-09 17:15:23 +02:00
Mark Shannon	da071fa3e8	GH-119866: Spill the stack around escaping calls. (GH-124392) * Spill the evaluation around escaping calls in the generated interpreter and JIT. * The code generator tracks live, cached values so they can be saved to memory when needed. * Spills the stack pointer around escaping calls, so that the exact stack is visible to the cycle GC.	2024-10-07 14:56:39 +01:00
Sam Gross	f4997bb3ac	gh-123923: Defer refcounting for `f_funcobj` in `_PyInterpreterFrame` (#124026 ) Use a `_PyStackRef` and defer the reference to `f_funcobj` when possible. This avoids some reference count contention in the common case of executing the same code object from multiple threads concurrently in the free-threaded build.	2024-09-24 20:08:18 +00:00
Mark Shannon	c87b0e4a46	GH-124284: Add stats for refcount operations on immortal objects (GH-124288)	2024-09-23 19:10:55 +01:00
Ken Jin	8810e286fa	gh-121459: Deferred LOAD_GLOBAL (GH-123128) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Sam Gross <655866+colesbury@users.noreply.github.com>	2024-09-14 00:23:51 +08:00
Sam Gross	b2afe2aae4	gh-123923: Defer refcounting for `f_executable` in `_PyInterpreterFrame` (#123924 ) Use a `_PyStackRef` and defer the reference to `f_executable` when possible. This avoids some reference count contention in the common case of executing the same code object from multiple threads concurrently in the free-threaded build.	2024-09-12 12:37:06 -04:00
Tushar Sadhwani	3597642ed5	gh-122239: Add actual count in unbalanced unpacking error message when possible (#122244 )	2024-09-10 16:07:30 +01:00
Sam Gross	556e855684	gh-117376: Make `Py_DECREF` a macro in ceval.c in free-threaded build (#122975 ) `Py_DECREF` and `PyStackRef_CLOSE` are now implemented as macros in the free-threaded build in ceval.c. There are two motivations; * MSVC has problems inlining functions in ceval.c in the PGO build. * We will want to mark escaping calls in order to spill the stack pointer in ceval.c and we will want to do this around `_Py_Dealloc` (or `_Py_MergeZeroLocalRefcount` or `_Py_DecRefShared`), not around the entire `Py_DECREF` or `PyStackRef_CLOSE` call.	2024-08-23 15:36:14 -04:00
Mark Shannon	bb1d30336e	GH-118093: Make `CALL_ALLOC_AND_ENTER_INIT` suitable for tier 2. (GH-123140) * Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it * Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.	2024-08-20 16:52:58 +01:00
Victor Stinner	4767a6e31c	gh-122728: Fix SystemError in PyEval_GetLocals() (#122735 ) Fix PyEval_GetLocals() to avoid SystemError ("bad argument to internal function"). Don't redefine the 'ret' variable in the if block. Add an unit test on PyEval_GetLocals().	2024-08-06 23:01:44 +02:00
Mark Shannon	7aca84e557	GH-117224: Move the body of a few large-ish micro-ops into helper functions (GH-122601)	2024-08-02 16:31:17 +01:00
Brandt Bucher	15d4cd0967	GH-116090: Fire RAISE events from _FOR_ITER_TIER_TWO (GH-122413)	2024-07-29 12:17:47 -07:00
Mark Shannon	afb0aa6ed2	GH-121131: Clean up and fix some instrumented instructions. (GH-121132) * Add support for 'prev_instr' to code generator and refactor some INSTRUMENTED instructions	2024-07-26 12:24:12 +01:00
Brandt Bucher	7b36b67b1e	GH-118093: Add tier two support to several instructions (GH-121884)	2024-07-18 14:24:58 -07:00
Mark Shannon	169324c27a	GH-120024: Use pointer for stack pointer (GH-121923)	2024-07-18 12:47:21 +01:00
Tian Gao	e65cb4c6f0	gh-118934: Make PyEval_GetLocals return borrowed reference (#119769 ) Co-authored-by: Alyssa Coghlan <ncoghlan@gmail.com>	2024-07-16 12:17:47 -07:00
Michael Droettboom	d69529d31c	gh-121338: Remove #pragma optimize (#121340 )	2024-07-08 08:48:42 -04:00
Sam Gross	8e8d202f55	gh-117139: Add _PyTuple_FromStackRefSteal and use it (#121244 ) Avoids the extra conversion from stack refs to PyObjects.	2024-07-02 12:30:14 -04:00
Brandt Bucher	33903c53db	GH-116017: Get rid of _COLD_EXITs (GH-120960)	2024-07-01 13:17:40 -07:00
Ken Jin	e6543daf12	gh-117139: Fix a few wrong steals in bytecodes.c (GH-121127) Fix a few wrong steals in bytecodes.c	2024-06-29 02:14:48 +08:00
Ken Jin	22b0de2755	gh-117139: Convert the evaluation stack to stack refs (#118450 ) This PR sets up tagged pointers for CPython. The general idea is to create a separate struct _PyStackRef for everything on the evaluation stack to store the bits. This forces the C compiler to warn us if we try to cast things or pull things out of the struct directly. Only for free threading: We tag the low bit if something is deferred - that means we skip incref and decref operations on it. This behavior may change in the future if Mark's plans to defer all objects in the interpreter loop pans out. This implies a strict stack reference discipline is required. ALL incref and decref operations on stackrefs must use the stackref variants. It is unsafe to untag something then do normal incref/decref ops on it. The new incref and decref variants are called dup and close. They mimic a "handle" API operating on these stackrefs. Please read Include/internal/pycore_stackref.h for more information! --------- Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com>	2024-06-27 03:10:43 +08:00
Irit Katriel	65a12c559c	gh-120834: fix type of *_iframe field in _PyGenObject_HEAD declaration (#120835 )	2024-06-24 10:23:38 +01:00
Mark Shannon	9cefcc0ee7	GH-120507: Lower the `BEFORE_WITH` and `BEFORE_ASYNC_WITH` instructions. (#120640 ) * Remove BEFORE_WITH and BEFORE_ASYNC_WITH instructions. * Add LOAD_SPECIAL instruction * Reimplement `with` and `async with` statements using LOAD_SPECIAL	2024-06-18 12:17:46 +01:00
Xie Yanbo	9e052619a6	Fix typos in documentation and comments (#119763 )	2024-06-04 10:22:22 +00:00
Irit Katriel	6e9863d7a3	gh-118692: Avoid creating unnecessary StopIteration instances for monitoring (#119216 )	2024-05-21 20:42:51 +00:00
Nikita Sobolev	a8e5fed100	gh-118613: Fix error handling of `_PyEval_GetFrameLocals` in `ceval.c` (#118614 )	2024-05-06 10:34:56 +03:00
Tian Gao	b034f14a4b	gh-74929: Implement PEP 667 (GH-115153)	2024-05-04 12:12:10 +01:00
Mark Shannon	1ab6356ebe	GH-118095: Use broader specializations of CALL in tier 1, for better tier 2 support of calls. (GH-118322) * Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations. * Remove CALL_PY_WITH_DEFAULTS specialization * Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize	2024-05-04 12:11:11 +01:00
Tian Gao	9c14ed0618	gh-107674: Improve performance of `sys.settrace` (GH-117133) * Check tracing in RESUME_CHECK * Only change to RESUME_CHECK if not tracing	2024-05-03 19:49:24 +01:00
Guido van Rossum	7d83f7bcc4	gh-118335: Configure Tier 2 interpreter at build time (#118339 ) The code for Tier 2 is now only compiled when configured with `--enable-experimental-jit[=yes\|interpreter]`. We drop support for `PYTHON_UOPS` and -`Xuops`, but you can disable the interpreter or JIT at runtime by setting `PYTHON_JIT=0`. You can also build it without enabling it by default using `--enable-experimental-jit=yes-off`; enable with `PYTHON_JIT=1`. On Windows, the `build.bat` script supports `--experimental-jit`, `--experimental-jit-off`, `--experimental-interpreter`. In the C code, `_Py_JIT` is defined as before when the JIT is enabled; the new variable `_Py_TIER2` is defined when the JIT or the interpreter is enabled. It is actually a bitmask: 1: JIT; 2: default-off; 4: interpreter.	2024-04-30 18:26:34 -07:00
Dino Viehland	4a1cf66c5c	gh-117657: Fix small issues with instrumentation and TSAN (#118064 ) Small TSAN fixups for instrumentation	2024-04-30 11:38:05 -07:00
Mark Shannon	3e06c7f719	GH-118095: Add dynamic exit support and FOR_ITER_GEN support to tier 2 (GH-118279)	2024-04-26 18:08:50 +01:00
Dino Viehland	07525c9a85	gh-116818: Make `sys.settrace`, `sys.setprofile`, and monitoring thread-safe (#116775 ) Makes sys.settrace, sys.setprofile, and monitoring generally thread-safe. Mostly uses a stop-the-world approach and synchronization around the code object's _co_instrumentation_version. There may be a little bit of extra synchronization around the monitoring data that's required to be TSAN clean.	2024-04-19 14:47:42 -07:00
Guido van Rossum	40f4d641a9	GH-118036: Fix a bug with CALL_STAT_INC (#117933 ) We were under-counting calls in `_PyEvalFramePushAndInit` because the `CALL_STAT_INC` macro was redefined to a no-op for the Tier 2 interpreter. The fix is not to `#undef` it at all. This results in ~37% more "Frames pushed" reported under "Call stats".	2024-04-18 07:59:02 -07:00
Jeff Glass	acf69e09c6	gh-115178: Add Counts of UOp Pairs to pystats (GH-115181)	2024-04-16 14:27:18 +01:00
Michael Droettboom	0edde64a41	GH-117457: Correct pystats uop "miss" counts (GH-117477)	2024-04-04 15:49:18 -07:00
Guido van Rossum	060a96f1a9	gh-116968: Reimplement Tier 2 counters (#117144 ) Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``), shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The API used for adaptive specialization counters is changed but the behavior is (supposed to be) identical. The behavior of the Tier 2 counters is changed: - There are no longer dynamic thresholds (we never varied these). - All counters now use the same exponential backoff. - The counter for ``JUMP_BACKWARD`` starts counting down from 16. - The ``temperature`` in side exits starts counting down from 64.	2024-04-04 15:03:27 +00:00
Guido van Rossum	8eda146e87	Fix successor opcode name printing in Tier 2 DEOPT debug message (#117471 )	2024-04-02 18:25:48 +00:00
Sam Gross	19c1dd60c5	gh-117323: Make `cell` thread-safe in free-threaded builds (#117330 ) Use critical sections to lock around accesses to cell contents. The critical sections are no-ops in the default (with GIL) build.	2024-03-29 13:35:43 -04:00
Michael Droettboom	26d328b2ba	GH-117121: Add pystats to JIT builds (GH-117346)	2024-03-28 15:23:08 -07:00
Mark Shannon	bf82f77957	GH-116422: Tier2 hot/cold splitting (GH-116813) Splits the "cold" path, deopts and exits, from the "hot" path, reducing the size of most jitted instructions, at the cost of slower exits.	2024-03-26 09:35:11 +00:00
Bogdan Romanyuk	a8e93d3dca	gh-115756: make PyCode_GetFirstFree an unstable API (GH-115781)	2024-03-19 09:20:38 +00:00
Guido van Rossum	76d0868907	Cleanup tier2 debug output (#116920 ) Various tweaks, including a slight refactor of the special cases for `_PUSH_FRAME`/`_POP_FRAME` to show the actual operand emitted.	2024-03-18 11:08:43 -07:00

1 2 3 4 5 ...

1539 Commits