Commit Graph

4368 Commits

Author SHA1 Message Date
Petr Viktorin 6f1d448bc1
gh-113993: Allow interned strings to be mortal, and fix related issues (GH-120520)
* Add an InternalDocs file describing how interning should work and how to use it.

* Add internal functions to *explicitly* request what kind of interning is done:
  - `_PyUnicode_InternMortal`
  - `_PyUnicode_InternImmortal`
  - `_PyUnicode_InternStatic`

* Switch uses of `PyUnicode_InternInPlace` to those.

* Disallow using `_Py_SetImmortal` on strings directly.
  You should use `_PyUnicode_InternImmortal` instead:
  - Strings should be interned before immortalization, otherwise you're possibly
    interning a immortalizing copy.
  - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
    `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
    backports, as they are now part of public API and version-specific ABI.

* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

* Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
  - `_Py_ID`
  - `_Py_STR` (including the empty string)
  - one-character latin-1 singletons

  Now, when you intern a singleton, that exact singleton will be interned.

* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

* Intern `_Py_STR` singletons at startup.

* For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup.

* Beef up the tests. Cover internal details (marked with `@cpython_only`).

* Add lots of assertions

Co-Authored-By: Eric Snow <ericsnowcurrently@gmail.com>
2024-06-21 17:19:31 +02:00
Xarblu 285f42c850
GH-120602: Support LLVM_VERSION_SUFFIX for JIT builds (GH-120604) 2024-06-19 17:48:00 -07:00
yf-yang ace2045ea6
Fix types in pegen parser generator (GH-120720) 2024-06-19 16:12:40 +02:00
Diego Russo a0dce37895
GH-119726: Deduplicate JIT trampolines for out-of-range jumps (GH-120250) 2024-06-18 18:27:02 -07:00
Diego Russo 07daaf1ce1
Ignore some failing tests in emulated JIT CI (GH-120375) 2024-06-18 18:24:29 -07:00
Victor Stinner 35b16795d1
gh-120417: Remove unused imports in cases_generator (#120622) 2024-06-17 21:58:56 +02:00
Daniele Parmeggiani 362cd2680b
gh-117657: Fix `__slots__` thread safety in free-threaded build (#119368)
Fix a race in `PyMember_GetOne` and `PyMember_SetOne` for `Py_T_OBJECT_EX`.
These functions implement `__slots__` accesses for Python objects.
2024-06-17 18:44:54 +00:00
Sam Gross 460cc9e14e
gh-117657: Fix TSan reported data race on ioctl_works (#120175) 2024-06-17 13:23:40 -04:00
Victor Stinner d9b4316374
gh-120417: Remove unused imports in Tools (#120623) 2024-06-17 18:09:26 +02:00
Victor Stinner 6acf7776ef
gh-120507: Double WASI memory (#120648)
Use 16 MiB stack with 40 MiB memory limit, instead of 8 MiB stack
with 20 MiB memory limit.
2024-06-17 16:08:05 +00:00
Kirill Podoprigora 95737bbf18
gh-120433: Mention ``chocolatey`` for installing llvm on Windows as an alternative option (#120434) 2024-06-17 15:52:07 +00:00
AN Long 2bacc2343c
gh-117657: Add TSAN suppression for set_default_allocator_unlocked (#120500)
Add TSAN suppression for set_default_allocator_unlocked
2024-06-15 00:10:18 +08:00
Ken Jin eebae2c460
gh-117657: Make PyType_HasFeature atomic (GH-120210)
Make PyType_HasFeature atomic
2024-06-13 17:29:19 +08:00
neonene 127c1d2771
gh-71587: Drop local reference cache to `_strptime` module in `_datetime` (gh-120224)
The _strptime module object was cached in a static local variable (in the datetime.strptime() implementation).  That's a problem when it crosses isolation boundaries, such as reinitializing the runtme or between interpreters.  This change fixes the problem by dropping the static variable, instead always relying on the normal sys.modules cache (via PyImport_Import()).
2024-06-12 10:46:39 -06:00
Ken Jin e16aed63f6
gh-117657: Make Py_TYPE and Py_SET_TYPE thread safe (GH-120165)
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Nadeshiko Manju <me@manjusaka.me>
2024-06-12 20:41:07 +08:00
Sam Gross e21057b999
gh-117657: Fix TSAN race involving import lock (#118523)
This adds a `_PyRecursiveMutex` type based on `PyMutex` and uses that
for the import lock. This fixes some data races in the free-threaded
build and generally simplifies the import lock code.
2024-06-06 13:40:58 -04:00
Sam Gross e69d068ad0
gh-117657: Fix race involving GC and heap initialization (#119923)
The `_PyThreadState_Bind()` function is called before the first
`PyEval_AcquireThread()` so it's not synchronized with the stop the
world GC. We had a race where `gc_visit_heaps()` might visit a thread's
heap while it's being initialized.

Use a simple atomic int to avoid visiting heaps for threads that are not
yet fully initialized (i.e., before `tstate_mimalloc_bind()` is called).

The race was reproducible by running:
`python Lib/test/test_importlib/partial/pool_in_threads.py`.
2024-06-04 09:42:13 -04:00
Eric Snow 105f22ea46
gh-117398: Use Per-Interpreter State for the _datetime Static Types (gh-119929)
We make use of the same mechanism that we use for the static builtin types.  This required a few tweaks.

The relevant code could use some cleanup but I opted to avoid the significant churn in this change.  I'll tackle that separately.

This change is the final piece needed to make _datetime support multiple interpreters.  I've updated the module slot accordingly.
2024-06-03 17:09:18 -06:00
Sam Gross 47fb4327b5
gh-117657: Fix race involving immortalizing objects (#119927)
The free-threaded build currently immortalizes objects that use deferred
reference counting (see gh-117783). This typically happens once the
first non-main thread is created, but the behavior can be suppressed for
tests, in subinterpreters, or during a compile() call.

This fixes a race condition involving the tracking of whether the
behavior is suppressed.
2024-06-03 20:58:41 +00:00
Sam Gross 41c1cefbae
gh-117657: Avoid `sem_clockwait` in TSAN (#119915)
The `sem_clockwait` function is not currently instrumented, which leads
to false positives.
2024-06-03 13:42:27 -04:00
Steve Dower fd01271366
gh-119679: Ensures correct import libraries are included in Windows install packages (GH-119790) 2024-06-03 15:42:45 +01:00
Nikita Sobolev 1e5f615086
gh-116991: Improve `pygen --help` for `python` subparser (#116992) 2024-06-03 10:52:35 +03:00
Donghee Na 0594a27e5f
gh-117657: Fix data races report by TSAN unicode-hash (gh-119907) 2024-06-03 12:22:41 +09:00
Sam Gross f3b89a63cb
gh-117657: Fix TSAN reported race in `_PyEval_IsGILEnabled`. (#119921)
The GIL may be disabled concurrently with this call so we need to use a
relaxed atomic load.
2024-06-02 10:19:02 -04:00
Sam Gross 7dc745d1f5
gh-117657: Add TSAN suppression for `set_discard_entry` (#119908)
Seen in CI occasionally when running `test_weakref`.
2024-06-01 12:15:58 -04:00
Sam Gross 90ec19fd33
gh-117657: Fix TSAN race in QSBR assertion (#119887)
Due to a limitation in TSAN, all reads from `PyThreadState.state` must be
atomic to avoid reported races.
2024-06-01 10:04:38 -04:00
Sam Gross 60593b2052
gh-117657: Fix TSAN race in free-threaded GC (#119883)
Only call `gc_restore_tid()` from stop-the-world contexts.
`worklist_pop()` can be called while other threads are running, so use a
relaxed atomic to modify `ob_tid`.
2024-06-01 10:04:05 -04:00
dependabot[bot] 5152120ae7
Bump types-psutil from 5.9.5.20240423 to 5.9.5.20240516 in /Tools (#119900) 2024-06-01 10:38:13 +00:00
dependabot[bot] 51191dbfdd
build(deps-dev): bump types-setuptools from 69.5.0.20240423 to 70.0.0.20240524 in /Tools (#119899) 2024-06-01 10:11:53 +00:00
Katie Bell 010aaa32fb
gh-97747: Improvements to WASM browser REPL. (#97665)
Improvements to WASM browser REPL.

Adds a text box to write and run code outside the REPL, a stop button, and handling of Ctrl-D for EOF.
2024-05-31 09:58:46 +02:00
Petr Viktorin 48f21b3631
gh-118235: Move RAISE_SYNTAX_ERROR actions to invalid rules and make sure they stay there (GH-119731)
The Full Grammar specification in the docs omits rule actions, so grammar rules that raise a syntax error looked like valid syntax.
This was solved in ef940de by hiding those rules in the custom syntax highlighter.

This moves all syntax-error alternatives to invalid rules, adds a validator that ensures that actions containing RAISE_SYNTAX_ERROR are in invalid rules, and reverts the syntax highlighter hack.
2024-05-30 09:27:32 +02:00
Irit Katriel c1e9647107
gh-119689: generate stack effect metadata for pseudo instructions (#119691) 2024-05-29 09:47:56 +00:00
Victor Stinner 7ca74a760a
gh-119661: Add _Py_SINGLETON() include in Argumenet Clinic (#119712)
When the _Py_SINGLETON() is used, Argument Clinic now adds an
explicit "pycore_runtime.h" include to get the macro. Previously, the
macro may or may not be included indirectly by another include.
2024-05-29 11:37:04 +02:00
Victor Stinner 0518edc170
gh-119396: Optimize unicode_repr() (#119617)
Use stringlib to specialize unicode_repr() for each string kind
(UCS1, UCS2, UCS4).

Benchmark:

+-------------------------------------+---------+----------------------+
| Benchmark                           | ref     | change2              |
+=====================================+=========+======================+
| repr('abc')                         | 100 ns  | 103 ns: 1.02x slower |
+-------------------------------------+---------+----------------------+
| repr('a' * 100)                     | 369 ns  | 369 ns: 1.00x slower |
+-------------------------------------+---------+----------------------+
| repr(('a' + squote) * 100)          | 1.21 us | 946 ns: 1.27x faster |
+-------------------------------------+---------+----------------------+
| repr(('a' + nl) * 100)              | 1.23 us | 907 ns: 1.36x faster |
+-------------------------------------+---------+----------------------+
| repr(dquote + ('a' + squote) * 100) | 1.08 us | 858 ns: 1.25x faster |
+-------------------------------------+---------+----------------------+
| Geometric mean                      | (ref)   | 1.16x faster         |
+-------------------------------------+---------+----------------------+
2024-05-28 18:05:20 +02:00
Serhiy Storchaka b313cc68d5
gh-117557: Improve error messages when a string, bytes or bytearray of length 1 are expected (GH-117631) 2024-05-28 12:01:37 +03:00
Xie Yanbo bf08f0a5fe
Fix typos in comments (#119645) 2024-05-28 09:53:32 +02:00
Eric Snow b30d30c747
gh-117398: Statically Allocate the Datetime C-API (GH-119472) 2024-05-23 21:15:52 +02:00
neonene e12a6780bb
gh-117142: ctypes: Clean up c-analyzer .tsv files (GH-117544)
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2024-05-22 20:30:41 +00:00
Michael Vincent c9073eb1a9
gh-117505: Run ensurepip in isolated env in Windows installer (GH-118257)
ensurepip forks a subprocess to run pip itself, but that subprocess only inherits a -I isolated mode flag (see _run_pip() in Lib/ensurepip/__init__.py), not the "-E -s" flags that the installer has been using. This means that parts of ensurepip don't actually run in an isolated environment and can make incorrect decisions based on packages installed in the user site-packages.
2024-05-22 18:59:47 +01:00
Eric Snow 81865002ae
gh-119213: Be More Careful About _PyArg_Parser.kwtuple Across Interpreters (gh-119331)
_PyArg_Parser holds static global data generated for modules by Argument Clinic.  The _PyArg_Parser.kwtuple field is a tuple object, even though it's stored within a static global.  In some cases the tuple is statically allocated and thus it's okay that it gets shared by multiple interpreters.  However, in other cases the tuple is set lazily, allocated from the heap using the active interprepreter at the point the tuple is needed.

This is a problem once that interpreter is destroyed since _PyArg_Parser.kwtuple becomes at dangling pointer, leading to crashes.  It isn't a problem if the tuple is allocated under the main interpreter, since its lifetime is bound to the lifetime of the runtime.  The solution here is to temporarily switch to the main interpreter.  The alternative would be to always statically allocate the tuple.

This change also fixes a bug where only the most recent parser was added to the global linked list.
2024-05-22 09:57:52 -06:00
Arnon Yaari 87939bd579
gh-117657: Fix itertools.count thread safety (#119268)
Fix itertools.count in free-threading mode
2024-05-21 10:16:34 -07:00
Seth Michael Larson 1195c164da
gh-112844: Update CPE references for external dependencies (#118521) 2024-05-20 13:27:09 -04:00
Brandt Bucher 4702b7b5bd
GH-118943: Fix a race condition when generating jit_stencils.h (GH-118957) 2024-05-16 12:11:42 -04:00
Miro Hrončok ab73bcdf73
Explain how to install LLVM on Fedora (GH-118983)
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
2024-05-16 12:09:52 -04:00
Michał Górny e04cd964eb
GH-118836: Fix JIT build error when SHT_NOTE section is present (GH-119000) 2024-05-13 14:37:02 -07:00
mpage b88889e9ff
gh-117657: Log TSAN warnings to separate files and archive them (#118747)
This ensures we don't lose races that occur in subprocesses or
interleave races from workers running in parallel.

Log files are collected and packaged into a zipfile that can be
downloaded from the "Artifacts" section of the workflow run.
2024-05-10 17:54:23 -04:00
Mark Shannon f5c6b9977a
GH-118910: Less boilerplate in the tier 2 optimizer (#118913) 2024-05-10 17:43:23 +01:00
Alex Turner 33d20199af
gh-117657: Fix QSBR race condition (#118843)
`_Py_qsbr_unregister` is called when the PyThreadState is already
detached, so the access to `tstate->qsbr` isn't safe without locking the
shared mutex. Grab the `struct _qsbr_shared` from the interpreter
instead.
2024-05-10 10:26:35 -04:00
mpage 22d5185308
gh-117657: Fix data races reported by TSAN on `interp->threads.main` (#118865)
Use relaxed loads/stores when reading/writing to this field.
2024-05-10 09:59:14 -04:00
Brett Simmers 98ff3f65c0
gh-117657: Replace TSAN suppresions with more specific rules (#118722)
Using `race:` filters out warnings if the function appears anywhere in
the stack trace. This can hide a lot of unrelated warnings, especially
for a function like `_PyEval_EvalFrameDefault`, which is somewhere on
the stack more often than not.

Change all free-threaded suppressions to `race_top:`, which only matches
the top frame, and add any new suppressions this exposes.
2024-05-09 17:02:39 -04:00