There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`
and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution
involving threads A, B, and C:
1. A starts.
2. B joins A, blocking on its `_tstate_lock`.
3. C joins A, blocking on its `_tstate_lock`.
4. A finishes and releases its `_tstate_lock`.
5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped
out before calling `_stop()`.
6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped
out before releasing it.
7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.
However, C holds it, so the assertion fails.
The race can be reproduced[^3] by inserting sleeps at the appropriate points in
the threading code. To do so, run the `repro_join_race.py` from the linked repo.
There are two main parts to this PR:
1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.
The event is set by the runtime prior to the thread being cleared (in the same
place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the
event to be set.
2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all
non-daemon threads to exit. To do so, an `is_daemon` predicate was added to
`PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`
now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on
`_tstate_lock`s.
[^1]: 441affc9e7/Lib/threading.py (L1201)
[^2]: 441affc9e7/Lib/threading.py (L1115)
[^3]: 8194653279
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Make `_thread.ThreadHandle` thread-safe in free-threaded builds
We protect the mutable state of `ThreadHandle` using a `_PyOnceFlag`.
Concurrent operations (i.e. `join` or `detach`) on `ThreadHandle` block
until it is their turn to execute or an earlier operation succeeds.
Once an operation has been applied successfully all future operations
complete immediately.
The `join()` method is now idempotent. It may be called multiple times
but the underlying OS thread will only be joined once. After `join()`
succeeds, any future calls to `join()` will succeed immediately.
The internal thread handle `detach()` method has been removed.
This marks dead ThreadHandles as non-joinable earlier in
`PyOS_AfterFork_Child()` before we execute any Python code. The handles
are stored in a global linked list in `_PyRuntimeState` because `fork()`
affects the entire process.
`threading.Lock` is now the underlying class and is constructable rather than the old
factory function. This allows for type annotations to refer to it which had no non-ugly
way to be expressed prior to this.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Always set a _MainThread as a main thread after os.fork() is called from
a thread started not by the threading module.
A new _MainThread was already set as a new main thread after fork if
threading.current_thread() was not called for a foreign thread before fork.
Now, if it was called before fork, the implicitly created _DummyThread will
be turned into _MainThread after fork.
It fixes, in particularly, an incompatibility of _DummyThread with
the threading shutdown logic which relies on the main thread
having tstate_lock.
Co-authored-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Joining a thread now ensures the underlying OS thread has exited. This is required for safer fork() in multi-threaded processes.
---------
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
In a few places we switch to another interpreter without knowing if it has a thread state associated with the current thread. For the main interpreter there wasn't much of a problem, but for subinterpreters we were *mostly* okay re-using the tstate created with the interpreter (located via PyInterpreterState_ThreadHead()). There was a good chance that tstate wasn't actually in use by another thread.
However, there are no guarantees of that. Furthermore, re-using an already used tstate is currently fragile. To address this, now we create a new thread state in each of those places and use it.
One consequence of this change is that PyInterpreterState_ThreadHead() may not return NULL (though that won't happen for the main interpreter).
The existence of background threads running on a subinterpreter was preventing interpreters from getting properly destroyed, as well as impacting the ability to run the interpreter again. It also affected how we wait for non-daemon threads to finish.
We add PyInterpreterState.threads.main, with some internal C-API functions.
Having a separate lock means Thread.join() doesn't need to wait for the thread to be cleaned up first. It can wait for the thread's Python target to finish running. This gives us some flexibility in how we clean up threads.
(This is a minor cleanup as part of a fix for gh-104341.)
Not comprehensive, best effort warning. There are cases when threads exist on some platforms that this code cannot detect. macOS when API permissions allow and Linux with a readable /proc procfs present are the currently supported cases where a warning should show up reliably.
Starting with a DeprecationWarning for now, it is less disruptive than something like RuntimeWarning and most likely to only be seen in people's CI tests - a good place to start with this messaging.
Previously, the optional restrictions on subinterpreters were: disallow fork, subprocess, and threads. By default, we were disallowing all three for "isolated" interpreters. We always allowed all three for the main interpreter and those created through the legacy `Py_NewInterpreter()` API.
Those settings were a bit conservative, so here we've adjusted the optional restrictions to: fork, exec, threads, and daemon threads. The default for "isolated" interpreters disables fork, exec, and daemon threads. Regular threads are allowed by default. We continue always allowing everything For the main interpreter and the legacy API.
In the code, we add `_PyInterpreterConfig.allow_exec` and `_PyInterpreterConfig.allow_daemon_threads`. We also add `Py_RTFLAGS_DAEMON_THREADS` and `Py_RTFLAGS_EXEC`.
* gh-93503: Add APIs to set profiling and tracing functions in all threads in the C-API
* Use a separate API
* Fix NEWS entry
* Add locks around the loop
* Document ignoring exceptions
* Use the new APIs in the sys module
* Update docs
If Condition.notify() was interrupted just after it released the waiter lock,
but before removing it from the queue, the following calls of notify() failed
with RuntimeError: cannot release un-acquired lock.
For threads, and for multiprocessing, it's always been the case that ``args=list`` works fine when passed to ``Process()`` or ``Thread()``, and such code is common in the wild. But, according to the docs, only a tuple can be used. This brings the docs into synch with reality.
Doc changes by Charlie Zhao.
Co-authored-by: Tim Peters <tim.peters@gmail.com>
Removed extra comma in comment that indicates state of a `Barrier` as it was confusing and breaking the flow while reading.
Co-authored-by: Priyank <5903604+cpriyank@users.noreply.github.com>
Fix the threading._shutdown() function when the threading module was
imported first from a thread different than the main thread: no
longer log an error at Python exit.
Fix a race condition in the Thread.join() method of the threading
module. If the function is interrupted by a signal and the signal
handler raises an exception, make sure that the thread remains in a
consistent state to prevent a deadlock.
The _bootstrap_inner() method of threading.Thread now reuses its
_delete() method rather than accessing _active() directly. It became
possible since _active_limbo_lock became reentrant. Moreover, it no
longer ignores any exception when deleting the thread from the
_active dictionary.
When a Thread is not joined after it has stopped, its lock may remain in the _shutdown_locks set until interpreter shutdown. If many threads are created this way, the _shutdown_locks set could therefore grow endlessly. To avoid such a situation, purge expired locks each time a new one is added or removed.
The snake_case names have existed since Python 2.6, so there is
no reason to keep the old camelCase names around. One similar
method, threading.Thread.isAlive, was already removed in
Python 3.9 (bpo-37804).
Fix the threading.Thread class at fork: do nothing if the thread is
already stopped (ex: fork called at Python exit). Previously, an
error was logged in the child process.
Add a private _at_fork_reinit() method to _thread.Lock,
_thread.RLock, threading.RLock and threading.Condition classes:
reinitialize the lock after fork in the child process; reset the lock
to the unlocked state.
Rename also the private _reset_internal_locks() method of
threading.Event to _at_fork_reinit().
* Add _PyThread_at_fork_reinit() private function. It is excluded
from the limited C API.
* threading.Thread._reset_internal_locks() now calls
_at_fork_reinit() on self._tstate_lock rather than creating a new
Python lock object.
Remove daemon threads from :mod:`concurrent.futures` by adding
an internal `threading._register_atexit()`, which calls registered functions
prior to joining all non-daemon threads. This allows for compatibility
with subinterpreters, which don't support daemon threads.
If fork was not called by a thread spawned by threading.Thread,
threading._after_fork() now creates a _MainThread instance for
_main_thread, instead of a _DummyThread instance.
* Use the 'p' format unit instead of manually called PyObject_IsTrue().
* Pass boolean value instead 0/1 integers to functions that needs boolean.
* Convert some arguments to boolean only once.