Reword and restructure the GIL API doc

This commit is contained in:
Antoine Pitrou 2011-01-15 12:54:19 +00:00
parent 9bf8d1c228
commit bedd2c2d88
1 changed files with 154 additions and 150 deletions

View File

@ -366,48 +366,47 @@ Thread State and the Global Interpreter Lock
single: lock, interpreter
The Python interpreter is not fully thread-safe. In order to support
multi-threaded Python programs, there's a global lock, called the :dfn:`global
interpreter lock` or :dfn:`GIL`, that must be held by the current thread before
multi-threaded Python programs, there's a global lock, called the :term:`global
interpreter lock` or :term:`GIL`, that must be held by the current thread before
it can safely access Python objects. Without the lock, even the simplest
operations could cause problems in a multi-threaded program: for example, when
two threads simultaneously increment the reference count of the same object, the
reference count could end up being incremented only once instead of twice.
.. index:: single: setcheckinterval() (in module sys)
.. index:: single: setswitchinterval() (in module sys)
Therefore, the rule exists that only the thread that has acquired the global
interpreter lock may operate on Python objects or call Python/C API functions.
In order to support multi-threaded Python programs, the interpreter regularly
releases and reacquires the lock --- by default, every 100 bytecode instructions
(this can be changed with :func:`sys.setcheckinterval`). The lock is also
released and reacquired around potentially blocking I/O operations like reading
or writing a file, so that other threads can run while the thread that requests
the I/O is waiting for the I/O operation to complete.
Therefore, the rule exists that only the thread that has acquired the
:term:`GIL` may operate on Python objects or call Python/C API functions.
In order to emulate concurrency of execution, the interpreter regularly
tries to switch threads (see :func:`sys.setswitchinterval`). The lock is also
released around potentially blocking I/O operations like reading or writing
a file, so that other Python threads can run in the meantime.
.. index::
single: PyThreadState
single: PyThreadState
The Python interpreter needs to keep some bookkeeping information separate per
thread --- for this it uses a data structure called :c:type:`PyThreadState`.
There's one global variable, however: the pointer to the current
:c:type:`PyThreadState` structure. Before the addition of :dfn:`thread-local
storage` (:dfn:`TLS`) the current thread state had to be manipulated
explicitly.
The Python interpreter keeps some thread-specific bookkeeping information
inside a data structure called :c:type:`PyThreadState`. There's also one
global variable pointing to the current :c:type:`PyThreadState`: it can
be retrieved using :c:func:`PyThreadState_Get`.
This is easy enough in most cases. Most code manipulating the global
interpreter lock has the following simple structure::
Releasing the GIL from extension code
-------------------------------------
Most extension code manipulating the :term:`GIL` has the following simple
structure::
Save the thread state in a local variable.
Release the global interpreter lock.
...Do some blocking I/O operation...
... Do some blocking I/O operation ...
Reacquire the global interpreter lock.
Restore the thread state from the local variable.
This is so common that a pair of macros exists to simplify it::
Py_BEGIN_ALLOW_THREADS
...Do some blocking I/O operation...
... Do some blocking I/O operation ...
Py_END_ALLOW_THREADS
.. index::
@ -416,9 +415,8 @@ This is so common that a pair of macros exists to simplify it::
The :c:macro:`Py_BEGIN_ALLOW_THREADS` macro opens a new block and declares a
hidden local variable; the :c:macro:`Py_END_ALLOW_THREADS` macro closes the
block. Another advantage of using these two macros is that when Python is
compiled without thread support, they are defined empty, thus saving the thread
state and GIL manipulations.
block. These two macros are still available when Python is compiled without
thread support (they simply have an empty expansion).
When thread support is enabled, the block above expands to the following code::
@ -428,65 +426,60 @@ When thread support is enabled, the block above expands to the following code::
...Do some blocking I/O operation...
PyEval_RestoreThread(_save);
Using even lower level primitives, we can get roughly the same effect as
follows::
PyThreadState *_save;
_save = PyThreadState_Swap(NULL);
PyEval_ReleaseLock();
...Do some blocking I/O operation...
PyEval_AcquireLock();
PyThreadState_Swap(_save);
.. index::
single: PyEval_RestoreThread()
single: errno
single: PyEval_SaveThread()
single: PyEval_ReleaseLock()
single: PyEval_AcquireLock()
There are some subtle differences; in particular, :c:func:`PyEval_RestoreThread`
saves and restores the value of the global variable :c:data:`errno`, since the
lock manipulation does not guarantee that :c:data:`errno` is left alone. Also,
when thread support is disabled, :c:func:`PyEval_SaveThread` and
:c:func:`PyEval_RestoreThread` don't manipulate the GIL; in this case,
:c:func:`PyEval_ReleaseLock` and :c:func:`PyEval_AcquireLock` are not available.
This is done so that dynamically loaded extensions compiled with thread support
enabled can be loaded by an interpreter that was compiled with disabled thread
support.
Here is how these functions work: the global interpreter lock is used to protect the pointer to the
current thread state. When releasing the lock and saving the thread state,
the current thread state pointer must be retrieved before the lock is released
(since another thread could immediately acquire the lock and store its own thread
state in the global variable). Conversely, when acquiring the lock and restoring
the thread state, the lock must be acquired before storing the thread state
pointer.
The global interpreter lock is used to protect the pointer to the current thread
state. When releasing the lock and saving the thread state, the current thread
state pointer must be retrieved before the lock is released (since another
thread could immediately acquire the lock and store its own thread state in the
global variable). Conversely, when acquiring the lock and restoring the thread
state, the lock must be acquired before storing the thread state pointer.
.. note::
Calling system I/O functions is the most common use case for releasing
the GIL, but it can also be useful before calling long-running computations
which don't need access to Python objects, such as compression or
cryptographic functions operating over memory buffers. For example, the
standard :mod:`zlib` and :mod:`hashlib` modules release the GIL when
compressing or hashing data.
It is important to note that when threads are created from C, they don't have
the global interpreter lock, nor is there a thread state data structure for
them. Such threads must bootstrap themselves into existence, by first
creating a thread state data structure, then acquiring the lock, and finally
storing their thread state pointer, before they can start using the Python/C
API. When they are done, they should reset the thread state pointer, release
the lock, and finally free their thread state data structure.
Non-Python created threads
--------------------------
Threads can take advantage of the :c:func:`PyGILState_\*` functions to do all of
the above automatically. The typical idiom for calling into Python from a C
thread is now::
When threads are created using the dedicated Python APIs (such as the
:mod:`threading` module), a thread state is automatically associated to them
and the code showed above is therefore correct. However, when threads are
created from C (for example by a third-party library with its own thread
management), they don't hold the GIL, nor is there a thread state structure
for them.
If you need to call Python code from these threads (often this will be part
of a callback API provided by the aforementioned third-party library),
you must first register these threads with the interpreter by
creating a thread state data structure, then acquiring the GIL, and finally
storing their thread state pointer, before you can start using the Python/C
API. When you are done, you should reset the thread state pointer, release
the GIL, and finally free the thread state data structure.
The :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release` functions do
all of the above automatically. The typical idiom for calling into Python
from a C thread is::
PyGILState_STATE gstate;
gstate = PyGILState_Ensure();
/* Perform Python actions here. */
/* Perform Python actions here. */
result = CallSomeFunction();
/* evaluate result */
/* evaluate result or handle exception */
/* Release the thread. No Python API allowed beyond this point. */
PyGILState_Release(gstate);
Note that the :c:func:`PyGILState_\*` functions assume there is only one global
interpreter (created automatically by :c:func:`Py_Initialize`). Python still
interpreter (created automatically by :c:func:`Py_Initialize`). Python
supports the creation of additional interpreters (using
:c:func:`Py_NewInterpreter`), but mixing multiple interpreters and the
:c:func:`PyGILState_\*` API is unsupported.
@ -509,6 +502,12 @@ being held by a thread that is defunct after the fork.
always able to.
High-level API
--------------
These are the most commonly used types and functions when writing C extension
code, or when embedding the Python interpreter:
.. c:type:: PyInterpreterState
This data structure represents the state shared by a number of cooperating
@ -550,21 +549,22 @@ always able to.
.. index:: module: _thread
When only the main thread exists, no GIL operations are needed. This is a
common situation (most Python programs do not use threads), and the lock
operations slow the interpreter down a bit. Therefore, the lock is not
created initially. This situation is equivalent to having acquired the lock:
when there is only a single thread, all object accesses are safe. Therefore,
when this function initializes the global interpreter lock, it also acquires
it. Before the Python :mod:`_thread` module creates a new thread, knowing
that either it has the lock or the lock hasn't been created yet, it calls
:c:func:`PyEval_InitThreads`. When this call returns, it is guaranteed that
the lock has been created and that the calling thread has acquired it.
.. note::
When only the main thread exists, no GIL operations are needed. This is a
common situation (most Python programs do not use threads), and the lock
operations slow the interpreter down a bit. Therefore, the lock is not
created initially. This situation is equivalent to having acquired the lock:
when there is only a single thread, all object accesses are safe. Therefore,
when this function initializes the global interpreter lock, it also acquires
it. Before the Python :mod:`_thread` module creates a new thread, knowing
that either it has the lock or the lock hasn't been created yet, it calls
:c:func:`PyEval_InitThreads`. When this call returns, it is guaranteed that
the lock has been created and that the calling thread has acquired it.
It is **not** safe to call this function when it is unknown which thread (if
any) currently has the global interpreter lock.
It is **not** safe to call this function when it is unknown which thread (if
any) currently has the global interpreter lock.
This function is not available when thread support is disabled at compile time.
This function is not available when thread support is disabled at compile time.
.. c:function:: int PyEval_ThreadsInitialized()
@ -575,37 +575,6 @@ always able to.
not available when thread support is disabled at compile time.
.. c:function:: void PyEval_AcquireLock()
Acquire the global interpreter lock. The lock must have been created earlier.
If this thread already has the lock, a deadlock ensues. This function is not
available when thread support is disabled at compile time.
.. c:function:: void PyEval_ReleaseLock()
Release the global interpreter lock. The lock must have been created earlier.
This function is not available when thread support is disabled at compile time.
.. c:function:: void PyEval_AcquireThread(PyThreadState *tstate)
Acquire the global interpreter lock and set the current thread state to
*tstate*, which should not be *NULL*. The lock must have been created earlier.
If this thread already has the lock, deadlock ensues. This function is not
available when thread support is disabled at compile time.
.. c:function:: void PyEval_ReleaseThread(PyThreadState *tstate)
Reset the current thread state to *NULL* and release the global interpreter
lock. The lock must have been created earlier and must be held by the current
thread. The *tstate* argument, which must not be *NULL*, is only used to check
that it represents the current thread state --- if it isn't, a fatal error is
reported. This function is not available when thread support is disabled at
compile time.
.. c:function:: PyThreadState* PyEval_SaveThread()
Release the global interpreter lock (if it has been created and thread
@ -624,6 +593,20 @@ always able to.
when thread support is disabled at compile time.)
.. c:function:: PyThreadState* PyThreadState_Get()
Return the current thread state. The global interpreter lock must be held.
When the current thread state is *NULL*, this issues a fatal error (so that
the caller needn't check for *NULL*).
.. c:function:: PyThreadState* PyThreadState_Swap(PyThreadState *tstate)
Swap the current thread state with the thread state given by the argument
*tstate*, which may be *NULL*. The global interpreter lock must be held
and is not released.
.. c:function:: void PyEval_ReInitThreads()
This function is called from :c:func:`PyOS_AfterFork` to ensure that newly
@ -631,6 +614,43 @@ always able to.
are not running in the child process.
The following functions use thread-local storage, and are not compatible
with sub-interpreters:
.. c:function:: PyGILState_STATE PyGILState_Ensure()
Ensure that the current thread is ready to call the Python C API regardless
of the current state of Python, or of the global interpreter lock. This may
be called as many times as desired by a thread as long as each call is
matched with a call to :c:func:`PyGILState_Release`. In general, other
thread-related APIs may be used between :c:func:`PyGILState_Ensure` and
:c:func:`PyGILState_Release` calls as long as the thread state is restored to
its previous state before the Release(). For example, normal usage of the
:c:macro:`Py_BEGIN_ALLOW_THREADS` and :c:macro:`Py_END_ALLOW_THREADS` macros is
acceptable.
The return value is an opaque "handle" to the thread state when
:c:func:`PyGILState_Ensure` was called, and must be passed to
:c:func:`PyGILState_Release` to ensure Python is left in the same state. Even
though recursive calls are allowed, these handles *cannot* be shared - each
unique call to :c:func:`PyGILState_Ensure` must save the handle for its call
to :c:func:`PyGILState_Release`.
When the function returns, the current thread will hold the GIL and be able
to call arbitrary Python code. Failure is a fatal error.
.. c:function:: void PyGILState_Release(PyGILState_STATE)
Release any resources previously acquired. After this call, Python's state will
be the same as it was prior to the corresponding :c:func:`PyGILState_Ensure` call
(but generally this state will be unknown to the caller, hence the use of the
GILState API).
Every call to :c:func:`PyGILState_Ensure` must be matched by a call to
:c:func:`PyGILState_Release` on the same thread.
The following macros are normally used without a trailing semicolon; look for
example usage in the Python source distribution.
@ -664,6 +684,10 @@ example usage in the Python source distribution.
:c:macro:`Py_BEGIN_ALLOW_THREADS` without the opening brace and variable
declaration. It is a no-op when thread support is disabled at compile time.
Low-level API
-------------
All of the following functions are only available when thread support is enabled
at compile time, and must be called only when the global interpreter lock has
been created.
@ -709,19 +733,6 @@ been created.
:c:func:`PyThreadState_Clear`.
.. c:function:: PyThreadState* PyThreadState_Get()
Return the current thread state. The global interpreter lock must be held.
When the current thread state is *NULL*, this issues a fatal error (so that
the caller needn't check for *NULL*).
.. c:function:: PyThreadState* PyThreadState_Swap(PyThreadState *tstate)
Swap the current thread state with the thread state given by the argument
*tstate*, which may be *NULL*. The global interpreter lock must be held.
.. c:function:: PyObject* PyThreadState_GetDict()
Return a dictionary in which extensions can store thread-specific state
@ -742,38 +753,31 @@ been created.
exception (if any) for the thread is cleared. This raises no exceptions.
.. c:function:: PyGILState_STATE PyGILState_Ensure()
.. c:function:: void PyEval_AcquireThread(PyThreadState *tstate)
Ensure that the current thread is ready to call the Python C API regardless
of the current state of Python, or of the global interpreter lock. This may
be called as many times as desired by a thread as long as each call is
matched with a call to :c:func:`PyGILState_Release`. In general, other
thread-related APIs may be used between :c:func:`PyGILState_Ensure` and
:c:func:`PyGILState_Release` calls as long as the thread state is restored to
its previous state before the Release(). For example, normal usage of the
:c:macro:`Py_BEGIN_ALLOW_THREADS` and :c:macro:`Py_END_ALLOW_THREADS` macros is
acceptable.
The return value is an opaque "handle" to the thread state when
:c:func:`PyGILState_Ensure` was called, and must be passed to
:c:func:`PyGILState_Release` to ensure Python is left in the same state. Even
though recursive calls are allowed, these handles *cannot* be shared - each
unique call to :c:func:`PyGILState_Ensure` must save the handle for its call
to :c:func:`PyGILState_Release`.
When the function returns, the current thread will hold the GIL. Failure is a
fatal error.
Acquire the global interpreter lock and set the current thread state to
*tstate*, which should not be *NULL*. The lock must have been created earlier.
If this thread already has the lock, deadlock ensues.
.. c:function:: void PyGILState_Release(PyGILState_STATE)
.. c:function:: void PyEval_ReleaseThread(PyThreadState *tstate)
Release any resources previously acquired. After this call, Python's state will
be the same as it was prior to the corresponding :c:func:`PyGILState_Ensure` call
(but generally this state will be unknown to the caller, hence the use of the
GILState API.)
Reset the current thread state to *NULL* and release the global interpreter
lock. The lock must have been created earlier and must be held by the current
thread. The *tstate* argument, which must not be *NULL*, is only used to check
that it represents the current thread state --- if it isn't, a fatal error is
reported.
Every call to :c:func:`PyGILState_Ensure` must be matched by a call to
:c:func:`PyGILState_Release` on the same thread.
.. c:function:: void PyEval_AcquireLock()
Acquire the global interpreter lock. The lock must have been created earlier.
If this thread already has the lock, a deadlock ensues.
.. c:function:: void PyEval_ReleaseLock()
Release the global interpreter lock. The lock must have been created earlier.
Sub-interpreter support