cpython/Tools/lockbench/lockbench.py

# Measure the performance of PyMutex and PyThread_type_lock locks
# with short critical sections.
#
# Usage: python Tools/lockbench/lockbench.py [CRITICAL_SECTION_LENGTH]
#
# How to interpret the results:
#
# Acquisitions (kHz): Reports the total number of lock acquisitions in
# thousands of acquisitions per second. This is the most important metric,
# particularly for the 1 thread case because even in multithreaded programs,
# most locks acquisitions are not contended. Values for 2+ threads are
# only meaningful for `--disable-gil` builds, because the GIL prevents most
# situations where there is lock contention with short critical sections.
#
# Fairness: A measure of how evenly the lock acquisitions are distributed.
# A fairness of 1.0 means that all threads acquired the lock the same number
# of times. A fairness of 1/N means that only one thread ever acquired the
# lock.
# See https://en.wikipedia.org/wiki/Fairness_measure#Jain's_fairness_index

from _testinternalcapi import benchmark_locks
import sys

# Max number of threads to test
MAX_THREADS = 10

# How much "work" to do while holding the lock
CRITICAL_SECTION_LENGTH = 1


def jains_fairness(values):
    # Jain's fairness index
    # See https://en.wikipedia.org/wiki/Fairness_measure
    return (sum(values) ** 2) / (len(values) * sum(x ** 2 for x in values))

def main():
    print("Lock Type           Threads           Acquisitions (kHz)   Fairness")
    for lock_type in ["PyMutex", "PyThread_type_lock"]:
        use_pymutex = (lock_type == "PyMutex")
        for num_threads in range(1, MAX_THREADS + 1):
            acquisitions, thread_iters = benchmark_locks(
                num_threads, use_pymutex, CRITICAL_SECTION_LENGTH)

            acquisitions /= 1000  # report in kHz for readability
            fairness = jains_fairness(thread_iters)

            print(f"{lock_type: <20}{num_threads: <18}{acquisitions: >5.0f}{fairness: >20.2f}")


if __name__ == "__main__":
    if len(sys.argv) > 1:
        CRITICAL_SECTION_LENGTH = int(sys.argv[1])
    main()
gh-108724: Add PyMutex and _PyParkingLot APIs (gh-109344) PyMutex is a one byte lock with fast, inlineable lock and unlock functions for the common uncontended case. The design is based on WebKit's WTF::Lock. PyMutex is built using the _PyParkingLot APIs, which provides a cross-platform futex-like API (based on WebKit's WTF::ParkingLot). This internal API will be used for building other synchronization primitives used to implement PEP 703, such as one-time initialization and events. This also includes tests and a mini benchmark in Tools/lockbench/lockbench.py to compare with the existing PyThread_type_lock. Uncontended acquisition + release: * Linux (x86-64): PyMutex: 11 ns, PyThread_type_lock: 44 ns * macOS (arm64): PyMutex: 13 ns, PyThread_type_lock: 18 ns * Windows (x86-64): PyMutex: 13 ns, PyThread_type_lock: 38 ns PR Overview: The primary purpose of this PR is to implement PyMutex, but there are a number of support pieces (described below). * PyMutex: A 1-byte lock that doesn't require memory allocation to initialize and is generally faster than the existing PyThread_type_lock. The API is internal only for now. * _PyParking_Lot: A futex-like API based on the API of the same name in WebKit. Used to implement PyMutex. * _PyRawMutex: A word sized lock used to implement _PyParking_Lot. * PyEvent: A one time event. This was used a bunch in the "nogil" fork and is useful for testing the PyMutex implementation, so I've included it as part of the PR. * pycore_llist.h: Defines common operations on doubly-linked list. Not strictly necessary (could do the list operations manually), but they come up frequently in the "nogil" fork. ( Similar to https://man.freebsd.org/cgi/man.cgi?queue) --------- Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com> 2023-09-19 12:54:29 -03:00			`# Measure the performance of PyMutex and PyThread_type_lock locks`
			`# with short critical sections.`
			`#`
			`# Usage: python Tools/lockbench/lockbench.py [CRITICAL_SECTION_LENGTH]`
			`#`
			`# How to interpret the results:`
			`#`
			`# Acquisitions (kHz): Reports the total number of lock acquisitions in`
			`# thousands of acquisitions per second. This is the most important metric,`
			`# particularly for the 1 thread case because even in multithreaded programs,`
			`# most locks acquisitions are not contended. Values for 2+ threads are`
			# only meaningful for `--disable-gil` builds, because the GIL prevents most
			`# situations where there is lock contention with short critical sections.`
			`#`
			`# Fairness: A measure of how evenly the lock acquisitions are distributed.`
			`# A fairness of 1.0 means that all threads acquired the lock the same number`
			`# of times. A fairness of 1/N means that only one thread ever acquired the`
			`# lock.`
			`# See https://en.wikipedia.org/wiki/Fairness_measure#Jain's_fairness_index`

			`from _testinternalcapi import benchmark_locks`
			`import sys`

			`# Max number of threads to test`
			`MAX_THREADS = 10`

			`# How much "work" to do while holding the lock`
			`CRITICAL_SECTION_LENGTH = 1`


			`def jains_fairness(values):`
			`# Jain's fairness index`
			`# See https://en.wikipedia.org/wiki/Fairness_measure`
			`return (sum(values) ** 2) / (len(values) * sum(x ** 2 for x in values))`

			`def main():`
			`print("Lock Type Threads Acquisitions (kHz) Fairness")`
			`for lock_type in ["PyMutex", "PyThread_type_lock"]:`
			`use_pymutex = (lock_type == "PyMutex")`
			`for num_threads in range(1, MAX_THREADS + 1):`
			`acquisitions, thread_iters = benchmark_locks(`
			`num_threads, use_pymutex, CRITICAL_SECTION_LENGTH)`

			`acquisitions /= 1000 # report in kHz for readability`
			`fairness = jains_fairness(thread_iters)`

			`print(f"{lock_type: <20}{num_threads: <18}{acquisitions: >5.0f}{fairness: >20.2f}")`


			`if __name__ == "__main__":`
			`if len(sys.argv) > 1:`
			`CRITICAL_SECTION_LENGTH = int(sys.argv[1])`
			`main()`