1995-01-17 12:29:31 -04:00
|
|
|
|
|
|
|
/* This code implemented by Dag.Gruneau@elsa.preseco.comm.se */
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
/* Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru */
|
2002-02-28 17:34:34 -04:00
|
|
|
/* Eliminated some memory leaks, gsw@agere.com */
|
1995-01-17 12:29:31 -04:00
|
|
|
|
1997-08-14 17:12:58 -03:00
|
|
|
#include <windows.h>
|
|
|
|
#include <limits.h>
|
2006-06-10 09:23:46 -03:00
|
|
|
#ifdef HAVE_PROCESS_H
|
1997-08-14 17:12:58 -03:00
|
|
|
#include <process.h>
|
2006-06-10 09:23:46 -03:00
|
|
|
#endif
|
1995-01-17 12:29:31 -04:00
|
|
|
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
typedef struct NRMUTEX {
|
|
|
|
LONG owned ;
|
|
|
|
DWORD thread_id ;
|
|
|
|
HANDLE hevent ;
|
|
|
|
} NRMUTEX, *PNRMUTEX ;
|
|
|
|
|
2006-06-04 09:59:59 -03:00
|
|
|
|
|
|
|
BOOL
|
|
|
|
InitializeNonRecursiveMutex(PNRMUTEX mutex)
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
{
|
|
|
|
mutex->owned = -1 ; /* No threads have entered NonRecursiveMutex */
|
|
|
|
mutex->thread_id = 0 ;
|
|
|
|
mutex->hevent = CreateEvent(NULL, FALSE, FALSE, NULL) ;
|
|
|
|
return mutex->hevent != NULL ; /* TRUE if the mutex is created */
|
|
|
|
}
|
|
|
|
|
2006-06-04 09:59:59 -03:00
|
|
|
VOID
|
|
|
|
DeleteNonRecursiveMutex(PNRMUTEX mutex)
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
{
|
|
|
|
/* No in-use check */
|
|
|
|
CloseHandle(mutex->hevent) ;
|
|
|
|
mutex->hevent = NULL ; /* Just in case */
|
|
|
|
}
|
|
|
|
|
2006-06-04 09:59:59 -03:00
|
|
|
DWORD
|
|
|
|
EnterNonRecursiveMutex(PNRMUTEX mutex, BOOL wait)
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
{
|
|
|
|
/* Assume that the thread waits successfully */
|
|
|
|
DWORD ret ;
|
|
|
|
|
|
|
|
/* InterlockedIncrement(&mutex->owned) == 0 means that no thread currently owns the mutex */
|
|
|
|
if (!wait)
|
|
|
|
{
|
2007-05-03 17:27:03 -03:00
|
|
|
if (InterlockedCompareExchange(&mutex->owned, 0, -1) != -1)
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
return WAIT_TIMEOUT ;
|
|
|
|
ret = WAIT_OBJECT_0 ;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
ret = InterlockedIncrement(&mutex->owned) ?
|
|
|
|
/* Some thread owns the mutex, let's wait... */
|
|
|
|
WaitForSingleObject(mutex->hevent, INFINITE) : WAIT_OBJECT_0 ;
|
|
|
|
|
|
|
|
mutex->thread_id = GetCurrentThreadId() ; /* We own it */
|
|
|
|
return ret ;
|
|
|
|
}
|
|
|
|
|
2006-06-04 09:59:59 -03:00
|
|
|
BOOL
|
|
|
|
LeaveNonRecursiveMutex(PNRMUTEX mutex)
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
{
|
|
|
|
/* We don't own the mutex */
|
|
|
|
mutex->thread_id = 0 ;
|
|
|
|
return
|
|
|
|
InterlockedDecrement(&mutex->owned) < 0 ||
|
|
|
|
SetEvent(mutex->hevent) ; /* Other threads are waiting, wake one on them up */
|
|
|
|
}
|
|
|
|
|
2006-06-04 09:59:59 -03:00
|
|
|
PNRMUTEX
|
|
|
|
AllocNonRecursiveMutex(void)
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
{
|
|
|
|
PNRMUTEX mutex = (PNRMUTEX)malloc(sizeof(NRMUTEX)) ;
|
|
|
|
if (mutex && !InitializeNonRecursiveMutex(mutex))
|
|
|
|
{
|
|
|
|
free(mutex) ;
|
|
|
|
mutex = NULL ;
|
|
|
|
}
|
|
|
|
return mutex ;
|
|
|
|
}
|
|
|
|
|
2006-06-04 09:59:59 -03:00
|
|
|
void
|
|
|
|
FreeNonRecursiveMutex(PNRMUTEX mutex)
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
{
|
|
|
|
if (mutex)
|
|
|
|
{
|
|
|
|
DeleteNonRecursiveMutex(mutex) ;
|
|
|
|
free(mutex) ;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
1998-12-21 15:32:43 -04:00
|
|
|
long PyThread_get_thread_ident(void);
|
1995-01-17 12:29:31 -04:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialization of the C package, should not be needed.
|
|
|
|
*/
|
2006-06-04 09:59:59 -03:00
|
|
|
static void
|
|
|
|
PyThread__init_thread(void)
|
1995-01-17 12:29:31 -04:00
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Thread support.
|
|
|
|
*/
|
2001-10-16 18:13:49 -03:00
|
|
|
|
|
|
|
typedef struct {
|
|
|
|
void (*func)(void*);
|
An Anonymous Coward on c.l.py posted a little program with bizarre
behavior, creating many threads very quickly. A long debugging session
revealed that the Windows implementation of PyThread_start_new_thread()
was choked with "laziness" errors:
1. It checked MS _beginthread() for a failure return, but when that
happened it returned heap trash as the function result, instead of
an id of -1 (the proper error-return value).
2. It didn't consider that the Win32 CreateSemaphore() can fail.
3. When creating a great many threads very quickly, it's quite possible
that any particular bootstrap call can take virtually any amount of
time to return. But the code waited for a maximum of 5 seconds, and
didn't check to see whether the semaphore it was waiting for got
signaled. If it in fact timed out, the function could again return
heap trash as the function result. This is actually what confused
the test program, as the heap trash usually turned out to be 0, and
then multiple threads all got id 0 simultaneously, confusing the
hell out of threading.py's _active dict (mapping id to thread
object). A variety of baffling behaviors followed from that.
WRT #1 and #2, error returns are checked now, and "thread.error: can't
start new thread" gets raised now if a new thread (or new semaphore)
can't be created. WRT #3, we now wait for the semaphore without a
timeout.
Also removed useless local vrbls, folded long lines, and changed callobj
to a stack auto (it was going thru malloc/free instead, for no discernible
reason).
Bugfix candidate.
2003-07-04 01:40:45 -03:00
|
|
|
void *arg;
|
2001-10-16 18:13:49 -03:00
|
|
|
} callobj;
|
|
|
|
|
2009-01-12 04:11:24 -04:00
|
|
|
/* thunker to call adapt between the function type used by the system's
|
|
|
|
thread start function and the internally used one. */
|
|
|
|
#if defined(MS_WINCE)
|
|
|
|
static DWORD WINAPI
|
|
|
|
#else
|
2009-01-09 16:03:27 -04:00
|
|
|
static unsigned __stdcall
|
2009-01-12 04:11:24 -04:00
|
|
|
#endif
|
2001-10-16 18:13:49 -03:00
|
|
|
bootstrap(void *call)
|
|
|
|
{
|
|
|
|
callobj *obj = (callobj*)call;
|
|
|
|
void (*func)(void*) = obj->func;
|
|
|
|
void *arg = obj->arg;
|
2009-01-09 16:03:27 -04:00
|
|
|
HeapFree(GetProcessHeap(), 0, obj);
|
2001-10-16 18:13:49 -03:00
|
|
|
func(arg);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
An Anonymous Coward on c.l.py posted a little program with bizarre
behavior, creating many threads very quickly. A long debugging session
revealed that the Windows implementation of PyThread_start_new_thread()
was choked with "laziness" errors:
1. It checked MS _beginthread() for a failure return, but when that
happened it returned heap trash as the function result, instead of
an id of -1 (the proper error-return value).
2. It didn't consider that the Win32 CreateSemaphore() can fail.
3. When creating a great many threads very quickly, it's quite possible
that any particular bootstrap call can take virtually any amount of
time to return. But the code waited for a maximum of 5 seconds, and
didn't check to see whether the semaphore it was waiting for got
signaled. If it in fact timed out, the function could again return
heap trash as the function result. This is actually what confused
the test program, as the heap trash usually turned out to be 0, and
then multiple threads all got id 0 simultaneously, confusing the
hell out of threading.py's _active dict (mapping id to thread
object). A variety of baffling behaviors followed from that.
WRT #1 and #2, error returns are checked now, and "thread.error: can't
start new thread" gets raised now if a new thread (or new semaphore)
can't be created. WRT #3, we now wait for the semaphore without a
timeout.
Also removed useless local vrbls, folded long lines, and changed callobj
to a stack auto (it was going thru malloc/free instead, for no discernible
reason).
Bugfix candidate.
2003-07-04 01:40:45 -03:00
|
|
|
long
|
|
|
|
PyThread_start_new_thread(void (*func)(void *), void *arg)
|
1995-01-17 12:29:31 -04:00
|
|
|
{
|
2009-01-09 16:03:27 -04:00
|
|
|
HANDLE hThread;
|
|
|
|
unsigned threadID;
|
|
|
|
callobj *obj;
|
|
|
|
|
An Anonymous Coward on c.l.py posted a little program with bizarre
behavior, creating many threads very quickly. A long debugging session
revealed that the Windows implementation of PyThread_start_new_thread()
was choked with "laziness" errors:
1. It checked MS _beginthread() for a failure return, but when that
happened it returned heap trash as the function result, instead of
an id of -1 (the proper error-return value).
2. It didn't consider that the Win32 CreateSemaphore() can fail.
3. When creating a great many threads very quickly, it's quite possible
that any particular bootstrap call can take virtually any amount of
time to return. But the code waited for a maximum of 5 seconds, and
didn't check to see whether the semaphore it was waiting for got
signaled. If it in fact timed out, the function could again return
heap trash as the function result. This is actually what confused
the test program, as the heap trash usually turned out to be 0, and
then multiple threads all got id 0 simultaneously, confusing the
hell out of threading.py's _active dict (mapping id to thread
object). A variety of baffling behaviors followed from that.
WRT #1 and #2, error returns are checked now, and "thread.error: can't
start new thread" gets raised now if a new thread (or new semaphore)
can't be created. WRT #3, we now wait for the semaphore without a
timeout.
Also removed useless local vrbls, folded long lines, and changed callobj
to a stack auto (it was going thru malloc/free instead, for no discernible
reason).
Bugfix candidate.
2003-07-04 01:40:45 -03:00
|
|
|
dprintf(("%ld: PyThread_start_new_thread called\n",
|
|
|
|
PyThread_get_thread_ident()));
|
1995-01-17 12:29:31 -04:00
|
|
|
if (!initialized)
|
1998-12-21 15:32:43 -04:00
|
|
|
PyThread_init_thread();
|
1995-01-17 12:29:31 -04:00
|
|
|
|
2009-01-09 16:03:27 -04:00
|
|
|
obj = (callobj*)HeapAlloc(GetProcessHeap(), 0, sizeof(*obj));
|
|
|
|
if (!obj)
|
An Anonymous Coward on c.l.py posted a little program with bizarre
behavior, creating many threads very quickly. A long debugging session
revealed that the Windows implementation of PyThread_start_new_thread()
was choked with "laziness" errors:
1. It checked MS _beginthread() for a failure return, but when that
happened it returned heap trash as the function result, instead of
an id of -1 (the proper error-return value).
2. It didn't consider that the Win32 CreateSemaphore() can fail.
3. When creating a great many threads very quickly, it's quite possible
that any particular bootstrap call can take virtually any amount of
time to return. But the code waited for a maximum of 5 seconds, and
didn't check to see whether the semaphore it was waiting for got
signaled. If it in fact timed out, the function could again return
heap trash as the function result. This is actually what confused
the test program, as the heap trash usually turned out to be 0, and
then multiple threads all got id 0 simultaneously, confusing the
hell out of threading.py's _active dict (mapping id to thread
object). A variety of baffling behaviors followed from that.
WRT #1 and #2, error returns are checked now, and "thread.error: can't
start new thread" gets raised now if a new thread (or new semaphore)
can't be created. WRT #3, we now wait for the semaphore without a
timeout.
Also removed useless local vrbls, folded long lines, and changed callobj
to a stack auto (it was going thru malloc/free instead, for no discernible
reason).
Bugfix candidate.
2003-07-04 01:40:45 -03:00
|
|
|
return -1;
|
2009-01-09 16:03:27 -04:00
|
|
|
obj->func = func;
|
|
|
|
obj->arg = arg;
|
2009-01-12 04:11:24 -04:00
|
|
|
#if defined(MS_WINCE)
|
|
|
|
hThread = CreateThread(NULL,
|
|
|
|
Py_SAFE_DOWNCAST(_pythread_stacksize, Py_ssize_t, SIZE_T),
|
|
|
|
bootstrap, obj, 0, &threadID);
|
|
|
|
#else
|
2009-01-09 16:03:27 -04:00
|
|
|
hThread = (HANDLE)_beginthreadex(0,
|
2007-05-03 17:27:03 -03:00
|
|
|
Py_SAFE_DOWNCAST(_pythread_stacksize,
|
2009-01-09 16:03:27 -04:00
|
|
|
Py_ssize_t, unsigned int),
|
|
|
|
bootstrap, obj,
|
|
|
|
0, &threadID);
|
2009-01-12 04:11:24 -04:00
|
|
|
#endif
|
2009-01-09 16:03:27 -04:00
|
|
|
if (hThread == 0) {
|
2009-01-12 04:11:24 -04:00
|
|
|
#if defined(MS_WINCE)
|
|
|
|
/* Save error in variable, to prevent PyThread_get_thread_ident
|
|
|
|
from clobbering it. */
|
|
|
|
unsigned e = GetLastError();
|
|
|
|
dprintf(("%ld: PyThread_start_new_thread failed, win32 error code %u\n",
|
|
|
|
PyThread_get_thread_ident(), e));
|
|
|
|
#else
|
An Anonymous Coward on c.l.py posted a little program with bizarre
behavior, creating many threads very quickly. A long debugging session
revealed that the Windows implementation of PyThread_start_new_thread()
was choked with "laziness" errors:
1. It checked MS _beginthread() for a failure return, but when that
happened it returned heap trash as the function result, instead of
an id of -1 (the proper error-return value).
2. It didn't consider that the Win32 CreateSemaphore() can fail.
3. When creating a great many threads very quickly, it's quite possible
that any particular bootstrap call can take virtually any amount of
time to return. But the code waited for a maximum of 5 seconds, and
didn't check to see whether the semaphore it was waiting for got
signaled. If it in fact timed out, the function could again return
heap trash as the function result. This is actually what confused
the test program, as the heap trash usually turned out to be 0, and
then multiple threads all got id 0 simultaneously, confusing the
hell out of threading.py's _active dict (mapping id to thread
object). A variety of baffling behaviors followed from that.
WRT #1 and #2, error returns are checked now, and "thread.error: can't
start new thread" gets raised now if a new thread (or new semaphore)
can't be created. WRT #3, we now wait for the semaphore without a
timeout.
Also removed useless local vrbls, folded long lines, and changed callobj
to a stack auto (it was going thru malloc/free instead, for no discernible
reason).
Bugfix candidate.
2003-07-04 01:40:45 -03:00
|
|
|
/* I've seen errno == EAGAIN here, which means "there are
|
|
|
|
* too many threads".
|
|
|
|
*/
|
2009-01-12 04:11:24 -04:00
|
|
|
int e = errno;
|
2009-01-09 16:03:27 -04:00
|
|
|
dprintf(("%ld: PyThread_start_new_thread failed, errno %d\n",
|
2009-01-12 04:11:24 -04:00
|
|
|
PyThread_get_thread_ident(), e));
|
|
|
|
#endif
|
2009-01-09 16:03:27 -04:00
|
|
|
threadID = (unsigned)-1;
|
|
|
|
HeapFree(GetProcessHeap(), 0, obj);
|
1995-01-17 12:29:31 -04:00
|
|
|
}
|
An Anonymous Coward on c.l.py posted a little program with bizarre
behavior, creating many threads very quickly. A long debugging session
revealed that the Windows implementation of PyThread_start_new_thread()
was choked with "laziness" errors:
1. It checked MS _beginthread() for a failure return, but when that
happened it returned heap trash as the function result, instead of
an id of -1 (the proper error-return value).
2. It didn't consider that the Win32 CreateSemaphore() can fail.
3. When creating a great many threads very quickly, it's quite possible
that any particular bootstrap call can take virtually any amount of
time to return. But the code waited for a maximum of 5 seconds, and
didn't check to see whether the semaphore it was waiting for got
signaled. If it in fact timed out, the function could again return
heap trash as the function result. This is actually what confused
the test program, as the heap trash usually turned out to be 0, and
then multiple threads all got id 0 simultaneously, confusing the
hell out of threading.py's _active dict (mapping id to thread
object). A variety of baffling behaviors followed from that.
WRT #1 and #2, error returns are checked now, and "thread.error: can't
start new thread" gets raised now if a new thread (or new semaphore)
can't be created. WRT #3, we now wait for the semaphore without a
timeout.
Also removed useless local vrbls, folded long lines, and changed callobj
to a stack auto (it was going thru malloc/free instead, for no discernible
reason).
Bugfix candidate.
2003-07-04 01:40:45 -03:00
|
|
|
else {
|
|
|
|
dprintf(("%ld: PyThread_start_new_thread succeeded: %p\n",
|
2009-01-09 16:03:27 -04:00
|
|
|
PyThread_get_thread_ident(), (void*)hThread));
|
|
|
|
CloseHandle(hThread);
|
An Anonymous Coward on c.l.py posted a little program with bizarre
behavior, creating many threads very quickly. A long debugging session
revealed that the Windows implementation of PyThread_start_new_thread()
was choked with "laziness" errors:
1. It checked MS _beginthread() for a failure return, but when that
happened it returned heap trash as the function result, instead of
an id of -1 (the proper error-return value).
2. It didn't consider that the Win32 CreateSemaphore() can fail.
3. When creating a great many threads very quickly, it's quite possible
that any particular bootstrap call can take virtually any amount of
time to return. But the code waited for a maximum of 5 seconds, and
didn't check to see whether the semaphore it was waiting for got
signaled. If it in fact timed out, the function could again return
heap trash as the function result. This is actually what confused
the test program, as the heap trash usually turned out to be 0, and
then multiple threads all got id 0 simultaneously, confusing the
hell out of threading.py's _active dict (mapping id to thread
object). A variety of baffling behaviors followed from that.
WRT #1 and #2, error returns are checked now, and "thread.error: can't
start new thread" gets raised now if a new thread (or new semaphore)
can't be created. WRT #3, we now wait for the semaphore without a
timeout.
Also removed useless local vrbls, folded long lines, and changed callobj
to a stack auto (it was going thru malloc/free instead, for no discernible
reason).
Bugfix candidate.
2003-07-04 01:40:45 -03:00
|
|
|
}
|
2009-01-09 16:03:27 -04:00
|
|
|
return (long) threadID;
|
1995-01-17 12:29:31 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Return the thread Id instead of an handle. The Id is said to uniquely identify the
|
|
|
|
* thread in the system
|
|
|
|
*/
|
2006-06-04 09:59:59 -03:00
|
|
|
long
|
|
|
|
PyThread_get_thread_ident(void)
|
1995-01-17 12:29:31 -04:00
|
|
|
{
|
|
|
|
if (!initialized)
|
1998-12-21 15:32:43 -04:00
|
|
|
PyThread_init_thread();
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
|
1995-01-17 12:29:31 -04:00
|
|
|
return GetCurrentThreadId();
|
|
|
|
}
|
|
|
|
|
2006-06-04 09:59:59 -03:00
|
|
|
void
|
|
|
|
PyThread_exit_thread(void)
|
1995-01-17 12:29:31 -04:00
|
|
|
{
|
2009-01-09 16:03:27 -04:00
|
|
|
dprintf(("%ld: PyThread_exit_thread called\n", PyThread_get_thread_ident()));
|
1995-01-17 12:29:31 -04:00
|
|
|
if (!initialized)
|
2009-01-09 16:03:27 -04:00
|
|
|
exit(0);
|
2009-01-12 04:11:24 -04:00
|
|
|
#if defined(MS_WINCE)
|
|
|
|
ExitThread(0);
|
|
|
|
#else
|
2009-01-09 16:03:27 -04:00
|
|
|
_endthreadex(0);
|
2009-01-12 04:11:24 -04:00
|
|
|
#endif
|
1995-01-17 12:29:31 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Lock support. It has too be implemented as semaphores.
|
|
|
|
* I [Dag] tried to implement it with mutex but I could find a way to
|
|
|
|
* tell whether a thread already own the lock or not.
|
|
|
|
*/
|
2006-06-04 09:59:59 -03:00
|
|
|
PyThread_type_lock
|
|
|
|
PyThread_allocate_lock(void)
|
1995-01-17 12:29:31 -04:00
|
|
|
{
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
PNRMUTEX aLock;
|
1995-01-17 12:29:31 -04:00
|
|
|
|
1998-12-21 15:32:43 -04:00
|
|
|
dprintf(("PyThread_allocate_lock called\n"));
|
1995-01-17 12:29:31 -04:00
|
|
|
if (!initialized)
|
1998-12-21 15:32:43 -04:00
|
|
|
PyThread_init_thread();
|
1995-01-17 12:29:31 -04:00
|
|
|
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
aLock = AllocNonRecursiveMutex() ;
|
1995-01-17 12:29:31 -04:00
|
|
|
|
2000-06-30 12:01:00 -03:00
|
|
|
dprintf(("%ld: PyThread_allocate_lock() -> %p\n", PyThread_get_thread_ident(), aLock));
|
1995-01-17 12:29:31 -04:00
|
|
|
|
1998-12-21 15:32:43 -04:00
|
|
|
return (PyThread_type_lock) aLock;
|
1995-01-17 12:29:31 -04:00
|
|
|
}
|
|
|
|
|
2006-06-04 09:59:59 -03:00
|
|
|
void
|
|
|
|
PyThread_free_lock(PyThread_type_lock aLock)
|
1995-01-17 12:29:31 -04:00
|
|
|
{
|
2000-06-30 12:01:00 -03:00
|
|
|
dprintf(("%ld: PyThread_free_lock(%p) called\n", PyThread_get_thread_ident(),aLock));
|
1995-01-17 12:29:31 -04:00
|
|
|
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
FreeNonRecursiveMutex(aLock) ;
|
1995-01-17 12:29:31 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Return 1 on success if the lock was acquired
|
|
|
|
*
|
|
|
|
* and 0 if the lock was not acquired. This means a 0 is returned
|
|
|
|
* if the lock has already been acquired by this thread!
|
|
|
|
*/
|
2006-06-04 09:59:59 -03:00
|
|
|
int
|
|
|
|
PyThread_acquire_lock(PyThread_type_lock aLock, int waitflag)
|
1995-01-17 12:29:31 -04:00
|
|
|
{
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
int success ;
|
1995-01-17 12:29:31 -04:00
|
|
|
|
2000-06-30 12:01:00 -03:00
|
|
|
dprintf(("%ld: PyThread_acquire_lock(%p, %d) called\n", PyThread_get_thread_ident(),aLock, waitflag));
|
1995-01-17 12:29:31 -04:00
|
|
|
|
2005-07-08 19:26:13 -03:00
|
|
|
success = aLock && EnterNonRecursiveMutex((PNRMUTEX) aLock, (waitflag ? INFINITE : 0)) == WAIT_OBJECT_0 ;
|
1995-01-17 12:29:31 -04:00
|
|
|
|
2000-06-30 12:01:00 -03:00
|
|
|
dprintf(("%ld: PyThread_acquire_lock(%p, %d) -> %d\n", PyThread_get_thread_ident(),aLock, waitflag, success));
|
1995-01-17 12:29:31 -04:00
|
|
|
|
|
|
|
return success;
|
|
|
|
}
|
|
|
|
|
2006-06-04 09:59:59 -03:00
|
|
|
void
|
|
|
|
PyThread_release_lock(PyThread_type_lock aLock)
|
1995-01-17 12:29:31 -04:00
|
|
|
{
|
2000-06-30 12:01:00 -03:00
|
|
|
dprintf(("%ld: PyThread_release_lock(%p) called\n", PyThread_get_thread_ident(),aLock));
|
1995-01-17 12:29:31 -04:00
|
|
|
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 15:47:15 -03:00
|
|
|
if (!(aLock && LeaveNonRecursiveMutex((PNRMUTEX) aLock)))
|
2007-04-24 21:10:50 -03:00
|
|
|
dprintf(("%ld: Could not PyThread_release_lock(%p) error: %ld\n", PyThread_get_thread_ident(), aLock, GetLastError()));
|
1995-01-17 12:29:31 -04:00
|
|
|
}
|
2006-06-13 12:04:24 -03:00
|
|
|
|
|
|
|
/* minimum/maximum thread stack sizes supported */
|
|
|
|
#define THREAD_MIN_STACKSIZE 0x8000 /* 32kB */
|
|
|
|
#define THREAD_MAX_STACKSIZE 0x10000000 /* 256MB */
|
|
|
|
|
|
|
|
/* set the thread stack size.
|
|
|
|
* Return 0 if size is valid, -1 otherwise.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
_pythread_nt_set_stacksize(size_t size)
|
|
|
|
{
|
|
|
|
/* set to default */
|
|
|
|
if (size == 0) {
|
|
|
|
_pythread_stacksize = 0;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* valid range? */
|
|
|
|
if (size >= THREAD_MIN_STACKSIZE && size < THREAD_MAX_STACKSIZE) {
|
|
|
|
_pythread_stacksize = size;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
#define THREAD_SET_STACKSIZE(x) _pythread_nt_set_stacksize(x)
|
2009-01-09 16:03:27 -04:00
|
|
|
|
|
|
|
|
|
|
|
/* use native Windows TLS functions */
|
|
|
|
#define Py_HAVE_NATIVE_TLS
|
|
|
|
|
|
|
|
#ifdef Py_HAVE_NATIVE_TLS
|
|
|
|
int
|
|
|
|
PyThread_create_key(void)
|
|
|
|
{
|
|
|
|
return (int) TlsAlloc();
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
PyThread_delete_key(int key)
|
|
|
|
{
|
|
|
|
TlsFree(key);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* We must be careful to emulate the strange semantics implemented in thread.c,
|
|
|
|
* where the value is only set if it hasn't been set before.
|
|
|
|
*/
|
|
|
|
int
|
|
|
|
PyThread_set_key_value(int key, void *value)
|
|
|
|
{
|
|
|
|
BOOL ok;
|
|
|
|
void *oldvalue;
|
|
|
|
|
|
|
|
assert(value != NULL);
|
|
|
|
oldvalue = TlsGetValue(key);
|
|
|
|
if (oldvalue != NULL)
|
|
|
|
/* ignore value if already set */
|
|
|
|
return 0;
|
|
|
|
ok = TlsSetValue(key, value);
|
|
|
|
if (!ok)
|
|
|
|
return -1;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
void *
|
|
|
|
PyThread_get_key_value(int key)
|
|
|
|
{
|
2009-01-10 08:14:31 -04:00
|
|
|
/* because TLS is used in the Py_END_ALLOW_THREAD macro,
|
|
|
|
* it is necessary to preserve the windows error state, because
|
|
|
|
* it is assumed to be preserved across the call to the macro.
|
|
|
|
* Ideally, the macro should be fixed, but it is simpler to
|
|
|
|
* do it here.
|
|
|
|
*/
|
|
|
|
DWORD error = GetLastError();
|
|
|
|
void *result = TlsGetValue(key);
|
|
|
|
SetLastError(error);
|
|
|
|
return result;
|
2009-01-09 16:03:27 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
PyThread_delete_key_value(int key)
|
|
|
|
{
|
|
|
|
/* NULL is used as "key missing", and it is also the default
|
|
|
|
* given by TlsGetValue() if nothing has been set yet.
|
|
|
|
*/
|
|
|
|
TlsSetValue(key, NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* reinitialization of TLS is not necessary after fork when using
|
|
|
|
* the native TLS functions. And forking isn't supported on Windows either.
|
|
|
|
*/
|
|
|
|
void
|
|
|
|
PyThread_ReInitTLS(void)
|
|
|
|
{}
|
|
|
|
|
|
|
|
#endif
|