thread enabled TCL and fork (the python solution)

Andreas Otto

unread,

Apr 22, 2009, 6:53:05 AM4/22/09

to

Hi,

I fond an interesting link:

http://docs.python.org/3.0/c-api/sys.html?highlight=fork#PyOS_AfterFork

python is able to continue work after "fork" if python itself is thread
enabled

mfg

Andreas Otto

Alexandre Ferrieux

unread,

Apr 22, 2009, 8:28:58 AM4/22/09

to

Or is it ?

Looking at the implementation, I see that the only care taken of
mutexes is for a few dedicated locks internal to the Python
interpreter (PyEval_ReInitThreads, PyImport_ReInitLock). So it would
seem that the very problem that we are failing to address in Tcl is
equally unsolved in Python: namely, that "alien" threads (ie thread
started within an extension, through OS thread creation API, not
through Py/Tcl library abstraction) may still be holding alien
mutexes. Leaving them as is may lead to deadlocks in case of fork1(),
while mass-unlocking them (which needs heavy instrumentation) is no
option either because the internal state that they were protecting is
left undefined.

Si I would predict/guess that Python took the bold approach that this
problem is unsolvable, so let those cases fail without too much
publicity, while the "nominal" case (no alien threads) works.

Can anybody confront this "prediction" with reality ?

-Alex

dkf

unread,

Apr 22, 2009, 8:51:38 AM4/22/09

to

On 22 Apr, 14:28, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:

> Si I would predict/guess that Python took the bold approach that this
> problem is unsolvable, so let those cases fail without too much
> publicity, while the "nominal" case (no alien threads) works.

Sounds like something that was discussed a few years back, but which
was not acted upon because people might be doing something odder (i.e.
using external threads). IMO it was letting the perfect be the enemy
of the good, but that's something to argue with the maintainer of the
Thread extension and not me.

Donal.

George Peter Staplin

unread,

Apr 22, 2009, 12:58:11 PM4/22/09

to

Andreas Otto wrote:

Andreas,

Python's method leaks memory for locks in every fork() path that calls
PyOS_AfterFork() and has comments that indicate they don't understand what
things are async safe in pthread_atfork handlers.

For instance this is from the 3.0.1 Python/ceval.c:

"/* This function is called from PyOS_AfterFork to ensure that newly
created child processes don't hold locks referring to threads which
are not running in the child process. (This could also be done using
pthread_atfork mechanism, at least for the pthreads implementation.) */"

That's incorrect. They could not portably use the pthread_atfork()
mechanism in any sort of reliable way for that.

"/*XXX Can't use PyThread_free_lock here because it does too
much error-checking. Doing this cleanly would require
adding a new function to each thread_*.h. Instead, just
create a new lock and waste a little bit of memory */"

The quote above indicates how they leak memory.

So if you have the pattern:
fork_after: fork(), PyOS_AfterFork()

And you do: fork_after -> fork_after -> fork_after -> ...

It will be leaking lock memory in every new process image.

To quote David Butenhof (a POSIX thread committee member) on
pthread_atfork():

"The real answer is that pthread_atfork() is a completely useless and stupid
mechanism that was a well intentioned but ultimately pointless attempt to
carve a "back door" solution out of an inherently insoluable design
conflict."

That's from a larger article I suggest you read:
http://groups.google.com/group/comp.programming.threads/msg/3a43122820983fde

-George

George Peter Staplin

unread,

Apr 22, 2009, 1:11:52 PM4/22/09

to

Alexandre Ferrieux wrote:

> On Apr 22, 12:53 pm, Andreas Otto <aotto1...@onlinehome.de> wrote:
>> Hi,
>>
>> I fond an interesting link:
>>
>> http://docs.python.org/3.0/c-api/sys.html?highlight=fork#PyOS_AfterFork
>>
>> python is able to continue work after "fork" if python itself is thread
>> enabled
>
>
> Or is it ?
>
> Looking at the implementation, I see that the only care taken of
> mutexes is for a few dedicated locks internal to the Python
> interpreter (PyEval_ReInitThreads, PyImport_ReInitLock). So it would
> seem that the very problem that we are failing to address in Tcl is
> equally unsolved in Python: namely, that "alien" threads (ie thread
> started within an extension, through OS thread creation API, not
> through Py/Tcl library abstraction) may still be holding alien
> mutexes. Leaving them as is may lead to deadlocks in case of fork1(),
> while mass-unlocking them (which needs heavy instrumentation) is no
> option either because the internal state that they were protecting is
> left undefined.

If you can guarantee the C libraries used aren't holding locks, and
everything else, it might even work. You might be able to recreate all of
the threads, and you might be able to init the locks again. If you can't
then it's a load of hogwash, and wasteful code.

For example: you might have thread T1 using a libc call that happens to use
an internal mutex managed by libc. Thread T2 is another thread that
fork()s. So T2 is the only thread duplicated after fork(), but the mutex
state for that libc space is probably the same in the duplicated process
image. So what is the state of the lock that T1 was in? Does this lead to
deadlock when T2 which is now the only thread, tries to restart the other
threads, and restore the state? You tell me, because AFAIK it may or may
not. Eventually you will probably find that it deadlocks though... So,
you have to contend with that if you want a reliable mechanism for this. I
think it's unsolvable with the current implementations of pthreads.

The only way you could reliably solve this problem with the current POSIX
thread support is if you have all threads in a known safe area. That means
having all threads work together at some point, and forcing them to do so.
You wouldn't be able to have a thread stuck in a while loop doing a [read]
and [puts] pattern, because every thread should be able to unite and be in
the safe path when one thread forks.

> Si I would predict/guess that Python took the bold approach that this
> problem is unsolvable, so let those cases fail without too much
> publicity, while the "nominal" case (no alien threads) works.

I think the Python developers are a bit overly ambitious and don't
understand the problems they are creating.

-George

Andreas Otto

unread,

Apr 23, 2009, 2:01:57 AM4/23/09

to

Hi,

my fault ... working after fork is possible because
python itself is not thread save:

http://docs.python.org/3.0/c-api/init.html#thread-state-and-the-global-interpreter-lock

>>>>>>>>>>>>>>>>>>>>>>><
The Python interpreter is not fully thread safe. In order to support
multi-threaded Python programs, there’s a global lock that must be held by
the current thread before it can safely access Python objects.
<<<<<<<<<<<<<<<<<<<<<<<<

the python C api is missing something like the first argument to near
every tcl C api called "Tcl_Interp *interp" this mean every thread is
working on something like a GLOBAL interpreter.

>>>>>>>>>>>>>>>>>>>>>
Therefore, the rule exists that only the thread that has acquired the global
interpreter lock may operate on Python objects or call Python/C API
functions. In order to support multi-threaded Python programs, the
interpreter regularly releases and reacquires the lock — by default, every
100 bytecode instructions (this can be changed with
sys.setcheckinterval()). The lock is also released and reacquired around
potentially blocking I/O operations like reading or writing a file, so that
other threads can run while the thread that requests the I/O is waiting for
the I/O operation to complete.
<<<<<<<<<<<<<<<<<<<<

mfg

Andreas Otto

George Peter Staplin

unread,

Apr 23, 2009, 2:50:42 AM4/23/09

to

Andreas Otto wrote:

Andreas, based on what I saw of their sources they still have potential
deadlock issues. If a Python thread is doing some libc or other work it
has potential to cause a deadlock when a thread forks, and keeps running
normal code paths other than exec*(). The global interpreter lock won't
help with their design. Consider also that some systems have
frameworks/libraries available that are commonly used and documented as
creating multiple threads by default.

-George