Is pthread_kill() guaranteed to start a handler?

Jan Wielemaker

unread,

Sep 5, 2003, 5:48:02 AM9/5/03

to

Hi,

I'm using pthread_kill() to request some other threads to do some
work and post a semaphore. The issueing thread counts the number
of _successfull_ pthread_kill() calls and than waits this number
of times on the semaphore. The (pseudo) code is:

install signal handler using sigaction()
foreach thread-that-may-be-interested
if ( pthread_kill(t, sig) == 0 )
signalled++;
while(signalled--)
sem_wait(sem);
restore signal handlers

handler:

<do some async-safe work>
sem_post(sem);

This works 99.99...% of the cases correctly (Linux) but we have
seen a few occasions where the main thread remains waiting on
the semaphore. Is there anything wrong with this code?

--- Jan

Joe Seigh

unread,

Sep 5, 2003, 7:09:36 AM9/5/03

to

Yes, you are assuming that signals can be reliably delivered to
specific threads. You've confirmed that it is not true.

Joe Seigh

Jan Wielemaker

unread,

Sep 5, 2003, 7:54:16 AM9/5/03

to

Joe,

Thanks :-( What can go wrong? Can the thread die before executing
the handler? Can it miss it because it has masked the signal (which
it does at times it is not prepared to handle it)? Note that this
signal is *only* used for this purpose and as I wait for all threads
to reply it is not possible the receiving thread already has this
signal pending (well, except for 3th parties sending me the signal,
but we'll forget about that).

If I recall well, I've seen an implementation to suspend/resume a
thread reliably using POSIX threads using similar techniques. Is this
wrong too?

Is there a good article about what *is* guaranteed with regard to
pthread_kill()?

Thanks for any pointers

--- Jan

Joe Seigh

unread,

Sep 5, 2003, 8:24:19 AM9/5/03

to

Jan Wielemaker wrote:
>
> >> This works 99.99...% of the cases correctly (Linux) but we have
> >> seen a few occasions where the main thread remains waiting on
> >> the semaphore. Is there anything wrong with this code?
> >
> > Yes, you are assuming that signals can be reliably delivered to
> > specific threads. You've confirmed that it is not true.
>
> Thanks :-( What can go wrong? Can the thread die before executing
> the handler? Can it miss it because it has masked the signal (which
> it does at times it is not prepared to handle it)? Note that this
> signal is *only* used for this purpose and as I wait for all threads
> to reply it is not possible the receiving thread already has this
> signal pending (well, except for 3th parties sending me the signal,
> but we'll forget about that).

Nothing is going wrong. It's working as it's supposed to. You are
making assumptions that are not valid. Not reliable means that not
all signals are guaranteed to be delivered. Your code should take
this into account. Obviously it didn't.

There are realtime signals which do have some relaibility guarantees but
only with respect to processes, not necessarily to threads. Someone
else will have to clarify here. Not my area of expertise.

>
> If I recall well, I've seen an implementation to suspend/resume a
> thread reliably using POSIX threads using similar techniques. Is this
> wrong too?

I don't know. suspend/resume has non-reliable behavior so a non-reliable
implementation would probably not be noticeable.

Joe Seigh

Alexander Terekhov

unread,

Sep 5, 2003, 8:27:13 AM9/5/03

to

Jan Wielemaker wrote:
[...]

> If I recall well, I've seen an implementation to suspend/resume a
> thread reliably using POSIX threads using similar techniques. Is this
> wrong too?

Yes, but for completely different reason. Currently nonexistent
pthread_sigqueue() is your friend, I guess. ;-) Why do you need
to "do some async-safe work" in the context of some particular
thread?

>
> Is there a good article about what *is* guaranteed with regard to
> pthread_kill()?

Try <http://www.unix.org>.

regards,
alexander.

Jan Wielemaker

unread,

Sep 5, 2003, 9:34:30 AM9/5/03

to

Alexander,

In article <3F588121...@web.de>, Alexander Terekhov wrote:
>
> Jan Wielemaker wrote:
> [...]
>> If I recall well, I've seen an implementation to suspend/resume a
>> thread reliably using POSIX threads using similar techniques. Is this
>> wrong too?
>
> Yes, but for completely different reason. Currently nonexistent
> pthread_sigqueue() is your friend, I guess. ;-) Why do you need
> to "do some async-safe work" in the context of some particular
> thread?

Non-existent functions are not really my favourites :-) For the `why',
this is part of multithreaded SWI-Prolog (www.swi-prolog.org) where I
thought I solved atom-garbage collection. Atoms (identifiers) are
global shared objects for which we sometimes wish to collect the
unused ones.

At some moment in time the system decides for a global atom garbage
collection (AGC). This means it locks the atom table (preventing the
creation of new atoms), marks all atoms it references itself and sends
a signal to all other threads who do the same marking and notify the
semaphore when done. Next we destroy all unreferenced atoms and
unlock the table. The atom-marking routine is written such that it is
guaranteed to mark all referenced atoms when called asynchronously.

As I've got no idea what the other threads are doing (notably they may
be busy running some expensive non-Prolog routine or be blocked on
some system call) I've found only two options: the collecting thread
askes the others to do the work asynchronously or for each thread it
suspends it, does the marking for it (which requires visibility of its
memory) and resumes it.

Please note that Prolog thread creation and termination is guarded by
a mutex that is also held by AGC, so no new Prolog thread comes into
the game while AGC is in play and no thread terminates.

>> Is there a good article about what *is* guaranteed with regard to
>> pthread_kill()?
>
> Try <http://www.unix.org>.

A white paper on threading indeed confirms that POSIX.1 signals are
delivered `at most once' and POSIX.1.b signals `exactly once'. Neither
the Linux (SuSE 8.1) documentation nor the Solaris (5.7) documentation
mentions the (I'm starting the believe) fact that a successfull kill()
or pthread_kill() doesn't mean the signal is actually delivered to the
receiver.

I'm still wondering *why* though. I understand the absence of a
signal queue implies the signal may already be pending, so a second
one is ignored. In this particular design though it is ensured this
signal is *not* pending.

Alexander Terekhov

unread,

Sep 5, 2003, 9:40:51 AM9/5/03

to

Joe Seigh wrote:
[...]

> Nothing is going wrong. It's working as it's supposed to. You are
> making assumptions that are not valid. Not reliable means that not
> all signals are guaranteed to be delivered. Your code should take
> this into account. Obviously it didn't.

Well, if he has only one thread that does his pthread_kill-based
"broadcast" and waits for others to complete operation, I really
don't see any obvious reasons why it shouldn't work under POSIX,
presuming that threads stay alive and don't block a signal for
ever. What am I missing?

regards,
alexander.

Alexander Terekhov

unread,

Sep 5, 2003, 10:08:20 AM9/5/03

to

Jan Wielemaker wrote:
[...]

> Non-existent functions are not really my favourites :-) For the `why',
> this is part of multithreaded SWI-Prolog (www.swi-prolog.org) where I
> thought I solved atom-garbage collection. Atoms (identifiers) are
> global shared objects for which we sometimes wish to collect the
> unused ones.

Don't mess with async stuff. Try something along the lines of:

http://groups.google.com/groups?threadm=3D021EA4.E2217C09%40web.de
(Subject: Re: Objects in container: creation and deletion details)

regards,
alexander.

Joe Seigh

unread,

Sep 5, 2003, 10:24:01 AM9/5/03

to

Well, for one thing it's not clear that you can extrapolate from
guarantees made for processes any guarantees for threads.

Another, the coordination of setting up and restoring of signal
handlers is a little dicey. There are other signal handlers?
What's coordinating the signals intended for those signal handlers?

And the increment of the semaphore in the signal handler does not
mean the thread has exited the signal handler. It's possible
for the thread to preempt at that point and not resume for an
arbitrary amount of time. What's the behavior when you signal
a thread that's still in a signal handler?

Joe Seigh

Jan Wielemaker

unread,

Sep 5, 2003, 10:35:55 AM9/5/03

to

Maybe some hope afterall :-) As explained they do stay alive as they
will block on an mutex held by AGC if they try to terminate. They
won't mask the signal forever. They can mask it for a finite period
(while doing things where they cannot guarantee proper asynchronous
atom-marking). The handler doesn't call any library or system call
except for sem_post() at the end.

It normally does run for quite some time. I've got one report from a
user telling me the system sometimes freezes while waiting on the
semaphore. It is not totally impossible there are other reasons for
this problem (afterall its a far from trivial program). I just want
to know the design is at least in theory sound.

Sofar I know POSIX.1 signals may not be delivered and in that sense
the design is flawed. Now the issue is whether there are conditions
under which delivery *is safe* on any `normal' system. I understand
an implementation is valid under POSIX.1 if it randomly ignores
signals, but I assume no real implementation will do that.

I *think* I understand that if N signals X are sent to thread Y it is
not guaranteed Y will execute handler(X) N times. Now however signal
X comes only from one source and this source waits for the handler in
Y to complete. We have no clue what Y is doing (computing, blocking
system call, ...), but it won't terminate.

Give me some hope :-)

--- Jan

Jan Wielemaker

unread,

Sep 5, 2003, 10:48:56 AM9/5/03

to

In article <3F5898D4...@web.de>, Alexander Terekhov wrote:
>
> Jan Wielemaker wrote:
> [...]
>> Non-existent functions are not really my favourites :-) For the `why',
>> this is part of multithreaded SWI-Prolog (www.swi-prolog.org) where I
>> thought I solved atom-garbage collection. Atoms (identifiers) are
>> global shared objects for which we sometimes wish to collect the
>> unused ones.

Alexander,

> Don't mess with async stuff. Try something along the lines of:
>
> http://groups.google.com/groups?threadm=3D021EA4.E2217C09%40web.de
> (Subject: Re: Objects in container: creation and deletion details)

I knew this advice would come :-). Its no option. Putting atoms
onto the Prolog stacks and throwing them away is so frequent it would
harm performance far too much. Especially Prolog backtracking can
remove many references. Right now it does a lot of the work by simply
discarding part of the stack. If I need to dereference I must analyse
what is on the stack.

A very complicated approach would be to have two types of atoms, atoms
shared in other shared data structures and atoms `local' to a thread.
This might work, but has so many serious implications I'm still trying
to avoid it.

--- Jan

Jan Wielemaker

unread,

Sep 5, 2003, 11:02:53 AM9/5/03

to

In article <3F589ED9...@xemaps.com>, Joe Seigh wrote:
>
>
> Alexander Terekhov wrote:
>>
>> Joe Seigh wrote:
>> [...]
>> > Nothing is going wrong. It's working as it's supposed to. You are
>> > making assumptions that are not valid. Not reliable means that not
>> > all signals are guaranteed to be delivered. Your code should take
>> > this into account. Obviously it didn't.
>>
>> Well, if he has only one thread that does his pthread_kill-based
>> "broadcast" and waits for others to complete operation, I really
>> don't see any obvious reasons why it shouldn't work under POSIX,
>> presuming that threads stay alive and don't block a signal for
>> ever. What am I missing?
>>
>
> Well, for one thing it's not clear that you can extrapolate from
> guarantees made for processes any guarantees for threads.
>
> Another, the coordination of setting up and restoring of signal
> handlers is a little dicey. There are other signal handlers?
> What's coordinating the signals intended for those signal handlers?

Not for normal operation. Some (interrupt) are sometimes used to trap
the debugger, but `works most of the time' is acceptable here. Fatal
errors (segv. etc.) are normally routed to Prolog exceptions to help
development but this too is allowed to go wrong `sometimes'.

Of course it is an open environment and people can (and often do) load
shared objects into it with extensions. Its ok to say: `stay away
from signal X'.

> And the increment of the semaphore in the signal handler does not
> mean the thread has exited the signal handler. It's possible
> for the thread to preempt at that point and not resume for an
> arbitrary amount of time. What's the behavior when you signal
> a thread that's still in a signal handler?

Its fine if the handler has not yet returned, as long as it has done
its work. In theory there may be a problem here. I doubt it causes
the observation though as AGC is infrequent.

Hmmm --- Jan

Joe Seigh

unread,

Sep 6, 2003, 2:38:19 PM9/6/03

to

> Well, for one thing it's not clear that you can extrapolate from
> guarantees made for processes any guarantees for threads.

To clarify, if you aren't guaranteed to get all the pending signals
for a process, but at least one, you cannot assume the same on a per
thread basis. Sending a bunch of signals to separate threads is
still sending a bunch of signals to a process and not all of them
have to be delivered to the process.

You should daisy chain the signals. Signal the first thread in the
list. Have it signal the next, etc ..., and have the last post the
semaphore. That way there's at most only one signal pending, not
multiple signals. And the signal handlers can't execute concurrently
anyhow since the signal is masked so you're not losing anything there.

Joe Seigh

Alexander Terekhov

unread,

Sep 6, 2003, 2:55:51 PM9/6/03

to

Joe Seigh wrote:
>
> > Well, for one thing it's not clear that you can extrapolate from
> > guarantees made for processes any guarantees for threads.
>
> To clarify, if you aren't guaranteed to get all the pending signals
> for a process, but at least one, you cannot assume the same on a per
> thread basis. Sending a bunch of signals to separate threads is
> still sending a bunch of signals to a process and not all of them
> have to be delivered to the process.

Well, to begin with, docu says: "The pthread_kill() function provides
a mechanism for asynchronously directing a signal at a thread in the
calling process. This could be used, for example, by one thread to
affect broadcast delivery of a signal to a set of threads."

>
> You should daisy chain the signals. Signal the first thread in the
> list. Have it signal the next, etc ..., and have the last post the
> semaphore. That way there's at most only one signal pending, not
> multiple signals. And the signal handlers can't execute concurrently
> anyhow since the signal is masked so you're not losing anything there.

Each thread has its own signal mask. Currently, there's no per-thread
signal queues (for realtime signals)... addition of pthread_sigqueue()
would change that.

regards,
alexander.

Jan Wielemaker

unread,

Sep 6, 2003, 2:57:29 PM9/6/03

to

Joe,

In article <3F5A2BF7...@xemaps.com>, Joe Seigh wrote:
>
>> Well, for one thing it's not clear that you can extrapolate from
>> guarantees made for processes any guarantees for threads.
>
> To clarify, if you aren't guaranteed to get all the pending signals
> for a process, but at least one, you cannot assume the same on a per
> thread basis. Sending a bunch of signals to separate threads is
> still sending a bunch of signals to a process and not all of them
> have to be delivered to the process.

Hmm. Thanks. I guess this isn't true for LinuxThreads, but it can
be for other implementations.

> You should daisy chain the signals. Signal the first thread in the
> list. Have it signal the next, etc ..., and have the last post the
> semaphore. That way there's at most only one signal pending, not
> multiple signals. And the signal handlers can't execute concurrently
> anyhow since the signal is masked so you're not losing anything there.

Hmmm. Again I'd say this (probably) isn't true for LinuxThreads. For how
many/which implementations will this be true? Won't pthread_kill() in
general be implemented differently from kill()? Having all signal
handlers executed sequentially isn't optimal on SMP systems but I it is
`acceptable'.

--- Jan

Joe Seigh

unread,

Sep 7, 2003, 1:35:47 PM9/7/03

to

Alexander Terekhov wrote:
>
> Joe Seigh wrote:
> >

> > To clarify, if you aren't guaranteed to get all the pending signals
> > for a process, but at least one, you cannot assume the same on a per
> > thread basis. Sending a bunch of signals to separate threads is
> > still sending a bunch of signals to a process and not all of them
> > have to be delivered to the process.
>
> Well, to begin with, docu says: "The pthread_kill() function provides
> a mechanism for asynchronously directing a signal at a thread in the
> calling process. This could be used, for example, by one thread to
> affect broadcast delivery of a signal to a set of threads."

Unfortunately, they seemed to have omitted an example implementation that
would work.

>
> >
> > You should daisy chain the signals. Signal the first thread in the
> > list. Have it signal the next, etc ..., and have the last post the
> > semaphore. That way there's at most only one signal pending, not
> > multiple signals. And the signal handlers can't execute concurrently
> > anyhow since the signal is masked so you're not losing anything there.
>
> Each thread has its own signal mask. Currently, there's no per-thread
> signal queues (for realtime signals)... addition of pthread_sigqueue()
> would change that.
>

The sigaction doc isn't too clear here.

Joe Seigh

Jan Wielemaker

unread,

Sep 8, 2003, 7:52:23 AM9/8/03

to

Dear Alexander and Joe.

Thanks for the insight sofar. Summarising, my current assumptions are
that

* pthread_kill() is guaranteed to have a response in any
`sensible' implementation if the receiver has a handler
that is not masked and it is guaranteed the process
(=all threads) do not have the signal pending.

* It is unclear whether this still holds if only the receiving
thread doesn't have the signal pending (i.e. other threads
in the process may). As kill() doesn't address a specific
thread and pthread_kill() does, i'm tempted to assume
pthread_kill() makes the signal pending in a datastructure
specific to the receiving thread, so there should be no
problem. The remark on broadcasting using pthread_kill()
supports this.

* Worse is that if I want to send the signal multiple times
it is difficult (impossible) to say it isn't pending any
longer as the receiver may preempt after signalling my
semaphore but before return from the signal handler is
complete. I guess it is pretty easy to test for a specific
implementation by forcing a long delay after signalling
the semaphore.

* pthread_sigqueue() will solve my problems :-) When/where
is this expected? So do POSIX1.b real-time signals. On
which platforms are these available?

* Anyone has a sensible alternative to make a number of threads
asynchronously execute some code and wait for all of them to
complete in a reliable and portable way? To relax things a
bit, it is acceptable if some thread does the work twice as
long as it does the work while it is requested (waited for).
Given the observation is `works almost always', some time delay
if it goes wrong is fine. The absence of a timed wait for a
semaphore is just one problem ... If I signal a thread twice,
am I guaranteed the first signal has either been handled or is
never going to be handled?

Alexander Terekhov

unread,

Sep 8, 2003, 9:07:22 AM9/8/03

to

Jan Wielemaker wrote:
> * pthread_kill() is guaranteed to have a response in any
> `sensible' implementation if the receiver has a handler
> that is not masked and it is guaranteed the process
> (=all threads) do not have the signal pending.

pthread_kill() has really nothing to do with any thread other than
kill target. "The pthread_kill() function shall request that a signal
be delivered to the specified thread."

>
> * It is unclear whether this still holds if only the receiving
> thread doesn't have the signal pending (i.e. other threads
> in the process may). As kill() doesn't address a specific
> thread and pthread_kill() does, i'm tempted to assume
> pthread_kill() makes the signal pending in a datastructure
> specific to the receiving thread, so there should be no
> problem. The remark on broadcasting using pthread_kill()
> supports this.
>
> * Worse is that if I want to send the signal multiple times
> it is difficult (impossible) to say it isn't pending any
> longer as the receiver may preempt after signalling my
> semaphore but before return from the signal handler is
> complete. I guess it is pretty easy to test for a specific
> implementation by forcing a long delay after signalling
> the semaphore.
>
> * pthread_sigqueue() will solve my problems :-)

Not knowing what you do in your handler(s) and how you "setup" the
whole thing, it does seem that your "broadcast-and-wait" is OKAY
as long as you have only one thread doing it (or serialize it). I
don't see how pthread_sigqueue() will solve your problem. It would
help if you'd have multiple threads doing it concurrently. In this
case, lack of queuing would explain your problem, to me. More info:

Well, <quote source=Butenhof> Ignoring the realtime queued signal
extension, any given signal number (e.g., SIGPIPE) could be pending
against EACH thread in the process AND against the process itself.
So for <n> threads, in that sense, you could say that each signal
may be pending <n>+1 times. That can only happen, of course, if it's
generated once synchronously or via pthread_kill() for each thread
in the process AND also generated asynchronously or by kill() for
the process. (And of course it must have been blocked by all
threads.) </quote>

http://www.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_04.html#tag_02_04_01
(Signal Generation and Delivery)

regards,
alexander.

Jan Wielemaker

unread,

Sep 8, 2003, 11:18:55 AM9/8/03

to

Alexander,

Almost there! My system (dual CPU, SuSE 8.1) is currently running an
extensive test-suite on this for over two days without any problems.
Others have found a problem that might (we are not sure yet) cause
one of the threads to crash in the signal handler which perfectly
explains why not all threads reply :-)

The document you mentioned here:

> http://www.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_04.html#tag_02_04_01
> (Signal Generation and Delivery)

basically confirmed my mental model. About Joe's issue on signalling while
the handler is still executing (preempted between sem_post() and actual return)
I read:

"If, when a pending signal is delivered, there are additional signals
queued to that signal number, the signal shall remain
pending. Otherwise, the pending indication shall be reset."

I think I should read this to confirm there is no issue and the
broadcast-and-wait is indeed safe.

Thanks for all the explanations (and hoping nobody comes with
a show-stopper :-)

--- Jan

Alexander Terekhov

unread,

Sep 9, 2003, 2:44:10 PM9/9/03

to

Jan Wielemaker wrote:
[...]

> Thanks for all the explanations (and hoping nobody comes with
> a show-stopper :-)

What data structure are you using for your "atom table"? The problem
is that unless you lock it (and you can't do it in a signal handler),
you may have all sorts of visibility/ordering/tearing problems...

regards,
alexander.

Jan Wielemaker

unread,

Sep 10, 2003, 5:10:41 AM9/10/03

to

Alexander,

Trying anyway :-) Its well apreciated! In the current implementation
there may be a problem here. Atoms are C-structures (6 words long)
allocated using malloc() (well, actually using a layer on top of
that). After creation they never change, exept for one field
`unsigned int references' which holds the reference count from other
objects on the (mostly shared) heap. Reference count changes are
guarded by the same mutex that guards the atom creation and garbage
collection (they are not very frequent).

The rest of the machinery works with atom-handles, which are
type-tagged unsigned long objects containing an offset into an array
of pointers to the actual atom structures. Normally all they do with
these handles is compare them and decide on equality. Handles come
from static program code, guarded by a mutex from dynamic program code
or created by the thread itself.

The signal handler walks through the stacks. For each cell it checks
whether it is an atom (using the tag), finds the atom structure and
does a->references |= 0x8000000 (on 32-bit machine). The collecting
thread is guaranteed to see this because of the semaphore
synchronisation.

What can go wrong is that in some thread the reference count is out of
date, so a->references |= 0x8000000 writes back an old reference
count. This never showed up (dual AMD-Athlon). I guess I should fix
this using atomic compare-and-swap? Are there any (portable)
alternatives?

Thanks for your attention and comments

--- Jan

Alexander Terekhov

unread,

Sep 10, 2003, 6:52:54 AM9/10/03

to

Jan Wielemaker wrote:
[...]

> The signal handler walks through the stacks. For each cell it checks
> whether it is an atom (using the tag), finds the atom structure and

^^^^^^^^^^^^^^^^^^^^^^^^

That means that you read "an array of pointers to the actual atom
structures", right? Is this a fixed array or is it something like
std::vector? The problem is that you need to a lock it in order to
have the consistent view.

> does a->references |= 0x8000000 (on 32-bit machine). The collecting
> thread is guaranteed to see this because of the semaphore
> synchronisation.

Yes, but that's not a problem.

>
> What can go wrong is that in some thread the reference count is out of
> date, so a->references |= 0x8000000 writes back an old reference
> count.

In general, since you don't lock it, any PREVIOUS transaction from
another thread can "overtake" your async-updates.

> This never showed up (dual AMD-Athlon). I guess I should fix
> this using atomic compare-and-swap?

Yes, it might help of ALL updates are done "lock free" via CAS.

> Are there any (portable) alternatives?

Not yet, I'm afraid.

regards,
alexander.

Alexander Terekhov

unread,

Sep 10, 2003, 7:05:56 AM9/10/03

to

Alexander Terekhov wrote:
[...]

> > Are there any (portable) alternatives?
>
> Not yet, I'm afraid.

Uhmm. You can try to offload your marking to the collector thread.
The semaphore will ensure that collector will have same memory
view as signal handler (when it unlocks the semaphore) and since
collector does have consistent view of your "array" (just keep it
locked while doing marking for all your threads), it should work.

Probably.

regards,
alexander.

Jan Wielemaker

unread,

Sep 10, 2003, 8:42:57 AM9/10/03

to

Alexander,

Yes, but there is a problem. After signalling the semaphore the
signal handler returns and continues (as long as it doesn't hit the
mutex associated to the atoms) modifying the stacks. I would be happy
with

foreach thread-but-myself
suspend(thread),
mark_atoms_for_thread(thread),
resume(thread)
done

provided suspend(thread) makes my memory view consistent with the
suspended thread. I've seen a few implementations of thread suspension
that didn't guarantee this, but I guess it is possible to write one that
does along the lines below (left out all the details).

suspend(t)
{ pthread_kill(t, SIGUSR1);
sem_wait(sem);
}

resume(t)
{ pthread_kill(t, SIGUSR2);
}

handler()
{ sem_post(sem);
sigsuspend(<for SIGUSR2>)
}

Of course this looses concurrency while doing the marking. Joe
claimed I won't get that anyway. I that true? In theory of course I
could create as many threads as CPUs and have them doing the marking
concurrently. I can synchronise the view of these threads with the
collector thread using a semaphore.

There should be a way to make my view consistent inside the handler ...

Thanks again

--- Jan

Alexander Terekhov

unread,

Sep 10, 2003, 9:39:13 AM9/10/03

to

Jan Wielemaker wrote:
[...]

> Yes, but there is a problem. After signalling the semaphore the
> signal handler returns and continues (as long as it doesn't hit the
> mutex associated to the atoms) modifying the stacks. I would be happy
> with
>
> foreach thread-but-myself
> suspend(thread),
> mark_atoms_for_thread(thread),
> resume(thread)
> done
>
> provided suspend(thread) makes my memory view consistent with the
> suspended thread. I've seen a few implementations of thread suspension
> that didn't guarantee this, but I guess it is possible to write one that
> does along the lines below (left out all the details).
>
> suspend(t)
> { pthread_kill(t, SIGUSR1);
> sem_wait(sem);
> }
>
> resume(t)
> { pthread_kill(t, SIGUSR2);
> }
>
> handler()
> { sem_post(sem);
> sigsuspend(<for SIGUSR2>)
> }

Yep. Except that it's totally broken (under the current standard)
if you can have a thread cancel request pending on the suspending
thread. The problem is that sigsuspend() is a "shall occur" thread
cancelation point... and you certainly don't want it to occur in a
signal handler... but you can't disable it because that operation
isn't async-signal-safe. The response from DB was <quote> No
problem; I'm perfectly happy to add another ominous warning to
the "KEEP OUT" sign <wink> </quote>. ;-) ;-)

What's also not clear to me is a more general problem of messing
errno (the signal handler should better NOT change it, but there
doesn't seem to be a requirement imposed on either applications
or implementation to save/restore it... async-signal-safety of
"errno macro" itself aside for a moment).

regards,
alexander.

Jan Wielemaker

unread,

Sep 10, 2003, 10:47:10 AM9/10/03

to

In article <3F5F2981...@web.de>, Alexander Terekhov wrote:
> Jan Wielemaker wrote:
>>

>> suspend(t)
>> { pthread_kill(t, SIGUSR1);
>> sem_wait(sem);
>> }
>>
>> resume(t)
>> { pthread_kill(t, SIGUSR2);
>> }
>>
>> handler()
>> { sem_post(sem);
>> sigsuspend(<for SIGUSR2>)
>> }
>
> Yep. Except that it's totally broken (under the current standard)
> if you can have a thread cancel request pending on the suspending
> thread. The problem is that sigsuspend() is a "shall occur" thread
> cancelation point... and you certainly don't want it to occur in a
> signal handler... but you can't disable it because that operation
> isn't async-signal-safe. The response from DB was <quote> No
> problem; I'm perfectly happy to add another ominous warning to
> the "KEEP OUT" sign <wink> </quote>. ;-) ;-)

Annoying. I guess it is not so hard to hack something together with
atomic compare-and-swap instructions such that in the event of a
cancellation request you don't continue the suspended thread.
Unfortunately one looses the cancelation request and there is
no standard for compare-and-swap operations :-(

I've also seen many systems provide some form of thread
suspend/resume, but they are all called differently. No one I've seen
discusses whether <suspend> <play with its memory> <resume> is
possible. Some don't guarantee the thread is actually suspended at
the moment the suspend call returns :-(

Wouldn't it be nice if pthread_kill() and the corresponding signal
handler execution act as a synchronization similar to sem_post()
and sem_wait()? Without that pthread_kill() isn't very useful
as the handler can only work on memory that is changed _only_
by the receiving thread and pthread_kill() doesn't provide for
passing a lot of context (of course you can send SIGUSR1 for 0
and SIGUSR2 for 1 :-))

> What's also not clear to me is a more general problem of messing
> errno (the signal handler should better NOT change it, but there
> doesn't seem to be a requirement imposed on either applications
> or implementation to save/restore it... async-signal-safety of
> "errno macro" itself aside for a moment).

If the signal receiving save/restore do not do that I guess it is
simply impossible (well, unsafe) to make _any_ system call from a
signal handler :-(

Cheers --- Jan

Alexander Terekhov

unread,

Sep 11, 2003, 5:49:50 AM9/11/03

to

Jan Wielemaker wrote:
[...]

> Wouldn't it be nice if pthread_kill() and the corresponding signal
> handler execution act as a synchronization similar to sem_post()
> and sem_wait()?

http://www.opengroup.org/onlinepubs/007904975/xrat/xbd_chap04.html#tag_01_04_10

"Conforming applications may only use the functions listed to
synchronize threads of control with respect to memory access.
There are many other candidates for functions that might also
be used. Examples are: signal sending and reception, or pipe
writing and reading. In general, any function that allows one
thread of control to wait for an action caused by another
thread of control is a candidate. IEEE Std 1003.1-2001 does
not require these additional functions to synchronize memory
access since this would imply the following:

All these functions would have to be recognized by advanced
compilation systems so that memory operations and calls to
these functions are not reordered by optimization.

All these functions would potentially have to have memory
synchronization instructions added, depending on the
particular machine.

The additional functions complicate the model of how memory
is synchronized and make automatic data race detection
techniques impractical."

Got it? ;-)

regards,
alexander.

Jan Wielemaker

unread,

Sep 11, 2003, 7:47:03 AM9/11/03

to

In article <3F60453E...@web.de>, Alexander Terekhov wrote:
>
> Jan Wielemaker wrote:
> [...]
>> Wouldn't it be nice if pthread_kill() and the corresponding signal
>> handler execution act as a synchronization similar to sem_post()
>> and sem_wait()?
>
> http://www.opengroup.org/onlinepubs/007904975/xrat/xbd_chap04.html#tag_01_04_10
>

[snip]

>
> The additional functions complicate the model of how memory
> is synchronized and make automatic data race detection
> techniques impractical."
>
> Got it? ;-)

Sure :-), except there is *no* (portable) synchronization primitive that
can be called from a signal handler :-( Plus there are no primitives in
pthread that allow for a sensible portable implementation of thread
suspend/resume :-( I do agree with the view of many in this group that
you should stay away from these primitives, but portable implementation
of high-level languages becomes impossible given this situation :-(

Next, it is extremely difficult (impossible?) to write GNU autoconf
macros that can detect memory synchronization behaviour on a specific
platform which would allow to automatically configure the most
`acceptable' realisation on a platform :-(

The trouble given this situation is that the system can be configured
and compiled successfully on some platform. It can also pass the test
suite but still be broken fundamentally and there is no way to extend
the test-suite to fix this :-(

I guess on most processors adding a few assembler instructions will
suffice to make the signal-and-wait work fine. Does anyone know whether
there is an overview or an open library based on these? Maybe solutions
that work on all GCC supported platforms?

--- Jan

Alexander Terekhov

unread,

Sep 11, 2003, 8:49:26 AM9/11/03

to

Jan Wielemaker wrote:
[...]

> Sure :-), except there is *no* (portable) synchronization primitive that
> can be called from a signal handler :-( Plus there are no primitives in
> pthread that allow for a sensible portable implementation of thread
> suspend/resume :-( I do agree with the view of many in this group that
> you should stay away from these primitives, but portable implementation
> of high-level languages becomes impossible given this situation :-(

File a DR or two seeking to declare (reassure) that errno-thing is
async-signal-safe and that when a signal is delivered to a signal-
catching function, implementation shall save/restore errno (sort of
"for convenience") and also disable/restore thread cancelability.
Or something like that to make a sensible portable implementation
of thread suspend/resume that you seem to need. Give it a try. And
it's fun too, sometimes. ;-) http://opengroup.org/austin/aardvark

regards,
alexander.

Jan Wielemaker

unread,

Sep 11, 2003, 9:58:29 AM9/11/03

to

Challenging task :-) I might, but I don't know I'm the right person. I
don't think I'm sufficiently knowledgable to make a complete proposal
for change especially as there are various ways that `almost' achieve my
goal of inspecting a thread asynchronously. Signal handling as my
current implementation does appears to me as the most sensible solution
but requires some way to ensure visibility. Thread suspend/resume solves
the problem too, provided the thread is really suspended and its memory
synchronised in the calling thread as the suspend() call returns. Thread
suspension and inspecting its memory is about as dangerous as async
signalling.

So, maybe the specs for signal handling should change along your lines,
allowing for (a cumbersome) implementation of suspend/resume and maybe
there should be some more async-safe primitives or async-safe versions
of normal primitives. For example an async-safe version of sem_wait()
would allow for

foreach thread
{ if ( pthread_kill(t, sig) == 0 )
{ signalled++;
sem_post(sem1);
}
}
while(signalled--)
sem_wait(sem2);

handler(sig)
{ sem_wait(sem1);
<do my work>
sem_post(sem2);
}

As I don't think blocking calls inside signal handlers are a very
good idea my favourite that the system ensures visibility inside
a signal handler if it is triggered from pthread_kill().

For now I think I'm going to search for non-portable solutions to get
the signal handler based approach to work ...

Happy hacking

--- Jan

Alexander Terekhov

unread,

Sep 11, 2003, 11:01:40 AM9/11/03

to

Jan Wielemaker wrote:
[...]

> So, maybe the specs for signal handling should change along your lines,
> allowing for (a cumbersome) implementation of suspend/resume

Oh, yeah. http://homepage.mac.com/dbutenhof/Threads/code/susp.c

Beside making this {wildly distributed} example really sensible,
it would also help many others. There are many async-signal-safe
cancelation points and "the problem" of errno is not limited to
suspend handler (sigsuspend() returns -1 and sets the errno to
EINTR).

regards,
alexander.

P.S. I guess it's time for our heavyweight to JOIN this thread
("join" in multiple sense ;-) ).

Jan Wielemaker

unread,

Sep 11, 2003, 11:32:43 AM9/11/03

to

In article <3F608E54...@web.de>, Alexander Terekhov wrote:
>
> Jan Wielemaker wrote:
> [...]
>> So, maybe the specs for signal handling should change along your lines,
>> allowing for (a cumbersome) implementation of suspend/resume
>
> Oh, yeah. http://homepage.mac.com/dbutenhof/Threads/code/susp.c

Ah. I found an older (?) copy (?) that uses a variable called senitel
that is set in the handler and the caller waits until it sees the new
value. As this is sensitive to reordering I thought of the semaphore.
Happy to see I'm not the only one :-)

> Beside making this {wildly distributed} example really sensible,
> it would also help many others. There are many async-signal-safe
> cancelation points and "the problem" of errno is not limited to
> suspend handler (sigsuspend() returns -1 and sets the errno to
> EINTR).

So, if I write

if ( open(somefile, ...) < 0 )
<handle error>

I might get EINTR if the suspend/resume happens _just_ after the
(successful) return of open()? As this suspend/resume would be part of
normal behaviour of my program this is not acceptable to me. I could
wrap all my systems calls, but it would still pose a lot of problems
in libraries that can be loaded dynamically into the system.

I've had a look at the sources of glibc's pthread implementation and
found the macros WRITE_MEMORY_BARRIER() and READ_MEMORY_BARRIER() and
now I'm tempted to believe the following would fix my problem:

WRITE_MEMORY_BARRIER();

foreach thread
if ( pthread_kill(t, sig) == 0 )
signalled++;

while(signalled--)
sem_wait(sem)

handler(sig)
{ READ_MEMORY_BARRIER();
<do my marking>
sem_post(sem);
}

These macros are empty for i386 which supports my experiments on a dual
AMD Athlon where my test program failed to produce errors after running
for 5 days. Ok, this is in Prolog and the system is doing a lot more
work and the theoretically possible problems are not very likely. I
should write a dedicated testing routine trying to make the problems as
likely as possible.

--- Jan

Alexander Terekhov

unread,

Sep 11, 2003, 12:32:48 PM9/11/03

to

Jan Wielemaker wrote:
[...]
> >> So, maybe the specs for signal handling should change along your lines,
> >> allowing for (a cumbersome) implementation of suspend/resume
> >
> > Oh, yeah. http://homepage.mac.com/dbutenhof/Threads/code/susp.c
>
> Ah. I found an older (?) copy (?) that uses a variable called senitel

http://groups.google.com/groups?threadm=37B84369.F5B8AAEF%40csc.com

> that is set in the handler and the caller waits until it sees the new
> value. As this is sensitive to reordering I thought of the semaphore.
> Happy to see I'm not the only one :-)
>
> > Beside making this {wildly distributed} example really sensible,
> > it would also help many others. There are many async-signal-safe
> > cancelation points and "the problem" of errno is not limited to
> > suspend handler (sigsuspend() returns -1 and sets the errno to
> > EINTR).
>
> So, if I write
>
> if ( open(somefile, ...) < 0 )
> <handle error>
>
> I might get EINTR if the suspend/resume happens _just_ after the
> (successful) return of open()?

My understanding is that you can "get" EINTR at any point where you
have the suspend signal unblocked. But if it happens _just_ after
the successful return of open(), you probably won't notice it. ;-)

[...]

> I've had a look at the sources of glibc's pthread implementation and
> found the macros WRITE_MEMORY_BARRIER() and READ_MEMORY_BARRIER() and
> now I'm tempted to believe the following would fix my problem:

It won't fix it (portable). To begin with, there's no requirenment
imposed on pthread_kill() to delay signal delivery until all writes
(from the "killing" tread) are made visible to the "killed" thread
and no memory barrier can change that (unless you happen to know
that it will work due to pthread_kill() implementation specifics).

Beside that, Linux msync macros and atomics are brain-dead und many
"msync sensitive" places in glibc/nptl (kernel aside for a moment)
are totally broken. Nobody (among maintainers) seems to care...
because it just "seems to work", I guess.

0.666 euros.

regards,
alexander.

David Butenhof

unread,

Sep 11, 2003, 2:13:59 PM9/11/03

to

Alexander Terekhov wrote:

There are many arguments for making everything "async-signal safe". It's
easier to code, conceptually simpler environment. It's also horrendously
complicated and expensive to implement in general unless everything inside
the kernel, and it's a radical and fundamental change from existing UNIX
practice dating back to the beginning of UNIX practice.

It's not going to happen in POSIX. Nor should it.

And if by multiple sense of join you mean, aside from contributing these
shattered gems of wisdom, "waiting for the thread to terminate", I prefer
to execute concurrently on another processor, thankyou. ;-)

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/

Jan Wielemaker

unread,

Sep 11, 2003, 3:35:07 PM9/11/03

to

In article <3F60A3B0...@web.de>, Alexander Terekhov wrote:
> Jan Wielemaker wrote:

> [...]
>> I've had a look at the sources of glibc's pthread implementation and
>> found the macros WRITE_MEMORY_BARRIER() and READ_MEMORY_BARRIER() and
>> now I'm tempted to believe the following would fix my problem:
>
> It won't fix it (portable). To begin with, there's no requirenment
> imposed on pthread_kill() to delay signal delivery until all writes
> (from the "killing" tread) are made visible to the "killed" thread
> and no memory barrier can change that (unless you happen to know
> that it will work due to pthread_kill() implementation specifics).
>
> Beside that, Linux msync macros and atomics are brain-dead und many
> "msync sensitive" places in glibc/nptl (kernel aside for a moment)
> are totally broken. Nobody (among maintainers) seems to care...
> because it just "seems to work", I guess.

I just read a long thread on a glibc mailing list from May this year
that indicates some people do care. It doesn't seem there will be more
support for signals in pthread (certainly not soon).

There is no completely safe and portable solution to my problem so I
think the non-portable route is the way to go. Luckily there are other
open source projects from which to borrow the tricky parts.

Happy hacking

--- Jan

Jan Wielemaker

unread,

Sep 11, 2003, 4:07:24 PM9/11/03

to

Alexander,

In article <slrnbm1jj...@ct.xs4all.nl>, Jan Wielemaker wrote:
> In article <3F60A3B0...@web.de>, Alexander Terekhov wrote:
>> Jan Wielemaker wrote:
>> [...]
>>> I've had a look at the sources of glibc's pthread implementation and
>>> found the macros WRITE_MEMORY_BARRIER() and READ_MEMORY_BARRIER() and
>>> now I'm tempted to believe the following would fix my problem:

As a final remark, I think according to your statements the following
program should fail on an SMP machine. Most likely this is true in
general, but it runs like a dream on a dual AMD-Athlon (SuSE 8.1, Kernel
2.4.19-4GB-SMP, glibc 2.2.5), which might explain Prolog runs fine on
this architecture :-)

--- Jan

Compile with gcc -pthread -Wall -o test test.c

================================================================
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <signal.h>
#include <pthread.h>
#include <semaphore.h>
#include <assert.h>
#include <sys/types.h>
#include <unistd.h>

static sem_t sem;

static long x;
static long y;

static void
handler(int sig)
{ y=x;
sem_post(&sem);
}

static void *
run_thread(void *arg)
{ for(;;)
{ sleep(1);
}
}

int
main(int argc, char **argv)
{ pthread_t th;
struct sigaction act, oldact;

sem_init(&sem, 0, 0);

pthread_create(&th, NULL, run_thread, NULL);

memset(&act, 0, sizeof(act));
act.sa_handler = handler;
sigaction(SIGUSR1, &act, &oldact);

for(;;)
{ x++;
if ( pthread_kill(th, SIGUSR1) == 0 )
{ sem_wait(&sem);

assert(x == y);
}

if ( x % 10000 == 0 )
{ putchar('.');
fflush(stdout);
}
}
}

Alexander Terekhov

unread,

Sep 12, 2003, 7:08:49 AM9/12/03

to

Jan Wielemaker wrote:
[...]

> As a final remark, I think according to your statements the following
> program should fail on an SMP machine. Most likely this is true in
> general, but it runs like a dream on a dual AMD-Athlon (SuSE 8.1, Kernel
> 2.4.19-4GB-SMP, glibc 2.2.5), which might explain Prolog runs fine on
> this architecture :-)

Compiler is allowed to reorder x++ and pthread_kill(). Your "test"
is totally broken (proves nothing other than existence of "luck").

regards,
alexander.

--
http://www.qpine.net/~ink/infringing-code.html

Alexander Terekhov

unread,

Sep 12, 2003, 7:11:13 AM9/12/03

to

David Butenhof wrote:
[...]

> There are many arguments for making everything "async-signal safe". It's
> easier to code, conceptually simpler environment. It's also horrendously
> complicated and expensive to implement in general unless everything inside
> the kernel, and it's a radical and fundamental change from existing UNIX
> practice dating back to the beginning of UNIX practice.
>
> It's not going to happen in POSIX. Nor should it.

So, you mean that if I decide to bother AG once again with a DR
or two (that I've already outlined in one of my previous posting
here), you're going to oppose to it on the grounds of vagueness
you wrote above? Interesting. I guess it's worth trying. Oder?

regards,
alexander.

--
http://www.temporal-acuity.net/sco/SCO.jpg

Alexander Terekhov

unread,

Sep 12, 2003, 8:23:30 AM9/12/03

to

Jan Wielemaker wrote:
[...]

> I just read a long thread on a glibc mailing list from May this year
> that indicates some people do care.

Yeah, there was "[PATCH] PPC atomic cleanup" thread in May this year.

http://groups.google.com/groups?selm=3EBBE66B.98885A41%40web.de
https://www.redhat.com/archives/phil-list/2003-August/msg00039.html

Both problems are still there (as of "nptl-0.57.tar").

regards,
alexander.

P.S. http://groups.google.com/groups?selm=3F4BCDED.7A35B881%40web.de

Jan Wielemaker

unread,

Sep 12, 2003, 9:48:11 AM9/12/03

to

In article <3F61A941...@web.de>, Alexander Terekhov wrote:
>
> Jan Wielemaker wrote:
> [...]
>> As a final remark, I think according to your statements the following
>> program should fail on an SMP machine. Most likely this is true in
>> general, but it runs like a dream on a dual AMD-Athlon (SuSE 8.1, Kernel
>> 2.4.19-4GB-SMP, glibc 2.2.5), which might explain Prolog runs fine on
>> this architecture :-)
>
> Compiler is allowed to reorder x++ and pthread_kill(). Your "test"
> is totally broken (proves nothing other than existence of "luck").

I don't agree to this. True, the compiler is allowed to do that.
Appearently it doesn't (otherwise the test would have failed) and
it isn't so hard to stop the compiler from doing that. What the
test program does validate (well, after 1000,000 apples falling
down the next may go up :-) is that the signal handler (on this
architecture) sees the correct new value of "x".

--- Jan