killing pthreads?

Steven Wong

unread,

Sep 23, 2003, 12:29:32 PM9/23/03

to

Hi,

I've got some code that keeps a list of pthread ids. When new threads
are created using pthreads their id's are added to the list. When
exiting the program, I have a method that kills any threads still
running, (this method runs from the main thread). It uses the
pthread_kill(id, SIGKILL) function.

However, every now and then my program will be killed, with the
message "Killed" sent to stdout, while the program is killing the
remaining threads. The only reason I can think of is that somehow I'm
accidentaly killing the main thread, even though the pid of the thread
to be killed is checked against the main thread's id before killing
it.

Other than that, I can't think of any other reasons why the program is
being killed that this.

Any ideas?

Thanks!

Loic Domaigne

unread,

Sep 23, 2003, 1:45:54 PM9/23/03

to

Hi Steven,

> I've got some code that keeps a list of pthread ids. When new threads
> are created using pthreads their id's are added to the list. When
> exiting the program, I have a method that kills any threads still
> running, (this method runs from the main thread). It uses the
> pthread_kill(id, SIGKILL) function.

The first question that pops naturally to my mind is: why are you doing
this? When the process terminates (e.g. the main thread returns, or exit()
is called), then all the running threads shall be automatically
terminated...

> However, every now and then my program will be killed, with the
> message "Killed" sent to stdout, while the program is killing the
> remaining threads. The only reason I can think of is that somehow I'm
> accidentaly killing the main thread, even though the pid of the thread
> to be killed is checked against the main thread's id before killing
> it.

Actually, you are quite lucky to see the problem only "every now and then".
I would expect to obtain always the message "Killed"!

A question: are you using Linux with LinuxThreads?

> Other than that, I can't think of any other reasons why the program is
> being killed that this.
> Any ideas?

The message that you are seeing is not surprising. I think, your
problem is perhaps a misconception of what pthread_kill() means. As a matter
of fact, pthread_kill() is almost the same as kill() up to one point:
the signal is delivered to the targeted thread, instead to be delivered to
the process.

However, the signal sent by pthread_kill() is handled like any other signal.
You are sending a SIGKILL, as a result you are terminating the PROCESS.
Indeed, that's the default action for that signal, and signal actions
affect the PROCESS, not THREADS. Hence, it's not surprising to see the
message "Killed"...

Regards,
Loic.
--
Article posté via l'accès Usenet http://www.mes-news.com
Accès par Nnrp ou Web

Steven Wong

unread,

Sep 24, 2003, 12:38:15 PM9/24/03

to

Yeah, that was it exactly. I was working with someone else's code and
didn't
realize pthread_kill() kills the whole process. I had thought it would
just kill a specific thread. pthread_cancel() is closer to what I
wanted....

Thanks!!

Loic Domaigne <loic...@gmx.net> wrote in message news:<mW.HLo...@mes-news.com>...

David Butenhof

unread,

Sep 25, 2003, 7:14:51 AM9/25/03

to

Steven Wong wrote:

> Yeah, that was it exactly. I was working with someone else's code and
> didn't realize pthread_kill() kills the whole process. I had thought it
> would just kill a specific thread. pthread_cancel() is closer to what I
> wanted....

First, the UNIX function "kill" is poorly named. It doesn't "kill" (or even
injure -- though there's room for dispute on that point ;-) ) anything. All
it really does is send a signal to the target.

The function pthread_kill() allows you to send that signal to a thread
rather than to a process. In modern POSIX terms, a "process" is just a
resource bundle containing one or more threads and "stuff" (including
memory, files, etc). There's no way to send a signal to "a process", so
kill() really only sends a signal to some RANDOM thread in a process
whereas pthread_kill() allows a thread within the process to send that
signal to a PARTICULAR thread.

Some signals, like SIGKILL, are also defined to have some "side effect" on
the process. Some have argued that this effect ought to apply to the
thread. But think about what that means when all signals are delivered to
threads.

You start a program, and it goes bad. It dies with a SIGSEGV. Ooops, well,
actually, only one thread died with a SIGSEGV. The rest continue, with some
corrupted data. Maybe, eventually, they all die with SIGSEGVs. Perhaps you
notice something's going bad and hit ^C, generating a SIGINT for the
process. But, wait... there is no process. So SIGINT gets delivered to some
thread, and that thread is terminated. The rest of the process (now sorely
wounded, but still chugging on) continues. Hit ^C again, and another thread
bites the dust. The beast is badly wounded, tearing apart your filesystem,
but still going. Quick, "kill -9". Ah, yes... and yet another single thread
dies, but the process chugs on. Um, this isn't working too well, is it?

When we added POSIX threads, and in particular when we worked out the final
signal model, we thought it would be moderately useful to have the well
known and widely used "POSIX job control" model continue to work. So that
was why we said that signals with actions previously defined to operate on
THE PROCESS would continue to do so. Otherwise, after all, we'd have had no
choice but to invent a whole new mechanism that DID, and force everyone to
recode their shells and such. Which, really, on the whole, didn't seem like
a particularly good (or useful) idea...

Now, pthread_cancel(). There are some important differences. When you cancel
a thread, you don't get to "KILL it". You instead ask politely that the
thread terminate itself as soon as it can do so conveniently. The interface
is designed to allow it to indefinitely defer cancellation, as well as
cleaning up anything necessary before it terminates. You generally don't
cancel a "runaway" thread, because you've got no particular reason to
expect that it'll behave rationally. (You're better off terminating the
process with a core dump to analyze later.)

Threads you expect to be cancelled need to be carefully written with cleanup
handlers. (Note that while cancellation SHOULD run C++ object destructors,
there's no standard that can require this and many implementations of C++
are therefore useless with threads.)

And, as someone else said, if you're trying to get rid of your running
threads at program termination, you generally might as well just call
exit(), which will "evaporate" all the threads anyway. The one exception
would be if you have threads maintaining externally visible state; in which
case you should probably cancel them and allow them to clean up completely.
You'll need to JOIN with those threads to be sure they've finished that
cleanup.

--
/--------------------[ David.B...@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/

Steven Wong

unread,

Sep 26, 2003, 5:24:30 PM9/26/03

to

Thanks for the detailed reply.

But a few questions....

> The function pthread_kill() allows you to send that signal to a thread
> rather than to a process. In modern POSIX terms, a "process" is just a
> resource bundle containing one or more threads and "stuff" (including
> memory, files, etc). There's no way to send a signal to "a process", so
> kill() really only sends a signal to some RANDOM thread in a process
> whereas pthread_kill() allows a thread within the process to send that
> signal to a PARTICULAR thread.

It is possible to setup signal handlers for this particular thread,
which, somehow, prevented the whole process from dying? (or at least
reacted differently to the signal than another thread in the same
process? If not, what would be the point of picking a specific thread
to send the signal to?)

> Some signals, like SIGKILL, are also defined to have some "side effect" on
> the process. Some have argued that this effect ought to apply to the
> thread. But think about what that means when all signals are delivered to
> threads.

For SIGKILL, the "side effect" being that the whole process gets
killed?

> And, as someone else said, if you're trying to get rid of your running
> threads at program termination, you generally might as well just call
> exit(), which will "evaporate" all the threads anyway. The one exception
> would be if you have threads maintaining externally visible state; in which
> case you should probably cancel them and allow them to clean up completely.
> You'll need to JOIN with those threads to be sure they've finished that
> cleanup.

The pthread_kill() code is in an existing "cleanup" method. I can't be
certain its called only when the application exits. (although it
probably is)

thanks again

Patrick TJ McPhee

unread,

Sep 27, 2003, 12:22:41 PM9/27/03

to

In article <a7424901.03092...@posting.google.com>,
Steven Wong <gndm...@yahoo.com> wrote:

[...]

% > There's no way to send a signal to "a process", so
% > kill() really only sends a signal to some RANDOM thread in a process
% > whereas pthread_kill() allows a thread within the process to send that
% > signal to a PARTICULAR thread.
%
% It is possible to setup signal handlers for this particular thread,
% which, somehow, prevented the whole process from dying?

You can't install a signal handler which is specific to a particular
thread -- signal handlers are process wide -- but you can, of course,
prevent the process from dying, by not aborting in the signal handler.

% (or at least
% reacted differently to the signal than another thread in the same
% process?

I don't think there's a legal way to do this. There's a limited set
of functions that you may call from a signal handler, and I don't think
it includes pthread_getspecific, pthread_self, or pthread_equal.

% If not, what would be the point of picking a specific thread
% to send the signal to?)

Normally, I do it to implement a time-out mechanism on a blocking
system call which doesn't support time-outs. You don't do it as a
replacement for pthread_cancel(), though.

% > Some signals, like SIGKILL, are also defined to have some "side effect" on

% For SIGKILL, the "side effect" being that the whole process gets
% killed?

As they say in films, with extreme prejudice.

% The pthread_kill() code is in an existing "cleanup" method.

It would be worth-while finding out whether the intent was to
terminate the threads or simply make sure they're not blocked in
a read or something like that. In either case, pthread_cancel
may be a better way of going.

--

Patrick TJ McPhee
East York Canada
pt...@interlog.com

Steven Wong

unread,

Oct 1, 2003, 11:13:12 AM10/1/03

to

Thanks for the reply Patrick!

> You can't install a signal handler which is specific to a particular
> thread -- signal handlers are process wide -- but you can, of course,
> prevent the process from dying, by not aborting in the signal handler.
>

I did a bit of digging around and it looks like sigaction() can be
used to setup
signal handlers for specific threads according to the man pages on the
SGI tech pubs library:

The sigaction POSIX service allows for per-thread handlers to be
installed for catching synchronous signals. It is called in a
multithreaded process to establish thread specific actions for
such
signals.

Although they list two sigaction() functions, one for processes and
one for posix threads. The functions have the same method signatures
though, so how can you tell which one you are actually calling?
(perhaps it would depend on whether or not you have pthreads linked in
your program?) and how would you specify which thread you want to use
the signal handler for? would it be the calling thread?

Steven

Loic Domaigne

unread,

Oct 1, 2003, 12:37:54 PM10/1/03

to

Steven Wong declarait :

That's interesting, because Posix.1c doesn't define anything like a
per-thread signal handler for sigaction()... Unless I missed something,
of course ;-)

In fact, signal masks are set on a per-thread basis, but signal actions and
signal handlers, as set with sigaction(), are shared between all threads.
IOW sigaction() stuffs are processwide.

I guess, what the manpage tells you is if you like to caught a
thread-directed signal (like e.g. SIGSEGV or SIGFPE), then you should use
sigaction(). And indeed, in that case, we could qualify the signal handler
as per-thread handler, because the offending thread shall run the signal
handler... But that formulation is definitively confusing (!)

Oh yes, one question: how old is the thread library implementation you
are referring? IIRC, there was something similar on older DIGITAL UNIX.
"Synchronous" signals could have different signal actions for each thread.
But that semantic was changed to comply to Posix.1c at version 4.0
of DIGITAL UNIX.

And of course, any non-POSIX library might implement what they like.
Though, I'm not sure if per-thread signal handler would be extremely
usefull...

David Butenhof

unread,

Oct 2, 2003, 7:35:30 AM10/2/03

to

Loic Domaigne wrote:

> Steven Wong declarait :
>
>> Thanks for the reply Patrick!
>
>> > You can't install a signal handler which is specific to a particular
>> > thread -- signal handlers are process wide -- but you can, of course,
>> > prevent the process from dying, by not aborting in the signal handler.
>
>> I did a bit of digging around and it looks like sigaction() can be
>> used to setup
>> signal handlers for specific threads according to the man pages on the
>> SGI tech pubs library:
>
>> The sigaction POSIX service allows for per-thread handlers to be
>> installed for catching synchronous signals. It is called in a
>> multithreaded process to establish thread specific actions for
>> such
>> signals.
>
>> Although they list two sigaction() functions, one for processes and
>> one for posix threads. The functions have the same method signatures
>> though, so how can you tell which one you are actually calling?
>> (perhaps it would depend on whether or not you have pthreads linked in
>> your program?) and how would you specify which thread you want to use
>> the signal handler for? would it be the calling thread?
>
> That's interesting, because Posix.1c doesn't define anything like a
> per-thread signal handler for sigaction()... Unless I missed something,
> of course ;-)

No, you didn't miss anything. Well, at least you didn't miss THAT. ;-)

There's no standard or portable way to set a signal action that applies to a
single thread. I don't know whether IRIX provides such an "extension";
though if they did I can't imagine they'd have used the sigaction() name to
do it. That'd just be too confusing!

> In fact, signal masks are set on a per-thread basis, but signal actions
> and signal handlers, as set with sigaction(), are shared between all
> threads. IOW sigaction() stuffs are processwide.
>
> I guess, what the manpage tells you is if you like to caught a
> thread-directed signal (like e.g. SIGSEGV or SIGFPE), then you should use
> sigaction(). And indeed, in that case, we could qualify the signal handler
> as per-thread handler, because the offending thread shall run the signal
> handler... But that formulation is definitively confusing (!)

Well, some believe that man pages are supposed to be confusing. In any case,
many certainly are!

> Oh yes, one question: how old is the thread library implementation you
> are referring? IIRC, there was something similar on older DIGITAL UNIX.
> "Synchronous" signals could have different signal actions for each thread.
> But that semantic was changed to comply to Posix.1c at version 4.0
> of DIGITAL UNIX.

Technically, no; that was still "DEC OSF/1". ;-) In fact, 4.0 was still DEC
OSF/1. But, yes, the original OSF/1 model (not specific to Digital) was
rather bizarre: the signal ACTION was per-thread, but the MASK was per-
process. (Just try to make sense of THAT model!) And all "asynchronous"
signals were delivered to the initial thread of the process.

> And of course, any non-POSIX library might implement what they like.
> Though, I'm not sure if per-thread signal handler would be extremely
> usefull...

There was an enormous and extended argument in the POSIX working group
between the "process signal model" and the "thread signal model". Both, in
fact, argued (persuasively, correctly, and blindly) that their preference
was "more compatible" with existing practice. I'm not even going to TRY to
capture the extent and violence of this argument now, but let's just say
that the accomplishment of Nawaf Bitar in eventually settling the argument
with his "Grand Signal Compromise" (which became the standard) was enough
to qualify him as a world class diplomat.

In short, BOTH make perfect sense... from a certain point of view. But the
final process model has the substantial advantage of retaining traditional
POSIX job control semantics, and everyone pretty much had to agree that
this was not a bad thing.

Steven Wong

unread,

Oct 3, 2003, 3:39:52 PM10/3/03

to

I tried posting the same question to the sgi NG, hopefully someone
there might know whats going on with the IRIX sigaction().

(I took that quote right off the sgi site, I swear!! =) )

Thanks

David Butenhof <David.B...@hp.com> wrote in message news:<3f7c...@usenet01.boi.hp.com>...