Standards, cancelation, blocked threads, poll and signals

iker

unread,

Dec 19, 2001, 12:00:27 PM12/19/01

to

Hello all.

I'm currently working on an application where I find it convenient to
cancel worker threads that are spawned by a manager thread to handle
client requests. Each of these client threads are usually blocked in
poll() waiting for IO but may also block in other unforeseen system
calls (something similar to ACE's select reactor - it blocks in select
but the service handlers it calls can block in anything).

I'm new to MT so I've started reading PwPT, the 1996 POSIX standard
and quite a few great postings to this newsgroup in order to come up
with a list of "standard" ways to do this which I include below. I
think the list summarizes some of the current thinking with respect to
this problem but I would be most appreciative if the folks here could
provide corrections and/or additional methods.

Regards,
Iker

1) Asynchronous cancelation: The general consensus in this newsgroup
is that this is bad. The target thread will be left in an
indeterminate state and, if it was doing anything even remotely
interesting you'll likely experience resource leaks or crashes.

2) Deferred cancelation: Nice, standard way of going about cancelation
but it suffers from some serious issues. First, target threads can
still block in poll and select and conform to the POSIX standard
(neither is on the list of cancelation points under POSIX 1996).
Second, cancelation cleanup under pthreads and cleanup under C++ do
not usually work together. On most platform/compiler combinations a
canceled thread will execute pthread cancelation handlers but will not
execute catch blocks or destructors for local objects.

3) Signals: This solution seems to work well but only under some
rather strict guidelines - your library client should not use signals,
should not create threads except via your library and should not make
use of other libraries that create threads. However, if this suits you
fine then cancelation using signals allows you to wake a thread stuck
in a blocking call - errno will return EINTR - which your client's
code can handle as it sees fit. If your client returns or throws in
response to the failure C++ catch blocks and destructors will be
called as expected.

Note: Just checked the standard (3.3.1.4 Signal Effects on Other
Functions) and it has this to say about functions that are interrupted
by signal handlers: "If the signal-catching function executes a return
the behavior of the interrupted function shall be as described
individually for that function". I guess this means that assuming the
function will fail or that errno will return EINTR as a result of the
failure may not be portable (although that's what you get on Linux).

4) Pipes or local sockets: IMO, the least intrusive solution of the
bunch if you're code follows the reactor pattern. Your poll or select
call includes a descriptor that is only used to wake it when the
thread is to be canceled. This approach returns from blocked select
and poll calls, works well with C++ but does nothing for you if your
thread is stuck in anything other than select or poll. I've taken this
approach and simply placed a documented requirement that clients of my
class not make calls that take "too long".

iker

unread,

Dec 26, 2001, 10:54:19 AM12/26/01

to

Maybe I should have titled the message "Thread cancelation summary" or
something? :)

Wil Evers

unread,

Dec 27, 2001, 6:18:39 AM12/27/01

to

In article <b859c232.0112...@posting.google.com>, iker wrote:

> Maybe I should have titled the message "Thread cancelation summary" or
> something? :)

Well, what can we say? You wrote an excellent summary.

- Wil

--
Wil Evers, DOOSYS R&D, Utrecht, Holland
[Wil underscore Evers at doosys dot com]

iker

unread,

Dec 28, 2001, 9:51:33 AM12/28/01

to

Thanks :)

I just thought I might have omitted some techniques or that some of
what I wrote might have been based on outdated information.

Iker

Wil Evers <bou...@dev.null> wrote in message news:<a0f02g$nvl$1...@news1.xs4all.nl>...

Drazen Kacar

unread,

Dec 28, 2001, 11:31:08 AM12/28/01

to

iker wrote:

> 2) Deferred cancelation: Nice, standard way of going about cancelation
> but it suffers from some serious issues. First, target threads can
> still block in poll and select and conform to the POSIX standard
> (neither is on the list of cancelation points under POSIX 1996).

Does that version of POSIX define select and poll interfaces at all?
If not, that would be the reason. Both are cancellation points under Unix
98 and POSIX 2001.

--
.-. .-. Unlike good wine, bullshit doesn't improve with age.
(_ \ / _) -- John McLean
| da...@willfork.com
|

Alexander Terekhov

unread,

Dec 28, 2001, 2:34:34 PM12/28/01

to

iker wrote:
[...]

> 1) Asynchronous cancelation: The general consensus in this newsgroup
> is that this is bad. The target thread will be left in an
> indeterminate state and, if it was doing anything even remotely
> interesting you'll likely experience resource leaks or crashes.

This is true only if you use async.cancellation
in a way NOT defined by the standard (provoke
undefined behavior). To me, why bother with
pthread_testcancel() injecting it at some
point(s) inside some long computation loop
that does not invoke any async-cancel-unsafe
operations? Why not just have cancel type set
to PTHREAD_CANCEL_ASYNCHRONOUS inside such
async-cancel-safe region?!

> 2) Deferred cancelation: Nice, standard way of going about cancelation
> but it suffers from some serious issues. First, target threads can
> still block in poll and select and conform to the POSIX standard
> (neither is on the list of cancelation points under POSIX 1996).

Umm... both are on the list of "shall occur"
cancellation points under POSIX 2001, AFAIK.

> Second, cancelation cleanup under pthreads and cleanup under C++ do
> not usually work together. On most platform/compiler combinations a
> canceled thread will execute pthread cancelation handlers but will not
> execute catch blocks or destructors for local objects.

Do you mean linuxthreads(glibc)/g++ combination? ;-)

Yeah, no standard defines interaction between
PTHREAD-cancellation/-exit/-cleanup-handlers
and C++ stack unwinding. However, catch(...)
handlers aside, any PTHREAD/*C++* implementation
that allows you to call pthread_cancel()/
pthread_exit() but does not destroy automatic
objects, clearly does not follow intent/rationale
of the C++ standard writers wrt destruction of
automatic objects and flow of control, IMHO.
Consider:

ISO/IEC 14882:1998(E), Pg 347:

"The function signature longjmp(jmp_buf jbuf, int val)
has more restricted behavior in this International
Standard. If any automatic objects would be destroyed
by a thrown exception transferring control to another
(destination) point in the program, then a call to
longjmp( jbuf, val) at the throw point that transfers
control to the same (destination) point has
undefined behavior."
^^^^^^^^^^^^^^^^^^

ISO/IEC 14882:1998(E), Pg 97:

"[stmt.jump]...
On exit from a scope (however accomplished),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
destructors (12.4) are called for all
constructed objects with automatic storage
duration (3.7.2) (named objects or temporaries)
that are declared in that scope, in the reverse
order of their declaration."

The only "exception" is exit() (and abort(),
of course) but that is the same as POSIX
does wrt process exit and C thread cleanup
handlers:

"It was suggested that the cancelation cleanup
handlers should also be called when the process
exits or calls the exec function. This was rejected
partly due to the performance problem caused by
having to call the cancelation cleanup handlers of
every thread before the operation could continue.
The other reason was that the only state expected
to be cleaned up by the cancelation cleanup handlers
would be the intraprocess state. Any handlers that
are to clean up the interprocess state would be
registered with atexit()."

> 3) Signals: This solution seems to work well but only under some
> rather strict guidelines - your library client should not use signals,
> should not create threads except via your library and should not make
> use of other libraries that create threads. However, if this suits you
> fine then cancelation using signals allows you to wake a thread stuck
> in a blocking call - errno will return EINTR - which your client's
> code can handle as it sees fit. If your client returns or throws in
> response to the failure C++ catch blocks and destructors will be
> called as expected.

Umm... but how do you know/ensure that by the
time your victim thread will process the NOOP
"cancel" signal your victim thread will be exactly
at the point/in the state of "stuck in a blocking
call - errno will return EINTR"?? It could occur
too early (or too late) wrt "interruptible blocking
call", or am missing something? Probably you will
need more than "just fire a signal"...

regards,
alexander.

Iker Arizmendi

unread,

Dec 29, 2001, 7:31:15 AM12/29/01

to

"Alexander Terekhov" wrote:
>
> iker wrote:
> [...]
> > 1) Asynchronous cancelation: The general consensus in this newsgroup
> > is that this is bad. The target thread will be left in an
> > indeterminate state and, if it was doing anything even remotely
> > interesting you'll likely experience resource leaks or crashes.
>
> This is true only if you use async.cancellation
> in a way NOT defined by the standard (provoke
> undefined behavior). To me, why bother with
> pthread_testcancel() injecting it at some
> point(s) inside some long computation loop
> that does not invoke any async-cancel-unsafe
> operations? Why not just have cancel type set
> to PTHREAD_CANCEL_ASYNCHRONOUS inside such
> async-cancel-safe region?!

This leads to two additional situations:
- If the thread you plan to cancel is free of potential resource leaks at
every point in its execution then asynchronous cancelation can be safely
used (although I would guess that such threads are fairly uncommon).
- If the thread is only free of potential leaks during certain points in its
execution then you have to carefully manage the transitions from
PTHREAD_CANCEL_DEFERRED to PTHREAD_CANCEL_ASYNCHRONOUS. When this transition
happens you have to be sure you either have no outstanding resources or set
up cleanup handlers to properly manage them.

> > 2) Deferred cancelation: Nice, standard way of going about cancelation
> > but it suffers from some serious issues. First, target threads can
> > still block in poll and select and conform to the POSIX standard
> > (neither is on the list of cancelation points under POSIX 1996).
>
> Umm... both are on the list of "shall occur"
> cancellation points under POSIX 2001, AFAIK.
>
> > Second, cancelation cleanup under pthreads and cleanup under C++ do
> > not usually work together. On most platform/compiler combinations a
> > canceled thread will execute pthread cancelation handlers but will not
> > execute catch blocks or destructors for local objects.
>
> Do you mean linuxthreads(glibc)/g++ combination? ;-)

Of course. :)

David Butenhof has posted a couple of detailed descriptions of this issue
and if I recall correctly only the Tru64 Unix team has made C++ behave as
one would expect in the presence of threads (even though it seems like the
right thing to do).

> > 3) Signals: This solution seems to work well but only under some
> > rather strict guidelines - your library client should not use signals,
> > should not create threads except via your library and should not make
> > use of other libraries that create threads. However, if this suits you
> > fine then cancelation using signals allows you to wake a thread stuck
> > in a blocking call - errno will return EINTR - which your client's
> > code can handle as it sees fit. If your client returns or throws in
> > response to the failure C++ catch blocks and destructors will be
> > called as expected.
>
> Umm... but how do you know/ensure that by the
> time your victim thread will process the NOOP
> "cancel" signal your victim thread will be exactly
> at the point/in the state of "stuck in a blocking
> call - errno will return EINTR"?? It could occur
> too early (or too late) wrt "interruptible blocking
> call", or am missing something? Probably you will
> need more than "just fire a signal"...
>

Good point. I guess one could call pthread_cancel followed immediately by a
call to pthread_kill. If we install NOOP signal handlers and set the target
thread to deferred cancelation mode then we need to deal with two scenarios:
- Thread is not blocked in a call. In this case the call to pthread_cancel
will force the thread to exit at the next cancelation point. The call to
pthread_kill will do nothing (as we were only using it "shake" threads out
of blocked calls).
- Thread is blocked. The call to pthread_cancel will queue the cancelation
request. The call to pthread_kill forces the target thread to return from
whatever call it was in at which point it can exit in response to EINTR or
keep executing (in which case it will exit at the next cancelation point
anyway).
Either way, this doesn't strike me as a "pretty" solution.

Regards,
Iker

Alexander Terekhov

unread,

Dec 31, 2001, 9:18:27 AM12/31/01

to

Iker Arizmendi wrote:
>
> "Alexander Terekhov" wrote:
> >
> > iker wrote:
> > [...]
> > > 1) Asynchronous cancelation: The general consensus in this newsgroup
> > > is that this is bad. The target thread will be left in an
> > > indeterminate state and, if it was doing anything even remotely
> > > interesting you'll likely experience resource leaks or crashes.
> >
> > This is true only if you use async.cancellation
> > in a way NOT defined by the standard (provoke
> > undefined behavior). To me, why bother with
> > pthread_testcancel() injecting it at some
> > point(s) inside some long computation loop
> > that does not invoke any async-cancel-unsafe
> > operations? Why not just have cancel type set
> > to PTHREAD_CANCEL_ASYNCHRONOUS inside such
> > async-cancel-safe region?!
>
> This leads to two additional situations:

Well, why not just treat the whole async-cancel-region
as a single (but rather fatty ;) deferred cancellation
point? I do not see any additional situations here (except
an obvious need for async-cancel-safety to form a region).
If you need some cleanup to release something and/or
restore/fix shared objects invariants just do it; set up
your cleanup prior to setting PTHREAD_CANCEL_ASYNCHRONOUS
cancel type on async-cancel-region entry and dismiss
your cleanup (optionally invoking it) at some point after
you have set cancel type back to PTHREAD_CANCEL_DEFFERED
on async-cancel-region exit.

[...]

> David Butenhof has posted a couple of detailed descriptions of this issue
> and if I recall correctly only the Tru64 Unix team has made C++ behave as
> one would expect in the presence of threads (even though it seems like the
> right thing to do).

For example, IBM C++/PTHREAD impls for zSeries/MVS and
iSeries/OS400 platforms also destroy automatic objects on
thread exit/cancel, AFAIK. Not sure wrt pSeries/AIX, but
hey, even pthreads-*win32* has C++ exception based (VCE)
impl. option ;-)

regards,
alexander.

Iker Arizmendi

unread,

Jan 2, 2002, 12:16:33 AM1/2/02

to

"Alexander Terekhov" wrote:

> Well, why not just treat the whole async-cancel-region
> as a single (but rather fatty ;) deferred cancellation
> point? I do not see any additional situations here (except
> an obvious need for async-cancel-safety to form a region).
> If you need some cleanup to release something and/or
> restore/fix shared objects invariants just do it; set up
> your cleanup prior to setting PTHREAD_CANCEL_ASYNCHRONOUS
> cancel type on async-cancel-region entry and dismiss
> your cleanup (optionally invoking it) at some point after
> you have set cancel type back to PTHREAD_CANCEL_DEFFERED
> on async-cancel-region exit.

I just chose to call the combination of deferred/asynchronous cancelation a
new "situation" to distinguish it from the strict use of one or the other
but I think we're on the same page nonetheless.

Cheers,
Iker