TRANSMIT THREAD
for(;;)
pthread_mutex_lock(&t_mutex);
while(awnd == 0)
pthread_cond_wait(&t_cond, &t_mutex);
awnd--;
//Do some message processing here
pthread_mutex_unlock(&t_mutex);
send message over the network
RECEIVE THREAD
for(;;)
alarm(timeout)
receive messages from network
if( message has been received)
turn off the timeout
if(alarm has gone off, timeout expired)
goto the location of timeout processing
pthread_mutex_lock(&t_mutex);
process message received
increase awnd by some integer value
Location for timeout processing
awnd = 1
pthread_cond_signal(&t_cond);
pthread_mutex_unlock(&t_mutex);
Try using PTHREAD_SCOPE_SYSTEM as an attribute when you create the threads so that when one
thread blocks it doesn't block all the threads.
Also, don't use alarm(). Unix signals were problematic to begin with. It's even more so in
with threads. There's no really clean way to deal with signals with pthreads. I don't waste
time trying to use them myself. Instead you should try to use poll() or select() to do timed
waits for i/o, and pthread_cond_timedwait() for other timed wait situations.
IMO, you should never be using signals unless there's some reason you can't avoid it, e.g.
you are stuck with a legacy application or somebody screwed up big time on an api. Take
for instance, POSIX asynchronous i/o. What were they thinking? It's nearly impossible for
a signal handler and threads to cleanly interact.
Joe Seigh
> Prarthana Kukkalli wrote:
> >
> > The receive thread just signals and releases the lock after one loop of
> > execution. I have a timer with UNIX signal being generated in the case of no
> > message received from the network because I am using blocking I/O. The
> > design works fine as long as messages are received from the network. But if
> > the timer does expire, what happens is that the receive thread does go to
> > the correct location and update awnd = 1. However, when the receive thread
> > signals, the transmit thread does not wake up in time and the receive thread
> > just keeps setting the timer and causing timeouts.
>
> Try using PTHREAD_SCOPE_SYSTEM as an attribute when you create the threads so that when one
> thread blocks it doesn't block all the threads.
This MIGHT help, (though I doubt it), if you're working with PCS (process contention scope)
threads on Solaris or AIX. It certainly won't help if you're using Linuxthreads, because all
threads are SCS, (system contention scope), and it won't help if you're on a system with real
2-level scheduling.
> Also, don't use alarm(). Unix signals were problematic to begin with. It's even more so in
> with threads. There's no really clean way to deal with signals with pthreads. I don't waste
> time trying to use them myself. Instead you should try to use poll() or select() to do timed
> waits for i/o, and pthread_cond_timedwait() for other timed wait situations.
This is certainly true. Even worse, although it's hard to completely decode the fragmented
example code, it appears that the code may be locking a mutex and signalling a condition
variable inside the alarm handler... and that's both illegal and extremely dangerous.
Furthermore, an alarm() signal is a PROCESS signal, not a THREAD signal, and will attack some
arbitrary thread in the process. Therefore, it may not interrupt the extended blocking I/O
operation. (It's possible that this is what's happened to you; while your "bookkeeping" signal
code ran in another thread, the receive thread remains blocked for the original I/O until some
data arrives.)
> IMO, you should never be using signals unless there's some reason you can't avoid it, e.g.
> you are stuck with a legacy application or somebody screwed up big time on an api. Take
> for instance, POSIX asynchronous i/o. What were they thinking? It's nearly impossible for
> a signal handler and threads to cleanly interact.
POSIX asynchronous I/O mostly predates the addition of threads to POSIX. You don't really need
to use signals, though the addition of "completion ports" might have made it all easier to work
with. (E.g., Win32, Solaris 8 AIO.)
In general; yes, one should strive to keep threads and signals apart. Signals, unfortunately,
tend to have the mistaken impression that they're every bit as good as threads. They are
therefore tempted to perform feats of which they are not capable, because they have no
independent execution context. They inevitably end up falling down in the attempt, and often
take your application with them.
/------------------[ David.B...@compaq.com ]------------------\
| Compaq Computer Corporation POSIX Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\-----[ http://home.earthlink.net/~anneart/family/dave.html ]-----/
I noticed that there are two functions, sigwaitinfo() and sigtimedwait() that are part of the
Posix realtime spec. These act like "completion ports" for signals, i.e. you call them from
a user thread. On Solaris 2.6, to get it to work, I had to block the signals and set up a
sigaction w/ SA_SIGINFO set even though I had the signals blocked.
It's kind of moot using it for Solaris posix aio since it appears that it's really Solaris aio
under the covers which then uses sigqueue() to send a signal which then gets trapped by the
pthread libraries they can direct the signal to a specific thread in case you have thread
signal masks. The latter part will probably go away in future release according to the man pages.
So it's a little redundant.
Anyway, if you had to deal with signals for one reason or another, I guess this would make them
less problematic.
Joe Seigh
> Dave Butenhof wrote:
> >
> > Joe Seigh wrote:
> >
> ...
> > > IMO, you should never be using signals unless there's some reason you can't avoid it, e.g.
> > > you are stuck with a legacy application or somebody screwed up big time on an api. Take
> > > for instance, POSIX asynchronous i/o. What were they thinking? It's nearly impossible for
> > > a signal handler and threads to cleanly interact.
> >
> > POSIX asynchronous I/O mostly predates the addition of threads to POSIX. You don't really need
> > to use signals, though the addition of "completion ports" might have made it all easier to work
> > with. (E.g., Win32, Solaris 8 AIO.)
> ...
>
> I noticed that there are two functions, sigwaitinfo() and sigtimedwait() that are part of the
> Posix realtime spec. These act like "completion ports" for signals, i.e. you call them from
> a user thread. On Solaris 2.6, to get it to work, I had to block the signals and set up a
> sigaction w/ SA_SIGINFO set even though I had the signals blocked.
Now THAT'S a stretch! But, yeah, there is a loose analogy.
These functions are intended to avoid some of the worst problems of signals, by allowing you to
SYNCHRONOUSLY receive ASYNCHRONOUS (process directed) signals. The standard requires that you block
the signals for which you wait in the calling thread. In fact, you won't get reliable behavior unless
the signals are blocked in ALL threads. The standard doesn't require that the implementation prefer a
sigwait-er over another thread that doesn't have the signal blocked (though such preference is "the
rational thing to do"). In any case, there would be windows while the sigwaiter isn't waiting on a
signal (either before you call it, or after it returns to report a signal) when an incoming signal
might legitimately target another thread. Solaris also (unreasonably, in my opinion, though the
standard doesn't prohibit such irrationality) requires that you set the signal action (at least to be
sure the signal isn't being ignored).
This is NOT quite the same as a completion port, but with care it can be used in a manner that will
gain you similar benefits.
> It's kind of moot using it for Solaris posix aio since it appears that it's really Solaris aio
> under the covers which then uses sigqueue() to send a signal which then gets trapped by the
> pthread libraries they can direct the signal to a specific thread in case you have thread
> signal masks. The latter part will probably go away in future release according to the man pages.
POSIX unfortunately lacks a pthread_sigqueue() analogous to pthread_kill(). I really keep forgetting
to do anything about this. (Someone ought to file a POSIX interpretation request to get it on the
agenda for a future standard update; but since the 2001 revision is rather far along, I suspect it'll
be a while.)
> Anyway, if you had to deal with signals for one reason or another, I guess this would make them
> less problematic.
In situations were you really must deal with asynchronous (process-directed) UNIX signals in a
threaded application, you should always assign a thread to block on a sigwait* function rather than
actually taking the signal asynchronously. (The only exception would be if the signal action was
extremely quick and simple, and didn't need to make any external calls.)