Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Suspending threads in Linux

1,334 views
Skip to first unread message

Alan Cabrera

unread,
Jun 13, 2003, 12:39:14 AM6/13/03
to
How can I suspend and resume threads in Linux? Do I have to do a
pthread_kill(SIGSTOP/SIGCONT)? I think that I read somewhere that this
feature might not be supported in the future.


Regards,
Alan


David Schwartz

unread,
Jun 13, 2003, 2:44:45 AM6/13/03
to

"Alan Cabrera" <mag...@spam.users.sf.net> wrote in message
news:bcbkdq$iea$1...@reader1.panix.com...

> How can I suspend and resume threads in Linux? Do I have to do a
> pthread_kill(SIGSTOP/SIGCONT)? I think that I read somewhere that this
> feature might not be supported in the future.

The answer is really this simple:

With the cooperation of the thread you want to stop/continue, you can do
it anyway you want. Set a flag that the thread checks periodically. Or do it
some other way.

Without the cooperation of the thread you want to stop/continue, it's
impossible. What if that thread holds a critical mutex such as the one that
protects malloc/free? There's no guarantee the thread that stopped the
thread could ever make sufficient forward progress to ever restart it.

DS


Alan Cabrera

unread,
Jun 13, 2003, 8:29:43 AM6/13/03
to

"David Schwartz" <dav...@webmaster.com> wrote in message

>
> "Alan Cabrera" <mag...@spam.users.sf.net> wrote in message
>
> > How can I suspend and resume threads in Linux?
>
> Without the cooperation of the thread you want to stop/continue, it's
> impossible. What if that thread holds a critical mutex such as the one
that
> protects malloc/free? There's no guarantee the thread that stopped the
> thread could ever make sufficient forward progress to ever restart it.

I'm porting a Virtual Machine to Linux. Modifying it to check some thread
is out of the question. The code uses suspend/resume for W2K, Mac OSX, and
Solaris. How is it that these operating systems provide these functions if
it's so impossible?


Regards,
Alan

Måns Rullgård

unread,
Jun 13, 2003, 8:57:39 AM6/13/03
to
"Alan Cabrera" <mag...@spam.users.sf.net> writes:

> > > How can I suspend and resume threads in Linux?
> >
> > Without the cooperation of the thread you want to stop/continue, it's
> > impossible. What if that thread holds a critical mutex such as the one
> that
> > protects malloc/free? There's no guarantee the thread that stopped the
> > thread could ever make sufficient forward progress to ever restart it.
>
> I'm porting a Virtual Machine to Linux. Modifying it to check some thread
> is out of the question. The code uses suspend/resume for W2K, Mac OSX, and
> Solaris. How is it that these operating systems provide these functions if
> it's so impossible?

It's not impossible. It's just unsafe.

--
Måns Rullgård
m...@users.sf.net

David Schwartz

unread,
Jun 13, 2003, 2:21:28 PM6/13/03
to

"Måns Rullgård" <m...@users.sourceforge.net> wrote in message
news:yw1xd6hi...@zaphod.guide...
> "Alan Cabrera" <mag...@spam.users.sf.net> writes:

> > > > How can I suspend and resume threads in Linux?

> It's not impossible. It's just unsafe.

It is impossible, assuming "How can I suspend and resume threads" means
"How can I suspend a thread such that I can resume it" rather than "How can
I suspend a thread such that I might be able to resume it if I'm lucky".

DS


Alan Cabrera

unread,
Jun 14, 2003, 12:07:22 PM6/14/03
to

"David Schwartz" <dav...@webmaster.com> wrote in message news:bcd4n8

>
> "Måns Rullgård" <m...@users.sourceforge.net> wrote in message
> > "Alan Cabrera" <mag...@spam.users.sf.net> writes:
>
> > > > > How can I suspend and resume threads in Linux?
>
> > It's not impossible. It's just unsafe.
>
> It is impossible, assuming "How can I suspend and resume threads"
means
> "How can I suspend a thread such that I can resume it" rather than "How
can
> I suspend a thread such that I might be able to resume it if I'm lucky".

The program that I'm porting uses suspend/resume for W2K, Mac OSX, and


Solaris. How is it that these operating systems provide these functions if

it's so impossible? Is this just a limitation of Linux?


Regards,
Alan

Måns Rullgård

unread,
Jun 14, 2003, 2:01:13 PM6/14/03
to
"Alan Cabrera" <mag...@spam.users.sf.net> writes:

> > > > > > How can I suspend and resume threads in Linux?
> >
> > > It's not impossible. It's just unsafe.
> >
> > It is impossible, assuming "How can I suspend and resume threads"
> > means "How can I suspend a thread such that I can resume it"
> > rather than "How can I suspend a thread such that I might be able
> > to resume it if I'm lucky".
>
> The program that I'm porting uses suspend/resume for W2K, Mac OSX, and
> Solaris. How is it that these operating systems provide these functions if
> it's so impossible? Is this just a limitation of Linux?

Do they do it in a safe way? Imagine this situation:

thread 1 thread 2
------------------------
lock mutex
suspend thread 1
lock mutex
<suspended> <waiting for thread 1>

How will you get out of this?

--
Måns Rullgård
m...@users.sf.net

Paul Pluzhnikov

unread,
Jun 14, 2003, 4:05:17 PM6/14/03
to
"Alan Cabrera" <mag...@spam.users.sf.net> writes:

> The program that I'm porting uses suspend/resume

Too bad. Your program is most likely buggy, as it is almost
impossible to use suspend/resume *safely* ...

Your better choice is to recode it *not* to use suspend.

> for W2K, Mac OSX, and Solaris.
> How is it that these operating systems provide these functions if
> it's so impossible? Is this just a limitation of Linux?

Linux provides it too, just use pthread_kill(tid, SIGSTOP).

It's not that it is impossible for the OS to provide this
functionality, it just that it is impossible to safely use this
functionality in an application (which is why it was *left out*
from POSIX thread standard).

Search comp.programming.threads for "thread suspend" for some
expert opinions.

If you think you know what you are doing, go ahead.
But your app will deadlock sooner or later, and probably at the
most inopportune moment ;-(

Cheers,
--
In order to understand recursion you must first understand recursion.

David Schwartz

unread,
Jun 14, 2003, 7:39:21 PM6/14/03
to

"Alan Cabrera" <mag...@spam.users.sf.net> wrote in message
news:bcfh44$qet$1...@reader1.panix.com...

> The program that I'm porting uses suspend/resume for W2K, Mac OSX, and
> Solaris. How is it that these operating systems provide these functions
if
> it's so impossible? Is this just a limitation of Linux?

Let me put it another way:

You're in luck. Linux doesn't limit you to any one suspend/resume
strategy. You can code any suspend/resume strategy you want. Linux provides
all the low-level functions you need. A thread can suspend itself with
pthread_cond_wait, a thread can resume another thread with
pthread_cond_signal, and a threads can communicate through the memory they
share with mutexes to protect it.

So do it however, you want to. Just make sure you understand what you
want.

DS


David Schwartz

unread,
Jun 14, 2003, 7:37:30 PM6/14/03
to

"Alan Cabrera" <mag...@spam.users.sf.net> wrote in message
news:bcfh44$qet$1...@reader1.panix.com...

> The program that I'm porting uses suspend/resume for W2K, Mac OSX, and
> Solaris. How is it that these operating systems provide these functions
if
> it's so impossible? Is this just a limitation of Linux?

I don't know, you tell me. What happens if the thread you're suspending
holds the memory allocation mutex and the thread that tries to unsuspend the
suspending thread needs to allocate some memory?

DS


Arnold Hendriks

unread,
Jun 16, 2003, 6:25:49 PM6/16/03
to
Paul Pluzhnikov <ppluz...@earthlink.net> wrote:
> Linux provides it too, just use pthread_kill(tid, SIGSTOP).

> It's not that it is impossible for the OS to provide this
> functionality, it just that it is impossible to safely use this
> functionality in an application (which is why it was *left out*
> from POSIX thread standard).

It shouldn't be impossible to use it safely - eg, if the OS at least ensures
that all suspends happen in user mode, then you would be safe if you stick
to syscalls (or perhaps, async-signal-safe functions) during a suspension only,
wouldn't it?

(are such guarantees about signal behaviour documented anywhere for Linux?)

Not that it would be useful for anything but low-level stuff such as garbage
collectors. I've only found a use for thread suspension so far on Win32, to
get around limitations in its memory mapping code.

--
Arnold Hendriks <a.hen...@b-lex.com>
B-Lex Information Technologies, http://www.b-lex.com/

David Schwartz

unread,
Jun 16, 2003, 8:12:28 PM6/16/03
to

"Arnold Hendriks" <a.hen...@b-lex.com> wrote in message
news:bclg5d$r2s$1...@news.btcnet.nl...

> Paul Pluzhnikov <ppluz...@earthlink.net> wrote:
> > Linux provides it too, just use pthread_kill(tid, SIGSTOP).

> > It's not that it is impossible for the OS to provide this
> > functionality, it just that it is impossible to safely use this
> > functionality in an application (which is why it was *left out*
> > from POSIX thread standard).

> It shouldn't be impossible to use it safely - eg, if the OS at least
ensures
> that all suspends happen in user mode, then you would be safe if you stick
> to syscalls (or perhaps, async-signal-safe functions) during a suspension
only,
> wouldn't it?

Well, the problem is that you have no way (in principle) to know what's
a syscall and what isn't. However, limiting to async-signal-safe functions
should work. The problem is, it's hard to see how the thread resume function
could be made async-signal-safe. Functions that (might) access the table of
threads are generally not async-signal-safe. And if you can't resume the
thread, you're basically screwed.

DS


Arnold Hendriks

unread,
Jun 17, 2003, 4:55:08 AM6/17/03
to
David Schwartz <dav...@webmaster.com> wrote:
> "Arnold Hendriks" <a.hen...@b-lex.com> wrote in message
> news:bclg5d$r2s$1...@news.btcnet.nl...

>> It shouldn't be impossible to use it safely - eg, if the OS at least


> ensures
>> that all suspends happen in user mode, then you would be safe if you stick
>> to syscalls (or perhaps, async-signal-safe functions) during a suspension
> only,
>> wouldn't it?

> Well, the problem is that you have no way (in principle) to know what's
> a syscall and what isn't. However, limiting to async-signal-safe functions
> should work. The problem is, it's hard to see how the thread resume function
> could be made async-signal-safe. Functions that (might) access the table of
> threads are generally not async-signal-safe. And if you can't resume the
> thread, you're basically screwed.

You could maintain your own thread list, mutex-protect it, and acquire that
mutex before you start the actual suspending of threads. Then you only need
an async-signal-safe call to suspend and resume every single thread in that
list. I'd expect it to be possible on LinuxThreads, but have no clue on how
the newer threading libraries work..

0 new messages