Issue with nested signals (signal while executing signal handler)

427 views
Skip to first unread message

Ralf Pannemans

unread,
Jul 21, 2015, 9:16:56 AM7/21/15
to dynamor...@googlegroups.com
Hi,

My application reacts on signals and it might be the case that it gets another signal while still executing the first signal. It works fine and the signal handler is called a second time as soon as the first time is finished.

But when I run the application under DR, it stops working. I found out that the first signal is propagated fine, but the second signal is not handled. The callback registered with dr_register_signal_event only receives the first signal. The second seems to be lost. Unfortunately, this causes a deadlock in my application.

Has anyone an idea why the signal gets lost or how I can handle those nested signals?

Best regards,
Ralf

Ralf Pannemans

unread,
Jul 27, 2015, 6:33:07 AM7/27/15
to DynamoRIO Users, ralf.pa...@googlemail.com
Update:

Here, the problem is broken down to a minimal example. Running it with dynamorio results in one signal getting lost :( The code is rather simple. The signal handler just waits 1s. Since 2 signals are sent two the thread, the second one should wait until the signal handler is done. This works fine without dynamorio. Any ideas what could be the problem?

Output without dynamorio:

 

before thread creation

signal 37 received

leaving signal handler after one second

signal 37 received

leaving signal handler after one second

good morning from thread

 

Output with dynamorio:

 

before thread creation

signal 37 received

leaving signal handler after one second

good morning from thread

Code:


#include <pthread.h>

#include <iostream>

#include <signal.h>

#include <unistd.h>

 

using std::cout;

using std::endl;

 

void* sayGoodMorningAfterFiveSeconds(void*)

{

    usleep(5000000);

    cout << "good morning from thread" << endl;

    return NULL;

}

 

static const int signalNumber = 37;

 

void registerSignalHandler(void (*handler)(int, siginfo_t*, void*))

{

    struct sigaction action;

    struct sigaction oldAction;

    action.sa_handler = NULL;

    action.sa_sigaction = handler;

    sigemptyset(&action.sa_mask);

    action.sa_flags = SA_ONSTACK | SA_SIGINFO;

    sigaction(signalNumber, &action, &oldAction);

}

 

void handleSignal(int signalNumber, siginfo_t* signalInfo, void* context)

{

    cout << "signal " << signalNumber << " received" << endl;

    usleep(1000000);

    cout << "leaving signal handler after one second" << endl;

}

 

int main()

{

    registerSignalHandler(handleSignal);

 

    pthread_t thread;

 

    cout << "before thread creation" << endl;

 

    pthread_create(&thread, NULL, sayGoodMorningAfterFiveSeconds, NULL);

 

    //send signal twice to thread, both should be processed

    pthread_kill(thread, signalNumber);

    pthread_kill(thread, signalNumber);

 

    pthread_join(thread, NULL);

 

    return 0;

}

Derek Bruening

unread,
Jul 30, 2015, 12:29:12 AM7/30/15
to dynamor...@googlegroups.com, ralf.pa...@googlemail.com

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To post to this group, send email to dynamor...@googlegroups.com.
Visit this group at http://groups.google.com/group/dynamorio-users.
For more options, visit https://groups.google.com/d/optout.

Ralf Pannemans

unread,
Jul 31, 2015, 8:39:19 AM7/31/15
to DynamoRIO Users, ralf.pa...@googlemail.com, brue...@google.com
Hi Derek,

Thanks a lot. Unfortunately, that is really strange.
On my machine, I cannot use the build due to another version of libc. I have just built the latest version in git. But that doesn't seem to work with that test. No signals are received. A colleague tested the build on his machine and had the result. No signal 37 received :(

Could you do me a favor and paste the output of the test when started with drrun? Have you received 2 signals? Unfortunately the test is not really "self validating", but the output needs some interpretation. See my previous reply.


Your help is very appreciated.

Thanks,
Ralf

Ralf Pannemans

unread,
Jul 31, 2015, 9:40:52 AM7/31/15
to DynamoRIO Users, ralf.pa...@googlemail.com, brue...@google.com
Update: There was an error on my side. I have used a wrong value for vm_size. Interesting enough, that didn't cause a problem before but with the latest version or even the bugfix applied, it DOES make a difference. Using a smaller vm_size (or none at all), shows that the the problem is really fixed.

A little BUT: From time to time, a signal still got lost. But not in every case as before.

Thanks a lot,
Ralf

Derek Bruening

unread,
Jul 31, 2015, 2:23:43 PM7/31/15
to Ralf Pannemans, DynamoRIO Users
On Fri, Jul 31, 2015 at 8:39 AM, Ralf Pannemans <ralf.pa...@googlemail.com> wrote:
On my machine, I cannot use the build due to another version of libc.

This is https://github.com/DynamoRIO/dynamorio/issues/1504 which we hope to address.

Derek Bruening

unread,
Jul 31, 2015, 3:18:17 PM7/31/15
to Ralf Pannemans, DynamoRIO Users
On Fri, Jul 31, 2015 at 9:40 AM, Ralf Pannemans <ralf.pa...@googlemail.com> wrote:
Update: There was an error on my side. I have used a wrong value for vm_size. Interesting enough, that didn't cause a problem before but with the latest version or even the bugfix applied, it DOES make a difference. Using a smaller vm_size (or none at all), shows that the the problem is really fixed.

You mean that you passed a reasonable value to -vm_size and something failed?  Can you file an issue on this with details (including your reasons for specifying a -vm_size)?

A little BUT: From time to time, a signal still got lost. But not in every case as before.

Can you file an issue with an app that reproduces that behavior?  I assume it's something more complex than the one you emailed earlier, as for that earlier one I do not see any loss in 100 runs.

Ralf Pannemans

unread,
Aug 6, 2015, 10:29:11 AM8/6/15
to DynamoRIO Users, ralf.pa...@googlemail.com
Done: https://github.com/DynamoRIO/dynamorio/issues/1759

In our tests it happened 1,4% of the time. So 100 OK runs are not that unlikely.

The reason for -vm_size is that I run into OOM issues if it is not provided.
Reply all
Reply to author
Forward
0 new messages