Ah... I think I may have just been able to reproduce the issue now. I wasn't trying it with nearly enough data to reproduce it reliably.
I did get this segfault though(I was able to catch the hanging issue separately in the debugger). I'm not sure if it's related at all, since that signal shouldn't be getting modified at all during execution.
140737323644672 - 20:16:29: Dispatch Status: 1 Hints: DISPATCH_DATA_REMAINS=0
140737323644672 - 20:16:29: Dispatch Status: 1 Hints: DISPATCH_DATA_REMAINS=0
140737315251968 - 20:16:29: Connection::on_remove_timeout_callback enabled:1 interval: 25000
140737315251968 - 20:16:29: Dispatcher::on_remove_timeout enabled:1 interval: 25000
140737315251968 - 20:16:29: Dispatcher::on_wakeup_main
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff5aea700 (LWP 13411)]
0x00007ffff7b93a50 in emit (
_A_a1=@0x7ffff5ae9cfc: DBus::DISPATCH_DATA_REMAINS, impl=0x60c970)
at /usr/include/sigc++-2.0/sigc++/signal.h:1006
1006 for (iterator_type it = slots.begin(); it != slots.end(); ++it)
(gdb) bt
#0 0x00007ffff7b93a50 in emit (
_A_a1=@0x7ffff5ae9cfc: DBus::DISPATCH_DATA_REMAINS, impl=0x60c970)
at /usr/include/sigc++-2.0/sigc++/signal.h:1006
#1 emit (this=<optimized out>,
_A_a1=@0x7ffff5ae9cfc: DBus::DISPATCH_DATA_REMAINS)
at /usr/include/sigc++-2.0/sigc++/signal.h:2801
#2 DBus::Connection::on_dispatch_status_callback (connection=<optimized out>,
new_status=<optimized out>, data=<optimized out>) at connection.cpp:936
#3 0x00007ffff792853e in ?? () from /lib/x86_64-linux-gnu/libdbus-1.so.3
#4 0x00007ffff7928676 in ?? () from /lib/x86_64-linux-gnu/libdbus-1.so.3
#5 0x00007ffff7940d2a in dbus_watch_handle ()
from /lib/x86_64-linux-gnu/libdbus-1.so.3
#6 0x00007ffff7bc4cbe in DBus::Watch::handle (this=this@entry=0x609e50,
flags=flags@entry=1) at watch.cpp:87
#7 0x00007ffff7bc4d2b in DBus::Watch::handle_read (this=this@entry=0x609e50,
error=error@entry=false, hangup=hangup@entry=false) at watch.cpp:95
#8 0x00007ffff7b9dac3 in DBus::Dispatcher::watch_thread_main (this=0x608c60)
at dispatcher.cpp:341
#9 0x00007ffff74bf970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00007ffff67090a4 in start_thread (arg=0x7ffff5aea700)
at pthread_create.c:309
#11 0x00007ffff6c2f87d in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
If the segfault is related, that at least gives us a point at which to start investigating.
Note that the above segfault was in the sending application. I also got this segault in the receiving application:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff62eb700 (LWP 13572)]
0x00007ffff7b93a59 in empty (this=<optimized out>)
at /usr/include/sigc++-2.0/sigc++/functors/slot_base.h:290
290 { return (!rep_ || !rep_->call_); }
(gdb) bt
#0 0x00007ffff7b93a59 in empty (this=<optimized out>)
at /usr/include/sigc++-2.0/sigc++/functors/slot_base.h:290
#1 emit (_A_a1=@0x7ffff62ea5cc: DBus::DISPATCH_COMPLETE, impl=0x612970)
at /usr/include/sigc++-2.0/sigc++/signal.h:1008
#2 emit (this=<optimized out>, _A_a1=@0x7ffff62ea5cc: DBus::DISPATCH_COMPLETE)
at /usr/include/sigc++-2.0/sigc++/signal.h:2801
#3 DBus::Connection::on_dispatch_status_callback (connection=<optimized out>,
new_status=<optimized out>, data=<optimized out>) at connection.cpp:936
#4 0x00007ffff792853e in ?? () from /lib/x86_64-linux-gnu/libdbus-1.so.3
#5 0x00007ffff792944a in ?? () from /lib/x86_64-linux-gnu/libdbus-1.so.3
#6 0x00007ffff7b9317a in DBus::Connection::send (this=0x60f1c0, msg=
std::shared_ptr (count 2, weak 0) 0x7ffff0000920) at connection.cpp:338
#7 0x0000000000406c95 in DBus::Method<void, std::string, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil>::handle_call_message (
this=0x613570, connection=std::shared_ptr (count 8, weak 1) 0x60f1c0,
message=std::shared_ptr (count 3, weak 0) 0x7ffff0000a30)
at /usr/local/include/dbus-cxx-0.8/dbus-cxx/method.h:1206
#8 0x00007ffff7ba3155 in DBus::Interface::handle_call_message (
this=<optimized out>,
connection=std::shared_ptr (count 8, weak 1) 0x60f1c0,
message=std::shared_ptr (count 3, weak 0) 0x7ffff0000a30)
at interface.cpp:388
#9 0x00007ffff7bb375c in DBus::Object::handle_message (
---Type <return> to continue, or q <return> to quit---
this=this@entry=0x612f90,
connection=std::shared_ptr (count 8, weak 1) 0x60f1c0,
message=std::shared_ptr (count 1, weak 0) 0x7ffff0000a10) at object.cpp:367
#10 0x00007ffff7bb64a6 in DBus::ObjectPathHandler::message_handler_callback (
connection=<optimized out>, message=<optimized out>, user_data=0x612f90)
at objectpathhandler.cpp:119
#11 0x00007ffff793860f in ?? () from /lib/x86_64-linux-gnu/libdbus-1.so.3
#12 0x00007ffff792a194 in dbus_connection_dispatch ()
from /lib/x86_64-linux-gnu/libdbus-1.so.3
#13 0x00007ffff7b93541 in DBus::Connection::dispatch (this=0x60f1c0)
at connection.cpp:435
#14 0x00007ffff7b9cc61 in DBus::Dispatcher::dispatch_thread_main (
this=0x60ec60) at dispatcher.cpp:219
#15 0x00007ffff74bf970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#16 0x00007ffff67090a4 in start_thread (arg=0x7ffff62eb700)
at pthread_create.c:309
#17 0x00007ffff6c2f87d in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Also, some good-ish news. When the call hangs, the return message does make it back to the calling program. Which means that this is almost certainly a bug in the calling program.
I think if we take a good look at what's going on with these segfaults we may be able to track down exactly what is going on. I'm thinking that if the segfaults are related at all to the non-response, the problem is that the sigc++ signal is getting modified from two different threads at once. My current thinking is that if the signal gets modified after the response comes back, you see the hang. If it's modified while the response is coming back, the iterator gets invalidated and thus we get our segfault. In this case, you may not ever see it if only that one thread is killed. In that case though, the library would die as no other processing would happen in that thread.
-Robert Middleton