[Tango-cs-bug-info] [tango-cs:bugs] #814 Segmentation fault when reading attribute

7 views
Skip to first unread message

tango-cs...@lists.sourceforge.net

unread,
Aug 16, 2016, 11:15:42 AM8/16/16
to Tango-cs...@lists.sf.net

[bugs:#814] Segmentation fault when reading attribute

Status: open
Labels: Tango9.2.0
Created: Tue Aug 16, 2016 03:14 PM UTC by Johan Forsberg
Last Updated: Tue Aug 16, 2016 03:14 PM UTC
Owner: nobody
Attachments:

Another segmentation fault we found while upgrading to TANGO 9:

If a device reads an attribute from a device in the same server (i.e. the same process) and from within a thread, the server segfaults. It does not happen for devices in other servers, nor if the read is not done from a thread.

The problem is easy to reproduce and has been observed in both a python device and a C++ device (included), so the issue should lie in libtango.

We are using TANGO 9.2.2. We have not yet tried to reproduce with older v9 versions but this behavior was not present in v8.

Thread 10 "ThreadReadSegfa" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffeb7fe700 (LWP 4848)]
0x00007ffff6b33c22 in omni_thread::get_value(unsigned int) () from /usr/lib/libomnithread.so.3
(gdb) bt
#0  0x00007ffff6b33c22 in omni_thread::get_value(unsigned int) () from /usr/lib/libomnithread.so.3
#1  0x00007ffff78b3138 in Tango::BlackBox::insert_attr (this=0x69b490, names=..., cl_id=..., vers=vers@entry=5, 
    sour=sour@entry=Tango::CACHE_DEV) at blackbox.cpp:643
#2  0x00007ffff7916447 in Tango::Device_5Impl::read_attributes_5 (this=0x69db60, names=..., source=Tango::CACHE_DEV, 
    cl_id=...) at device_5.cpp:114
#3  0x00007ffff7aadcfb in _0RL_lcfn_6fe2f94a21a10053_84000000 (cd=0x7fffeb7fd080, svnt=<optimized out>)
    at tangoSK.cpp:6272
#4  0x00007ffff6e1a94f in omni::omniOrbPOA::dispatch(omniCallDescriptor&, omniLocalIdentity*) ()
   from /usr/lib/libomniORB4.so.1
#5  0x00007ffff6dff939 in omniLocalIdentity::dispatch(omniCallDescriptor&) () from /usr/lib/libomniORB4.so.1
#6  0x00007ffff6e0da65 in omniObjRef::_invoke(omniCallDescriptor&, bool) () from /usr/lib/libomniORB4.so.1
#7  0x00007ffff7aadfbf in Tango::_objref_Device_5::read_attributes_5 (this=this@entry=0x7fffd8000d10, names=..., 
    source=<optimized out>, cl_ident=...) at tangoSK.cpp:6298
#8  0x00007ffff779a6d0 in Tango::DeviceProxy::read_attribute (this=0x7fffeb7fda40, attr_string="something")
    at devapi_base.cpp:5592
#9  0x000000000040dbf4 in Tango::DeviceProxy::read_attribute (this=0x7fffeb7fda40, att_name=0x41bc08 "something")
    at /usr/local/include/tango/DeviceProxy.h:665
#10 0x000000000040d21d in ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest::_read_attribute (this=0x69db60)
    at ThreadReadSegfaultTest.cpp:387
#11 0x00000000004129f5 in std::_Mem_fn_base<void (ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest::*)(), true>::operator()<, void>(ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest*) const (this=0x7fffd4005418, __object=0x69db60)
    at /usr/include/c++/5/functional:600
#12 0x0000000000412949 in std::_Bind<std::_Mem_fn<void (ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest::*)()> (ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest*)>::__call<void, , 0!
 ul>(std::tuple<>&&, std::_Index_tuple<0ul>) (
    this=0x7fffd4005418, 
    __args=<unknown type in /home/johfor/DeviceServers/ThreadReadSegfaultTest, CU 0x0, DIE 0x3d8b0>)
    at /usr/include/c++/5/functional:1074
#13 0x0000000000412899 in std::_Bind<std::_Mem_fn<void (ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest::*)()> (ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest*)>::operator()<, void>() (this=0x7fffd4005418)
    at /usr/include/c++/5/functional:1133
#14 0x000000000041285e in std::_Bind_simple<std::_Bind<std::_Mem_fn<void (ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest::*)()> (ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest*)> ()>::_M_invoke<>(std::_Index_tuple<>) (
    this=0x7fffd4005418) at /usr/include/c++/5/functional:1531
#15 0x00000000004127b4 in std::_Bind_simple<std::_Bind<std::_Mem_fn<void (ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest::*)()> (ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest*)> ()>::operator()() (this=0x7fffd4005418)
    at /usr/include/c++/5/functional:1520
#16 0x0000000000412744 in std::thread::_Impl<std::_Bind_simple<std::_Bind<std::_Mem_fn<void (ThreadReadSegfaultTest_ns::ThreadReadSegfaultTest::*)()> (ThreadReadSegfaultTest_ns::Thre!
 adReadSegfaultTest*)> ()> >::_M_run() (
    this=0x7fffd4005400) at /usr/include/c++/5/thread:115
#17 0x00007ffff6430c80 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#18 0x00007ffff67016fa in start_thread (arg=0x7fffeb7fe700) at pthread_create.c:333
#19 0x00007ffff5e9fb5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) 

Sent from sourceforge.net because Tango-cs...@lists.sf.net is subscribed to https://sourceforge.net/p/tango-cs/bugs/

To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/tango-cs/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.

message-footer.txt

tango-cs...@lists.sourceforge.net

unread,
Aug 17, 2016, 8:44:43 AM8/17/16
to Tango-cs...@lists.sf.net

Hi Johan,

Thank you for the bug report.
I think this problem comes from the following change introduced in blackbox.cpp (r28946):

https://sourceforge.net/p/tango-cs/code/28946/tree//api/cpp/cppapi/branches/Tango_900/server/blackbox.cpp?diff=28853

I tried to remove the omni_thread::release_dummy() part and the example you provided didn't crash.

It didn't give the same result as when you execute ReadAttribute though.
I got an exception (Not able to acquire serialization (dev, class or process) monitor) which produces a timeout on the client level.
I tried with Tango 8 and I got the same behaviour.

In your real device where you are doing something similar, are you using the default serialization model (by device, as in your example) or did you change it?

To me, it looks logical to get such an exception since the monitor is already taken by the thread executing the ReadAttributeFromThread command when the newly created thread invokes read_attribute(), which is trying to acquire the monitor again, but from another thread, hence the exception.
So unless, you are changing the serialization model, I don't see the point of reading its own attribute from a thread started from a command, and wait for the thread to end before returning the result... But you are probably doing something different in your real world example...

Removing the omni_thread::release_dummy() part should solve your issue but this means we would come back to a version having the memory leak this r28946 revision was trying to solve.
There might be other locations in the code and some edge cases where some methods are called directly on omni_thread::self(), without being in an omni_thread or having created a dummy omni_thread before, leading to a similar crash as the one you reported.

Cheers,
Reynald

message-footer.txt

tango-cs...@lists.sourceforge.net

unread,
Aug 17, 2016, 9:46:52 AM8/17/16
to Tango-cs...@lists.sf.net

It might be a good idea to use the omni_thread::ensure_self class which will take care of creating a dummy omni_thread in cases like this where the code tries to invoke methods directly on omni_thread::self(). This class will ensure a dummy thread is created when needed and will take care of releasing it automatically in its destructor.
It seems that this class is provided by omni_thread for this purpose.


[bugs:#814] Segmentation fault when reading attribute

Status: open
Labels: Tango9.2.0
Created: Tue Aug 16, 2016 03:14 PM UTC by Johan Forsberg

Last Updated: Wed Aug 17, 2016 12:43 PM UTC
Owner: nobody
Attachments:

message-footer.txt

tango-cs...@lists.sourceforge.net

unread,
Aug 17, 2016, 10:15:50 AM8/17/16
to Tango-cs...@lists.sf.net

Please note that in Tango 8 and in Tango 9 without the omni_thread::release_dummy() part, the blackbox is reporting the attribute read from the thread as being read from polling, which is not correct in that case.


[bugs:#814] Segmentation fault when reading attribute

Status: open
Labels: Tango9.2.0
Created: Tue Aug 16, 2016 03:14 PM UTC by Johan Forsberg

Last Updated: Wed Aug 17, 2016 01:46 PM UTC
Owner: nobody
Attachments:

message-footer.txt

tango-cs...@lists.sourceforge.net

unread,
Sep 2, 2016, 9:59:56 AM9/2/16
to Tango-cs...@lists.sf.net

Bug fixed in SVN repo


[bugs:#814] Segmentation fault when reading attribute

Status: open
Labels: Tango9.2.0
Created: Tue Aug 16, 2016 03:14 PM UTC by Johan Forsberg

Last Updated: Wed Aug 17, 2016 02:14 PM UTC
Owner: nobody
Attachments:

message-footer.txt

tango-cs...@lists.sourceforge.net

unread,
Nov 8, 2016, 10:05:47 AM11/8/16
to Tango-cs...@lists.sf.net

I saw that #823 was closed as a duplicate of this bug. But is it really the same? I don't think the fix mentioned above also solves the issue regarding commands from another thread. It looks like a similar omithread issue but it does not happen in the black box.

0x00007ffff4014d02 in omni_thread::get_value(unsigned int) () from /usr/local/lib/libomnithread.so.4
(gdb) bt
#0  0x00007ffff4014d02 in omni_thread::get_value(unsigned int) () from /usr/local/lib/libomnithread.so.4
#1  0x00007ffff504098a in Tango::DeviceImpl::get_client_ident (this=<optimized out>) at device.cpp:4416
#2  0x00007ffff504df15 in Tango::DeviceImpl::check_lock (this=this@entry=0x120ba30, 
    meth=meth@entry=0x7ffff52401c5 "command_inout4", cmd=cmd@entry=0x7fffe57f8680 "DummyCommand") at device.cpp:4820
#3  0x00007ffff5077792 in Tango::Device_4Impl::command_inout_4 (this=0x120ba30, in_cmd=0x7fffe57f8680 "DummyCommand", 
    in_data=..., source=Tango::CACHE_DEV, cl_id=...) at device_4.cpp:467
#4  0x00007ffff51f6162 in _0RL_lcfn_6fe2f94a21a10053_a3000000 (cd=0x7fffe57f7f20, svnt=<optimized out>)
    at tangoSK.cpp:5383
#5  0x00007ffff430917f in omni::omniOrbPOA::dispatch(omniCallDescriptor&, omniLocalIdentity*) ()
   from /usr/local/lib/libomniORB4.so.2
#6  0x00007ffff42e9cae in omniLocalIdentity::dispatch(omniCallDescriptor&) () from /usr/local/lib/libomniORB4.so.2
#7  0x00007ffff42f8ebe in omniObjRef::_invoke(omniCallDescriptor&, bool) () from /usr/local/lib/libomniORB4.so.2
#8  0x00007ffff51ff71c in Tango::_objref_Device_4::command_inout_4 (this=this@entry=0x7fffd4001be0, 
    command=<optimized out>, argin=..., source=<optimized out>, cl_ident=...) at tangoSK.cpp:5410
#9  0x00007ffff4f1108e in Tango::Connection::command_inout (this=0x7fffd4001440, command="DummyCommand", data_in=...)
    at devapi_base.cpp:1280
#10 0x00007ffff59bb6a3 in PyConnection::command_inout (self=..., cmd_name="DummyCommand", argin=...)
    at /tmp/pip-k4qsc2hy-build/ext/connection.cpp:25
...

[bugs:#814] Segmentation fault when reading attribute

Status: open
Labels: Tango9.2.0
Created: Tue Aug 16, 2016 03:14 PM UTC by Johan Forsberg

Last Updated: Fri Sep 02, 2016 01:58 PM UTC
Owner: nobody
Attachments:

message-footer.txt

tango-cs...@lists.sourceforge.net

unread,
Nov 8, 2016, 12:06:05 PM11/8/16
to Tango-cs...@lists.sf.net

My crash in 823 was indeed caused by the blackbox, but we're getting your bug too in another case. I just posted it in #827


[bugs:#814] Segmentation fault when reading attribute

Status: open
Labels: Tango9.2.0
Created: Tue Aug 16, 2016 03:14 PM UTC by Johan Forsberg

Last Updated: Tue Nov 08, 2016 03:05 PM UTC
Owner: nobody
Attachments:

message-footer.txt
Reply all
Reply to author
Forward
0 new messages