gRPC C++ server stuck at select

59 views
Skip to first unread message

rajeev...@rediffmail.com

unread,
Mar 7, 2019, 9:51:00 PM3/7/19
to grpc.io
me (rajeev...@rediffmail.com change
Mar 7 (16 hours ago)
Hi,

In streaming mode gRPC server get stuck after  ~2 years; 

Platform : vxWorks 6.3

gRPC C++ Stack version : 0.13.0

Pasting below  thread trace for reference


Thread trace  pthr8 where streaming was running
============================================================
0x004810a8 vxTaskEntry  +0x60 : 0x0045c88c ()
0x0045c8dc pthread_exit +0x178: 0x01775fa4 ()
0x01775fdc gpr_thd_options_is_joinable+0x54 : grpc::thread::thread_func(void *) ()
0x005b6dc0 grpc::thread::thread_func(void *)+0x1c : grpc::thread::thread_function<grpc::DynamicThreadPool::DynamicThread>::call() ()
0x005b8b24 grpc::thread::thread_function<grpc::DynamicThreadPool::DynamicThread>::call()+0x4c : grpc::DynamicThreadPool::DynamicThread::ThreadFunc() ()
0x00aacaf8 grpc::DynamicThreadPool::DynamicThread::ThreadFunc()+0x28 : grpc::DynamicThreadPool::ThreadFunc() ()
0x00aacd18 grpc::DynamicThreadPool::ThreadFunc()+0x17c: 0x00ab4a90 ()
0x00ab4a90 grpc::Server::RunRpc()+0x558: grpc::BidiStreamingHandler<openconfig::OpenConfig::Service, openconfig::SubscribeRequest ()
0x0059e1e4 grpc::BidiStreamingHandler<openconfig::OpenConfig::Service, openconfig::SubscribeRequest+0x64 : 0x00a64718 ()
0x00a6473c openconfig::OpenConfig::Service::Subscribe(grpc::ServerContext +0xac : 0x00a605b4 ()
0x00a605b4 SyncServiceImpl::Subscribe(grpc::ServerContext *, grpc::ServerReaderWriter+0x1488: grpc::WriterInterface<openconfig::SubscribeResponse>::Write(const openconfig::SubscribeResponse &) ()
0x0058ff6c grpc::WriterInterface<openconfig::SubscribeResponse>::Write(const openconfig::SubscribeResponse &)+0x28 : grpc::ServerReaderWriter<openconfig::SubscribeResponse, openconfig::SubscribeRequest> ()
0x005a246c grpc::ServerReaderWriter<openconfig::SubscribeResponse, openconfig::SubscribeRequest>+0x14c: grpc::CompletionQueue::Pluck(grpc::CompletionQueueTag *) ()
0x00aab014 grpc::CompletionQueue::Pluck(grpc::CompletionQueueTag *)+0x88 : grpc_completion_queue_pluck ()
0x00a9fb28 grpc_completion_queue_pluck+0x3b8: grpc_pollset_work ()
0x01757b2c grpc_pollset_work+0x280: 0x017584ec ()
0x01758678 grpc_poll_deadline_to_millis_timeout+0x7ac: grpc_poll ()
0x01756cf0 grpc_poll    +0x180: select ()
0x00423858 select       +0x25c: semTake ()
0x00498098 semTake      +0x98 : 0x00496834 ()

=====================

Thread 563 (pthr8 tid:0x2c2f36d8 ):
#0  0x004968a0 in semBTake (semId=0x2c2f49f0, timeout=1001) at semBLib.c:590
#1  0x00498098 in semTake (semId=0x2c2f49f0, timeout=1001) at semLib.c:442
#2  0x00423858 in select (width=1274, pReadFds=0x2c2f2d60, pWriteFds=0x2c2f2e60, pExcFds=0x0, pTimeOut=0x0) at selectLib.c:595
#3  0x01756cf0 in grpc_poll (p=0x2c2f2fa0, nfds=2, timeout=1000) at ...//src/core/iomgr/pollset_posix.c:188
#4  0x01758678 in basic_pollset_maybe_work_and_unlock (exec_ctx=0x2c2f30f0, pollset=0x2c87f37c, worker=0x2c2f30c0, deadline=Cannot access memory at address 0x2)
    at ...//src/core/iomgr/pollset_posix.c:740
#5  0x01757b2c in grpc_pollset_work (exec_ctx=0x2c2f30f0, pollset=0x2c87f37c, worker=0x2c2f30c0, now={tv_sec = 49610331, tv_nsec = 663474868, clock_type = GPR_CLOCK_MONOTONIC}, deadline=
      {tv_sec = 49610332, tv_nsec = 662792550, clock_type = GPR_CLOCK_MONOTONIC}) at ...//src/core/iomgr/pollset_posix.c:469
#6  0x00a9fb28 in grpc_completion_queue_pluck (cc=0x2c87f360, tag=0x2c2f31e0, deadline={tv_sec = 9223372036854775807, tv_nsec = 0, clock_type = GPR_CLOCK_MONOTONIC}, reserved=0x2c87f360)
    at ...//src/core/surface/completion_queue.c:430
#7  0x00aab014 in grpc::CompletionQueue::Pluck (this=0x2c2f3478, tag=0x2c2f31e0) at ...//src/cpp/common/completion_queue.cc:80
#8  0x005a246c in grpc::ServerReaderWriter<openconfig::SubscribeResponse, openconfig::SubscribeRequest>::Write (this=0x2c2f3390, msg=@0x2c2f31e0, options=@0x2c2f3260)
    at ...//include/grpc++/impl/codegen/call.h:583
#9  0x0058ff6c in grpc::WriterInterface<openconfig::SubscribeResponse>::Write (this=<value optimized out>, msg=<value optimized out>)
    at ...//include/grpc++/impl/codegen/call.h:69
#10 0x00a605b4 in SyncServiceImpl::Subscribe (this=<value optimized out>, context=0x2c2f3490, stream=0x2c2f3390)
    at .../server/ocfg_server.cc:322
#11 0x00a6473c in RpcSubscribe (service=<value optimized out>, context=<value optimized out>, stream=<value optimized out>)
    at ../server/OpenConfig.grpc.pb.cc:109
#12 0x0059e1e4 in grpc::BidiStreamingHandler<openconfig::OpenConfig::Service, openconfig::SubscribeRequest, openconfig::SubscribeResponse>::RunHandler (this=0x2c298e50, param=@0x2c2f3510)
    at ...//include/grpc++/impl/codegen/method_handler_impl.h:176
#13 0x00ab4a90 in grpc::Server::RunRpc (this=0x2c2fbee0) at ...//include/grpc++/impl/codegen/rpc_service_method.h:61
#14 0x00aacd18 in grpc::DynamicThreadPool::ThreadFunc (this=0x2c2bc6a0) at ...//src/cpp/server/dynamic_thread_pool.cc:82
#15 0x00aacaf8 in grpc::DynamicThreadPool::DynamicThread::ThreadFunc (this=0x2c2eaad8)
    at ...//src/cpp/server/dynamic_thread_pool.cc:51
#16 0x005b8b24 in grpc::thread::thread_function<grpc::DynamicThreadPool::DynamicThread>::call (this=<value optimized out>)
    at ...//include/grpc++/impl/thd_no_cxx11.h:88
#17 0x005b6dc0 in grpc::thread::thread_func (arg=<value optimized out>) at ...//include/grpc++/impl/thd_no_cxx11.h:77
#18 0x01775fdc in thread_body (v=<value optimized out>) at ...//src/core/support/thd_posix.c:58
#19 0x0045c8dc in wrapperFunc (function=0x1775fa4 <thread_body>, arg=741255992, creatorSigMask=0) at pthreadLib.c:2980
#20 0x004810a8 in vxTaskEntry () at taskArchLib.c:241
#21 0x00000000 in ?? ()
=================================

Thanks
Rajeev




Click here to Reply

David Klempner

unread,
Mar 8, 2019, 6:31:40 PM3/8/19
to grpc.io
Note that you're asking about an ancient pre-1.0 release of grpc on an unsupported platform -- you're basically on your own, and you might consider solutions such as rebooting every once in a while.

One possible answer, however:

Your callstack includes calls to select(), which I presume is custom code because I do not believe grpc ever wrote code directly calling select(). Are you checking that calls to FD_SET are not larger than FD_SETSIZE? How is that handled? I see "width=1274" which is disturbingly high, given that it looks like VxWorks default FD_SETSIZE is 256.

rajeev...@rediffmail.com

unread,
Mar 11, 2019, 1:08:24 AM3/11/19
to grpc.io
Thanks David; yes that's true it is an ancient release and we had to make changes while porting onto vxWorks.
Reply all
Reply to author
Forward
0 new messages