grpc c++ BlockingUnaryCall hang forever while socket data is available

25 views
Skip to first unread message

fy m

unread,
Nov 10, 2025, 10:10:52 AM (3 days ago) Nov 10
to grpc.io
Hello, I have a client(A)/server(B) program backed by GRPC 1.37.

Here we are using BlockingUnaryCall mainly in a multithread environment and no timeout set yet from client side.

And we find that after several calls(with different RPC methods) , there is a possibility that some pending RPCs never return and the callstack like below:

Thread 31 (Thread 0x7ff695ffb700 (LWP 3073) "grpcpp_sync_ser"):
#0  0x00007ff707ee87f9 in syscall () from /lib64/libc.so.6
#1  0x00007ff708509987 in absl::lts_20230802::synchronization_internal::FutexWaiter::WaitUntil(std::atomic<int>*, int, absl::lts_20230802::synchronization_internal::KernelTimeout) () from /usr/lib64/libabsl_synchronization.so.2308.0.0
#2  0x00007ff708509a6a in absl::lts_20230802::synchronization_internal::FutexWaiter::Wait(absl::lts_20230802::synchronization_internal::KernelTimeout) () from /usr/lib64/libabsl_synchronization.so.2308.0.0
#3  0x00007ff708509c71 in AbslInternalPerThreadSemWait_lts_20230802 () from /usr/lib64/libabsl_synchronization.so.2308.0.0
#4  0x00007ff70850bbd3 in absl::lts_20230802::Mutex::Block(absl::lts_20230802::base_internal::PerThreadSynch*) () from /usr/lib64/libabsl_synchronization.so.2308.0.0
#5  0x00007ff70850c776 in absl::lts_20230802::Mutex::LockSlowLoop(absl::lts_20230802::SynchWaitParams*, int) () from /usr/lib64/libabsl_synchronization.so.2308.0.0
#6  0x00007ff70850cdac in absl::lts_20230802::Mutex::LockSlowWithDeadline(absl::lts_20230802::MuHowS const*, absl::lts_20230802::Condition const*, absl::lts_20230802::synchronization_internal::KernelTimeout, int) () from /usr/lib64/libabsl_synchronization.so.2308.0.0
#7  0x00007ff70850934a in absl::lts_20230802::Mutex::LockSlow(absl::lts_20230802::MuHowS const*, absl::lts_20230802::Condition const*, int) () from /usr/lib64/libabsl_synchronization.so.2308.0.0
#8  0x00007ff708a3c216 in ?? () from /usr/lib64/libgrpc.so.37
#9  0x00007ff708a3fef5 in ?? () from /usr/lib64/libgrpc.so.37
#10 0x00007ff708a49265 in grpc_pollset_work(grpc_pollset*, grpc_pollset_worker**, grpc_core::Timestamp) () from /usr/lib64/libgrpc.so.37
#11 0x00007ff708b5215e in ?? () from /usr/lib64/libgrpc.so.37
#12 0x00007ff709067a5d in grpc::CompletionQueue::Pluck (this=0x7ff695ff9b60, tag=0x7ff695ff9ba0) at /usr/include/grpcpp/completion_queue.h:322
#13 0x00007ff709071bce in grpc::internal::BlockingUnaryCallImpl<google::protobuf::MessageLite, google::protobuf::MessageLite>::BlockingUnaryCallImpl (this=0x7ff695ff9f00, channel=0xb14cc0, method=..., context=0x7ff695ffa030, request=..., result=0x7ff695ffa210) at /usr/include/grpcpp/impl/client_unary_call.h:80
#14 0x00007ff70906ed76 in grpc::internal::BlockingUnaryCall<bam_grpc::bam_get_prealloc_chunks_args, bam_grpc::bam_get_prealloc_chunks_res, google::protobuf::MessageLite, google::protobuf::MessageLite> (channel=0xb14cc0, method=..., context=0x7ff695ffa030, request=..., result=0x7ff695ffa210) at /usr/include/grpcpp/impl/client_unary_call.h:51
......

However we can see the underlying socket as data needs to be drained:

e5b2384394a9:/ # ss -antp | grep 49495
LISTEN     0      4096   [::ffff:127.0.0.46]:49495                   *:*     users:(("B",pid=3003,fd=11))
CLOSE-WAIT 397    0       [::ffff:127.0.0.1]:58096 [::ffff:127.0.0.46]:49495 users:(("A",pid=3042,fd=8))

There are 397 bytes in the RECV-Q but never get a chance to be read. (It's in CLOSE-WAIT due to B has close the connection.)

Can anyone help how to further debugging? Thanks.
Reply all
Reply to author
Forward
0 new messages