Hello,
I'm in the long overdo process of updating gRPC from 1.20 to 1.36.1. I am running into an issue where the streaming replies from the server are not reaching the client in about 50% of the instances. This is binary, either the streaming call works perfectly or it doesn't work at all. After debugging a bit, I turned on the http tracing and from what I can tell, the http messages are received in the client thread, but where in the correct case, perform_stream_op[s=0x7f0e16937290]: RECV_MESSAGE is logged, but in the broken case it isn't. No error messages occur.
I've tried various tracers, but haven't hit anything. The code is pretty much the same pattern as the example and there's no indication any disconnect has occurred which would cause the call to terminate. Using gdb to look at the thread, it is still in epoll_wait.
The process in which this runs calls 2 different synchronous server streaming calls to the same server in separate threads. It also is a gRPC server. Everything is run over the internal 'lo' interface. Any ideas on where to look to debug this?
Thanks,
Bryan