Hi,
In our setup, we have a Kubernetes pod with two containers—one hosting the gRPC service and the other acting as the client. The service is implemented in Golang, while the client is in Python. We create the client at the start of the application and use the same client though out the lifetime of the application.
Everything works fine until the server sends a keepalive ping (default: 2 hours after the last activity) and it doesn’t receive an acknowledgment within 20 seconds and it closes the transport.
When the client subsequently makes a gRPC call, it detects that the transport is unavailable and encounters an error. However, on the next attempt, knowing the transport is missing, it creates a new one, and the call succeeds.
Since both the server and client run within the same Kubernetes pod, there is no firewall blocking the pings. They are on the same network and communicate via localhost.
I enabled debug logging and noticed that pings sent by the client are acknowledged by the server. These pings have an 8-byte payload containing arbitrary data. I assume these are not keepalive pings, as keepalive pings should contain all zeros in hex. (Please correct me if I’m wrong.)
From my debug log observations, I see the following pattern:
I have not overridden any gRPC channel options—these observations are based on the default configuration.
Can anyone help me debug and understand this issue?
debug logs from client after 30 minutes of inactivity
I0000 00:00:1740647883.533817 23 init.cc:167] grpc_shutdown(void)
I0000 00:00:1740648903.774618 26 connectivity_state.cc:173] ConnectivityStateTracker client_channel[0x555dd05705d0]: get current state: READY
I0000 00:00:1740648903.774675 26 connectivity_state.cc:151] ConnectivityStateTracker client_channel[0x555dd05705d0]: READY -> IDLE (channel entering IDLE, OK)
I0000 00:00:1740648903.774714 26 connectivity_state.cc:151] ConnectivityStateTracker client_transport[0x7f0cd0001968]: READY -> SHUTDOWN (close_transport, OK)
I0000 00:00:1740648903.774719 26 connectivity_state.cc:159] ConnectivityStateTracker client_transport[0x7f0cd0001968]: notifying watcher 0x7f0cd0001500: READY -> SHUTDOWN
I0000 00:00:1740648903.774779 26 connectivity_state.cc:74] watcher 0x7f0cd0001500: delivering async notification for SHUTDOWN (OK)
I0000 00:00:1740648903.774788 26 init.cc:167] grpc_shutdown(void)
debug logs from server after 30 minutes of inactivity
02/26 08:01:55 INFO: [transport] [server-transport 0xc000338000] Closing: EOF
2025/02/26 08:01:55 INFO: [transport] [server-transport 0xc000338000] loopyWriter exiting with error: transport closed by client
debug logs from server after 2 hours of inactivity
2025/02/26 15:27:44 INFO: [transport] [server-transport 0x14000214600] Closing: keepalive ping not acked within timeout 20s
2025/02/26 15:27:44 INFO: [transport] [server-transport 0x14000214600] loopyWriter exiting with error: transport closed by client