### What version of gRPC and what language are you using?
```
Python (client): grpcio==1.27.2
```
### What operating system (Linux, Windows,...) and version?
Linux Ubuntu 18.04
```
# uname -a
4.15.0-99-generic #100-Ubuntu SMP Wed Apr 22 20:32:56 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
```
### What did you do?
Go (server)
```
// MaxMessageSizeBytes 2GB to support large data size
// KeepalivePolicyMinTime The minimum time between keepalive pings. Client pings can not be more frequent than this.
const KeepalivePolicyMinTime = 5 * time.Second
// KeepalivePolicyPermitWithoutStream Permit keepalive pings even with no inflight RPCs.
const KeepalivePolicyPermitWithoutStream = true
// KeepaliveParamTime If the server doesn't see any activity after this time, it
// pings the client to see if the transport is still alive.
const KeepaliveParamTime = 5 * time.Second
// KeepaliveParamTimeOut After having pinged for keepalive check, the server waits for this duration and
// the connection is closed if no activity is seen after that.
const KeepaliveParamTimeOut = 5 * time.Second
...
...
grpcListener, err := net.Listen(network, address)
if err != nil {
return nil, err
}
keepaliveParameters := keepalive.ServerParameters{
Time: KeepaliveParamTime,
Timeout: KeepaliveParamTimeOut,
}
keepalivePolicy := keepalive.EnforcementPolicy{
MinTime: KeepalivePolicyMinTime,
PermitWithoutStream: KeepalivePolicyPermitWithoutStream,
}
var (
recoveryFunc grpc_recovery.RecoveryHandlerFunc
)
opts := []grpc_recovery.Option{
grpc_recovery.WithRecoveryHandler(recoveryFunc),
}
grpcServer := grpc.NewServer(
grpc_middleware.WithUnaryServerChain(
grpc_recovery.UnaryServerInterceptor(opts...),
),
grpc_middleware.WithStreamServerChain(
grpc_recovery.StreamServerInterceptor(opts...),
),
grpc.MaxRecvMsgSize(MaxMessageSizeBytes),
grpc.KeepaliveEnforcementPolicy(keepalivePolicy),
grpc.KeepaliveParams(keepaliveParameters),
)
// followed by grpcServer.Serve
```
Python (client)
```
import grpc
import time
from my_pb
from my_grpc
GRPC_CHANNEL_OPTIONS = [
('grpc.keepalive_time_ms', 5000),
('grpc.keepalive_timeout_ms', 5000),
('grpc.keepalive_permit_without_calls', True),
('grpc.http2.max_pings_without_data', 0),
('grpc.http2.min_time_between_pings_ms', 5000),
('grpc.http2.min_ping_interval_without_data_ms', 5000),
('grpc.max_send_message_length', 2000 * 1024 * 1024),
('grpc.max_receive_message_length', 2000 * 1024 * 1024)]
if __name__ == "__main__":
url = "localhost:50051"
my_channels = [grpc.insecure_channel(url, options=GRPC_CHANNEL_OPTIONS) for i in range(10)]
my_stubs = [my_grpc.MyStub(my_channels[i]) for i in range(10)]
print(f"Setup grpc channel + stubs... {len(my_channels)}")
# make a request on each connection
for my_stub in my_stubs:
try:
request = my_pb.GetRequest()
response = my_stub.Get(request)
print(response)
except Exception as e:
# print(e)
pass
# test: grpc should keepalive with the connection
# simulate the scenario when there are no activity on the channel
while True:
print("Sleeping ...")
time.sleep(10)
# ... code to use the channel
```
### What did you expect to see?
Keepalive maintains the connection when no request/response flows through the channel.
And I also do not expect to see network/connection/watchdog error received on the client.
### What did you see instead?
Errors seen in the client. (No error on server.)
```
Setup grpc channel + stubs... 10
Sleeping ...
Sleeping ...
E0505 16:54:08.762435018 47441 chttp2_transport.cc:2893] keepalive_ping_end state error: 0 (expect: 1)
Sleeping ...
E0505 16:54:18.762541095 47452 chttp2_transport.cc:2893] keepalive_ping_end state error: 0 (expect: 1)
Sleeping ...
E0505 16:54:28.766357886 47452 chttp2_transport.cc:2893] keepalive_ping_end state error: 0 (expect: 1)
Sleeping ...
E0505 16:54:38.765747109 47441 chttp2_transport.cc:2893] keepalive_ping_end state error: 0 (expect: 1)
Sleeping ...
E0505 16:54:48.766456333 47452 chttp2_transport.cc:2880] ipv6:[::1]:50051: Keepalive watchdog fired. Closing transport.
Sleeping ...
Sleeping ...
```
```
Setup grpc channel + stubs... 10
Sleeping ...
Sleeping ...
Sleeping ...
E0505 17:00:34.036852668 47929 chttp2_transport.cc:2880] ipv6:[::1]:50051: Keepalive watchdog fired. Closing transport.
Sleeping ...
Sleeping ...
Sleeping ...
Sleeping ...
```
----
In production environment, I have also seen the same error message when using the above configuration -- thought reporting a different error code `E0330` and using a older version of grpcio client library (grpcio==1.23.0):
```
E0330 19:01:16.746628681 10533 chttp2_transport.cc:2825] ipv4:
10.50.40.102:50051: Keepalive watchdog fired. Closing transport.
```
------
I have also tried the following configuration but also encountered the watchdog error.
Go (server)
```
// KeepalivePolicyMinTime The minimum time between keepalive pings. Client pings can not be more frequent than this.
// const KeepalivePolicyMinTime = 5 * time.Second
const KeepalivePolicyMinTime = 10 * time.Second
// KeepalivePolicyPermitWithoutStream Permit keepalive pings even with no inflight RPCs.
const KeepalivePolicyPermitWithoutStream = true
// KeepaliveParamTime If the server doesn't see any activity after this time, it
// pings the client to see if the transport is still alive.
// const KeepaliveParamTime = 5 * time.Second
const KeepaliveParamTime = 10 * time.Second
// KeepaliveParamTimeOut After having pinged for keepalive check, the server waits for this duration and
// the connection is closed if no activity is seen after that.
const KeepaliveParamTimeOut = 5 * time.Second
```
Python (client)
```
GRPC_CHANNEL_OPTIONS = [
('grpc.keepalive_time_ms', 10000),
('grpc.keepalive_timeout_ms', 5000),
('grpc.keepalive_permit_without_calls', True),
('grpc.http2.max_pings_without_data', 0),
('grpc.http2.min_time_between_pings_ms', 10000),
# ('grpc.http2.min_ping_interval_without_data_ms', 10000), # N/A on client side
('grpc.max_send_message_length', 2000 * 1024 * 1024),
('grpc.max_receive_message_length', 2000 * 1024 * 1024)]
```
1. Could you please help me understand what is the correct way to configure these options so that I would not encounter the `Keepalive watchdog fired. Closing transport` error? Thanks.
> GRPC_ARG_KEEPALIVE_TIME_MS
> This channel argument controls the period (in milliseconds) after which a keepalive ping is sent on the transport.
and
> GRPC_ARG_HTTP2_MIN_SENT_PING_INTERVAL_WITHOUT_DATA_MS
> If there are no data frames being received on the transport, this channel argument controls the minimum time (in milliseconds) gRPC Core will wait between successive pings.
Does GRPC_ARG_KEEPALIVE_TIME_MS send a keepalive ping regardless if there is data provided GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA = true?
Does that mean GRPC_ARG_HTTP2_MIN_SENT_PING_INTERVAL_WITHOUT_DATA_MS is only applicable when GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA = false?
Thanks