Hi
I am running number of source connector(around 20) in connect cluster(3 nodes) and all the connectors are fetching records from Oracle tables and pushing them to different kafka topics(different consumer groups) successfully.
Lately i am facing connection timeout issue while trying to submit more connectors to the connect cluster:
{
"error_code": 500,
"message": "Request timed out"
}
and the worker log contains the following entries
Member connect-1-ee4157cb-2dfe-4505-b38d-4dd7c4e300f8 sending LeaveGroup request to coordinator X.X.X.X:9093 (id: 2147483645 rack: null) due to consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:979)
The timeout issue does not crop up initially when the number of connectors in the cluster is less. The issue occurs when the load in the connect cluster reached substantially high(around 20 connectors).
#600 Secs
#KAFKA-10122, 60 Secs
Any suggestion on this issue will be greatly appreciated and also I like to know how can i derive correct value for max.poll.interval.ms and is it correct to increase the values for max.poll.interval.ms to a higher value e.g 10 minutes ?
The Kafka cluster(7 node) and connect cluster(3 node) are running on OEL6.10/Confluent 5.5.1-CE
Regards