Regular network connection failure of a Kafka broker in a 3 broker cluster in OpenShift.

1,121 views
Skip to first unread message

Matt Mcewan

unread,
Apr 24, 2018, 1:05:46 AM4/24/18
to Confluent Platform
For reasons unknown and often several times a week in Production and Test, we cannot communicate with a Kafka broker, and this message repeats in the log: WARN Connection to node nnnn could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) 

Strangely this in turn prevents Kafka working (We cannot Produce/Consume).

OpenShift doesn't recognize it's not working, and Kafka doesn't recognize it either.

I am about to add a Livenessprobe to the YAML to restart the Pod if the command is not executed, but we'd like to find the root cause naturally.

If I use the Curl url:hostport command from another Broker or Zookeeper node, you can get a reply from all other Brokers and Zookeepers. Yet Curl to the Kafka node that has "failed" returns "Could not resolve host ...", even though I can go into OpenShift and use the Terminal. 

I don't know if this is a Kafka or OpenShift/Kubernetes issue.

If anyone else has had this and resolved it, I'd be grateful for some pointers.

Priyanka Reddy

unread,
Feb 1, 2019, 7:32:39 AM2/1/19
to Confluent Platform
Hi,

Are you able to find the solution? I have the same prob now
Reply all
Reply to author
Forward
0 new messages