Hi all,
I'm a little confused about how to get our Kafka consumers to reliably connect to a broker cluster.
Let's assume the following:
3x broker+zookeeper nodes: broker01, broker02, broker03. Kafka broker is on port 9092, zookeeper is on port 2181
1 worker node (to be configured with 3 consumer tasks): worker01
Using 'bootstrap.servers=broker01:9092,broker02:9092,broker03:9092' is not ideal, as that requires all 3 brokers in the cluster to be available when the worker comes up (I think). Probability wise you're better off specifying just a single node here; if a single node fails and you've specified 3 nodes in the list, you're guaranteed to fail to start the worker 100% of the time. If you specify a single node here and you have a single node fail on you there's only a 33.33% chance that it's the one you've specified and therefore causes you to fail. Does anyone have any thoughts on what the optimum strategy here is? To me the current behaviour of 'fail if any of the bootstrap.servers is unavailable' is counterintuitive. I'd have expected it to try each in turn and as long as any one of them is available but I don't think that's what was happening during our testing.
Thanks for any clarification you can provide.
Kind Regards,
Matt