Kafka Producer failover between different Kafka clusters

387 views
Skip to first unread message

CMC

unread,
Mar 22, 2017, 4:58:21 AM3/22/17
to Confluent Platform
Hi

I'm trying to understand options on the best way to achieve Kafka Producer failover between different Kafka clusters.

My enterprise has two separate Kafka clusters in two separate Datacenters.

The producers in DC1 typically connect to brokers in DC1,   and producers in DC2 to brokers in DC2.

There might be circumstances when the producers in DC1 need to connect to brokers in DC2.    The producer connection string is usually just a list of brokers in DC1, so any views on the best way to switch this to DC2.

Some ideas are:

1) stopping the producers in DC1,   changing the broker list in the configuration files to DC2 brokers,  and restarting - a bit clunky

2) using DNS labels for brokers, so then stopping producers,  changing DNS labels to point to DC2 brokers and restarting - similar to above, but means a DNS change rather than a producer level config change

3) listing all DC1 and DC2 brokers in the Producer connection string at same time.  Never tested this.   Would a producer connect to DC1 if DC1 listed first,  and then if none in DC1 available automatically connect to DC2?   This option seems a bit random to me,  i.e. not entirely sure which cluster you'd connect to,  and not able to control easily

4) using a Load Balancer VIP in front of Kafka brokers.  Not sure how common this method is to connect.  Seems to remove some of resilience/LB from producer connection string itself.   This would rely on LB having a preference to DC1 brokers,  and only changing to point to DC2 brokers when required.  Suspect producers would still need to be restarted if there was much of a delay between the LB moving the VIP between DC1 & DC2.  
Also with this option,  I presume once the initial connection is made to the first broker,  and the producer learns about the topology of the cluster - it would then connect directly to the brokers nodes?   (ie  the traffic wouldn't flow via the Load Balancer but would be direct between producer & leader broker)
  

I'm aware it's probably best to keep producers & brokers within same site,  but these are low-latency/high bandwidth connected sites, so that should not be an issue.

So all in,   I don't really see a good option.    Probably option 2 is best of this bunch.      Wondering if there are any other practical ideas out there?

Thanks
Reply all
Reply to author
Forward
0 new messages