Hi,
Reading through the kafka connect documentations I'm wondering what's the difference in terms of consumer groups between standalone and distributed.
It seems that to run multiple consuming processes on different hosts the distributed setup is recommended.
But it looks like the standalone mode creates a kafka consumer group anyways, so if you run multiple standalone instances, doesn't it allow you to scale in the same way ?
Each standalone process will consume from one or more partitions right ?
Also, in the distributed mode, can multiple workers or tasks consume from the same partition ?
Thanks
--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/F6A8DEE6B40A30419B89834C7CFF43F4032DCD00%40gsdgeup01env2.firmwide.corp.gs.com.
For more options, visit https://groups.google.com/d/optout.
Thanks !
In my case I have a single simple task that I want to parallelize leveraging partitions, they don’t have to share state or do anything on rebalance or failover, so I guess running as many standalone instances as partitions does the job.
The underlying consumer api with the consumer group should coordinate which consumers gets data from which partition.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/CACOCneZD2B0iAKhXXq5y1QDLkH0kSHuK_LjM1Qd6FginaOCUdQ%40mail.gmail.com.