One Kafka cluster limitation for Kafka Streams applications

Anca Sarb

unread,

Apr 7, 2016, 8:50:18 AM4/7/16

to Confluent Platform

Hi all,

The developer guide for Kafka Streams (http://docs.confluent.io/2.1.0-alpha1/streams/developer-guide.html) states that there is currently a limitation in that a Kafka Streams application can only talk to a single Kafka cluster but also mentions that in the future, Kafka Streams will be able to support connection to different Kafka clusters for reading input streams and/or writing output streams. Is this feature going to be included in the 0.10.0 open source release, which is targeted for mid April?

Thank you for good work around Kafka Streams.

Anca

Guozhang Wang

unread,

Apr 7, 2016, 9:36:36 PM4/7/16

to Confluent Platform

Anca,

We do not have plans to include supporting multiple clusters in the first release of Kafka Streams in 0.10.0.0 yet.

Would you like to share your use case of stream processing across clusters for me to better understand the motivation of this feature request?

Guozhang

Anca Sarb

unread,

Apr 8, 2016, 4:20:23 AM4/8/16

to Confluent Platform

Hi Guozhang,

Thanks for your reply.

Regarding our use case, we're trying to synchronize two internal systems(each with their own Kafka clusters) by consuming records published by our system on one Kafka cluster and publishing a consolidated message on the Kafka cluster of the downstream system.

Do you have any suggestions on how to best go about achieving this? If possible, we'd like to still make use of Kafka streams library, as it's quite neat! Or do you recommend using the KafkaProducer/KafkaConsumer API instead?

Anca

Guozhang Wang

unread,

Apr 8, 2016, 3:16:33 PM4/8/16

to Confluent Platform

How complex is the consolidation process?

One thing you can do though, is to that you can use the customizable process() call in Kafka Streams, such that after the consolidation, you can use another embedded producer client in your processor to send to the different destination cluster. And when Kafka Streams add multi-cluster support, you can simply get rid of that customized processor at the end of your topology.

Guozhang

Saravanan Tirugnanum

unread,

Apr 21, 2017, 8:20:16 AM4/21/17

to Confluent Platform

Hi Guozhang

Is this Multi Cluster support feature available in 0.10.2.0. Or do we still need to write custom producer client and add to processor.

In that case , we will not have a sink at all our in topology builder. Hope thats fine.

Regards

Saravanan

Eno Thereska

unread,

Apr 21, 2017, 11:25:45 AM4/21/17

to Confluent Platform

The Multi cluster feature is not yet available in 0.10.2.0 and it's unlikely it will be available in 0.11 either (in 3 or so months). It's on our radar though, and it sounds useful. We could use some help from the community if anyone is interested in picking this up.

Thanks

Eno

Saravanan Tirugnanum

unread,

Apr 26, 2017, 2:17:14 PM4/26/17

to Confluent Platform

Thanks Eno.. I am happy to help contribute for this.. Can you pls help guide how to start

Eno Thereska

unread,

Apr 27, 2017, 12:42:41 PM4/27/17

to Confluent Platform

I think this will require what we call a KIP: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals. It's a form of a design proposal that you can put forth and then the community provides feedback. We can provide some guidance if needed too.

Thanks, this is much appreciated,

Eno

m.mohamed....@gmail.com

unread,

Dec 31, 2017, 7:50:28 AM12/31/17

to Confluent Platform

Hi,

I have a great need of the talking to multiple clusters feature.

Would you please tell me if it is supported on the 1.0.0 version ?

Thanks

Mohamed

Matthias J. Sax

unread,

Dec 31, 2017, 1:06:41 PM12/31/17

to confluent...@googlegroups.com

This feature is not supported yet and there is no concrete roadmap atm
either.

It's recommended to write the output to a topic in the source cluster
and replicate the data into the target cluster.

Note:
- the output topic in the source cluster can have a quite short
retention time, as it is only used as intermediate "buffer" while the
data is safely stored with larger retention time in the target cluster;
thus, memory overhead can be minimized
- for replicating the data you can use MirrorMaker that ships with
Apache Kafka (or other third party tools for cross-cluster replication)

Hope this helps.

-Matthias

On 12/31/17 4:50 AM, m.mohamed....@gmail.com wrote:
> Hi,
>
> I have a great need of the talking to multiple clusters feature.
> Would you please tell me if it is supported on the 1.0.0 version ?
>
> Thanks
> Mohamed
>
>
> Le jeudi 27 avril 2017 18:42:41 UTC+2, Eno Thereska a écrit :
>
> I think this will require what we call a
> KIP: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

> <https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals>.

> <http://docs.confluent.io/2.1.0-alpha1/streams/developer-guide.html>)

> states that there is currently a
> limitation in that a Kafka Streams
> application can only talk to a single
> Kafka cluster but also mentions that in
> the future, Kafka Streams will be able
> to support connection to different Kafka
> clusters for reading input streams
> and/or writing output streams. Is this
> feature going to be included in the
> 0.10.0 open source release, which is
> targeted for mid April?
>
> Thank you for good work around Kafka
> Streams.
>
> Anca
>

> --
> You received this message because you are subscribed to the Google
> Groups "Confluent Platform" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to confluent-platf...@googlegroups.com
> <mailto:confluent-platf...@googlegroups.com>.
> To post to this group, send email to confluent...@googlegroups.com
> <mailto:confluent...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/confluent-platform/d32f2bf9-de9c-4f1f-b1ae-baf198e86f94%40googlegroups.com
> <https://groups.google.com/d/msgid/confluent-platform/d32f2bf9-de9c-4f1f-b1ae-baf198e86f94%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

signature.asc

m.mohamed....@gmail.com

unread,

Apr 26, 2018, 2:45:44 PM4/26/18

to Confluent Platform

Thank you Matthias

I try to write the output to a topic in the source cluster and replicate the data into the target cluster using MirrorMaker. However, I get this error message:

[2018-04-26 17:42:22,003] ERROR Error when sending message to topic output-1-func003 with key: 9 bytes, value: 8 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for output-1-func003-0: 30013 ms has passed since batch creation plus linger time

I get this error 30 seconds after lunching a program that uses kafka streams to produce messages in the topic output-1-func003. The message is a long number which is sent every 5 seconds. After googling the error, I understood that the sending frequency may be the cause of the error. So, as recommended, I changed the configuration of the "linger.ms" and "batch.size" of MirrorMaker producer. However, this didn't solve the problem.

This is the command I use to lunch MirrorMaker

bin/kafka-mirror-maker.sh \
	--consumer.config config/sourceClusterConsumer.config \
	--producer.config config/targetClusterProducer.config \
	--whitelist=output-1-func003

This is the content of my sourceClusterConsumer.config

bootstrap.servers=localhost:9092
client.id=func003.Consumer
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=org.apache.kafka.common.serialization.LongDeserializer
group.id=func003.Consumer
partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor

This is the content of my targetClusterProducer.config

bootstrap.servers=192.168.10.5:9092
client.id=func003.Producer
key.serializer=org.apache.kafka.common.serialization.StringDeserializer
value.serializer=org.apache.kafka.common.serialization.LongDeserializer
compression.type=lz4
linger.ms=100
batch.size=65536

May you please help me ?

Thanks

Mohamed

> an email to confluent-platform+unsub...@googlegroups.com
> <mailto:confluent-platform+unsub...@googlegroups.com>.

Matthias J. Sax

unread,

Apr 30, 2018, 5:35:53 AM4/30/18

to confluent...@googlegroups.com

You might hit a bug that is addressed via KIP-91.

As a workaround, try to increase parameter `request.timeout.ms` (default
is 30 seconds)

-Matthias

> > an email to confluent-platf...@googlegroups.com
> <javascript:>
> > <mailto:confluent-platf...@googlegroups.com
> <javascript:>>.

> > To post to this group, send email to confluent...@googlegroups.com

> <javascript:>
> > <mailto:confluent...@googlegroups.com <javascript:>>.

> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/confluent-platform/d32f2bf9-de9c-4f1f-b1ae-baf198e86f94%40googlegroups.com
> <https://groups.google.com/d/msgid/confluent-platform/d32f2bf9-de9c-4f1f-b1ae-baf198e86f94%40googlegroups.com>
>
> >

> <https://groups.google.com/d/msgid/confluent-platform/d32f2bf9-de9c-4f1f-b1ae-baf198e86f94%40googlegroups.com?utm_medium=email&utm_source=footer

> <https://groups.google.com/d/msgid/confluent-platform/d32f2bf9-de9c-4f1f-b1ae-baf198e86f94%40googlegroups.com?utm_medium=email&utm_source=footer>>.
>
> > For more options, visit https://groups.google.com/d/optout

> <https://groups.google.com/d/optout>.

>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Confluent Platform" group.
> To unsubscribe from this group and stop receiving emails from it, send

> an email to confluent-platf...@googlegroups.com
> <mailto:confluent-platf...@googlegroups.com>.

> To post to this group, send email to confluent...@googlegroups.com
> <mailto:confluent...@googlegroups.com>.
> To view this discussion on the web visit

> https://groups.google.com/d/msgid/confluent-platform/048c7878-5d0d-4ef3-b752-4ef28ce983ef%40googlegroups.com
> <https://groups.google.com/d/msgid/confluent-platform/048c7878-5d0d-4ef3-b752-4ef28ce983ef%40googlegroups.com?utm_medium=email&utm_source=footer>.

signature.asc

Reply all

Reply to author

Forward