Kafka Streams: ordering records after rekeying

815 views
Skip to first unread message

peter...@gmail.com

unread,
Feb 7, 2017, 5:49:30 PM2/7/17
to Confluent Platform
Hi,

We have a Kafka streams (0.10.0.1) application which consumes keyless messages. This can't be changed unfortunately.
After repartitioning the messages with KStream#selectKey we'd like to have the exact same order of messages within one
partition (for the same keys) as they hit the input topic.
This would be important because it affects the order of the updates on the record we do in reduceByKey.

As far as I read after applying selectKey the order is not guaranteed anymore.. Is there any workaround/solution to this
problem other than having the right keys already in the source topic we read from?

Thank you,
Peter

Matthias J. Sax

unread,
Feb 7, 2017, 8:51:36 PM2/7/17
to confluent...@googlegroups.com
There is no support offered by Streams... It's a general Kafka design
decision to have not ordering guaranteed between partitions, thus, you
don't have an ordering guarantee even in your original non-keyed topic.
And thus, because data is already out-of-order, Streams cannot fix this.

You could build a custom "ordering" processor though with Streams -- but
you still need to know what you ordering criteria will be in the first
place. If you get two records with the same key from two different
partitions, how to you decide which was was first?


-Matthias
> --
> You received this message because you are subscribed to the Google
> Groups "Confluent Platform" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to confluent-platf...@googlegroups.com
> <mailto:confluent-platf...@googlegroups.com>.
> To post to this group, send email to confluent...@googlegroups.com
> <mailto:confluent...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/confluent-platform/1bd22690-7000-4d9d-861c-76ed2a01e3be%40googlegroups.com
> <https://groups.google.com/d/msgid/confluent-platform/1bd22690-7000-4d9d-861c-76ed2a01e3be%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

signature.asc

peter...@gmail.com

unread,
Feb 8, 2017, 2:20:21 AM2/8/17
to Confluent Platform
I see, thanks.
When the records are initially produced with the keys and later there's no change in the keys, then is the order of the messages
preserved when they are processed in reduceByKey? Can you please confirm this?
However, when records are reshuffled after selectKey, the order (the order based on the offsets) won't be kept anymore.
Did I understand correctly?

Peter

Matthias J. Sax

unread,
Feb 8, 2017, 12:27:38 PM2/8/17
to confluent...@googlegroups.com
Yes. That's correct.

Put some comments inline.


-Matthias

On 2/7/17 11:20 PM, peter...@gmail.com wrote:
> I see, thanks.
> When the records are initially produced with the keys and later there's
> no change in the keys, then is the order of the messages
> preserved when they are processed in reduceByKey? Can you please confirm
> this?

Yes. As not re-partitioning happens, order is preserved.

> However, when records are reshuffled after selectKey, the order (the
> order based on the offsets) won't be kept anymore.

Yes. If you set a new key, it basically the case, that your data is
out-of-order (with respect to the new key) in the original stream. And
thus, the re-partitioning step can't do anything to "fix" the order.
> --
> You received this message because you are subscribed to the Google
> Groups "Confluent Platform" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to confluent-platf...@googlegroups.com
> <mailto:confluent-platf...@googlegroups.com>.
> To post to this group, send email to confluent...@googlegroups.com
> <mailto:confluent...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/confluent-platform/f5d6d7b9-31c4-4cb4-b84c-f664d871d024%40googlegroups.com
> <https://groups.google.com/d/msgid/confluent-platform/f5d6d7b9-31c4-4cb4-b84c-f664d871d024%40googlegroups.com?utm_medium=email&utm_source=footer>.
signature.asc

peter...@gmail.com

unread,
Feb 8, 2017, 3:07:49 PM2/8/17
to Confluent Platform
Thank you Matthias for the clarification!

--Peter
Reply all
Reply to author
Forward
0 new messages