You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Confluent Platform
Hi,
We have a Kafka streams (0.10.0.1) application which consumes keyless messages. This can't be changed unfortunately.
After repartitioning the messages with KStream#selectKey we'd like to have the exact same order of messages within one
partition (for the same keys) as they hit the input topic.
This would be important because it affects the order of the updates on the record we do in reduceByKey.
As far as I read after applying selectKey the order is not guaranteed anymore.. Is there any workaround/solution to this
problem other than having the right keys already in the source topic we read from?
Thank you,
Peter
Matthias J. Sax
unread,
Feb 7, 2017, 8:51:36 PM2/7/17
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to confluent...@googlegroups.com
There is no support offered by Streams... It's a general Kafka design
decision to have not ordering guaranteed between partitions, thus, you
don't have an ordering guarantee even in your original non-keyed topic.
And thus, because data is already out-of-order, Streams cannot fix this.
You could build a custom "ordering" processor though with Streams -- but
you still need to know what you ordering criteria will be in the first
place. If you get two records with the same key from two different
partitions, how to you decide which was was first?
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Confluent Platform
I see, thanks.
When the records are initially produced with the keys and later there's no change in the keys, then is the order of the messages
preserved when they are processed in reduceByKey? Can you please confirm this?
However, when records are reshuffled after selectKey, the order (the order based on the offsets) won't be kept anymore.
Did I understand correctly?
Peter
Matthias J. Sax
unread,
Feb 8, 2017, 12:27:38 PM2/8/17
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to confluent...@googlegroups.com
Yes. That's correct.
Put some comments inline.
-Matthias
On 2/7/17 11:20 PM, peter...@gmail.com wrote:
> I see, thanks.
> When the records are initially produced with the keys and later there's
> no change in the keys, then is the order of the messages
> preserved when they are processed in reduceByKey? Can you please confirm
> this?
Yes. As not re-partitioning happens, order is preserved.
> However, when records are reshuffled after selectKey, the order (the
> order based on the offsets) won't be kept anymore.
Yes. If you set a new key, it basically the case, that your data is
out-of-order (with respect to the new key) in the original stream. And
thus, the re-partitioning step can't do anything to "fix" the order.