Hey everyone,
Because of an issue we experienced, I found out that kafka message keys written by the Ingress are not hashed. They are basically a concatenation of the class and the labels. This means that all points related to a class go into the same partition / group of partitions.
Of course, this is fine in most cases.
In the case where data of a specific class is significantly bigger than the rest of the data, this can be an issue (big strings versus int / floats) where some store threads will have lots of work to do and others not so much.
Besides performance concerns, do you think having the key hashed (which would spread out messages of any class over all partitions but still keep the order guarantee for a given GTS) could cause problems? Maybe this could be an option?
Thanks for your input!