druid.broker.cache.unCacheable=[] |
--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/3ebd3103-2237-493d-8efb-4c5dd17274f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/3ebd3103-2237-493d-8efb-4c5dd17274f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/f1CclNwXGGw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CABs1682Nnej25baMeB_A-nwnaGZ9sn-utJKtDQL28VPHM35MKw%40mail.gmail.com.
The following shows the segment scan times we observed in production. As you can see, the uneven distribution of segment scan times across nodes yields to some nodes having scan times of up to 5 or even 10 seconds. We haven't updated our production system with new memory settings yet, but in our test systems we observed as showed above that the issue is probably that after having been in operation for a while, a node doesn't have enough heap to fully utilize all of its cores, hence the uneven distribution in scan times.
With the right heap settings, the oscillating scan times in the first graph above would reduce to something much nicer:
Below you see a sequence of six different types of queries send sequentially: x queries of type 1, then x queries of type 2 and so on.
Results show that complex metrics (with filter expressions) and numeric sort order causes the biggest stress on memory and GC and leads to performance penalties. The implementation for numeric sorting looks quite abusive as it receives string objects and instantiates BigDecimal objects from them, just for comparing which of two is numerically smaller.
The leftmost scan times are for alphanumeric sorting, the rightmost are for numeric sorting with otherwise same query type. As one can see, the segment scan times with numerical sorting are four times slower, so there is a huge potential for speed improvements if someone were to tune the numerical sort routines in Druid.