Trying to understand why are high cardinality queries are so slow.

90 views
Skip to first unread message

MikeKulls

unread,
Feb 20, 2019, 7:37:26 PM2/20/19
to OpenTSDB
It seems that this high cardinality issue is OpenTSDB's Achilles heel. If I have a metric with a large number of tags, then querying it becomes very slow. I'm trying to work out why, obviously that is the first step to solving a problem. My understanding of the issue is below, please let me know if this is incorrect. This is assuming I have a metric with a single tag of username

So the row key will key "metric_name","time to nearest hour", "UserNameKey", "UserNameValue"

Because OpenTSDB doesn't know that every row contains only a tag of Username, it can't use that part of the row key in a get or scan request. So it simply does a scan based on the start and end time and filters the returned data manually. Hence it needs to look through all the data in the time range for that metric.

Is my understanding correct?

Gabe Nydick

unread,
Feb 20, 2019, 9:19:47 PM2/20/19
to MikeKulls, OpenTSDB
The meta table is used as an index to help inform queries. Also, if you're salted, you will get better performance because very similar time series will be grouped together, but, as they differ more, they'll be spread across region servers, so you'll effectively get scatter-gather results.

Gabe E. Nydick
Principal Infrastructure Eng.

MikeKulls

unread,
Feb 20, 2019, 9:50:37 PM2/20/19
to OpenTSDB
So in the above example, would it do an indexed lookup on username or is it basically scanning all data for that metric? From the testing I've done it scans all data.

MikeKulls

unread,
Feb 21, 2019, 12:57:20 AM2/21/19
to OpenTSDB
To explain further what I mean by "indexed lookup", if we have a rowkey like this

'metric'-'time'-'username tag'-'username value'

And assuming I query my metric for a short time range and I specify a filter of username. I can see 2 ways opentsdb could run the query
1) It could do a scan in hbase specifying a startrow of my metric and start time. From there is could scan the table looking for my user
or
2) It could do a get in hbase specifying a rowkey of metric and time and username tag and username value. It would then repeat this for each hour in the time range.

I presume it does no 1 as otherwise it would quicker.

Gabe Nydick

unread,
Feb 21, 2019, 1:39:28 AM2/21/19
to MikeKulls, OpenTSDB
The data is stored sorted and the tag key value pairs are sorted I'm the row itself. It doesn't have to scan to find the data. 

What you're describing isn't as inefficient as it sounds.

It can't do #2 because it doesn't know all of the others tags.

MikeKulls

unread,
Feb 21, 2019, 9:25:59 PM2/21/19
to OpenTSDB
I presume though it does a fast lookup to find the metric and timestamp but then it needs to do a scan to find the tags? Hence if I want to return the data for 1 or 1000 devices then it needs to scan the data for 1000 devices

Gabe Nydick

unread,
Feb 22, 2019, 2:51:26 AM2/22/19
to MikeKulls, OpenTSDB
It does, but the values are sorted and the meta table acts like an index.
Reply all
Reply to author
Forward
0 new messages