[OpenTSDB]: Different query times for the same amount of points

Benjamin Vandenberghe

unread,

Oct 30, 2015, 12:40:45 PM10/30/15

to OpenTSDB

Hello there.

First of all, thank you ! because we are using OpenTSDB every day and it helps a lot !!

Running OpenTSDB-2.1.0.

Here is the situation :

A node is sending its cpu.pidle every minute to OpenTSDB with one tag, host.

The same node is also sending its network traffic (net.rxkb for instance) every minute with two tags, host and interface name, for each interface (8 on the node).

I first request cpu.pidle host=mynode over a month with downsampling over 30 minutes (around 42,000 datapoints) taking about 5 seconds, which is fine.

Then I request net.rxkb host=mynode interface=eth0 over the same month with the same downsampling (same number of datapoints) taking around 25 to 30 seconds !

I wonder where this huge difference is coming from ... ?

Is it from tag treatment ?

Is it from the fact that every minute, node is sending 8 datapoints for net.rxkb while 1 for cpu.pidle, thus making "net.rxkb column" way bigger on a month ?

Is there any way of solving this problem ?

Cheers.

Benjamin

nha...@gmail.com

unread,

Nov 2, 2015, 3:40:54 AM11/2/15

to OpenTSDB

Is it from the fact that every minute, node is sending 8 datapoints for net.rxkb while 1 for cpu.pidle, thus making "net.rxkb column" way bigger on a month ?

You just nailed down the problem. Since the row key is a tuple of (metric, timestamp, tags) in that order, it needs to scan all rows for that metric in the specified time range, regardless of the tags you specified in the query. When you have more data points in the same amount of time, the time taken increases accordingly.

Is there any way of solving this problem ?

Push one of the tags to the metric name (e.g. mynode.net.rxkb). However, you will not be able to aggregate across different values of the chosen tag with a single query (e.g. in the aforementioned example, you won't be able to aggregate across different hosts with a single query), and it may cause an explosion in the number of metrics if you have many instances of the tag value.

Benjamin Vandenberghe

unread,

Nov 2, 2015, 5:27:53 AM11/2/15

to OpenTSDB

Thank you for your response nhahtdh.

Much appreciated.

ManOLamancha

unread,

Nov 10, 2015, 5:54:33 PM11/10/15

to OpenTSDB

Also if you don't mind starting with a new set of tables, you can enable salting with 2.2.0 and spread the high cardinality data around a bit. http://opentsdb.net/docs/build/html/user_guide/writing.html#salting Also the time taken depends on the number of unique hosts you have.

Benjamin Vandenberghe

unread,

Apr 14, 2016, 8:48:49 AM4/14/16

to OpenTSDB

Hi there !

Back in discussion...

We've upgraded our cluster to 2.2 and did not enable salting as we don't want to lose all our data, and migrating from one HBase cluster to another may be a bit tricky.

We still have the same problem of my first post of this thread, and it is getting worse with time...

Now a query for sys.cpu.pidle with one tag host over a month with 30m downsampling takes 10 seconds (vs. 5s five month ago).

We really need a solution for our cluster handling around 6,000 different host. Our constraint is to make current data available, even if we have to build a new HBase cluster.

A very interresting potential solution is : http://opentsdb.net/docs/build/html/user_guide/query/index.html#explicit-tags

Can someone confirm that it is typically our case, and that we will be able to enable this feature without reseting (or migrating) our data ?

If this solution is not enough, we will be forced to build another set of tables and enable salting...

Does this salting solution will enhance READ performances ? How to determine the best number of scanner for this feature ?

Thanks for your help !

Mathias Herberts

unread,

Apr 14, 2016, 11:00:17 AM4/14/16

to OpenTSDB

How many different series do you have per host? How many datapoints per series?

Benjamin Vandenberghe

unread,

Apr 14, 2016, 11:56:46 AM4/14/16

to OpenTSDB

At least 30 series per host.

The serie of my example gets 1 datapoint every minute.

Over a month it represents 43200 datapoints.

Benjamin Vandenberghe

unread,

May 2, 2016, 8:50:56 AM5/2/16

to OpenTSDB

Up on that post !

Reply all

Reply to author

Forward