Hey,
I need to handle 40,000 hosts, 325 metrics each, 1 point per minute.
I've managed to load the data at a rate of 240,000 data-points-per-second pretty easily (18.7 billion data points loaded at this rate for a day). But the query speeds were not what I anticipated originally.
Let's say I have 1 day's worth of data:
- If I have 325 metrics (held as metrics) and 40,000 hosts (held as tags) -> Takes 12 seconds to query a new metric.
- If I have 40,000 hosts (held as metrics) and 325 metrics (held as tags) -> Takes 50 milliseconds to query a new metric.
Since 325 * 40,000 = 13 million, and we'll likely get more metrics than 325, I can't push the metric and the host into the metric (I'd pass 16 million UIDs quickly). So, I think these are my only two options.
Is there a down-side to using the host as the metric name in this manner in OpenTSDB?
I see that it messes with Grafana; but I can probably copy the OpenTSDB plugin for Grafana and modify it to work in this manner.
Extra notes:
- Compaction disabled.
- Appends disabled.
- LZ4 compression (we're on MapR-DB).
- Using 3 notes, salting with 3 buckets.
- Using random UID assignment.
- Data block encoding set to FAST_DIFF, though I'm not sure if MapR-DB honors that.
Thanks!
-John