func TSys100NStoEpoch(nsec uint64) int64 {
nsec -= 116444736000000000
seconds := nsec / 1e7
return int64(seconds)
}
Grouping
Down Sampling
Interpolation
Aggregation
Rate Calculation
Here is an example query:
The main change here to the query API is the expression parameter: x=difference(sum:proc.stat.cpu.percpu{cpu=1},sum:proc.stat.cpu.percpu{cpu=2})
You can create more complex expressions like:
x=scale(difference(sum:proc.stat.cpu.percpu{cpu=1},sum:proc.stat.cpu.percpu{cpu=2}), 0.003) // scale the result of difference to .3%
x = alias(scale(difference(...{cpu=1}, ...{cpu=2}), 0.003), 'newname') //alias the result name to 'newname'
When you encode this string as a URL, please note that { and } need to be replaced by %7B and %7D respectively.
Bump. Any update on getting this moved into 2.2?
So are you saying the put branch handles counters differently and I can try it now, or that the put branch lays the foundations for this change?
Any progress on this front?
Think it will be part of 2.3?
There is no notion of “missing data” in OpenTSDB, since no assumptions
are made on the interval between data points. Things don’t have to
tick at a fixed rate.
For monotonically increasing counters, resets are generally easily
detected by the first derivative (aka rate) becoming negative,
typically only for one data point (although something crash looping
might produce a more erratic behavior). Again this is where a more
powerful query language comes in handy as it allows you to express
things like ge(rate(foo), 0) where ge(x, 0) means only keep values of
x for x >= 0.
There absolutely needs to be a notion of missing data. Half the problems in this thread wouldn't exist if TSDB had a concept of missing data. Here's a trivial way to support it without a fixed polling interval: the first time a reporter pushes a metric it marks a "this is my first report" flag that introduces a barrier for that data.
Using the first derivative to detect resets is a heuristic; there are many cases where it fails. Those seem like corner cases, but they're not ignorable.
The current approach of introducing 0, or worse yet letting the interpolation do its thing, is inventing data. In my opinion, inventing data is a cardinal sin for any product that has "DB" in its name.
You mentioned "we now have rate options", have these been updated somewhere? The main thing I wanted in regards to rate calculation was the datapoint to be dropped instead of emitting a zero during rollover, is that possible now?
On Wednesday, March 2, 2016 at 11:01:36 AM UTC-8, adam.l...@turn.com wrote:There absolutely needs to be a notion of missing data. Half the problems in this thread wouldn't exist if TSDB had a concept of missing data. Here's a trivial way to support it without a fixed polling interval: the first time a reporter pushes a metric it marks a "this is my first report" flag that introduces a barrier for that data.We kinda have that already with the meta data system, it just needs better tuning for high-throughput uses. What else would you have the first metric report?
Using the first derivative to detect resets is a heuristic; there are many cases where it fails. Those seem like corner cases, but they're not ignorable.For a counter we now have the rate options that handle rollovers properly. What other cases are you thinking of?
The current approach of introducing 0, or worse yet letting the interpolation do its thing, is inventing data. In my opinion, inventing data is a cardinal sin for any product that has "DB" in its name.With 2.2 we now have the NaN handling for missing data so that has been helping a ton to detect upstream issues. And I agree, we definitely don't want to invent data when writing to storage but at query time, it can be nice to interpolate or invent the data in some situations. But we should always have the fallback of showing the raw data.
This isn't related to metadata in any way. When a machine restarts, and the counter that used to be 200M is now 0, the metadata is the same. What I mean is that the reporter knows for sure that there is a hole in the data; all you need is that when it first reports a metric *since booting* it marks the *datapoint* as first (not the metric). The point is to insert a break between the last time the metric was reported (i.e. before the restart) and the first report. That means that TSDB now knows to 1) not do rate across that hole and 2) not interpolate across that hole.
The biggest case that's not handled properly is when the machine (aka reporter) is restarted. Whenever I see rollover with respect to rate is in talk about integer overflow, which is a comparatively rare event in the world of 64-bit ints.
With 2.2 we now have the NaN handling for missing data so that has been helping a ton to detect upstream issues. And I agree, we definitely don't want to invent data when writing to storage but at query time, it can be nice to interpolate or invent the data in some situations. But we should always have the fallback of showing the raw data.I disagree. Never invent data. Full Stop. The correct way to handle missing data is to exclude it from computation.I've witnessed multiple cases where what's nice to you means I get incorrect plots. The counter+rate stuff at least has the curtesy to be obviously wrong. But the same problems occur with other metrics where the effect is subtle and easy to miss.
I understand that this is a big ask, because it needs to extend all the way to the UI layers.
When it comes to the reordering and 2.3 vs 2.4 and HBaseCon. Not only is HBaseCon in May, but my Birthday is in May as well. And I want proper counters for my Birthday ... not sure if that changes priorities?