Linear interpolation calculation

2,140 views
Skip to first unread message

anku...@gmail.com

unread,
Oct 9, 2013, 1:06:57 PM10/9/13
to open...@googlegroups.com
Hi I was going through this link http://opentsdb.net/docs/build/html/user_guide/query/aggregators.html .

I came across some negative values in the example of aggregation.

Just wanted to know the exact calculation.

Please anyone can elaborate the below table with reference to the above link. Any help is appreciated.

series ts0 ts0+10s ts0+20s ts0+30s ts0+40s ts0+50s ts0+60s
A na 5 na 15 na 5 na
B 10 na 20 na 10 na 20
Result 10 15 30 -7 -2 15 20

ManOLamancha

unread,
Oct 14, 2013, 6:43:53 PM10/14/13
to open...@googlegroups.com

Oop,s the -7 should be "5" and the -2 should be "0". I'll fix the docs. The linear interpolation formula is just above that, y = y0 + (x - x0) * (y1 - y0) / (x1 - x0)

anku...@gmail.com

unread,
Oct 15, 2013, 1:27:35 AM10/15/13
to open...@googlegroups.com
Thanks , it would be great if you provide the values in the formula with respect to the provided table , for calculating the values as said by you "5" and "0" at ts+30 and ts+40.

anku...@gmail.com

unread,
Oct 17, 2013, 2:59:02 AM10/17/13
to open...@googlegroups.com
Any one please reply .

ManOLamancha

unread,
Oct 21, 2013, 12:08:05 PM10/21/13
to open...@googlegroups.com
On Thursday, October 17, 2013 2:59:02 AM UTC-4, anku...@tcs.com wrote:
Any one please reply .

On Tuesday, October 15, 2013 10:57:35 AM UTC+5:30, anku...@tcs.com wrote:
Thanks , it would be great if you provide the values in the formula with respect to the provided table , for calculating the values as said by you "5" and "0" at ts+30 and ts+40.

OK, the formula I put in the docs was transposed and I screwed up the calculations. I updated the docs so take a look and let me know if that looks right. Thanks. http://opentsdb.net/docs/build/html/user_guide/query/aggregators.html 

Andrew H

unread,
Jan 17, 2014, 2:57:55 PM1/17/14
to open...@googlegroups.com
I think it's still wrong in the table at least at one point.  I use the same formula, structured a little differently:


         g  - g1
d = d1 + ------- * (d2 - d1)
         g2 - g1 

With the following intuitive definitions:

d -- the desired table value
d1 -- the table value immediately below d
d2 -- the table value immediately above d
g -- table index (in this case, the time) of the desired value
g1 -- table index immediately below g
g2 -- table index immediately above g

So, looking at "Interpolated B" at ts0+30s, we get these values for the variables:

d = unknown without calculation
g = 30 (or, ts0+30s)
g1 = 20 (or, ts0+20s)
g2 = 40 (or, ts0+40s)
d1 = 20
d2 = 10

That makes the formula:

         30 - 20
d = 20 + ------- * (10 - 20)
         40 - 20 

Which reduces to:

         10
d = 20 + --- * (-10) = 20 - (1/2)(10) = 15
         20 

Andrew H

unread,
Jan 17, 2014, 3:11:33 PM1/17/14
to open...@googlegroups.com
Oh, and the comment "the interpolated value is negative" is also incorrect.  The first derivative of a curve-match of the data points would be negative at that point, but the interpolation is not.  The only way it could be is if either, or both, the preceding or following points were negative.

Andrew H

unread,
Jan 17, 2014, 8:34:55 PM1/17/14
to open...@googlegroups.com
Sorry, I really should have thought this through and posted everything at once instead of lots of add-on posts. I've created the interpolation in the universal programming language of Google Sheets :-)

A minor point... Shouldn't the first and last point sums (at 0 and 60) be indeterminate?  It would require 2 extrapolations to properly complete them.  It's a vanishingly small problem and highly pedantic for large data sets, but possibly relevant in some cases.

ManOLamancha

unread,
Jan 20, 2014, 11:46:34 AM1/20/14
to open...@googlegroups.com
On Friday, January 17, 2014 8:34:55 PM UTC-5, Andrew H wrote:
Sorry, I really should have thought this through and posted everything at once instead of lots of add-on posts. I've created the interpolation in the universal programming language of Google Sheets :-)

No problem, took me a few tries to get it right too :) 

A minor point... Shouldn't the first and last point sums (at 0 and 60) be indeterminate?  It would require 2 extrapolations to properly complete them.  It's a vanishingly small problem and highly pedantic for large data sets, but possibly relevant in some cases.

Yes, technically they should be indeterminate but it can happen often if you are summing a large number of data sets in OpenTSDB. For example we have thousands of hosts publishing data to OpenTSDB every 5 minutes. If we synced all of the data generators, we would have massive spikes of data across the network and need a ton of TSDs to handle the spike. The rest of the time they'd sit idle. Instead we randomly splay the publishers so that we have a nice, even flow of traffic. Therefore, when we go to execute an aggregated query, there are tons of NAs in there since the timestamps won't line up. Instead of just returning an NA for almost every data point, we just sum what we have and move along. Another aggregator could be written to return NAs though.

Andrew H

unread,
Jan 21, 2014, 1:03:50 AM1/21/14
to open...@googlegroups.com
To be clear, my "sorry" was for not putting everything into one post. My calculations above and my Sheets application of the calculation are correct.  The Sheets version is just a different way of looking at it.  The ts0+30s Interpolated B calculation in the aggregators doc should be 15, not 10.  Also, intuitively, it makes no sense for the interpolation strictly between to unequal, known data points to be exactly equal to one of the known points.

Andrew H

unread,
Jan 21, 2014, 7:37:52 AM1/21/14
to open...@googlegroups.com
"*to* unequal, known data points"?  Sorry, that's "two". :-)  This is what I get for posting replies after about 1.5 hours of sleep and a 3:45 AM flight across the country. sigh...
Reply all
Reply to author
Forward
0 new messages