Oh I forgot to say, one of OpenTSDB's core philosophies is never to
throw away, downsample or alter data. For systems work that OpenTSDB
was designed for, there really is no such thing as "signal noise".
Every datapoint is an accurate snapshot of some system metric at the
time it was read. If there is variability in the data, then that in
and of itself is valuable data to be kept, not thrown away. Your
network traffic could be jittery, you could have bursts of traffic, or
spikes in GC. These things change over time, and you never know when
you need to compare the jitteriness of last month's or last year's
data to that of today.
If you want smoothing or outlier suppression there are good plot-time
algorithms for that, but for us, at storage time is the wrong time to
do that. Lossy storage algorithms run fundamentally counter to the
goal.
With disk prices so cheap, and with all of the lossless compression
and space saving techniques around, there really is no excuse to throw
away data.