Histogram.mean & stddev become Double.NaN (ExponentiallyDecayingReservoir, WeightedSnapshot)

34 views
Skip to first unread message

michaels

unread,
Apr 15, 2015, 4:10:05 AM4/15/15
to metric...@googlegroups.com
Hello.

We use metrics to track database access times and similar stuff in our project. The monitoring logs are traced every minute to a file.

Recently we discovered strange logs (commented with -- below):

-- application doesn't have any load for several hours --
-- the following periodic traces represent the latest value of the histogramm (same trace for hours) --

16:26:55 type=HISTOGRAM, name=xxxxlatency, count=108288, min=1, max=137, mean=1.123272970726895, stddev=0.5904154125651903, median=1.0, p75=1.0, p95=2.0, p98=3.0, p99=4.0, p999=4.0

16:27:55 type=HISTOGRAM, name=xxxxlatency, count=108288, min=1, max=137, mean=1.123272970726895, stddev=0.5904154125651903, median=1.0, p75=1.0, p95=2.0, p98=3.0, p99=4.0, p999=4.0

16:28:55 type=HISTOGRAM, name=xxxxlatency, count=108288, min=1, max=137, mean=1.123272970726895, stddev=0.5904154125651903, median=1.0, p75=1.0, p95=2.0, p98=3.0, p99=4.0, p999=4.0

-- according to other logs and code analysis application now updates the histogram with a few values --
-- then the periodic trace changes to: --

16:29:55 type=HISTOGRAM, name=xxxxlatency, count=108292, min=1, max=2, mean=NaN, stddev=NaN, median=1.0, p75=1.0, p95=1.0, p98=1.0, p99=1.0, p999=1.0


Unfortunately we don't have traces of the numbers we provide to histogram.update.
But they are just time values (differences of subsequent calls of System.currentTimeMillis and System.nanoTimes in milliseconds).

We currently don't understand which (strange) long values we possibly could provide to histogram.update that could cause "mean" not to be a number any longer.

Is there any possibilities of a bug in histogram?

Version and class information:
* metrics-core-3.1.0
* We use histogram with ExponentiallyDecayingReservoir (which uses a WeightedSnapshot internally).


Any hints for us? - Thank you,

Michael



Marshall Pierce

unread,
Apr 15, 2015, 10:17:50 AM4/15/15
to metric...@googlegroups.com
System.currentTimeMillis can go backwards, so you should never use it
for elapsed time measurements.
> --
> You received this message because you are subscribed to the Google
> Groups "metrics-user" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to metrics-user...@googlegroups.com
> <mailto:metrics-user...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

michaels

unread,
Apr 15, 2015, 10:33:50 AM4/15/15
to metric...@googlegroups.com
Thanks for the answer, yes - that's why we have replaced it with nanoTimes at almost all places (but not all yet).

Still, why can histogram.mean become Double.NaN, when we provide a negative value (which might NOT be the case) in a while to histogram.update?

Thanks

Michael

Marshall Pierce

unread,
Apr 15, 2015, 10:46:57 AM4/15/15
to metric...@googlegroups.com
That I couldn't tell you without digging in to the code, but it strikes
me as possible. :)
Reply all
Reply to author
Forward
0 new messages