Can CollectAndCompare validate range for non-deterministic values e.g. latency?

33 views
Skip to first unread message

Michael Rybak

unread,
May 28, 2020, 2:59:35 PM5/28/20
to Prometheus Developers
TL;DR: I'm unit-testing Prometheus metrics: server's requests count and latencies. The server is written in Go. Latencies are non-determinstic and there seems to be no good way to fake clock in Prometheus or Go. Does Prometheus provide a method similar to CollectAndCompare that lets validating a value (e.g. latency) against a range, not exact expectation? If not, would it be a good feature to add to Prometheus? Alternatively, is there a way to fake clock in Prometheus, so e.g. InstrumentHandlerDuration would propagate the pre-defined fake duration? Or, is there a good way to fake clock in Go?

I'm measuring server's requests count (Counter) and latencies (Summary), and test both with unit tests. I observe the metrics via InstrumentHandlerCounter and InstrumentHandlerDuration respectively.

For testing requests count, I hardcode the expectation in a string constant, and use CollectAndCompare to perform an exact match validation:

// Make some test requests.
..

// Validate them
expectation := strings.NewReader(`
# HELP my_metric_requests_total My help.
# TYPE my_metric_requests_total counter
my_metric_requests_total{code="200"} 2
my_metric_requests_total{code="304"} 1
my_metric_requests_total{code="502"} 1
my_metric_requests_total{code="503"} 1
`)

this.Require().NoError(promtest.CollectAndCompare(myMetricRequestsTotal, expectation, "my_metric_requests_total"))

I couldn't find a way to do the same for latencies, because they are non-deterministic. So instead of the one-liner check above I dig into the internals of the gathered metrics:

// Make some test requests. 
hintPrefix := "My test."
...

// Validate them
type codeLabelPair string
type scenarioExpectedSampleCountMap map[codeLabelPair]uint64

expectedSampleCountMap := scenarioExpectedSampleCountMap{
`name:"code" value:"200" `: 3,
`name:"code" value:"304" `: 1,
`name:"code" value:"502" `: 2,
}

reg := prometheus.NewPedanticRegistry()
if err := reg.Register(promRequestsLatency); err != nil {
this.T().Errorf(hintPrefix+" - registering collector failed: %s", err)
}

actualMetricFamilyArr, err := reg.Gather()
if err != nil {
this.T().Errorf(hintPrefix+" - gathering metrics failed: %s", err)
}

assert.Equal(this.T(), 1, len(actualMetricFamilyArr),
hintPrefix+" expects exactly one metric family.")

assert.Equal(this.T(), "request_latencies_in_seconds", *actualMetricFamilyArr[0].Name,
hintPrefix+" expects the right metric name.")

assert.Equal(this.T(), len(expectedSampleCountMap), len(actualMetricFamilyArr[0].Metric),
hintPrefix+" expects the right amount of metrics collected and gathered.")

for _, actualMetric := range actualMetricFamilyArr[0].Metric {
// Expect the right sample count.
code := actualMetric.Label[0].String()
expectedSampleCount := expectedSampleCountMap[codeLabelPair(code)]
actualSampleCount := actualMetric.Summary.GetSampleCount()
assert.Equal(this.T(), expectedSampleCount, actualSampleCount, hintPrefix+" expects the right sample count for "+code)

// Test quantiles.
expectedQuantileKeys := []float64{0.5, 0.9, 0.99}

// Expect the right number of quantiles.
assert.Equal(this.T(), len(expectedQuantileKeys), len(actualMetric.Summary.Quantile), hintPrefix+" expects the right number of quantiles.")

// Expect the right quantiles.
// Expect positive quantile values, because latencies are non-zero.
// Don't check the exact values, because latencies are non-deterministic.
for i, quantile := range actualMetric.Summary.Quantile {
assert.Equal(this.T(), expectedQuantileKeys[i], quantile.GetQuantile(), hintPrefix+" expects the right quantile.")
assert.True(this.T(), quantile.GetValue() > .0, hintPrefix+" expects non-zero quantile value (latency).")
}
}

This seems to be more complex than it should be. Is there a one-liner way, similar to the CollectAndCompare call I'm making above to validate requests count? 

Alternatively, is there a way to fake clock in Prometheus, so e.g. InstrumentHandlerDuration would propagate the pre-defined fake duration? Or, is there a good way to fake clock in Go? There is an option that doesn't look safe enough:


Thanks.

Bartłomiej Płotka

unread,
May 28, 2020, 4:06:56 PM5/28/20
to Michael Rybak, Prometheus Developers
That is a very good question (: 

I don't think the current API of client Golang https://godoc.org/github.com/prometheus/client_golang/prometheus/testutil package allows this scenario. It might be good to add GH issue for client_golang project to explore those possibilities,

In terms of potential API for such util functions, something that I would recommend here is the e2e framework we use in Thanos like here: https://github.com/thanos-io/thanos/blob/c733564d44745af1a023bfa5d51d6d205404dc82/test/e2e/compact_test.go#L556 This is actually maintained in Cortex repository, and it's something we use, maintain and recommend, especially if you want to test against container environment.

There is also a possibility to pull out this code to operate on raw text files, and put into client_golang, this API might work. Note that Peter also is working on something even better: PromQL based unit test on metric text file! (: Quite amazing stuff, something to consider as well.

Kind Regards,
Bartek


--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/b440c496-cfb1-4b5f-8fb5-11ca47485a3cn%40googlegroups.com.

Bjoern Rabenstein

unread,
May 29, 2020, 8:50:38 AM5/29/20
to Michael Rybak, Prometheus Developers
The testutil package is really more meant for end-to-end testing
exporters and such, where you want to mirror an external metric source
and therefore know exactly the input and expected output.

If you want to test if your code was instrumented properly, I would
rather go with a mock registry or mock metrics. (But that's tedious at
the moment for various reasons. v2 of the instrumentation client will
make that easier.)

Having said that, if you only want to test for presence of a metric
and are not interested in the value, the new filtering in
`CollectAndCount` and `GatherAndCount` comes in handy. (It is not
released yet, but I'll cut a release soon.)

Check it out here:
https://github.com/prometheus/client_golang/blob/master/prometheus/testutil/testutil.go#L134-L138

You would filter for the metric name whose presence you want to test for.

--
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in
Reply all
Reply to author
Forward
0 new messages