--Brian Brazil
2016-06-14 11:38 GMT+02:00 Brian Brazil <brian....@robustperception.io>:On 14 June 2016 at 09:52, Romain Vrignaud <rvri...@gmail.com> wrote:Hello,I'm looking for some advice. For various reason I'm using telegraf (and would like to stick to it) with prometheus output to get metrics. (https://github.com/influxdata/telegraf/)It seems that whenever a metric is published in the prometheus output it will never leave it, even if the telegraf input doesn't send it again (it's metrics for rabbitmq ephemeral queues in my case). You can find the bug description there : https://github.com/influxdata/telegraf/issues/1334.If I understand correctly telegraf maintainer, it's prometheus way of doing things. Is that true ?To be more specific, I have an alert rule on a sum of messages in ephemeral queues. But with telegraf rabbitmq input plugin and telegraf prometheus output plugin, queues that are deleted keep last known value before queue was deleted. This is quite problematic for us as it totaly twist the computation of number of messages. How should we handle this kind of use case ?This is a problem with Telegraf, the API it provides for outputs doesn't tell us what metrics do and don't exist. The original PR I proposed adding Prometheus support to Telegraf didn't have this issue.Would you mind comment on telegraf issue that this statement "Prometheus basically comes with the assumption that once a metric has been reported, it must be reported at every interval." is not true and not aligned with prometheus philosophy ?
I'd suggest using https://github.com/kbudde/rabbitmq_exporterFor various reason:
* only one project to maintain (telegraf vs lots of differents exporters)* push to durable metric storage in influxdb (given the fact that AFAIK prometheus will drop influxdb write)
I would prefer maintain only one tool for metric gathering. As exporters are not able to push metrics to influxdb, I would prefer to keep telegraf.
My above statement is a bit out of context, I meant it that prometheus would expect _telegraf_ to continue reporting the same metrics, as I don't quite see a way that Telegraf could report ephemeral metrics to prometheus.
@Brian-Brazil I'd be very interested to know how Telegraf can let prometheus know which metrics do and don't exist, and be able to unregister & reregister them later?
I'm certainly not a prometheus expert, but I was basing that statement off of this example code: https://godoc.org/github.com/prometheus/client_golang/prometheus#example-Register
--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I'm certainly not a prometheus expert, but I was basing that statement off of this example code: https://godoc.org/github.com/prometheus/client_golang/prometheus#example-Register
That's about direct instrumentation, which is not what we're doing here.
Brian
Telegraf is doing direct instrumentation, I don't quite follow, what should Telegraf be doing instead?
I see, and what would be stopping only Telegraf's "prometheus_client" output plugin from implementing the Collector interface, rather than the entire Telegraf agent?
It would be possible for the Telegraf prometheus output plugin to cache it's metrics from each collection interval, and then send them down the channel anytime that Collect is called.
But I also think that doing this would in some ways go against a tenant of prometheus, that the metrics should be "collected" at the time the http request is made, is that correct?
Would you recommend this as an OK workaround for Telegraf to take, without changing it's core workflow?