I would like to use Promethus to do alerting and gather data from a varnish cache instance (https://www.varnish-cache.org).
I did a bit of googling and the closest that I can find is using munin (https://github.com/munin-monitoring/contrib/blob/master/plugins%2Fvarnish4%2Fvarnish4_) and then the munin exporter but I was thinking that I would rather use the varnistat command and then write a custom varnish exporter that could parse the XML in Go.
I dont have ay experience in Go-lang but I thought it would be a nice holiday project.
Your thoughts?
Regards
Rudi
Great idea!
SoundCloud doesn't use Varnish anymore, but I'm happy to review your code.
Tobi
--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I decided to make one if you want to check it out https://github.com/jonnenauha/prometheus_varnish_exporter
You'll find more details in the readme. Was a quick one night project but should do the trick.
Hi guys, not sure if this is relevant to you anymore. I also looked for a Varnish exporter and could not find one.
I decided to make one if you want to check it out https://github.com/jonnenauha/prometheus_varnish_exporter
You'll find more details in the readme. Was a quick one night project but should do the trick.
On Friday, December 11, 2015 at 10:30:09 PM UTC+2, Rudi Kramer wrote:
> Hello,
>
> I would like to use Promethus to do alerting and gather data from a varnish cache instance (https://www.varnish-cache.org).
>
> I did a bit of googling and the closest that I can find is using munin (https://github.com/munin-monitoring/contrib/blob/master/plugins%2Fvarnish4%2Fvarnish4_) and then the munin exporter but I was thinking that I would rather use the varnistat command and then write a custom varnish exporter that could parse the XML in Go.
>
> I dont have ay experience in Go-lang but I thought it would be a nice holiday project.
>
> Your thoughts?
>
> Regards
> Rudi
--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Maybe it tells you more as Prometheus folks. I also added the version as
varnish_version{major="3",minor="0",patch="5",revision="1a89b1f",version="3.0.5"} 1
Should this metric be sent on each Collect or if the value does not change, can it only be sent on the first query or something? I suppose it has to be sent every time, even if static so Prometheus wont think its absent.
>
> Thanks for doing this, you may want to read our guidelines for exporters to help make it as useful as possible: https://docs.google.com/document/d/1JapuiRbp-XoyECgl2lPdxITrhm5IyCUq9iA_h6jp3OY/edit
Yeah I did skim trough that doc. I'll be honest with you, I tried to understand what kind of metrics (gauge, gauge vec, counter etc.) I should be using but am not sure if I understood the idea completely. I read the haproxy exporter a lot as the kind of stats are quite similar to haproxy. Seems like it is using GaugeVec all over the place except for the up/scrape num counters.
My current understanding is that GaugeVec should be used if I have eg. a metric called "varnish_main_fetch" with attached label code, that has the status code.
I'm also using GaugeVec for generic backend events like "varnish_backend_happy" and attaching each backends label (user provided name) and uuid (server generated) to the event.
Did I understand this correctly? When the same metric is done for multiple "things" i should use the same metric name, use gauge vec and set values to unique labels?
>
> In particular you should be using ConstMetrics, rather than creating an abstraction on top of prometheus instrumentation and sharing state between scrapes (please, never ever create an abstraction on top of instrumentation). This will be simpler and eliminate the race condition in your current implementation.
I did create a very thin abstraction for both what I get from varnish and what is sent to prometheus. The prometheus metric owning the gauge and adding some helper functions.
I don't see how this thin struct is any different than the map[int]Gauge being used in haproxy exported. Sure I added a bit more than a number but it helps me to resolve the metric that needs to be updated when I get new data from varnish.
I wanted to do that -test mode that dumps resolved metrics on the screen. As far as I see once I create the metric/gauge it wont let me ask for the namespace, name, description etc. How would I accomplish this without doing some abstraction around the prometheus metric?
Is "ConstMetrics" something you provide or what? I cant find that from prom or client_golang packages.
Could you point me to the race condition if you spotted one? Both exporters handle their own mutex when modifying state.
I did not go the route as the haproxy cvs reader did with channels, I feel that would have just made the code harder to read. The whole scrape on my VM takes 1-20 msec so I felt no need to do work in parallel, it would have very minimal gains.
Describe and Collect should both be safe to call concurrently, I do understand this as its essentially a HTTP handler.
>
> Is there some reason you can't decode the json into a []varnishMetric directly?
> To answer your question in the comment, the version number should go as a label on it's own metric and have the value 1.
Yeah, the use of reflection and fiddling there is unfortunate but minimal. I have yet to figure out a way to make a struct to read correctly the type of JSON that varnishstat emits. It has a single timestamp to mess the parsing, afaik the JSON package would emit a error at that point and not continue.
Here are examples from Varnish 3.x and 4.x output:
https://github.com/jonnenauha/prometheus_varnish_exporter/blob/master/varnish_test.go#L84-L97
https://github.com/jonnenauha/prometheus_varnish_exporter/blob/master/varnish_test.go#L44-L55
If you know a way to directly get that into a map[string]xMetric let me know.
I'll add that version thing. I parse major, minor, patch and commit revision so there is plenty of labels :)
>
>
> Can you share some sample output? I'm confused as to what you're doing with labels, and as far as I can tell from the docs varnish doesn't expose anything that would require a label.
Here is a -test run that will verify varnishstat, scrape it and dump stuff to stdout
https://dl.dropboxusercontent.com/u/3589544/code/prom/prometheus_varnish_exporter-test-mode.txt
Maybe this will give you better idea of how I'm using labels. One good example is the locks section of varnish. For example it emits
LCK.sms.colls
LCK.cli.colls
LCK.herder.colls
etc.
If I understood the label system correctly I'm combining all of these as a single GaugeVec "lck_colls" and each has unique "ident" label with value of sms, cli, herder etc. Is this how they should be used?
Another example is if you look for "main_fetch". This one I'm not sure if this is ok. I basically saw that there was a bunch of common fetch_xxx, so i trimmed the end and put the last identifier as a label.
>
>
> Brian
>
>
>
>
>
> On Friday, December 11, 2015 at 10:30:09 PM UTC+2, Rudi Kramer wrote:
>
> > Hello,
>
> >
>
> > I would like to use Promethus to do alerting and gather data from a varnish cache instance (https://www.varnish-cache.org).
>
> >
>
> > I did a bit of googling and the closest that I can find is using munin (https://github.com/munin-monitoring/contrib/blob/master/plugins%2Fvarnish4%2Fvarnish4_) and then the munin exporter but I was thinking that I would rather use the varnistat command and then write a custom varnish exporter that could parse the XML in Go.
>
> >
>
> > I dont have ay experience in Go-lang but I thought it would be a nice holiday project.
>
> >
>
> > Your thoughts?
>
> >
>
> > Regards
>
> > Rudi
>
>
>
> --
>
> You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
>
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.
>
>
>
>
>
> --
>
>
> Brian Brazil
> www.robustperception.io
--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hmm, I'm now realizing I might have mixed labels and const labels with each other.
What is the difference between them in terms of .With and friends?
I do understand that in const the value never changes, as this seems to be the case for me. Server name/id or the main_fetch code or other identifiers scraped from varnish never change. So I think I should be using const labels instead.
Will gaugeVec.With(prometheus.Labels{"ident":"mybackend1"}) work exactly the same no matter if its a const or "runtime changing" label?
One thing also I need to do better is runtime changes. User can add/remove backends to Varnish while the monitoring is running, my code should adapt to this.
I'm now looking into NewConstMetric stuff. I think this answers my question above. It can both describe and collect at the same time (done inside Collect). In this case I assume I will only describe the static always present up/version/scrapes metrics in Describe.
This looks like a perfect solution for Varnish with it's runtime new/removed labeled metrics. I'll investigate this approach a bit more. Right now it would simply log a warning and drop it to the floor.
Looks like (in your example) the ConstMetric *Desc should still be shared by metric unique name/key between all Collect runs, its just that the *Metric gets always created on the spot.
The new approach now also has the benefit of handling runtime changes to varnish like adding new backends.
I have removed the pretty pointless scrape num/fail counters. I left up and version as they were before. up will only go to zero if the varnishstat command fails. I noticed that it does not fail even if varnish is stopped, it keeps giving zero values. This wont make up go to zero, so its a bad indicator if varnish is actually up. Will have to device a way to detect if the server is up reliably.
If you could review the code again, its a lot shorter read now :)
Ben: I'd be glad if it gets added to the 3rd party exporter list so people can find it. I recon Google wont find that repo for a while :) I would still wait for some more code review and that I test this in production with out company varnish instances and graph something useful in Grafana.
I have rewrote the exporter to use const metrics. This simplified the codebase massively. All the bookkeeping between collects and figuring out the metrics before Describe is executed was a pain.
The new approach now also has the benefit of handling runtime changes to varnish like adding new backends.
I have removed the pretty pointless scrape num/fail counters. I left up and version as they were before. up will only go to zero if the varnishstat command fails. I noticed that it does not fail even if varnish is stopped, it keeps giving zero values. This wont make up go to zero, so its a bad indicator if varnish is actually up. Will have to device a way to detect if the server is up reliably.
If you could review the code again, its a lot shorter read now :)
Ben: I'd be glad if it gets added to the 3rd party exporter list so people can find it. I recon Google wont find that repo for a while :) I would still wait for some more code review and that I test this in production with out company varnish instances and graph something useful in Grafana.
A few things that you'll find in the doc:Avoid a label called 'type', it's very generic. From a quick read of the docs, "server" may be better label than "id" as that's what the docs call it and it indicates what it's the id of.
You should not include a "total" field in a metric, as that'll break aggregation. If you have to have it (i.e. the sum is less than the total), give it a different metric name.
How you're specifying the host/port for the exporter to listen on is different from the other go exporters. You should standardise that, and grab a default port number as described in the doc.
A few things that you'll find in the doc:Avoid a label called 'type', it's very generic. From a quick read of the docs, "server" may be better label than "id" as that's what the docs call it and it indicates what it's the id of.I implemented a way to rename fq names and label keys. Its looking a bit better but still have generic "type" keys, "id" is gone now. Lock "target" somehow felt right, but thats pretty generic as well. Seems all I can come up with is generic ones :)
Here is a simplified dump of the data from /metrics https://dl.dropboxusercontent.com/u/3589544/code/prom/prometheus_varnish_exporter-http-metrics.txtYou should not include a "total" field in a metric, as that'll break aggregation. If you have to have it (i.e. the sum is less than the total), give it a different metric name.Ah, good point about the sum() etc. Fixed now for totals to have "_total" postfix without label.
How you're specifying the host/port for the exporter to listen on is different from the other go exporters. You should standardise that, and grab a default port number as described in the doc.Alright, will do at some point.