blackbox_exporter 0.24.0 and smokeping_prober 0.7.1 - DNS cache "nscd" not working

102 views
Skip to first unread message

Alexander Wilke

unread,
Mar 15, 2024, 10:41:37 AM3/15/24
to Prometheus Users
Hello,

I am running blackbox_exporter and smokeping_prober on a RHEL8 environment. Unfortunately with our configu wie have around 4-5 million DNS queries per 24hrs.

The reason for that is that we do very frequent tcp queries to various destinations which results in many DNS requests.

To reduce the DNS load on the DNS server we tried to implement "nscd" as a DNS cache.

However running strace we notice that the blackbox_exporter is checking resolve.con, then nsswitch.conf then /etc/hosts and then send the query directly to the DNS server not using the DNS cache. Thats for every target of blackbox_exporter.

For smokeping_prober I am aware that it resolves DNS only at restart and we notice the same. All requests are directly send to DNS server but not to the cache.

anyone using nscd on RHEL8 to cache blackbox_exporter and/or smokeping_prober?

If not has anyone a working, simple configuration with unbound for this specific scenario?

Is blackbox and smokeping using glibc methods to resolve DNS or something else?

Thank you very much!

Ben Kochie

unread,
Mar 15, 2024, 12:52:09 PM3/15/24
to Alexander Wilke, Prometheus Users
All of the Prometheus components you're talking about are statically compiled Go binaries. These use Go's native DNS resolution. It does not use glibc. So maybe looking for solutions related to Golang and nscd would help. I've not looked into this myself.

But on the subject of node local DNS caches. I can highly recommend CoreDNS's cache plugin[0]. It even has built-in Prometheus support so you can find how good your cache is working. The CoreDNS cache specifically supports prefetching, which is important for making sure there's no gap or latency in updating the cache when the TTL is close to expiring.


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7972e866-f6be-47f6-8807-65f560f2aa3fn%40googlegroups.com.

Alexander Wilke

unread,
Mar 15, 2024, 7:05:21 PM3/15/24
to Prometheus Users
Thanks for the hint. I checked the Go DNS feature and found these hints:

  1. export GODEBUG=netdns=go # force pure Go resolver 
  2. export GODEBUG=netdns=cgo # force cgo resolver 


I tried to set the cgo env variable and restarted services. however systemd-resolved and nscd seem not to be able to cache it.
May have to wait for a colleague who is more experienced in linux than me. perhaps we can figure it out why it's not working with the new behaviour.

Anthony Cairncross

unread,
Mar 20, 2024, 2:04:43 AM3/20/24
to Prometheus Users
Hello there,

I hope I can add some detail to the discussion.

Had a go - no pun intended ;) - at trying to use GODEBUG variables to see what happens.
When using "export GODEBUG=netdns=cgo+1" and running the precompiled blackbox_exporter like "blackbox_exporter-0.24.0.linux-amd64" you would get the something like the following in the output:

go package net: built with netgo build tag; using Go's resolver

Looking at net module from golang at
https://github.com/golang/go/blob/go1.20.4/src/net/conf.go#L61
or the explanation in newer versions at
https://github.com/golang/go/blob/master/src/net/conf.go#L18

It shows that if the app was built with the "netgo" build tag, the go-resolver would always be used or respectively the use of "netcgo" would be prohibited. As stated by Ben, that glibc is not used.
So trying to get it to use glibc functions with "GODEBUG=netdns=cgo" won't work here.

Having a quick look at the binary, it seems, that netgo build tag was applied:

$ strings blackbox_exporter-0.24.0.linux-amd64/blackbox_exporter | egrep '\-tags.*net.*'
build   -tags=netgo
build   -tags=netgo


Or as per another var:

$ strings blackbox_exporter-0.24.0.linux-amd64/blackbox_exporter | grep CGO_ENABLED
build   CGO_ENABLED=0
build   CGO_ENABLED=0

So go would usually look in /etc/nsswitch.conf, /etc/hosts and then directly call the DNS server from /etc/resolv.conf if there is no local hosts entry.
To be able to use DNS caching (without rebuilding), one would need a local DNS server with enabled cache on the system which is referenced in the resolv.conf. Like with CoreDNS, bind, dnsmasq, unbound, etc.
I tried to find something about how go uses nsswitch.conf to get it to use nscd, but nothing helped so far.

Brian Candler

unread,
Mar 20, 2024, 4:18:26 AM3/20/24
to Prometheus Users
> To be able to use DNS caching (without rebuilding), one would need a local DNS server with enabled cache on the system which is referenced in the resolv.conf.

That's what systemd does: its cache binds to 127.0.0.53, and then you point to 127.0.0.53 in /etc/resolv.conf

Chris Siebenmann

unread,
Mar 21, 2024, 11:33:06 AM3/21/24
to Anthony Cairncross, Prometheus Users, Chris Siebenmann
> Having a quick look at the binary, it seems, that netgo build tag was
> applied:
>
> $ strings blackbox_exporter-0.24.0.linux-amd64/blackbox_exporter | egrep
> '\-tags.*net.*'
> build -tags=netgo
> build -tags=netgo

As a side note: if you have the Go toolchain available, you can use 'go
version -m <binary>' to conveniently dump out all of this information
(among other things). Taken from the current blackbox_exporter binary
release:

build -tags=netgo
build CGO_ENABLED=0

(In the case of standard Prometheus exporters like node_exporter and
blackbox_exporter, it looks like the tags information is reported in
their '--version' output, although not the CGO setting. But 'go version
-m' is authoritative for all Go binaries.)

- cks
Reply all
Reply to author
Forward
0 new messages