--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/9bdbec6d-e8ca-402e-947f-18199a514619%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Prometheus should automatically upgrade to TCP for large DNS responses.What DNS server are you using to host your SRV records?
On Tue, Dec 12, 2017 at 1:22 AM, <craig....@fluxfederation.com> wrote:
Hi,Just starting down the road of using Prometheus, and we're looking at using dns_sd to find all our nodes. However, it seems that it is limited to ~512 byte replies; anything longer results in various DNS resolution failures, e.g.: "dns: bad rdlength", "dns: overflow unpacking uint16", "dns: overflow unpacking uint32". Which error we get depends on the precise length of the response.This surprises me; I can only get to around 10 nodes per SRV record, which seems like a very low number; it would take a very short host-naming scheme to get many more. Am I missing something about how dns_sd should be working or how I should be doing things?(Yes, there are other ways I could achieve this, but dns_sd seems quite elegant, and works well with our future plans, so I'm curious to know if there's ways I can make it work)Prometheus version is 2.0.0, if it matters.Thanks,Craig Miskell
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/20171213063601.vuqjun6lighjbgmq%40hezmatt.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/17f4ef32-1602-41a4-a57b-a0c36fca57b0%40googlegroups.com.To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
On Tue, Dec 12, 2017 at 01:58:27PM -0800, craig....@fluxfederation.com wrote:
> An excellent question: it is dnsmasq, which is where it all goes wrong.
Oh my.
> For those playing at home:
> *) Prometheus sends a query, with no EDNS0 extra-size options included
> *) dnsmasq sends an oversize (>512byte) reply, with no TC bit set (bad
> dnsmasq, no cookie)
> *) The Prometheus dns client is, justifiably, displeased with this offering
>
> If I put bind9 on my prometheus node, forwarding to our dnsmasq servers,
> and make prometheus use that instead, it all works. When bind9 replies to
> prometheus, it correctly truncates the reply and sets the TC bit;
> prometheus then re-requests with EDNS0 extra-size options set, and gets the
> full reply properly encoded by bind9.
Hmm... what's different about how BIND queries dnsmasq that allows it to
proceed? Is it just that it ignores the fact of the oversized reply, and
somehow manages to parse the response anyway? If there's a difference in
the way that BIND does the *query* (or sequence of queries), there may be
scope for changing Prometheus to mimic that (say, by sending the
EDNS0-enabled query first, perhaps).
> For the record, a friend of mine who I trust deeply on DNS matters says
> that the DNS library should be able to handle that reply, and that the
> 512-byte limit is historical and a bit silly. So it's possible my
> statement that the client is justifiably displeased is wrong.
The 512 byte limit is historical, and potentially a bit silly, but dnsmasq
is deranged for sending oversized responses, because some clients won't
handle it, and can't be updated to handle it. "Be conservative in what you
send" and all that.
Insofar as the problem exists in the DNS library that Prometheus uses
(https://github.com/miekg/dns), I think you're going to have to have the
argument with them about whether to support parsing over-sized but
non-truncated responses -- if there's nothing Prometheus can do differently
(short of changing DNS libraries, which I doubt is going to happen for a
problem which isn't *strictly* the client library's fault), then there's
nothing that can be fixed in Prometheus, and the problem will have to be
addressed elsewhere.