Google Public DNS queries > 25/sec

280 views
Skip to first unread message

jim...@gmail.com

unread,
Dec 14, 2015, 9:13:04 AM12/14/15
to public-dns-discuss
Hey Google,

I rate limit DNS queries at 25/sec.   What kind of queries, from one subnet, would you be sending above that limit?


Dec 14 10:21:28 ns1 named[1857]: increase from 500 to 750 RRL entries with 503 bins; average search length 1.8
Dec 14 10:21:28 ns1 named[1857]: increase from 750 to 1125 RRL entries with 503 bins; average search length 1.9
Dec 14 10:21:28 ns2 named[1870]: increase from 500 to 750 RRL entries with 503 bins; average search length 1.8
Dec 14 10:21:28 ns3 named[1851]: increase from 500 to 750 RRL entries with 503 bins; average search length 1.8
Dec 14 10:21:28 ns1 named[1857]: increase from 1125 to 1688 RRL entries with 503 bins; average search length 2.1
Dec 14 10:21:28 ns2 named[1870]: increase from 750 to 1125 RRL entries with 503 bins; average search length 1.9
Dec 14 10:21:28 ns3 named[1851]: increase from 750 to 1125 RRL entries with 503 bins; average search length 1.9
Dec 14 10:21:28 ns2 named[1870]: increase from 1125 to 1688 RRL entries with 503 bins; average search length 2.1
Dec 14 10:21:28 ns2 named[1870]: limit responses to 2404:6800:4003:c00::/56 for www.flagspot.net IN A  (356a3d32)
Dec 14 10:21:28 ns2 named[1870]: client 2404:6800:4003:c00::109#40758 (www.flagspot.net): rate limit slip response to 2404:6800:4003:c00::/56 for www.flagspot.net IN A  (356a3d32)
Dec 14 10:21:28 ns2 named[1870]: client 2404:6800:4003:c00::112#55722 (www.flagspot.net): rate limit drop response to 2404:6800:4003:c00::/56 for www.flagspot.net IN A  (356a3d32)
Dec 14 10:21:28 ns2 named[1870]: client 2404:6800:4003:c00::10d#64649 (www.flagspot.net): rate limit slip response to 2404:6800:4003:c00::/56 for www.flagspot.net IN A  (356a3d32)
Dec 14 10:21:28 ns3 named[1851]: increase from 1125 to 1688 RRL entries with 503 bins; average search length 2.1
Dec 14 10:21:28 ns2 named[1870]: client 2404:6800:4003:c00::102#57063 (www.flagspot.net): rate limit drop response to 2404:6800:4003:c00::/56 for www.flagspot.net IN A  (356a3d32)
Dec 14 10:21:28 ns2 named[1870]: client 2404:6800:4003:c00::117#65508 (www.flagspot.net): rate limit slip response to 2404:6800:4003:c00::/56 for www.flagspot.net IN A  (356a3d32)
Dec 14 10:21:29 ns1 named[1857]: increase from 1688 to 2532 RRL entries with 503 bins; average search length 2.5
Dec 14 10:21:32 ns1 named[1857]: increase from 503 to 2539 RRL bins for 2532 entries; average search length 3.9
Dec 14 10:21:32 ns3 named[1851]: increase from 503 to 1693 RRL bins for 1688 entries; average search length 3.5
Dec 14 10:21:32 ns2 named[1870]: increase from 503 to 1693 RRL bins for 1688 entries; average search length 3.6


Is this possibly the same problem that leads GoDaddy to block Google Public DNS?

-Jim P.

jim...@gmail.com

unread,
Dec 14, 2015, 12:28:17 PM12/14/15
to public-dns-discuss
Query Graphs:   http://i.imgur.com/XwFyAha.png  (all times UTC)

-Jim P.

Alex Dupuy

unread,
Dec 15, 2015, 1:07:15 PM12/15/15
to public-dns-discuss
Hi Jim,

Thanks for operating a great site (as a strictly amateur vexillologist - I was a junior member of NAVA as a kid back in the 1970s). And congratulations on operating a DNSSEC-signed domain that is actually correctly configured (http://dnsviz.net/d/www.flagspot.net/dnssec/) - that's not so common.
 
I rate limit DNS queries at 25/sec.   What kind of queries, from one subnet, would you be sending above that limit?

Dec 14 10:21:28 ns2 named[1870]: limit responses to 2404:6800:4003:c00::/56 for www.flagspot.net IN A  (356a3d32)
Dec 14 10:21:28 ns2 named[1870]: client 2404:6800:4003:c00::109#40758 (www.flagspot.net): rate limit slip response to 2404:6800:4003:c00::/56 for www.flagspot.net IN A  (356a3d32)
Dec 14 10:21:28 ns2 named[1870]: client 2404:6800:4003:c00::112#55722 (www.flagspot.net): rate limit drop response to 2404:6800:4003:c00::/56 for www.flagspot.net IN A  (356a3d32)

The Google Public DNS FAQ identifies all our IPv4 and IPv6 locations, 2404:6800:4003:c00::/56 is "tul" aka Tulsa, Oklahoma, but as we have fewer IPv6 resolver anycast, and the IPv6 routing topology has a lot more tunneling, the origin of the queries could be coming from most anywhere in the Western Hemisphere.  Google Public DNS has many separate nameserver and resolver engines at each cluster, and the cluster size is substantially larger than 25. A single client with a 20 millisecond fixed timeout can easily generate 50 queries in a second, and these would be load-balanced across as many nameservers in the cluster. We do have nameserver rate limiting, so that no cluster will send more than a few thousand QPS (consistent with the RRL entry sizes you see) to any single nameserver IP, but that's on a much higher level than your rate limiting (and it does need to be that high to support our legitimate query volume to nameservers that handle many many many domains). 
 
Is this possibly the same problem that leads GoDaddy to block Google Public DNS?

In your case it was probably just a misconfigured (or mis-designed) DNS client somewhere generating a very short-lived traffic spike; your rate limiting protected you (if you even really needed protection at all).  In GoDaddy's case, the ultimate source of the problem is typically a Denial-of-Service attack; they are probably getting very high rates of (bogus) queries from all over, including other public open resolvers.  While we limit the traffic we send, we get a far larger volume of queries, and it could be that it is easier for them to just block us (and drop incoming DNS traffic by maybe 25%?) than to implement more sophisticated load-shedding approaches (like RRL).

Alex Dupuy

unread,
Dec 15, 2015, 1:09:06 PM12/15/15
to public-dns-discuss
I forgot to include in my previous reply, this link to our rate-limiting protections: https://developers.google.com/speed/public-dns/docs/security#rate_limit

jim...@gmail.com

unread,
Dec 17, 2015, 9:21:23 PM12/17/15
to public-dns-discuss
Alex, Thank you for the reply, compliments, and detailed explanation.   My concern lies in what happens to normal DNS queries, that transition Google Public DNS, when my DNS rate-limit's a single Google Public DNS server.   Am I correct in assuming that my rate-limiting caused Google Public DNS to return SERVFAIL for my domains during that period?

-Jim P.
 
Reply all
Reply to author
Forward
0 new messages