On Fri 05 May 2023 15:17:37 +0000, Jens Meißner wrote:
> dnsmasq on bookworm fails to start after installation because the dns port 53 is already is use by systemd-resolved.
> After stopping systemd-resolved dnsmasq will start but refuses all dns queries with the Extended DNS Error Code 14 "Not Ready".
> This error is reproducible on new installation.
First of all, this should block dnsmasq.service (binary package "dnsmasq"), but
it should NOT block /usr/sbin/dnsmasq (binary package "dnsmasq-base").
The latter is needed by things like libvirtd and network-manager!
Here is how I solved this on my Debian 11 router:
1. in /etc/dnsmasq.d/cyber-kludges.conf
# Don't fight nsd and systemd-resolved for control over ports.
# Also don't shit yourself at boot time if dnsmasq starts before the ifaces are up.
# The combination of options is a little confusing.
# "--ignore-address=203.7.155.4" does something COMPLETELY unrelated, so
# instead we need to whitelist the OTHER dmz address (203.7.155.1).
# Then we need to whitelist the other ifaces, else it would bind ONLY to 203.7.155.1.
bind-dynamic
interface=lo
interface=byod
interface=lan
listen-address=203.7.155.1
no-dhcp-interface=dmz
listen-address=10.194.71.1
no-dhcp-interface=vpn
except-interface=internet
# Proxy DNSv4/DNSv6 from the internet.
# HARD CODE the upstream servers.
# We SHOULD get them dynamically from systemd-networkd (the DHCPv4/RA/DHCPv6 client on the "internet" interface).
# However, that requires third-party software like
https://gitlab.com/craftyguy/networkd-dispatcher
# Since Aussie Broadband rarely (if ever) change these, hard-coding them is Good EnoughTM.
# I considered also/instead adding the Cloudflare and/or Google anycast DNS servers from here:
#
https://github.com/systemd/systemd/blob/main/docs/DISTRO_PORTING.md
# ...but those DNS servers will direct us to more distant hosts.
# For example, "
deb.debian.org" is
# 8ms away using the address from AB or CF, but
# 22ms away using the address from Google.
no-resolv
all-servers
cache-size=8192
server=202.142.142.142
server=202.142.142.242
server=2403:5800:100:1::142
server=2403:5800:1:5::242
# THIS BIT IS ONLY NEEDED BECAUSE I *ALSO* RUN NSD.
# IT IS NOT NEEDED FOR systemd-resolve + dnsmasq.
# Don't go out to the internet and back in, for our own domains.
# This also means e.g. "logserv" still works when the internet is down.
server=/
cyber.com.au/155.7.203.in-addr.arpa/203.7.155.4
2. in /etc/systemd/network/00-dmz.network, tell systemd-networkd (and thus resolved) about dnsmasq
[Match]
Name=dmz
[Link]
RequiredForOnline=no
[Network]
Domains=
cyber.com.au
Address=
203.7.155.1/26
Address=
203.7.155.4/26
Address=
203.7.155.49/26
# THESE NEXT TWO LINES ARE THE RELEVANT ONES FOR 1035568
Domains=
cyber.com.au ~155.7.203.in-addr.arpa
DNS=203.7.155.1
3. install libnss-resolve and make this link
lrwxrwxrwx 1 root root 24 Feb 24 2021 /etc/resolv.conf -> /lib/systemd/resolv.conf
In other words, what I have is:
a. local nss users go
libnss_resolve
-> resolved (via socket)
-> dnsmasq on 203.7.155.1 (for
cyber.com.au and 155.7.203.in-addr.arpa)
-> whatever systemd-networkd got from upstream DHCP/DHCPv6 (for every other domain)
b. local /etc/resolv.conf users go
-> resolved on 127.0.0.53 (via UDP)
[rest as above]
Because of the quirky way to code this in dnsmasq,
there is no good way to write a general default dnsmasq.conf to hook it up this way.
The other potential way to hook this up is to simply tell resolved not to listen on
127.0.0.53:53 (DNSStubListener=no in /etc/systemd/resolved.conf).
HOWEVER, it then means that name resolution is different for glibc (nss) versus everyone else, because
a. local nss users go
libnss_resolve
-> resolved (via socket)
-> whatever systemd-networkd got from upstream DHCP/DHCPv6 (for all domains)
...NEVER see RRs in dnsmasq.
b. local resolv.conf users cannot go to resolved, because it now only listens on a AF_UNIX socket, not AF_DGRAM (UDP).
So it either points directly upstream (typical legacy setup in dhclient) and bypasses BOTH dnsmasq and resolved; or
it's set to 127.0.0.1 (i.e. dnsmasq) and bypasses resolved.
Note that networkd has NO WAY to tell dnsmasq what DNS server(s) are supplied by upstream .network files / DHCP responses.
networkd can only tell resolved that (I last checked back in v247).
PS: I have also seen deeply inconsistent results when there are unqualified names in /etc/hosts (e.g. "10.1.2.3 alice")
because libnss_files.so, dnsmasq, and resolved treat those differently. In essence, libnss_files.so is a third "path" on top of (a) and (b) above.
My solution was to move stop using /etc/hosts and
instead use /etc/hosts.dnsmasq-only (dnsmasq --addn-hosts=), then
make all name resolution paths pass through *AT LEAST* dnsmasq.
This is part of why the knee-jerk answer of "FFS, just patch IPT_FREEBIND into dnsmasq" probably isn't a comprehensive fix.
PPS: for the record, here is the "ip -c -4 a" of the host the above config comes from.
It only has legacy IP at the ISP, so I haven't even considered solving this for IPv6 :-(
bash5$ ip -c -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
inet
127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: byod: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
inet
203.7.155.65/26 brd 203.7.155.127 scope global byod
valid_lft forever preferred_lft forever
inet
203.7.155.193/26 brd 203.7.155.255 scope global byod
valid_lft forever preferred_lft forever
4: lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
inet
203.7.155.129/26 brd 203.7.155.191 scope global lan
valid_lft forever preferred_lft forever
6: internet: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet
119.17.136.37/22 brd 119.17.139.255 scope global dynamic internet
valid_lft 1066sec preferred_lft 1066sec
7: dmz: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet
203.7.155.1/26 brd 203.7.155.63 scope global dmz
valid_lft forever preferred_lft forever
inet
203.7.155.4/26 brd 203.7.155.63 scope global secondary dmz
valid_lft forever preferred_lft forever
inet
203.7.155.49/26 brd 203.7.155.63 scope global secondary dmz
valid_lft forever preferred_lft forever
8: vpn: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1000
inet
10.194.71.1/24 brd 10.194.71.255 scope global vpn
valid_lft forever preferred_lft forever