Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1035568: dnsmasq is broken on new bookworm installations

1,196 views
Skip to first unread message

Jens Meißner

unread,
May 5, 2023, 11:30:04 AM5/5/23
to
Package: dnsmasq
Version: 2.89-1
Severity: grave
Justification: renders package unusable
X-Debbugs-Cc: hept...@gmx.de

Hello,

dnsmasq on bookworm fails to start after installation because the dns port 53 is already is use by systemd-resolved.
After stopping systemd-resolved dnsmasq will start but refuses all dns queries with the Extended DNS Error Code 14 "Not Ready".
This error is reproducible on new installation.

Setting severity to grave because it affects clean installs.

Regards,
Jens


Steps to reproduce to problem:

1. Create a new instance from the generic bookworm image: https://cdimage.debian.org/images/cloud/bookworm/daily/20230505-1371/debian-12-generic-amd64-daily-20230505-1371.qcow2
2. Update package cache and install dnsmasq: apt update && apt install -y dnsmasq
3. dnsmasq will fail to start:

May 05 13:57:17 bookworm systemd[1]: Starting dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server...
May 05 13:57:17 bookworm dnsmasq[1078]: dnsmasq: failed to create listening socket for port 53: Address already in use
May 05 13:57:17 bookworm dnsmasq[1078]: failed to create listening socket for port 53: Address already in use
May 05 13:57:17 bookworm dnsmasq[1078]: FAILED to start up
May 05 13:57:17 bookworm systemd[1]: dnsmasq.service: Control process exited, code=exited, status=2/INVALIDARGUMENT
May 05 13:57:17 bookworm systemd[1]: dnsmasq.service: Failed with result 'exit-code'.
May 05 13:57:17 bookworm systemd[1]: Failed to start dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server.

4. This first problem can be solved by disabling systemd-resolved: systemctl disable --now systemd-resolved.service

5. Now dnsmasq can be started (systemctl start dnsmasq.service), but it logs an error:

May 05 13:58:51 bookworm systemd[1]: Starting dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server...
May 05 13:58:51 bookworm dnsmasq[1184]: started, version 2.89 cachesize 150
May 05 13:58:51 bookworm dnsmasq[1184]: DNS service limited to local subnets
May 05 13:58:51 bookworm dnsmasq[1184]: compile time options: IPv6 GNU-getopt DBus no-UBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP conntrack ipset nftset auth cryptohash DNSSEC loop-detect inotify dumpfile
May 05 13:58:51 bookworm dnsmasq[1184]: read /etc/hosts - 8 names
May 05 13:58:51 bookworm resolvconf[1193]: Dropped protocol specifier '.dnsmasq' from 'lo.dnsmasq'. Using 'lo' (ifindex=1).
May 05 13:58:51 bookworm resolvconf[1193]: Failed to set DNS configuration: Unit dbus-org.freedesktop.resolve1.service not found.
May 05 13:58:51 bookworm systemd[1]: Started dnsmasq.service - dnsmasq - A lightweight DHCP and caching DNS server.

6. Install dnsutils: apt install -y dnsutils

7. Try to query the local nameserver. It will refuse to respond:

$ dig @127.0.0.1 debian.org

; <<>> DiG 9.18.12-1-Debian <<>> @127.0.0.1 debian.org
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 14242
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; EDE: 14 (Not Ready)
;; QUESTION SECTION:
;debian.org. IN A

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Fri May 05 14:00:51 UTC 2023
;; MSG SIZE rcvd: 45


-- System Information:
Debian Release: 12.0
APT prefers testing-security
APT policy: (500, 'testing-security'), (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 6.1.0-7-amd64 (SMP w/2 CPU threads; PREEMPT)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages dnsmasq depends on:
ii dnsmasq-base [dnsmasq-base] 2.89-1
ii init-system-helpers 1.65.2
ii netbase 6.4
ii runit-helper 2.15.2
ii sysvinit-utils [lsb-base] 3.06-4

dnsmasq recommends no packages.

Versions of packages dnsmasq suggests:
ii systemd-resolved [resolvconf] 252.6-1

-- no debconf information

Trent W. Buck

unread,
May 14, 2023, 11:32:19 PM5/14/23
to
On Fri 05 May 2023 15:17:37 +0000, Jens Meißner wrote:
> dnsmasq on bookworm fails to start after installation because the dns port 53 is already is use by systemd-resolved.
> After stopping systemd-resolved dnsmasq will start but refuses all dns queries with the Extended DNS Error Code 14 "Not Ready".
> This error is reproducible on new installation.

First of all, this should block dnsmasq.service (binary package "dnsmasq"), but
it should NOT block /usr/sbin/dnsmasq (binary package "dnsmasq-base").
The latter is needed by things like libvirtd and network-manager!



Here is how I solved this on my Debian 11 router:

1. in /etc/dnsmasq.d/cyber-kludges.conf

# Don't fight nsd and systemd-resolved for control over ports.
# Also don't shit yourself at boot time if dnsmasq starts before the ifaces are up.
# The combination of options is a little confusing.
# "--ignore-address=203.7.155.4" does something COMPLETELY unrelated, so
# instead we need to whitelist the OTHER dmz address (203.7.155.1).
# Then we need to whitelist the other ifaces, else it would bind ONLY to 203.7.155.1.
bind-dynamic
interface=lo
interface=byod
interface=lan
listen-address=203.7.155.1
no-dhcp-interface=dmz
listen-address=10.194.71.1
no-dhcp-interface=vpn
except-interface=internet


# Proxy DNSv4/DNSv6 from the internet.
# HARD CODE the upstream servers.
# We SHOULD get them dynamically from systemd-networkd (the DHCPv4/RA/DHCPv6 client on the "internet" interface).
# However, that requires third-party software like https://gitlab.com/craftyguy/networkd-dispatcher
# Since Aussie Broadband rarely (if ever) change these, hard-coding them is Good EnoughTM.
# I considered also/instead adding the Cloudflare and/or Google anycast DNS servers from here:
# https://github.com/systemd/systemd/blob/main/docs/DISTRO_PORTING.md
# ...but those DNS servers will direct us to more distant hosts.
# For example, "deb.debian.org" is
# 8ms away using the address from AB or CF, but
# 22ms away using the address from Google.
no-resolv
all-servers
cache-size=8192
server=202.142.142.142
server=202.142.142.242
server=2403:5800:100:1::142
server=2403:5800:1:5::242

# THIS BIT IS ONLY NEEDED BECAUSE I *ALSO* RUN NSD.
# IT IS NOT NEEDED FOR systemd-resolve + dnsmasq.
# Don't go out to the internet and back in, for our own domains.
# This also means e.g. "logserv" still works when the internet is down.
server=/cyber.com.au/155.7.203.in-addr.arpa/203.7.155.4

2. in /etc/systemd/network/00-dmz.network, tell systemd-networkd (and thus resolved) about dnsmasq

[Match]
Name=dmz
[Link]
RequiredForOnline=no
[Network]
Domains=cyber.com.au
Address=203.7.155.1/26
Address=203.7.155.4/26
Address=203.7.155.49/26
# THESE NEXT TWO LINES ARE THE RELEVANT ONES FOR 1035568
Domains=cyber.com.au ~155.7.203.in-addr.arpa
DNS=203.7.155.1

3. install libnss-resolve and make this link

lrwxrwxrwx 1 root root 24 Feb 24 2021 /etc/resolv.conf -> /lib/systemd/resolv.conf

In other words, what I have is:

a. local nss users go

libnss_resolve
-> resolved (via socket)
-> dnsmasq on 203.7.155.1 (for cyber.com.au and 155.7.203.in-addr.arpa)

-> whatever systemd-networkd got from upstream DHCP/DHCPv6 (for every other domain)

b. local /etc/resolv.conf users go

-> resolved on 127.0.0.53 (via UDP)
[rest as above]

Because of the quirky way to code this in dnsmasq,
there is no good way to write a general default dnsmasq.conf to hook it up this way.

The other potential way to hook this up is to simply tell resolved not to listen on 127.0.0.53:53 (DNSStubListener=no in /etc/systemd/resolved.conf).
HOWEVER, it then means that name resolution is different for glibc (nss) versus everyone else, because

a. local nss users go

libnss_resolve
-> resolved (via socket)
-> whatever systemd-networkd got from upstream DHCP/DHCPv6 (for all domains)

...NEVER see RRs in dnsmasq.

b. local resolv.conf users cannot go to resolved, because it now only listens on a AF_UNIX socket, not AF_DGRAM (UDP).

So it either points directly upstream (typical legacy setup in dhclient) and bypasses BOTH dnsmasq and resolved; or
it's set to 127.0.0.1 (i.e. dnsmasq) and bypasses resolved.

Note that networkd has NO WAY to tell dnsmasq what DNS server(s) are supplied by upstream .network files / DHCP responses.
networkd can only tell resolved that (I last checked back in v247).


PS: I have also seen deeply inconsistent results when there are unqualified names in /etc/hosts (e.g. "10.1.2.3 alice")
because libnss_files.so, dnsmasq, and resolved treat those differently. In essence, libnss_files.so is a third "path" on top of (a) and (b) above.

My solution was to move stop using /etc/hosts and
instead use /etc/hosts.dnsmasq-only (dnsmasq --addn-hosts=), then
make all name resolution paths pass through *AT LEAST* dnsmasq.

This is part of why the knee-jerk answer of "FFS, just patch IPT_FREEBIND into dnsmasq" probably isn't a comprehensive fix.


PPS: for the record, here is the "ip -c -4 a" of the host the above config comes from.
It only has legacy IP at the ISP, so I haven't even considered solving this for IPv6 :-(

bash5$ ip -c -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: byod: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
inet 203.7.155.65/26 brd 203.7.155.127 scope global byod
valid_lft forever preferred_lft forever
inet 203.7.155.193/26 brd 203.7.155.255 scope global byod
valid_lft forever preferred_lft forever
4: lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
inet 203.7.155.129/26 brd 203.7.155.191 scope global lan
valid_lft forever preferred_lft forever
6: internet: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 119.17.136.37/22 brd 119.17.139.255 scope global dynamic internet
valid_lft 1066sec preferred_lft 1066sec
7: dmz: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
inet 203.7.155.1/26 brd 203.7.155.63 scope global dmz
valid_lft forever preferred_lft forever
inet 203.7.155.4/26 brd 203.7.155.63 scope global secondary dmz
valid_lft forever preferred_lft forever
inet 203.7.155.49/26 brd 203.7.155.63 scope global secondary dmz
valid_lft forever preferred_lft forever
8: vpn: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1000
inet 10.194.71.1/24 brd 10.194.71.255 scope global vpn
valid_lft forever preferred_lft forever

Antonio Terceiro

unread,
May 16, 2023, 2:00:05 PM5/16/23
to
Control: severity -1 normal

On Fri, 05 May 2023 15:17:37 +0000 =?utf-8?q?Jens_Mei=C3=9Fner?= <hept...@gmx.de> wrote:
> Package: dnsmasq
> Version: 2.89-1
> Severity: grave
> Justification: renders package unusable
> X-Debbugs-Cc: hept...@gmx.de
>
> Hello,
>
> dnsmasq on bookworm fails to start after installation because the dns port 53 is already is use by systemd-resolved.
> After stopping systemd-resolved dnsmasq will start but refuses all dns queries with the Extended DNS Error Code 14 "Not Ready".
> This error is reproducible on new installation.
>
> Setting severity to grave because it affects clean installs.

This is a bug that needs fixing, for sure. But systemd-resolved is
installed by default only on the cloud images, and not on all Debian
installs. Therefore this does not really affect all users, and grave
severity does not apply.

Enabling (uncommenting) the `bind-interfaces` in /etc/dnsmasq.conf fixes
the issue on my tests, and maybe that should be set by default.
signature.asc
0 new messages