Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#993051: avahi-daemon CPU usage increases over time

1,259 views
Skip to first unread message

Ryan Armstrong

unread,
Aug 26, 2021, 7:40:04 PM8/26/21
to
Package: avahi-daemon
Version: 0.8-5
Severity: normal

Dear Maintainer,

After upgrading to Debian Bullseye, I noticed that the Avahi CPU usage on my server machine was
quite high (eventually 100% of one core). After resetting Avahi, the CPU usage was normal then
eventually increased over time again until it was again rather high. The
increase appears to be (very roughly) 1 or 2% per hour on my rather humble Intel(R) Celeron(R)
CPU 4205U @ 1.80GHz.

Checking the journal, I only see the following sorts of lines:

Aug 25 16:21:30 zeta avahi-daemon[313333]: avahi_normalize_name() failed.
Aug 25 16:21:30 zeta avahi-daemon[313333]: avahi_key_new() failed.
Aug 25 16:21:30 zeta avahi-daemon[313333]: avahi_normalize_name() failed.
Aug 25 16:21:30 zeta avahi-daemon[313333]: avahi_key_new() failed.
Aug 25 16:21:31 zeta avahi-daemon[313333]: avahi_normalize_name() failed.
Aug 25 16:21:31 zeta avahi-daemon[313333]: avahi_key_new() failed.

Which was around the time I turned on another machine on my network.
However, the timing was not aligned with when Avahi CPU usage increased.
Instead, it seems to be aligned with when I turn on my printer, but nothing
of note was printed in the log when that happened.

Is there any means for me to gather additional information to help
diagnose this problem?

Thanks,
Ryan

-- System Information:
Debian Release: 11.0
APT prefers stable-security
APT policy: (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 5.10.0-8-amd64 (SMP w/2 CPU threads)
Locale: LANG=en_CA.UTF-8, LC_CTYPE=en_CA.UTF-8 (charmap=UTF-8), LANGUAGE=en_CA:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages avahi-daemon depends on:
ii adduser 3.118
ii bind9-host [host] 1:9.16.15-1
ii dbus 1.12.20-2
ii init-system-helpers 1.60
ii libavahi-common3 0.8-5
ii libavahi-core7 0.8-5
ii libc6 2.31-13
ii libcap2 1:2.44-1
ii libdaemon0 0.14-7.1
ii libdbus-1-3 1.12.20-2
ii libexpat1 2.2.10-2
ii lsb-base 11.1.0

Versions of packages avahi-daemon recommends:
ii libnss-mdns 0.14.1-2

Versions of packages avahi-daemon suggests:
pn avahi-autoipd <none>

-- no debconf information

Ryan Armstrong

unread,
Sep 14, 2021, 7:20:03 PM9/14/21
to
I have made an attempt at profiling where the avahi-daemon is getting
stuck. Hopefully this is useful to someone. I downloaded the source
package for avahi-daemon, then rebuilt it and installed the resulting
debug packages. After waiting for the CPU usage to reach around 70% or
so, I attempted to profile where it is executing.

My initial attempt at running `perf record -p 514 -g` failed; when
running `perf report` after, the report was fully blank. As a result, I
decided to go back to my original plan and just break a few times in gdb
to see where it stopped. See the attached log for a full log of that
session. It seemed to be usually in one of two places:

#0  0x00007ff359801fb0 in find_next_timeout (s=<optimized out>) at
simple-watch.c:431
#1  0x00007ff3598027ea in avahi_simple_poll_prepare
(s=s@entry=0x558f0b1e5ff0, timeout=timeout@entry=-1) at simple-watch.c:481
#2  0x00007ff359802c69 in avahi_simple_poll_iterate (s=0x558f0b1e5ff0,
timeout=timeout@entry=-1) at simple-watch.c:599
#3  0x0000558f09c929ee in run_server (c=0x558f09cb01e0 <config>) at
main.c:1268
#4  main (argc=<optimized out>, argv=<optimized out>) at main.c:1686

or

#0  0x00007ff35962d3c3 in __GI___poll (fds=0x558f0b1eec90, nfds=10,
timeout=2209504) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007ff359802aa1 in avahi_simple_poll_run (s=0x558f0b1e5ff0) at
simple-watch.c:527
#2  avahi_simple_poll_run (s=0x558f0b1e5ff0) at simple-watch.c:518
#3  0x00007ff359802c78 in avahi_simple_poll_iterate (s=0x558f0b1e5ff0,
timeout=timeout@entry=-1) at simple-watch.c:602
#4  0x0000558f09c929ee in run_server (c=0x558f09cb01e0 <config>) at
main.c:1268
#5  main (argc=<optimized out>, argv=<optimized out>) at main.c:1686

Ryan

On 2021-09-11 1:11 p.m., Ryan Armstrong wrote:
> I edited my Avahi service to add the --debug flag to see if added
> anything useful. It doesn't seem so, but here it is regardless. I've
> attached both the full log (gzipped) and an annotated and simplified log.
>
> I may try and attach to the process with GDB in the future to trace
> what part of the code it is primarily working in.
>
> Ryan
>
avahi-debug session.log

Andreas Schneider

unread,
Nov 9, 2021, 8:00:03 AM11/9/21
to
I have the exact same problem for quite some time already (certainly 6 months, likely more), using Debian sid.

I'm surprised that this bug report is the only mention of the issue I could find on the net. It's reproducible every time I switch on my printer, but it takes quite some hours, in the order of a few days, before it turns into a significant problem. Running `sudo avahi-daemon -k` kills the process and a new one is started, which behaves well until I switch on my printer.

It looks like Ryan already did what I was planning to do, i.e. debug through the daemon and do some Monte Carlo profiling. If there is anything I can do to help triage the problem, please let me know. For now I can only share details of my home setup that may or may not be helpful: the printer is an HP Deskjet-3630; DNS is handled by a FritzBox! but I believe the issue also showed up while I was using my home grown dnsmasq server.

Best,

Andreas

Gustavo Noronha Silva

unread,
Jan 3, 2022, 6:50:03 AM1/3/22
to
Hey,

I noticed this problem a while ago and did some investigation
yesterday. perf showed find_next_timeout as the culprit indeed, and gdb
helped me realize the timeout linked list was just growing infinitely
large...

Here's the fix: https://github.com/lathiat/avahi/pull/366

Cheers,

Gustavo

Michael Welsh Duggan

unread,
Mar 11, 2022, 1:10:03 PM3/11/22
to
I've been running with the patch linked to by Gustavo Noronha Silva
<k...@debian.org> for a few days now, and for me it seems to have solved
this problem. I'd suggest patching Debian's version until upstream
makes a release including this fix.

--
Michael Welsh Duggan
(md...@md5i.com)
0 new messages