Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1025618: cloud-init and firewalld systemd unit files have ordering cycles

770 views
Skip to first unread message

Guillaume Knispel

unread,
Dec 6, 2022, 12:40:03 PM12/6/22
to
Package: cloud-init
Version: 20.4.1-2+deb11u1
Severity: important
X-Debbugs-Cc: xi...@australdx.fr

Dear Maintainer,

firewalld and cloud-init have ordering cycles between their systemd unit
files, leading to more or less broken boot results when both are installed
and active, because at each boot systemd decides to skip a
non-deterministically choosen service (not necessarily cloud-init or
firewalld) to break the cycle.

I'm not sure if any of firewalld or cloud-init is more at fault (maybe
in not respecting some systemd rules?) so I'm also opening a duplicate
of this bug for the other package.

This can have various but potentially serious consequences, depending on
what should be, but is not, started.

Examples of boot traces of this issue happening:

* example 1:
sysinit.target: Found ordering cycle on cloud-init.service/start
sysinit.target: Found dependency on networking.service/start
sysinit.target: Found dependency on network-pre.target/start
sysinit.target: Found dependency on firewalld.service/start
sysinit.target: Found dependency on basic.target/start
sysinit.target: Found dependency on sockets.target/start
sysinit.target: Found dependency on uuidd.socket/start
sysinit.target: Found dependency on sysinit.target/start
sysinit.target: Job cloud-init.service/start deleted to break ordering cycle starting with sysinit.target/start

* example 2:
sysinit.target: Found ordering cycle on cloud-init.service/start
sysinit.target: Found dependency on networking.service/start
sysinit.target: Found dependency on network-pre.target/start
sysinit.target: Found dependency on firewalld.service/start
sysinit.target: Found dependency on dbus.service/start
sysinit.target: Found dependency on sysinit.target/start
sysinit.target: Job cloud-init.service/start deleted to break ordering cycle starting with sysinit.target/start

* example 3:
firewalld.service: Found ordering cycle on dbus.socket/start
firewalld.service: Found dependency on sysinit.target/start
firewalld.service: Found dependency on cloud-init.service/start
firewalld.service: Found dependency on networking.service/start
firewalld.service: Found dependency on network-pre.target/start
firewalld.service: Found dependency on firewalld.service/start
firewalld.service: Job dbus.socket/start deleted to break ordering cycle starting with firewalld.service/start

* example 4:
firewalld.service: Found ordering cycle on dbus.service/start
firewalld.service: Found dependency on sysinit.target/start
firewalld.service: Found dependency on cloud-init.service/start
firewalld.service: Found dependency on networking.service/start
firewalld.service: Found dependency on network-pre.target/start
firewalld.service: Found dependency on firewalld.service/start
firewalld.service: Job dbus.service/start deleted to break ordering cycle starting with firewalld.service/start
basic.target: Found ordering cycle on sysinit.target/start
basic.target: Found dependency on cloud-init.service/start
basic.target: Found dependency on networking.service/start
basic.target: Found dependency on network-pre.target/start
basic.target: Found dependency on firewalld.service/start
basic.target: Found dependency on basic.target/start
basic.target: Job cloud-init.service/start deleted to break ordering cycle starting with basic.target/start

* example 5:
basic.target: Found ordering cycle on sockets.target/start
basic.target: Found dependency on uuidd.socket/start
basic.target: Found dependency on sysinit.target/start
basic.target: Found dependency on cloud-init.service/start
basic.target: Found dependency on networking.service/start
basic.target: Found dependency on network-pre.target/start
basic.target: Found dependency on firewalld.service/start
basic.target: Found dependency on dbus.service/start
basic.target: Found dependency on basic.target/start
basic.target: Job sockets.target/start deleted to break ordering cycle starting with basic.target/start
firewalld.service: Found ordering cycle on dbus.socket/start
firewalld.service: Found dependency on sysinit.target/start
firewalld.service: Found dependency on cloud-init.service/start
firewalld.service: Found dependency on networking.service/start
firewalld.service: Found dependency on network-pre.target/start
firewalld.service: Found dependency on firewalld.service/start
firewalld.service: Job dbus.socket/start deleted to break ordering cycle starting with firewalld.service/start

* example 6:
networking.service: Found ordering cycle on network-pre.target/start
networking.service: Found dependency on firewalld.service/start
networking.service: Found dependency on dbus.service/start
networking.service: Found dependency on basic.target/start
networking.service: Found dependency on sockets.target/start
networking.service: Found dependency on uuidd.socket/start
networking.service: Found dependency on sysinit.target/start
networking.service: Found dependency on cloud-init.service/start
networking.service: Found dependency on networking.service/start
networking.service: Job network-pre.target/start deleted to break ordering cycle starting with networking.service/start


At first I experienced the issue on a Debian Stable (Bullseye), then
I was able to reproduce the problem on an up-to-date Bookworm.

Note that "systemd-analyze verify" seems to be unable to find the issue,
however, the following repro (tried on Bookworm) shows a way to detect
the cycles statically (plus, during a reboot you can observe it
directly):

$ sudo apt install firewalld cloud-init
$ echo "datasource_list: [ Fallback ]" | sudo tee /etc/cloud/cloud.cfg.d/99_fallback.cfg
$ sudo reboot
$ wget https://raw.githubusercontent.com/jantman/misc-scripts/4560db773f463101273539e625c9b48e9f53f87f/dot_find_cycles.py
$ # ^ I found that script by reading https://github.com/systemd/systemd/issues/3829
$ sudo apt install 2to3 python3-pygraphviz python3-pydotplus python3-pydot python3-graphviz python3-networkx
$ 2to3 -w dot_find_cycles.py
$ chmod +x dot_find_cycles.py
$ sudo systemd-analyze dot --no-pager --order 2>/dev/null | python3 ./dot_find_cycles.py - | tee order-cycle_cloud-init-firewalld.txt
networking.service -> network-pre.target -> firewalld.service -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> basic.target -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> basic.target -> sockets.target -> cloud-init-hotplugd.socket -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> basic.target -> sockets.target -> dbus.socket -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> basic.target -> systemd-pcrphase-sysinit.service -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> dbus.socket -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> dbus.service -> basic.target -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> dbus.service -> basic.target -> sockets.target -> cloud-init-hotplugd.socket -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> dbus.service -> basic.target -> sockets.target -> dbus.socket -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> dbus.service -> basic.target -> systemd-pcrphase-sysinit.service -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> dbus.service -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> dbus.service -> dbus.socket -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> polkit.service -> dbus.socket -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> polkit.service -> basic.target -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> polkit.service -> basic.target -> sockets.target -> cloud-init-hotplugd.socket -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> polkit.service -> basic.target -> sockets.target -> dbus.socket -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> polkit.service -> basic.target -> systemd-pcrphase-sysinit.service -> sysinit.target -> cloud-init.service -> networking.service
networking.service -> network-pre.target -> firewalld.service -> polkit.service -> sysinit.target -> cloud-init.service -> networking.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> basic.target -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> basic.target -> sockets.target -> cloud-init-hotplugd.socket -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> basic.target -> sockets.target -> dbus.socket -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> basic.target -> systemd-pcrphase-sysinit.service -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> dbus.socket -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> dbus.service -> basic.target -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> dbus.service -> basic.target -> sockets.target -> cloud-init-hotplugd.socket -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> dbus.service -> basic.target -> sockets.target -> dbus.socket -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> dbus.service -> basic.target -> systemd-pcrphase-sysinit.service -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> dbus.service -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> dbus.service -> dbus.socket -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> polkit.service -> dbus.socket -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> polkit.service -> basic.target -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> polkit.service -> basic.target -> sockets.target -> cloud-init-hotplugd.socket -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> polkit.service -> basic.target -> sockets.target -> dbus.socket -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> polkit.service -> basic.target -> systemd-pcrphase-sysinit.service -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service
systemd-networkd.service -> network-pre.target -> firewalld.service -> polkit.service -> sysinit.target -> cloud-init.service -> systemd-networkd-wait-online.service -> systemd-networkd.service


Best regards,
Guillaume Knispel


-- System Information:
Debian Release: 11.5
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-19-amd64 (SMP w/64 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages cloud-init depends on:
ii fdisk 2.36.1-8+deb11u1
ii gdisk 1.0.6-1.1
ii ifupdown 0.8.36
ii locales 2.31-13+deb11u5
ii lsb-base 11.1.0
ii lsb-release 11.1.0
ii net-tools 1.60+git20181103.0eebece-1
ii procps 2:3.3.17-5
ii python3 3.9.2-3
ii python3-configobj 5.0.6-4
ii python3-jinja2 2.11.3-1
ii python3-jsonpatch 1.25-3
ii python3-jsonschema 3.2.0-3
ii python3-oauthlib 3.1.0-2
ii python3-requests 2.25.1+dfsg-2
ii python3-yaml 5.3.1-5
ii util-linux 2.36.1-8+deb11u1

Versions of packages cloud-init recommends:
ii cloud-guest-utils 0.31-2
ii eatmydata 105-9
ii sudo 1.9.5p2-3

Versions of packages cloud-init suggests:
pn btrfs-progs <none>
ii e2fsprogs 1.46.2-2
pn xfsprogs <none>

-- no debconf information

Ross Vandegrift

unread,
Dec 7, 2022, 8:20:04 PM12/7/22
to
Control: forwarded -1 https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1956629

Hi Guillaume,

On Tue, Dec 06, 2022 at 06:26:26PM +0100, Guillaume Knispel wrote:
> firewalld and cloud-init have ordering cycles between their systemd unit
> files, leading to more or less broken boot results when both are installed
> and active, because at each boot systemd decides to skip a
> non-deterministically choosen service (not necessarily cloud-init or
> firewalld) to break the cycle.

Thanks for bringing this to our attention. There's a few useful
discussions:

https://github.com/firewalld/firewalld/issues/414
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1956629

From my quick read: Michael Biebl proposes dropping network-pre.target
from cloud-init's After=, and replacing it with each of the config
backends that cloud-init supports. This sounds pretty reasonable, but
also like something that upstream should address first.

Should we consider adding "Conflicts: firewalld" to cloud-init before
the freeze? That's not optimal of course, but it'd prevent a user from
ending up in this situation for now.

Thanks,
Ross

Sam Hartman

unread,
Dec 12, 2022, 8:00:04 PM12/12/22
to
>>>>> "Ross" == Ross Vandegrift <rvand...@debian.org> writes:

>> From my quick read: Michael Biebl proposes dropping
>> network-pre.target
Ross> from cloud-init's After=, and replacing it with each of the
Ross> config backends that cloud-init supports. This sounds pretty
Ross> reasonable, but also like something that upstream should
Ross> address first.

Why wait for upstream?
It's a bug affecting Debian users, our systemd maintainer has a solution
that you (and I) think is reasonable.
The symptom is quite serious.
We often make changes before upstream in situations like that,
especially when the alternative is:

Ross> Should we consider adding "Conflicts: firewalld" to cloud-init
Ross> before the freeze? That's not optimal of course, but it'd
Ross> prevent a user from ending up in this situation for now.

I'd much rather see Debian local changes than conflicts.

Noah Meyerhans

unread,
Dec 12, 2022, 8:50:04 PM12/12/22
to
We should simply move this discussion to an upstream pull request rather
than wait passively for their response. I agree that diverging from
upstream is preferable to unnecessary conflicts, but it shouldn't be
done without first consulting with upstream on our proposed solution.

noah

Ross Vandegrift

unread,
Dec 16, 2022, 7:00:04 PM12/16/22
to
I played with the suggested solution and was unable to get it working:
cloud-init.service doesn't have a /direct/ Before=network-pre.target to remove.
The ordering is implicit in the combination of units.

Probably, I think Michael knew that when he made the suggestion - but I had to
play with it for a few hours first. :)

At a high level the issue is: firewalld.service forces network-pre.target after
sysinit.target, but cloud-init.service forces the other way around. In detail,
using < to represent Before, the imposed orderings look like:

- from firewalld:
sysinit.target < dbus.service < firewalld.service < network-pre.target
- from cloud-init:
cloud-init-local.service < network-pre.target < systemd-networkd-wait-online.service < cloud-init.service < sysinit.target

There's a few approaches to resolving this. As far as I can tell, the only
immediately viable one (at the bottom) requires users to manually fix this
and accept some trade-offs. Anyone have any better ideas?



Modify firewalld to run before sysinit.target
---------------------------------------------

This would let cloud-init and firewalld agree to do network-pre.target before
sysinit.target.

This is probably not possible since firewalld requires dbus, which starts after
sysinit.target. There's a thread at [1] about why moving firewalld to be an
early boot service is difficult.


Modify cloud-init to run after sysinit.target
---------------------------------------------

This would let cloud-init and firewalld agree to do network-pre.target after
sysinit.target. This might not be advisable (see comments in [1] about running
network management services in late boot), but it looks like this is how RHEL
does it [2].

From [3], I think cloud-init.service added Before=basic.target (which
eventually became Before=sysinit.target) to ensure cloud-init configured block
device mounts were ready early enough in boot process. The network needs to be
online for this, since some block device config can come from network sources.
So changing this in the Debian package seems risky to me.


Locally override firewalld.service's order
------------------------------------------

If you need to use both together, create an override unit that removes
Before=network-pre.target. This eliminates the cycle by allowing cloud-init's
order to win. But it the network will be up without firewalld for a period.
Unfortunately, dependencies can't be removed in a drop-in - so I think you need
to copy the unit to /etc/systemd/system and modify it.

Ross

[1] - https://lists.freedesktop.org/archives/systemd-devel/2022-March/047538.html
[2] - https://github.com/canonical/cloud-init/blob/main/systemd/cloud-init.service.tmpl#L4-L6
[3] - https://github.com/canonical/cloud-init/commit/80f5ec4be0f781b26eca51d90d51abfab396b3f6

Ross Vandegrift

unread,
Jan 18, 2023, 1:50:04 AM1/18/23
to
On Fri, Dec 16, 2022 at 03:48:00PM -0800, Ross Vandegrift wrote:
> At a high level the issue is: firewalld.service forces network-pre.target after
> sysinit.target, but cloud-init.service forces the other way around. In detail,
> using < to represent Before, the imposed orderings look like:
>
> - from firewalld:
> sysinit.target < dbus.service < firewalld.service < network-pre.target
> - from cloud-init:
> cloud-init-local.service < network-pre.target < systemd-networkd-wait-online.service < cloud-init.service < sysinit.target
>
> There's a few approaches to resolving this. As far as I can tell, the only
> immediately viable one (at the bottom) requires users to manually fix this
> and accept some trade-offs. Anyone have any better ideas?

We discussed this issue on the recent cloud-team meeting and had some
revised options.

> Modify firewalld to run before sysinit.target
> ---------------------------------------------
[snip]

This one still seems impossible.

> Modify cloud-init to run after sysinit.target
> ---------------------------------------------
[snip]

The main downside of this one, is that cloud-init will be running too
late to configure block devices. But this feature didn't always work
well. So maybe we'd affect a non-working feature.

I've confirmed that cloud-init's block device setup is working well on
AWS at least. So I think this will break working cloud-init features.
IMO, that means it is not viable.

> Locally override firewalld.service's order
> ------------------------------------------
[snip]

This remains unattractive since unsuspecting users will be left with
broken images and no clear path to fix the problem.


Modify dbus to run later
------------------------

We discussed a way improve things by shuffling dbus later, but I didn't
take good enough notes, and I can't reconstruct the details. Sorry for
forgetting - Bastian do you recall the details?


Add Breaks or Conflicts to prevent coinstallation
-------------------------------------------------

None of the alternatives seem reasonable and installing cloud-init and
firewalld cannot produce a working Debian image. So we should prevent
this state.

We thought Conflicts might be required because once both are unpacked,
the problematic cycle technically exists. Though it may not cause harm
unless both services are (re-)started simultaneously.

Ross
0 new messages