Impact of systemd on CFEngine (and configuration management tools in general)

384 views
Skip to first unread message

Marco Marongiu

unread,
Jan 17, 2015, 10:28:56 AM1/17/15
to help-c...@googlegroups.com
Hi there

I'd appreciate your opinions on the following, as I'm sure this will
have an impact on many of us pretty soon.

My Christmastime readings about systemd and a recent message to this
list from Bryan Burke (see below) seem to suggest that CFEngine
practitioners will have to develop a best practice pretty quick
regarding service management. And not only us: all of the people
managing services using Puppet, Chef, Salt or whatever may be impacted.

One of the features of systemd is that forked processes can't escape
their parent in no way. The moment you kill the parent, you kill the
children, too, and there is no way out of this.

Bryan attempt to have CFEngine upgrade itself on a systemd-bound OS seem
to suggest exactly that: the new package installs, the post-installation
scripts request a service restart and boom! The whole thing is killed,
leaving the upgrade half-way done.

This suggests a few things to me.

The more important one is: on systemd-bound OSs we should get into the
habit of asking systemd to start/stop services and let go the (now) bad
habit of restarting a service by running it's script from init.d
directly. If not, that process may result as a child of CFEngine, and
should CFEngine stop for any reason, part of the system will go down
with it.

The second one is: on such systems we should maybe delegate the upgrade
of CFEngine to atd (like launching "at now+1min
/usr/local/sbin/upgrade-cfengine" for example) or to anything that
doesn't kill the process that upgrades CFEngine

Finally: all this should kept in mind when writing the pre/post
installation scripts in the CFEngine packages for systemd-bound OSs.

I look forward to your considerations

Ciao
-- bronto


-------- Forwarded Message --------
Subject: Re: [help-cfengine] CFEngine update
Date: Sat, 17 Jan 2015 01:09:07 -0500
From: Bryan Burke <bbur...@gmail.com>
To: help-cfengine <help-c...@googlegroups.com>



>
> I find doing the update via cfengine is a kinda tricky. If something
> goes wrong your node could be left stranded. We have adopted a stance
> that upgrades to CFEngine should be handled by an external tool due to
> the snake eating it's tail scenario.

Indeed, to give some specifics, I’d like to update cfengine via
cfengine, but I’ve recently run into the problem where cfengine trying
to update itself present problems.

I *think* what is happening in my case is that, when cfengine
successfully updates it’s RPM (these are RHEL7 systems), part of the
postinstall/update script attempts to issue the command “service
cfengine3 restart”, which gets redirected to “systemctl restart
cfengine3”. However, because of cgroup-type management of processes and
their children, this ends up sending a SIGTERM to the running cf-agent,
the yum command, or both. Whatever the exact actions are, I get left in
a state where both the old and new versions of cfengine-community are
installed. Running ‘yum-complete-transaction’ finishes/fixed the
process, but obviously this is not ideal.

If there is interest in trying to handle this, I could open a redmine
ticket on the issue.


Bryan

Bryan Burke

unread,
Jan 19, 2015, 10:03:03 AM1/19/15
to help-c...@googlegroups.com

>
> I'd appreciate your opinions on the following, as I'm sure this will
> have an impact on many of us pretty soon.
>
> My Christmastime readings about systemd and a recent message to this
> list from Bryan Burke (see below) seem to suggest that CFEngine
> practitioners will have to develop a best practice pretty quick
> regarding service management. And not only us: all of the people
> managing services using Puppet, Chef, Salt or whatever may be impacted.
>
> One of the features of systemd is that forked processes can't escape
> their parent in no way. The moment you kill the parent, you kill the
> children, too, and there is no way out of this.

Do you know this for certain? One of the things I wanted to try was having a process daemonize, that is, fork (once, maybe twice), and then call setsid(); I wanted to see if this could break out of the cgroup prison. I haven’t read up too much on cgroups, so I don’t know, but I feel there must be some way for a process (perhaps, only one owned by root) to declare itself independent of cgroup monitoring.

If so, I’ve already written a cfengine module that could possibly handle situations like this. It is designed more for long-running processes, and cfe merely ensures it’s running, but it might work for certain issues like this.

>
> Bryan attempt to have CFEngine upgrade itself on a systemd-bound OS seem
> to suggest exactly that: the new package installs, the post-installation
> scripts request a service restart and boom! The whole thing is killed,
> leaving the upgrade half-way done.
>
> This suggests a few things to me.
>
> The more important one is: on systemd-bound OSs we should get into the
> habit of asking systemd to start/stop services and let go the (now) bad
> habit of restarting a service by running it's script from init.d
> directly. If not, that process may result as a child of CFEngine, and
> should CFEngine stop for any reason, part of the system will go down
> with it.

I suppose one solution to this problem is to decouple the upgrade of the package from the restart. At least then the package installation would succeed. There’s still an issue of how to eventually restart the service, but we can reduce the problem down to its smallest area of effect.


Bryan

Marco Marongiu

unread,
Jan 19, 2015, 10:18:29 AM1/19/15
to help-c...@googlegroups.com
On 19/01/15 16:03, Bryan Burke wrote:
>>> One of the features of systemd is that forked processes can't
>>> escape their parent in no way. The moment you kill the parent,
>>> you kill the children, too, and there is no way out of this.
> Do you know this for certain?

Not tested, but that's what I read from
http://0pointer.de/blog/projects/systemd.html section "Keeping Track of
Processes"


> One of the things I wanted to try was
> having a process daemonize, that is, fork (once, maybe twice), and
> then call setsid();

From the same page, section "Writing Daemons", you read:

> We ask daemon writers not to fork or even double fork in their
> processes, but run their event loop from the initial process systemd
> starts for you. Also, don't call setsid().

What happens if you do, I don't know.


> I wanted to see if this could break out of the cgroup prison. I
> haven’t read up too much on cgroups, so I don’t know, but I feel
> there must be some way for a process (perhaps, only one owned by
> root) to declare itself independent of cgroup monitoring.

If you do some experiments on the subject, I'd be glad if you'd share
the results.


>>> The more important one is: on systemd-bound OSs we should get
>>> into the habit of asking systemd to start/stop services and let
>>> go the (now) bad habit of restarting a service by running it's
>>> script from init.d directly. If not, that process may result as a
>>> child of CFEngine, and should CFEngine stop for any reason, part
>>> of the system will go down with it.
> I suppose one solution to this problem is to decouple the upgrade of
> the package from the restart.

Yes


> At least then the package installation would succeed.

Yes


> There’s still an issue of how to eventually restart the service,

Exactly... which brings us back to square one... Isn't spinning the
upgrade off from CFEngine the most reliable solution (like, e.g., using
delegating it to at)? It would work whether or not one has the package
installation and service restart in the same software package.

Ciao!
-- bronto

Bryan Burke

unread,
Jan 19, 2015, 10:25:46 AM1/19/15
to help-cfengine
>
> Not tested, but that's what I read from
> http://0pointer.de/blog/projects/systemd.html section "Keeping Track of
> Processes"
>
>
>> One of the things I wanted to try was
>> having a process daemonize, that is, fork (once, maybe twice), and
>> then call setsid();
>
> From the same page, section "Writing Daemons", you read:
>
>> We ask daemon writers not to fork or even double fork in their
>> processes, but run their event loop from the initial process systemd
>> starts for you. Also, don't call setsid().
>
> What happens if you do, I don't know.

Of course, when told not to do something like that, I have to try to see what it breaks :)

>
>> I wanted to see if this could break out of the cgroup prison. I
>> haven’t read up too much on cgroups, so I don’t know, but I feel
>> there must be some way for a process (perhaps, only one owned by
>> root) to declare itself independent of cgroup monitoring.
>
> If you do some experiments on the subject, I'd be glad if you'd share
> the results.

Will do.

>> There’s still an issue of how to eventually restart the service,
>
> Exactly... which brings us back to square one... Isn't spinning the
> upgrade off from CFEngine the most reliable solution (like, e.g., using
> delegating it to at)? It would work whether or not one has the package
> installation and service restart in the same software package.

If we went this route, I wouldn’t want the cfengine package to assume anything about when/how it needs to restart. It’d be up to the sysadmin/policy-writer to determine what mechanism should be used to restart the service out-of-band.


Bryan

Bryan Burke

unread,
Apr 5, 2015, 10:35:35 PM4/5/15
to help-cfengine
> From the same page, section "Writing Daemons", you read:
>
>> We ask daemon writers not to fork or even double fork in their
>> processes, but run their event loop from the initial process systemd
>> starts for you. Also, don't call setsid().
>
> What happens if you do, I don't know.

Of course, when told not to do something like that, I have to try to see what it breaks :)

So, I finally got around to trying this (things have been insane at work for the last 1.5 months or so). Unfortunately, it seems easy to break out of one of the normal cgroup controllers (cpu, memory, etc), but there's some special controller for systemd, and every time I try to break out of it, I more or less get "controller does not exist". I don't know a lot about cgroups, but things I can do with the others don't work for the systemd one.

In addition, calling daemon(3) (double-fork, setsid(2), etc) may confuse systemd with respect to which process is the "main" one for a given unit when starting it up, but it does not seem to have any effect on moving the process out of systemd's control.
 
>> There’s still an issue of how to eventually restart the service,

As best as I can tell, with the current implementation, restarting the service must be out-of-band; thus, updating the package must also. I can see 3 natural methods for doing this:

1. cron: Natural, but has the problem that if the update task is to only happen once, extra intervention is required, and I'm guessing most people would not want this; rather, they'd want cfengine to be able to trigger a one-time update.

2. at: This is by far the simplest to set up and satisfies all requirements (is de-coupled from cfengine, and can be triggered by cfengine). Setting a "yum --quiet -u update cfengine-community" task to execute at say "now + 5 minutes" (or something configurable by the policy-writer) is pretty easy.

3. systemd: Systemd supports "timer units" which are units that simply activate other units based on some timespec. This has potential for the most natural solution for the platform but is also the most complex to set up. I don't suspect many would take the time. Possibly a sketch could do this for us.

I've thought more about the idea of removing the service restart from the cfengine update script in the RPM. Since cfengine would presumably still have to schedule a restart of itself, we'd still run into the need for an out-of-band restart. Functionality that might avoid this would be including support for a daemon-reload mechanism like OpenSSH's: Basically, when it receives SIGHUP, it calls execve(2) with its original arguments. Then the RPM restart could be changed to a reload and perhaps this whole thing could be avoided.

I understand that without a controlling process, this behavior might be difficult to implement, as cf-agent/execd doesn't necessarily have knowledge of the other cf-* daemons (if any). I'm curious what some of the developers think about the potential of this in the future?

-- 
Bryan

Marco Marongiu

unread,
Apr 6, 2015, 1:22:54 PM4/6/15
to help-c...@googlegroups.com
Thanks Bryan for trying this and for sharing. As I said previously, I've
implemented something with cfengine and at and am looking for (or trying
to make) a clear schedule to post about that.

Ciao
-- bronto
Reply all
Reply to author
Forward
0 new messages