cfengine on OS X: found issue with cf-execd/launchd

47 views
Skip to first unread message

robu...@gmail.com

unread,
Feb 28, 2017, 7:00:28 AM2/28/17
to help-cfengine
hi all,
since a while I'm running cfengine happily also on OS X.
Lately I discovered an issue, I believe, with launchd and cf-execd.
If I keep the systems (various MacBooks) running for a couple of days
with only letting them hibernate (no full reboot) it stops executing cf-agent.
I realized that first by checking the timestamps with cf-key -s on the server vs. timestamps I got from our munki environment (that we use to deploy software)
On the OS X systems there are cf-execd, cf-serverd and cf-monitord processes running (sane, no zombies ;-),
also 'launchctl list' shows that all three daemons are loaded,
but cf-agent isn't executed anymore on it's regular intervall (5 min here).

To fix this,  one has to either reboot the instance or
unload and reload the cf-execd through launchctl.

After that, cf-agent will be executed regular again.

Can anyone confirm this behaviour?
Anyone an idea what creates this hickup? (why does hibernate affects cf-execd but not other services?)
And: before I start hacking something myself that reloads the service after each waking up: Has anyone maybe already a fix for it?

TIA
and best regards

Stefan Skoglund (lokal användare)

unread,
Mar 1, 2017, 12:37:08 PM3/1/17
to help-c...@googlegroups.com
I saw something like this but in this case it was caused by a mismatch
between time-zone of os (Linux) and hardware which caused a sudden
change in OS time.
That broke cf-hub's pull of info from that machine so the status for
that machine ended up as not ok.

robu...@gmail.com

unread,
Mar 7, 2017, 11:14:00 AM3/7/17
to help-cfengine

I saw something like this but in this case it was caused by a mismatch
between time-zone of os (Linux) and hardware which caused a sudden
change in OS time.
That broke cf-hub's pull of info from that machine so the status for
that machine ended up as not ok.


thnx Stefan,
no, I don't see that my issue has to do with time or timezones in this case.

(Forgot to report the versions: 
"CFEngine Core 3.7.2" from cfengineers (thnx to you guys by the way ;-)
on "darwin_x86_64_16_4_0")

I solved it for now by creating a script to reload cf-execd and a launchctl file to start that script.
Seems to be okay for now.

bash script:

/usr/local/sbin/restart_cf_execd.sh
-------------------------
#!/bin/bash
#
# cf-execd keeps dreaming after OS X wakes up from hibernation

# here we
# restart cf-execd in case cf-key reports an timestamp not equal to "this" hour.
# In order not to restart cf-execd too often call this script each >5min
# (more precisely >time-interval-of-cf-agent-executions) through launchd
# That results in an "offline" time of cfengine of max: cf-agent-interval+launchd-interval
# If you call this script each 10 min cf-execd will be woken up latest 15 min after system wakeup
#

if [ ! $(cf-key -s | egrep "(Outgoing)" |  egrep "`date "+%a %b +%-d %H"`") ]; then
       
/bin/launchctl unload /Library/LaunchDaemons/com.cfengine.cf-execd.plist
        sleep
1
       
/bin/launchctl load /Library/LaunchDaemons/com.cfengine.cf-execd.plist
fi

exit 0
--------------------------


and the simple launchctl file:

/Library/LaunchAgents/com.cfengine.cf-execd-wake-up.plist
--------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
 
<dict>
   
<key>Label</key>
   
<string>com.cfengine.cf-execd-wake-up</string>
   
<key>ProgramArguments</key>
   
<array>
       
<string>/usr/local/sbin/restart_cf_execd.sh</string>
   
</array>
       
<key>StartInterval</key>
       
<integer>600</integer>
   
<key>AbandonProcessGroup</key>
   
<true/>
   
<key>StandardErrorPath</key>
   
<string>/dev/null</string>
   
<key>StandardOutPath</key>
   
<string>/dev/null</string>
 
</dict>
</plist>

mike.w...@verticalsysadmin.com

unread,
Mar 11, 2017, 12:43:10 AM3/11/17
to help-cfengine
Interesting!  May I recommend you open a bug report with CFEngine?  https://tracker.mender.io/secure/CreateIssue!default.jspa

I looked, and I couldn't find any existing bug reports matching this description.

Best,
--Mike Weilgart
Vertical Sysadmin, Inc.

robu...@gmail.com

unread,
Mar 21, 2017, 12:11:38 PM3/21/17
to help-cfengine
Well, that was why I was looking for confirmation here.
I had the issue on a few MacBooks, but struggling now to reliably reproduce the behavior.
I'll keep an eye on it, and if and when more confident,  report back.
thnx
Reply all
Reply to author
Forward
0 new messages