We're seeing this error on a small percentage of our clients. Most of
them have recently installed Munki, rebooted, and manually executed
the MSU GUI. Most of them also don't receive the error and everything
is fine if they manually execute the MSU GUI a 2nd or 3rd time. Some
are getting it repeatedly; we obviously need to look closer at these
ones. Clients are a mix of 10.5.x and 10.6.x.
Does anyone else see this occasionally or for a period of time after
the client is installed and the machine is rebooted? I haven't looked
into the MSU code much yet - figured I would ask first.
- Justin
MSU.app attempts to start it by touching /private/tmp/.com.googlecode.munki.updatecheck.launchd
A launchd job defined at /Library/LaunchDaemons/com.googlecode.munki.managedsoftwareupdate-manualcheck.plist is triggered when the file is created.
So a failure to start the `managedsoftwareupdate --manualcheck` process is most likely caused by the com.googlecode.munki.managedsoftwareupdate-manualcheck job not being active.
I'm guessing one of two things:
1) Your users aren't giving you accurate information, and the machine has not been rebooted after the initial munki install, and therefore the launchd com.googlecode.munki.managedsoftwareupdate-manualcheck job is not active.
2) There's something wrong with your installation so that the com.googlecode.munki.managedsoftwareupdate-manualcheck job is not being created.
On the machines that are showing this issue, what is the output of:
% sudo launchctl list | grep munki
It should be similar to:
% sudo launchctl list | grep munki
- 0 com.googlecode.munki.managedsoftwareupdate-manualcheck
- 0 com.googlecode.munki.managedsoftwareupdate-install
20142 - com.googlecode.munki.managedsoftwareupdate-check
> --
> You received this message because you are subscribed to the Google Groups "munki-dev" group.
> To post to this group, send email to munk...@googlegroups.com.
> To unsubscribe from this group, send email to munki-dev+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/munki-dev?hl=en.
>
Next time I hear of this problem I'll have the user sudo launchctl
list before attempting to run MSU a second time.
Of course, none of this would happen if we simply didn't email users
pointing them to docs that explain they can manually run MSU if
they're eager ;)
# run the preflight script if it exists
preflightscript = os.path.join(scriptdir, 'preflight')
try:
result, output = utils.runExternalScript(preflightscript, runtype)
munkicommon.display_info(output)
except utils.ScriptNotFoundError:
result = 0
pass # script is not required, so pass
except utils.RunExternalScriptError, e:
result = 0
munkicommon.display_warning(str(e))
if result:
# non-zero return code means don't run
munkicommon.display_info('managedsoftwareupdate run aborted by'
' preflight script: %s' % result)
# record the check result for use by Managed Software Update.app
# right now, we'll return the same code as if the munki server
# was unavailable. We need to revisit this and define additional
# update check results.
recordUpdateCheckResult(-2)
if options.munkistatusoutput:
# connect to socket and quit
munkistatus.activate()
munkistatus.quit()
exit(-1)
I'd be curious to see if there are any entries in /Library/Managed Installs/Logs/ManagedSoftwareUpdate.log (or wherever you've relocated this log) around the time of the manual invocation of Managed Software Update.app.
-Greg
Will this behaviour occur if the preflight script takes longer than 30
seconds to complete?
I can't see any command before the preflight in managedsoftwareupdate
that would trigger munkistatus (and thus connection to the munkistatus
socket).
Thus, MSU.app drops the trigger file to cause root
'managedsoftwareupdate --manualcheck'. MSU.app then immediately creates
the socket:
"/tmp/com.googlecode.munki.munkistatus.%s" % os.getpid()
and waits for up to 30 seconds for managedsoftwareupdate to connect to
this socket.
managedsoftwareupdate does a few bits and pieces before connecting to
that socket, one being the preflight script, the length of time that
takes to execute not being under munki's control.
Cheers,
Rob.
> Greg,
>
> Will this behaviour occur if the preflight script takes longer than 30 seconds to complete?
I believe so, yes. Could this be your issue, Justin? Maybe for some reason sometimes your preflight takes longer than 30 seconds to execute, and sometimes less?
If your preflight is written in Python, a workaround that would not require modifying munki would be to add
from munkilib import munkicommon
munkicommon.display_status('Running preflight script...')
This could absolutely be it. We're doing quite a bit of work in our
preflight (running a slow "facter -p", 3-4 HTTP roundtrips, etc).
We've certainly seen our preflight take longer in some cases.
> If your preflight is written in Python, a workaround that would not require modifying munki would be to add
Yes. I'll try that. Thanks for the find, Rob, and the suggested fix, Greg!
On Thu, Nov 18, 2010 at 3:49 PM, Greg Neagle <gregn...@mac.com> wrote:On Nov 18, 2010, at 12:30 PM, Rob Middleton wrote:Greg,Will this behaviour occur if the preflight script takes longer than 30 seconds to complete?I believe so, yes. Could this be your issue, Justin? Maybe for some reason sometimes your preflight takes longer than 30 seconds to execute, and sometimes less?
This could absolutely be it. We're doing quite a bit of work in our
preflight (running a slow "facter -p", 3-4 HTTP roundtrips, etc).
We've certainly seen our preflight take longer in some cases.If your preflight is written in Python, a workaround that would not require modifying munki would be to add
Yes. I'll try that. Thanks for the find, Rob, and the suggested fix, Greg!from munkilib import munkicommonmunkicommon.display_status('Running preflight script...')
I added a preflight script that simply did a `sleep 31` before exiting and it triggered the error message we've been discussing.
Looking at the best fix for this...
-Greg
-Greg
even calling munkicommon.display_status within managedsoftwareupdate
didn't do the trick.
As a quick fix I think we're going to deploy an MSU.app with 60sec
timeout, to avoid people hitting and reporting this error, but that's
obviously not the best solution.
Confirmed. This fixes the problem for us too. Thanks!