Installation order -- change to random?

435 views
Skip to first unread message

Justin McWilliams

unread,
Apr 19, 2012, 10:03:04 AM4/19/12
to Greg Neagle, munk...@googlegroups.com
Greg and all,

I recently came across a couple of hosts that were hanging when
attempting to install a particular package. This particular package
was always the first in the list, so there were many other packages
that the client never even attempted to install. It turns out when we
fixed that single package, all others installed without a problem.
This got me thinking: a single bad package should not totally prevent
Munki from working for all packages.

Is there a reason we install packages in a particular order (the same
order) on every execution? I haven't dug too deep yet, but I think
it's installing items based on the order defined in the manifest. I
can't think of a reason why this would be necessary for the admin to
control, since they have "update_for" and "requires" pkginfo keys.
So ... would it be worthwhile to change Munki to install items in a
random order, so a single bad package doesn't completely kill a client
and other updates can continue to install successfully on subsequent
runs?

- Justin

Nate

unread,
Apr 19, 2012, 10:05:49 AM4/19/12
to munk...@googlegroups.com, Greg Neagle
I think installing them in a random order would be kind of a hacky way around this problem.  It would be better if munki could skip the bad package and continue in the order as listed in my manifest.  I like that it installs in the order listed as it provides a deeper level of organization which can be helpful.

Nate

Justin McWilliams

unread,
Apr 19, 2012, 10:23:35 AM4/19/12
to munk...@googlegroups.com, Greg Neagle
On Thu, Apr 19, 2012 at 10:05 AM, Nate <nate....@gmail.com> wrote:
> I think installing them in a random order would be kind of a hacky way
> around this problem.  It would be better if munki could skip the bad package
> and continue in the order as listed in my manifest.

I agree that it would be ideal to just skip pkgs that previously
failed, but that's a ton more work to implement, wouldn't be
foolproof, etc.

> I like that it installs
> in the order listed as it provides a deeper level of organization which can
> be helpful.

Can you elaborate on the advantages of maintaining control of the
order? I don't understand what you mean when mentioning
organization...

Raúl Cuza

unread,
Apr 19, 2012, 10:28:30 AM4/19/12
to munk...@googlegroups.com, Greg Neagle, munk...@googlegroups.com
A random order would make it difficult to troubleshoot problems that are affected by installation order.

Sent from my eMate 300

Nate

unread,
Apr 19, 2012, 11:16:58 AM4/19/12
to munk...@googlegroups.com
I guess I could still organize my manifests, but I dislike the idea of a random install order.  It doesn't *really* fix the problem...you could randomly install the bad package first and run into a similar issue.  Granted, after several runs, you may get the majority of the packages installed, I would rather be sure that they are installed.

Also, as you said, the real answer is to fix the broken package (which you did), so do we really need more munki functionality for this?  I've run into a similar issue in the past.  Even with copious amounts of testing, there are some edge cases that will be missed (been there done that).  Discovered it, fixed it and everything was happy.  

Nate

Greg Neagle

unread,
Apr 19, 2012, 11:29:59 AM4/19/12
to munk...@googlegroups.com

On Apr 19, 2012, at 7:03 AM, Justin McWilliams wrote:

> Greg and all,
>
> I recently came across a couple of hosts that were hanging when
> attempting to install a particular package. This particular package
> was always the first in the list, so there were many other packages
> that the client never even attempted to install. It turns out when we
> fixed that single package, all others installed without a problem.
> This got me thinking: a single bad package should not totally prevent
> Munki from working for all packages.

It depends on how you define "bad package". If you mean one that causes /usr/sbin/install to exit with an installation error, Munki continues on with the rest of the packages from there.

I could easily create a package with a preflight script that did `/sbin/shutdown -h now`. This would kill everything, and so Munki would attempt to install it over and over, always failing.

>
> Is there a reason we install packages in a particular order (the same
> order) on every execution? I haven't dug too deep yet, but I think
> it's installing items based on the order defined in the manifest.

Yes, sort of, with dependencies injected and included_manifests in there, too.

> I
> can't think of a reason why this would be necessary for the admin to
> control, since they have "update_for" and "requires" pkginfo keys.

"update_for" and "requires" information is not currently used at install time; updatecheck.py writes out items in intended install order; installer.py installs them in that order.
So adding some randomness to the install order without screwing up the correct install order for items in a dependency relationship would not be trivial.

> So ... would it be worthwhile to change Munki to install items in a
> random order, so a single bad package doesn't completely kill a client
> and other updates can continue to install successfully on subsequent
> runs?

Seems like an edge case better handled by fixing the bad package(s).

Not opposed to the idea (though there may be people who rely on the current behavior), but not sure it's worth the effort.

-Greg

>
> - Justin

Nick McSpadden

unread,
Apr 19, 2012, 11:30:50 AM4/19/12
to munki-dev


On Apr 19, 7:23 am, Justin McWilliams <o...@google.com> wrote:
> Can you elaborate on the advantages of maintaining control of the
> order?  I don't understand what you mean when mentioning
> organization...

Look at my particular situation. I have a server on a private gigabit
ethernet switch that does not connect to the WAN that I use for
deploying to locally attached computers. It has a Munki server on it,
so that way these machines can load up all their software and updates
without affecting the network as a whole. The last thing that happens
in the Munki list is a change to the Munki config itself, which
switches it over to the Munki server sitting on the WAN, so that once
the machines leave my private deployment network and enter the "real
world," they'll still be able to receive updates from Munki.

If this happens in random order, my entire deployment strategy is
shot, because if the configuration for Munki access changes halfway
through, that means half of my private server's Munki updates won't go
in.

Regardless of whether this is an ideal setup or not is irrelevant to
the debate; the point is that randomizing the Munki installs from the
list means you actually have no idea what is going to install when. I
don't particularly understand how it's advantageous to not know how
your software deployment is actually going to go. Isn't the entire
point of something like Munki to make deployment predictable and
easy? Why would I want to play Russian roulette with updates every
time I bootstrap a new computer with Munki?

As Raul points out, that would make it excruciatingly difficult to
troubleshoot problems with installation order, especially if you have
lots of packages that are dependent on other packages being present
first. I don't want packages deploying MCX settings for iLife
preferences before iLife has been installed.

Justin McWilliams

unread,
Apr 19, 2012, 11:57:44 AM4/19/12
to munk...@googlegroups.com
By "bad package" I means one that either crash Munki entirely, or hang
indefinitely. I've seen both. One's that fail gracefully (installer
returns non-zero) are fine, as other updates will then be installed.

I agree that fixing the broken package is the best thing to do, but
discovering such a problem isn't immediate. In the cases I've found, a
package was working well on 99.x% of clients, so it took a while to
realize that there were a handful of clients that went *weeks* without
installing *any* updates. I'd prefer during those weeks *most* updates
install just fine, and the *one* update that is causing problems
doesn't hold everything else up, as then the machine would at least be
mostly updated while we're not yet aware of the bad package.

As is, installation order seems undocumented and questionable when
taking into consideration mixed managed_updates/managed_installs,
included_manifests, etc., so I just thought this wouldn't be an issue.
But it seems like enough people are (wrongfully?) relying on manifest
defined order too heavily to change this.

Justin McWilliams

unread,
Apr 19, 2012, 12:05:29 PM4/19/12
to munk...@googlegroups.com
On Thu, Apr 19, 2012 at 11:30 AM, Nick McSpadden <nmcsp...@gmail.com> wrote:
>
>
> On Apr 19, 7:23 am, Justin McWilliams <o...@google.com> wrote:
>> Can you elaborate on the advantages of maintaining control of the
>> order?  I don't understand what you mean when mentioning
>> organization...
>
> Look at my particular situation.  I have a server on a private gigabit
> ethernet switch that does not connect to the WAN that I use for
> deploying to locally attached computers.  It has a Munki server on it,
> so that way these machines can load up all their software and updates
> without affecting the network as a whole.  The last thing that happens
> in the Munki list is a change to the Munki config itself, which
> switches it over to the Munki server sitting on the WAN, so that once
> the machines leave my private deployment network and enter the "real
> world," they'll still be able to receive updates from Munki.

That sounds like something you should handle in postflight, not a package.

> If this happens in random order, my entire deployment strategy is
> shot, because if the configuration for Munki access changes halfway
> through, that means half of my private server's Munki updates won't go
> in.
>
> Regardless of whether this is an ideal setup or not is irrelevant to
> the debate; the point is that randomizing the Munki installs from the
> list means you actually have no idea what is going to install when.  I
> don't particularly understand how it's advantageous to not know how
> your software deployment is actually going to go.  Isn't the entire
> point of something like Munki to make deployment predictable and
> easy?  Why would I want to play Russian roulette with updates every
> time I bootstrap a new computer with Munki?

If packages have dependencies, the pkginfo files should ensure those
dependencies are defined and met before installation takes place.
Order in manifest doesn't guarantee that previous installations were
successful and dependencies are all met. See my example below for
iLife.

> As Raul points out, that would make it excruciatingly difficult to
> troubleshoot problems with installation order, especially if you have
> lots of packages that are dependent on other packages being present
> first.  I don't want packages deploying MCX settings for iLife
> preferences before iLife has been installed.

For this, the MCX settings package should use an "installs" key in the
pkginfo, so it doesn't get installed until iLife exists. Simply
relying on manifest order is not sufficient. Think of the case where
iLife fails to install for whatever reason (installer returns
non-zero); if you're only relying on order in the manifest, then the
MCX package would still get installed so the same problem you've just
identified would still occur.

Greg Neagle

unread,
Apr 19, 2012, 12:19:39 PM4/19/12
to munk...@googlegroups.com
On Apr 19, 2012, at 9:05 AM, Justin McWilliams wrote:

As Raul points out, that would make it excruciatingly difficult to
troubleshoot problems with installation order, especially if you have
lots of packages that are dependent on other packages being present
first.  I don't want packages deploying MCX settings for iLife
preferences before iLife has been installed.

For this, the MCX settings package should use an "installs" key in the
pkginfo, so it doesn't get installed until iLife exists.  Simply
relying on manifest order is not sufficient.  Think of the case where
iLife fails to install for whatever reason (installer returns
non-zero); if you're only relying on order in the manifest, then the
MCX package would still get installed so the same problem you've just
identified would still occur.

Actually, with the current implementation, Munki _will_ attempt to install the MCX package even if the iLife install failed (MCX is a bad example; no harm if it is installed, and install won't fail if iLife isn't installed).

Right now, dependencies are checked by code in updatecheck.py, and items are written to InstallInfo.plist in their intended installation order. installer.py does not currently check the dependencies during a normal install run; in fact, until recently, the dependency info wasn't even available at this stage.

Last fall, I changed the handling of unattended_install items so that more of them could be installed in an unattended manner; prior to this change, the unattended_install skipped any items that were part of a dependency chain since it couldn't be sure all prerequisites were in place.  So now update_for/requires information is available at install time, and it would be possible to make installer.py use this information during the install phase (though still not trivial).

-Greg

Rob Middleton

unread,
Apr 19, 2012, 6:31:30 PM4/19/12
to munk...@googlegroups.com
Justin,

You can randomise the order of managed_installs items in Simian & that would give you much of the desired behaviour (particularly if you are not using included_manifests).

Rob.

Greg Neagle

unread,
Apr 19, 2012, 6:35:09 PM4/19/12
to munk...@googlegroups.com
Oh, yeah -- good point.

If the order of the content of "managed_installs" is randomized by Simian, Justin can get what he wants without affecting those who rely on the current behavior and without affecting dependency ordering.

in other words, randomize the INPUT to updatecheck instead of the OUTPUT...

-Greg

Justin McWilliams

unread,
Apr 19, 2012, 6:38:29 PM4/19/12
to munk...@googlegroups.com
On Thu, Apr 19, 2012 at 6:31 PM, Rob Middleton <rrmid...@gmail.com> wrote:
> Justin,
>
> You can randomise the order of managed_installs items in Simian & that would give you much of the desired behaviour (particularly if you are not using included_manifests).

Yea I thought of that before mailing munki-dev, but thought perhaps
Munki would benefit from a similar change. I was obviously wrong ;)

Currently Simian sorts alphabetically so viewing the manifest in the
web interface is easier on the eyes, but we could just randomly sort
before sending off to the clients. Or we could automatically detect
repeated failures, omit those items from the manifest entirely, and
raise alerts to admins ... if only the day had more hours.

Rob Middleton

unread,
Apr 19, 2012, 6:41:17 PM4/19/12
to munk...@googlegroups.com
Yeah - I think that should largely meet both needs.

I don't rely on the order - but for me it helps in debugging to have the same behaviour on many machines.

I would prefer a deterministic outcome -- I prefer 100 computers failing in the same way rather than 100 computers failing in an additionally random way. When I'm busy I need a critical mass of failure before I take a look.

Rob.

Raúl Cuza

unread,
Apr 20, 2012, 8:08:08 AM4/20/12
to munk...@googlegroups.com, munk...@googlegroups.com
On Apr 19, 2012, at 11:57, Justin McWilliams <og...@google.com> wrote:

> (snip snip)


>
> I agree that fixing the broken package is the best thing to do, but
> discovering such a problem isn't immediate. In the cases I've found, a
> package was working well on 99.x% of clients, so it took a while to
> realize that there were a handful of clients that went *weeks* without
> installing *any* updates. I'd prefer during those weeks *most* updates
> install just fine, and the *one* update that is causing problems
> doesn't hold everything else up, as then the machine would at least be
> mostly updated while we're not yet aware of the bad package.
>

When I read this I hear the need for an auditing system where I can ask the question "which systems are not fully patched?" and get an answer. For 'mission critical' packages, this same system could raise an alert.

The auditing system is ideally independent of the processes it is checking so it doesn't fail in the same way. But ideal schameal.

And this comment is tangential to your desire that a package causing munki to fail in some way not interfere with other package installations.

Raúl

Reply all
Reply to author
Forward
0 new messages