makecatalogs - deletes catalogs if in no pkginfo file

122 views
Skip to first unread message

Rob Middleton

unread,
May 18, 2010, 12:30:10 AM5/18/10
to munk...@googlegroups.com
Hi folks,

Getting a bit of a problem when a catalog does not exist but is referenced in a manifest.

In my current case, no pkginfo currently references the catalog 'preprod' therefore it does not get built by makecatalogs (and the old copy gets deleted). When managedsoftwareupdate --checkonly runs it cannot download that non-existant catalog and crashes.

I'm not sure whether it is best for managedsoftwareupdate to understand that a catalog may sometimes not exist, or whether makecatalogs should accept some arguments to define which catalogs should be created even if they would be empty.

Output of 'managedsoftwareupdate --checkonly' below.
(using version 0.5.1)

Cheers,
Rob Middleton.
Centenary Institute
Sydney, Australia.


$ sudo /usr/local/munki/managedsoftwareupdate --checkonly
Managed Software Update Tool
Copyright 2010 The Munki Project
http://code.google.com/p/munki

ERROR: Unexpected error in updatecheck:
ERROR: Traceback (most recent call last):
File "/usr/local/munki/managedsoftwareupdate", line 416, in main
updatecheckresult = updatecheck.check(id=options.id)
File "/usr/local/munki/munkilib/updatecheck.py", line 2374, in check
installinfo)
File "/usr/local/munki/munkilib/updatecheck.py", line 1282, in processManifestForInstalls
getCatalogs(cataloglist)
File "/usr/local/munki/munkilib/updatecheck.py", line 1671, in getCatalogs
message=message)
File "/usr/local/munki/munkilib/updatecheck.py", line 2298, in getHTTPfileIfChangedAtomically
message=message)
File "/usr/local/munki/munkilib/updatecheck.py", line 2224, in curl
downloadedsize = os.path.getsize(tempdownloadpath)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/genericpath.py", line 49, in getsize
return os.stat(filename).st_size
OSError: [Errno 2] No such file or directory: '/Library/Managed Installs/catalogs/preprod.download'

--
You received this message because you are subscribed to the Google Groups "munki-dev" group.
To post to this group, send email to munk...@googlegroups.com.
To unsubscribe from this group, send email to munki-dev+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/munki-dev?hl=en.

Greg Neagle

unread,
May 18, 2010, 12:50:03 AM5/18/10
to munk...@googlegroups.com, munk...@googlegroups.com
That's definitely a bug. It should just report an error retrieving the catalog and continue on. I'll look at it tomorrow AM. (I'm curious - does this error occur with 0.5.0?)

Sent from my iPad

Rob Middleton

unread,
May 18, 2010, 2:37:20 AM5/18/10
to munk...@googlegroups.com
Unsure about 0.5.0. I only got munki setup properly for testing on Sunday. I needed the time & the curl munki - verifying server certs is important to me (thanks for that change).

My next steps with Munki are:
- a server-side dynamic manifest (probably tied to our database of machine MAC addresses).
- looking again at securing /Library/Preferences/ManagedInstalls.plist from privilege escalation (admin -> root). An acl of "everyone deny delete" should have done the trick, but doesn't because munki rewrites this file on every run using a python filesystem write command (wiping out the acl).
- wondering how I prevent automatic download of big updates when a client is off our main network, but allowing download of updates if users choose to 'update now'. (Must support users at home on maternity leave, must not use people's mobile 3G data allowances on updates.)
- speeding up initial unattended wait->install->reboot->wait->install->reboot cycle on deployment of a fresh computer.
- and ... more repackaging

Regards,
Rob.

Greg Neagle

unread,
May 18, 2010, 12:33:12 PM5/18/10
to munk...@googlegroups.com
I wasn't able to reproduce your exact error, but did see a series of other errors caused by non-existent catalogs.

Replace your /usr/local/munki/munkilib/updatecheck.py with this one:

http://munki.googlecode.com/svn-history/r537/trunk/code/client/munkilib/updatecheck.py

and see if it addresses your issue as well.

-Greg

Greg Neagle

unread,
May 18, 2010, 1:08:35 PM5/18/10
to munk...@googlegroups.com
On May 17, 2010, at 11:37 PM, Rob Middleton wrote:

> - looking again at securing /Library/Preferences/ManagedInstalls.plist from privilege escalation (admin -> root).

As per our previous discussion on this topic, I don't really understand how you protect _anything_ from an administrative user...

> - wondering how I prevent automatic download of big updates when a client is off our main network, but allowing download of updates if users choose to 'update now'. (Must support users at home on maternity leave, must not use people's mobile 3G data allowances on updates.)

You'd have to selectively enable/disable the launchd job specified by /Library/LaunchDaemons/com.googlecode.munki.managedsoftwareupdate-check.plist, which controls the automatic check, or rewrite/replace /usr/local/munki/updatecheckhelper to do the "right thing" when off your network.

Our munki server is not available from outside our network; to get updates remotely, a user would have to connect via VPN. And we have virtually zero users using 3G data remotely...

> - speeding up initial unattended wait->install->reboot->wait->install->reboot cycle on deployment of a fresh computer.

Can you give more detail on this?

When I reimage a machine, I touch /Volumes/NewlyImagedVolume/Users/Shared/.com.googlecode.munki.checkandinstallatstartup.

On the first boot, munki then runs immediately when the loginwindow loads; it then downloads and installs everything in on fell swoop. I'm curious why you have multiple wait->install->reboot cycles.

-Greg

Rob Middleton

unread,
May 19, 2010, 5:30:38 AM5/19/10
to munk...@googlegroups.com
1. That fixes the crash thanks.

And indeed - I can't see how the error I had could have been caused by this :-/. Your fix clears up the problem of a catalog appearing in the cataloglist of a manifest, but not having a valid catalog loaded into catalog[catalogname].


2. Further digging shows a related bug -- in the case of a 404 error but there is a previously downloaded file, Munki will use the previous file so long as the previous file parses as a valid plist.


def getHTTPfileIfChangedAtomically(url, destinationpath,
message=None, resume=False):
...
return destinationpath, err
# in this function, destinationpath is returned even if there was a failure. The only case destinationpath = None is returned is where no previous file existed at that location.

That function called by --
def getCatalogs(cataloglist):
...
(newcatalog, err) = getHTTPfileIfChangedAtomically(catalogurl,
catalogpath,
message=message)
if newcatalog:
# this function assumes that if newcatalog != None that a valid download occurred - this is not true, it could mean a 404 error but a previously downloaded file.

[outcome - if that catalog is now empty, Munki will use the last previously cached non-empty catalog]



3. An unimportant imperfection is that the failed-to-download catalog will be reattempted for download for every nested manifest processed. Each manifest processed will call getCatalogs(cataloglist) for all relevant catalogs. If a valid catalog is already loaded it will not download it again (catalogname in catalog), if the catalog failed to download for the last manifest a download attempt will again be made.

def getCatalogs(cataloglist):
...
for catalogname in cataloglist:
if not catalogname in catalog:
....
(newcatalog, err) = getHTTPfileIfChangedAtomically(catalogurl,
catalogpath,
message=message)


Slowly getting my head around the code. I still haven't got my head around the nice curl wrapper. I like the use of extended attributes to store the etag.

Regards,
Rob Middleton.

Greg Neagle

unread,
May 19, 2010, 12:53:09 PM5/19/10
to munk...@googlegroups.com

On May 19, 2010, at 2:30 AM, Rob Middleton wrote:

> 1. That fixes the crash thanks.
>
> And indeed - I can't see how the error I had could have been caused by this :-/. Your fix clears up the problem of a catalog appearing in the cataloglist of a manifest, but not having a valid catalog loaded into catalog[catalogname].
>
>
> 2. Further digging shows a related bug -- in the case of a 404 error but there is a previously downloaded file, Munki will use the previous file so long as the previous file parses as a valid plist.
>
>
> def getHTTPfileIfChangedAtomically(url, destinationpath,
> message=None, resume=False):
> ...
> return destinationpath, err
> # in this function, destinationpath is returned even if there was a failure. The only case destinationpath = None is returned is where no previous file existed at that location.
>
> That function called by --
> def getCatalogs(cataloglist):
> ...
> (newcatalog, err) = getHTTPfileIfChangedAtomically(catalogurl,
> catalogpath,
> message=message)
> if newcatalog:
> # this function assumes that if newcatalog != None that a valid download occurred - this is not true, it could mean a 404 error but a previously downloaded file.
>
> [outcome - if that catalog is now empty, Munki will use the last previously cached non-empty catalog]

I need to think about this a bit; the original intention was to have managedsoftwareupdate _not_ fail when it couldn't download things from the munki server, so that it could continue on and possibly install things that had been previously downloaded. Here's the scenario:

1) managedsoftwareupdate runs while a laptop is on the organization's network and it downloads some items to be installed. A user is logged in however, and elects not to install at that time.

2) The user takes the laptop home. managedsoftwareupdate runs again, cannot contact the munki server, but continues on and sees pre-downloaded items ready to be installed. If a user is logged in, they might be notified at this time; otherwise we go ahead and install.

So I try to proceed as much as possible even if I cannot get things from the munki server, but there are probably logic errors due to this... This seems to be one.

>
> 3. An unimportant imperfection is that the failed-to-download catalog will be reattempted for download for every nested manifest processed. Each manifest processed will call getCatalogs(cataloglist) for all relevant catalogs. If a valid catalog is already loaded it will not download it again (catalogname in catalog), if the catalog failed to download for the last manifest a download attempt will again be made.
>
> def getCatalogs(cataloglist):
> ...
> for catalogname in cataloglist:
> if not catalogname in catalog:
> ....
> (newcatalog, err) = getHTTPfileIfChangedAtomically(catalogurl,
> catalogpath,
> message=message)

I consider the case of a non-existent catalog in a manifest a mistake on the part of the munki admin. We don't want managedsoftware update to crash in the is case, but we don't need to spend any effort optimizing for this case, either. Do you disagree? In this case, the fix is simple -- the admin should remove the reference to the non-existent catalogs from the manifests, or populate the non-existent catalog!

I suppose I could just build an empty catalog dictionary for non-existent catalogs, which would prevent the reattempted downloads...

> Slowly getting my head around the code. I still haven't got my head around the nice curl wrapper. I like the use of extended attributes to store the etag.

That solution made me happy, since I didn't have to make some other structure to keep track of Etags.

-Greg

Rob Middleton

unread,
May 19, 2010, 6:58:25 PM5/19/10
to munk...@googlegroups.com
On 20/05/2010 2:53 AM, Greg Neagle wrote:
> On May 19, 2010, at 2:30 AM, Rob Middleton wrote:
>
>
>> 2. Further digging shows a related bug -- in the case of a 404 error but there is a previously downloaded file, Munki will use the previous file so long as the previous file parses as a valid plist.
>>
>>
>> def getHTTPfileIfChangedAtomically(url, destinationpath,
>> message=None, resume=False):
>> ...
>> return destinationpath, err
>> # in this function, destinationpath is returned even if there was a failure. The only case destinationpath = None is returned is where no previous file existed at that location.
>>
>> That function called by --
>> def getCatalogs(cataloglist):
>> ...
>> (newcatalog, err) = getHTTPfileIfChangedAtomically(catalogurl,
>> catalogpath,
>> message=message)
>> if newcatalog:
>> # this function assumes that if newcatalog != None that a valid download occurred - this is not true, it could mean a 404 error but a previously downloaded file.
>>
>> [outcome - if that catalog is now empty, Munki will use the last previously cached non-empty catalog]
>>
> I need to think about this a bit; the original intention was to have managedsoftwareupdate _not_ fail when it couldn't download things from the munki server, so that it could continue on and possibly install things that had been previously downloaded. Here's the scenario:
>
> 1) managedsoftwareupdate runs while a laptop is on the organization's network and it downloads some items to be installed. A user is logged in however, and elects not to install at that time.
>
> 2) The user takes the laptop home. managedsoftwareupdate runs again, cannot contact the munki server, but continues on and sees pre-downloaded items ready to be installed. If a user is logged in, they might be notified at this time; otherwise we go ahead and install.
>
> So I try to proceed as much as possible even if I cannot get things from the munki server, but there are probably logic errors due to this... This seems to be one.
>
I agree, not straight forward.

A 404 error is almost a clear "no longer on the server / should no
longer be on the client" directive ... at least for HTTPS. There is
still a problem of a laptop connecting to a wireless network through a
captive portal (wireless login/payment page) - you may not be talking to
the real webserver, so deleting the file on a 404 that has been
generated by an intercepting web server is problematic. But perhaps that
is just an argument for only supporting https for manifest and catalog
files :-).
>> 3. An unimportant imperfection is that the failed-to-download catalog will be reattempted for download for every nested manifest processed. Each manifest processed will call getCatalogs(cataloglist) for all relevant catalogs. If a valid catalog is already loaded it will not download it again (catalogname in catalog), if the catalog failed to download for the last manifest a download attempt will again be made.
>>
>> def getCatalogs(cataloglist):
>> ...
>> for catalogname in cataloglist:
>> if not catalogname in catalog:
>> ....
>> (newcatalog, err) = getHTTPfileIfChangedAtomically(catalogurl,
>> catalogpath,
>> message=message)
>>
> I consider the case of a non-existent catalog in a manifest a mistake on the part of the munki admin. We don't want managedsoftware update to crash in the is case, but we don't need to spend any effort optimizing for this case, either. Do you disagree? In this case, the fix is simple -- the admin should remove the reference to the non-existent catalogs from the manifests, or populate the non-existent catalog!
>
> I suppose I could just build an empty catalog dictionary for non-existent catalogs, which would prevent the reattempted downloads...
>
I think in my first email I suggested this alternate (or additional)
solution of allowing makecatalogs to take a list of catalogs that need
to exist. The mistake isn't so much on the part of the munki admin when
we can't give the tools the right parameters yet :-), makecatalogs wipes
out my empty catalogs. The question then is whether makecatalogs just
takes the catalog list on the command line (my preference), or whether
it parses the manifest directory to find the list of catalogs that
should exist.

It should be fairly normal for the testing catalog to be empty. A
pkginfo is either pushed to production within 2 days or removed from
testing ... if no more packages are loaded for 2 weeks the testing
catalog might be empty (or missing) for 12 days. The manifests should
not need to be modified in this case, they are hand coded unlike the
catalogs.

Cheers,
Rob Middleton.

>
>> Slowly getting my head around the code. I still haven't got my head around the nice curl wrapper. I like the use of extended attributes to store the etag.
>>
> That solution made me happy, since I didn't have to make some other structure to keep track of Etags.
>
> -Greg
>
>
>> Regards,
>> Rob Middleton.
>>
>> On 19/05/2010, at 2:33 AM, Greg Neagle wrote:
>>
>>
>>> I wasn't able to reproduce your exact error, but did see a series of other errors caused by non-existent catalogs.
>>>
>>> Replace your /usr/local/munki/munkilib/updatecheck.py with this one:
>>>
>>> http://munki.googlecode.com/svn-history/r537/trunk/code/client/munkilib/updatecheck.py
>>>
>>> and see if it addresses your issue as well.
>>>
>>> -Greg
>>>
>>> On May 17, 2010, at 11:37 PM, Rob Middleton wrote:
>>>
>>>
>>>> Unsure about 0.5.0. I only got munki setup properly for testing on Sunday. I needed the time& the curl munki - verifying server certs is important to me (thanks for that change).

Greg Neagle

unread,
May 19, 2010, 7:20:54 PM5/19/10
to munk...@googlegroups.com
On May 19, 2010, at 3:58 PM, Rob Middleton wrote:

> It should be fairly normal for the testing catalog to be empty. A pkginfo is either pushed to production within 2 days or removed from testing ... if no more packages are loaded for 2 weeks the testing catalog might be empty (or missing) for 12 days. The manifests should not need to be modified in this case, they are hand coded unlike the catalogs.

Fantastic point. I always have _something_ in testing, so that didn't occur to me, but is obvious now in retrospect. Until I address this, you should keep _something_ in testing, even if it's an unused package that's just a placeholder.

And remember that items can be in multiple catalogs, so you can have your current munkitools pkg (for example) in both testing and production catalogs simultaneously as another workaround for now.

-Greg
Reply all
Reply to author
Forward
0 new messages