fqdn fact is set to "localhost" at boot time

38 views
Skip to first unread message

David Alden

unread,
Mar 30, 2016, 5:48:47 PM3/30/16
to puppet...@googlegroups.com
Hi,
   I'm running into a problem with puppet agent on my Mac's.  I have ~200 mac's and when they reboot, a few of them (typically 5-10) have problems where the "fqdn" fact is set to "localhost" (as is the "hostname" fact).  This causes them to try to request a new cert using the certname "localhost" and therefore puppet (which is run by launchd) never works (it just sits there printing "Notice: Did not receive certificate" every 2 minutes .  I have a few "hacks" that I've thought of to fix this, but I'm not a fan of them:

1)  Write a script that runs every <n> minutes looking at the puppet.log file to see if the last few lines contains "Did not receive certificate" and unload/load puppet.

2) Write a wrapper for puppet that makes sure the network is up (and the fqdn is returning a true fqdn) before it starts puppet.

3) Set the "certname" in puppet.conf -- I don't like this because we rename hosts kind of often and we want their certname to be their fqdn.  I don't want to add another step ("don't forget to login and edit the puppet.conf file before you rename the computer").

I'd much prefer to figure out how to properly fix this (as I would expect that no one really wants the fqdn fact to be "localhost" :) - any suggestions?

...dave

jcbollinger

unread,
Mar 31, 2016, 9:55:58 AM3/31/16
to Puppet Users


If your machines rely on the network to determine their hostnames, and Puppet is configured to use the hostname as certname (as it is by default), then it follows that you cannot successfully perform Puppet catalog runs until the network is up and the machine has determined (or maybe just can determine) its own hostname.  I'm not well versed in OS X service management, which in any case I understand has changed somewhat over time, but surely it has a conventional idiom for expressing that sort of dependency.

Alternatively, and perhaps a bit quicker and dirtier, you could run the agent via an external scheduler instead of as a daemon in its own right (i.e. with --onetime --no-daemonize).  This has a lot of advantages, one of the lesser being that if the agent does issue a certificate request, it will do so only once per scheduled run (and fail the run if it does not receive the requested cert).  Supposing that you can rely on the network eventually coming up, and the machines learning their correct hostnames, the problem will go away (on a per-machine, per-boot basis) once that happens.  Overall, you will see far fewer cert-request errors logged.


John

Reply all
Reply to author
Forward
0 new messages