Puppet Agent starting too soon

37 views
Skip to first unread message

Laverne Schrock

unread,
Mar 28, 2017, 6:02:10 AM3/28/17
to Puppet Users
I have a box on which puppet-agent does not start correctly on reboot. Well, to be more precise, the puppet-agent starts, but never contacts the server.


$ sudo journalctl -b 0 -u puppet -f
-- Logs begin at Wed 2017-02-01 18:27:11 CST. --
Mar 06 12:42:03 localhost.localdomain systemd[1]: Started Puppet agent.
Mar 06 12:42:16 localhost.localdomain puppet-agent[927]: Could not request certificate: getaddrinfo: Temporary failure in name resolution
Mar 06 12:44:16 a.real.hostname.tld puppet-agent[927]: Could not request certificate: getaddrinfo: Temporary failure in name resolution
Mar 06 12:46:16 a.real.hostname.tld puppet-agent[927]: Could not request certificate: getaddrinfo: Temporary failure in name resolution
Mar 06 12:48:16 a.real.hostname.tld puppet-agent[927]: Could not request certificate: getaddrinfo: Temporary failure in name resolution

Note how when the puppet-agent starts, the box doesn't yet know its hostname because the network stack is (apparently) not fully up. Running `systemctl restart puppet` resolves the issue until the next reboot.

I was able to find a work-around. In the systemd unit file for puppet, I changed

After=basic.target network.target
to
 After=basic.target network-online.target

See: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

The box is a fresh install of Fedora 25 and is using the following packages:
puppet-agent-1.9.3-1.fedoraf25.x86_64
puppetlabs-release-pc1-1.1.0-5.fedoraf25.noarch

I have another box with the same setup (but a little more RAM) and the issue does not occur there.

I have two thoughts on this.
1) This is a subtle timing issue which is why I see it on one box, but not the other.
2) puppet-agent is misbehaving and ought to  properly detect when the networking stack comes up.
3) If I want to resolve this, I should just use my workaround.

Does #3 seem like the best plan? I'd appreciate any insight into why the issue is occurring.

Cheers,
-Laverne Schrock
 



Trevor Vaughan

unread,
Mar 28, 2017, 7:34:43 AM3/28/17
to Puppet Users
This does seem to be the fix that is required both for Fedora and EL7.

Trevor

--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/36ca1e68-fa09-4654-ba62-9c13c2561c76%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Trevor Vaughan
Vice President, Onyx Point, Inc

-- This account not approved for unencrypted proprietary information --

Rob Nelson

unread,
Mar 28, 2017, 9:00:38 AM3/28/17
to puppet...@googlegroups.com
That wiki page says that all you should need to do is have NetworkManager or systemd-networkd services enabled. Do you by any chance have them both disabled on the affected node? I know many of us don't like NetworkManager, but disabling it entirely can cause some problems. Just a guess, as I haven't seen this issue on any of my nodes, including the low-RAM ones.
On Mon, Mar 27, 2017 at 11:33 PM, Laverne Schrock <lverns...@gmail.com> wrote:

Trevor Vaughan

unread,
Mar 28, 2017, 9:08:22 AM3/28/17
to Puppet Users
Disabling NetworkManager hasn't caused any issues for me so far. That said, EL7.3 might break that, so Fedora may already be broken.

Trevor


For more options, visit https://groups.google.com/d/optout.

Rob Nelson

unread,
Mar 28, 2017, 9:23:59 AM3/28/17
to puppet...@googlegroups.com
Did you replace it with systemd-networkd, or just ditch it? Regardless, I've updated us to EL7.3 (CentOS though) and not observed this issue, so I'm not sure it's that generic of a problem, there must be *something* triggering it. Even a simple race condition should result in some successes if Laverne has rebooted it a few dozen times.

Trevor Vaughan

unread,
Mar 29, 2017, 9:24:56 AM3/29/17
to Puppet Users

Rob Nelson

unread,
Mar 29, 2017, 6:00:43 PM3/29/17
to puppet...@googlegroups.com
I'm guessing those race conditions are related to not replacing it with something else. Seems they're "magic" components to not requiring reliance on network-online.target. Hooray :/

Игорь Тиунов

unread,
Mar 30, 2017, 2:51:22 AM3/30/17
to Puppet Users
HI, I have the same issue for my Redhat 7 servers (immutable hypervisors). For some reasons ppuppet.servise unit is not network depended. I don't know what is it bug or feature. I override dependency for puppet unit to depend on network.service (I use lsb network service)
Reply all
Reply to author
Forward
0 new messages