Wierd Puppet Master issue

26 views
Skip to first unread message

Peter Berghold

unread,
Mar 23, 2016, 10:43:44 AM3/23/16
to puppet-users
Luckily this doesn't happen all  the time, but I've seen in twice now in about a year's time on two different Puppet masters.  Here's some background.

I have in a central location a "Grand Master" that serves only the "Remote Masters" each located in a different data center.  On the remote masters is a copy of the Puppet modules and site.pp as it exists on the Grand Master.  Also on each remote master is an RPM repository containing RPMs developed in house as well as supporting RPMs for Puppet since nothing in production can talk over the network to anything outside the data center with exception of the remote master.

These two sets of data get updated during the normal Puppet agent run on the remote master. Here is a sanitized version of the Puppet classes doing the work:

file { '/data/repos':
    source             =>$repo_src,
    backup             => false,
    source_permissions => use,
    purge              => true,
    recurse            => true,
    ignore             => '*/repodata/*',
    force              => true,
}

file {'/data/puppet-modules':
       source             => $src_uri,
    backup             => false,
    purge              => true,
    source_permissions => use,
    recurse            => true,
    force              => true
}

Here's where things get weird.  For the second time this has failed and the transfer between the grand master and the remote master has hung in mid session.  In this last case it wasn't until it started to process the RPM tree that it hung.  

Restarting the master process on the grand master did not help. However, when I stopped the master process and manually started it in debug mode the problem went away.  This just doesn't make sense to me.


Has anybody else observed this behavior and were you able to resolve it?

jcbollinger

unread,
Mar 24, 2016, 9:43:22 AM3/24/16
to Puppet Users


On Wednesday, March 23, 2016 at 9:43:44 AM UTC-5, Salty Old Cowdawg wrote:

Here's where things get weird.  For the second time this has failed and the transfer between the grand master and the remote master has hung in mid session.  In this last case it wasn't until it started to process the RPM tree that it hung.  

Restarting the master process on the grand master did not help. However, when I stopped the master process and manually started it in debug mode the problem went away.  This just doesn't make sense to me.



Are you saying that after working as expected for many cycles on all the machines involved, one or more of the remote masters' own catalog runs against the central master started failing consistently? Because it's unclear to me why restarting a master process would be expected to rescue a stalled catalog run of one of the agents associated with it.
 


Has anybody else observed this behavior and were you able to resolve it?



I have not observed the behavior you describe, but the most prominent red flag (ok, maybe yellow flag) that I see in the resource declarations you presented is their reliance on recursive File resources.  These are a continual sore spot for Puppet, and they frequently cause problems when applied to directory trees that have either large numbers of files or a large volume of data, for values of "large" that are somewhat environment-dependent.

I cannot be confident that it will solve the problem you observed, but on general principles I would recommend that you choose a different mechanism for synching those files.  Git and rsync seem to be popular tools for that, and both can be driven by Puppet.


John

Reply all
Reply to author
Forward
0 new messages