Service entry for puppet agents not working

43 views
Skip to first unread message

Bret Wortman

unread,
Aug 8, 2016, 8:40:24 AM8/8/16
to Puppet Users
We've been using cron to manage our puppet agents for the past few years but have discovered some issues where it's running under a different environment and is having trouble completing when run in cron, but it works fine as a daemon or from the command line. So I'm preparing to switch over.

Unfortunately, the following doesn't work for my 3.8.6 agents on Centos 6 systems even though it works fine for 4.3 agents:

service { "puppet":
    ensure => running,
    enable => true,
    hasstatus => true,
    hasrestart => true,
}

What we see on some agents is that puppet will restart the service each and every time it runs, which gives us lots of false "changes".

# service puppet status
puppet dead but pid file exists
# ps aux | grep puppet | grep agent
root      9879  0.0  0.0 134404 43516 ?       Ss     12:22    0:00 /usr/bin/ruby/usr/bin/puppet agent

Has anyone else seen this or know of a workaround? I've tried various ways of providing a "status => " command but haven't found anything that works yet.

Christopher Wood

unread,
Aug 8, 2016, 10:13:38 AM8/8/16
to puppet...@googlegroups.com
On Mon, Aug 08, 2016 at 05:40:24AM -0700, Bret Wortman wrote:
> We've been using cron to manage our puppet agents for the past few years
> but have discovered some issues where it's running under a different
> environment and is having trouble completing when run in cron, but it
> works fine as a daemon or from the command line. So I'm preparing to
> switch over.

This sounds like an xy problem. Your underlying issue is that the agent sometimes runs under a different environment than desired and you'd like this to stop.

Over here we've had agents (3 and 4) running from cron with the following:

usecacheonfailure = false
environment = (whatever that is)

I presume that if we used the cache on failure then if the agent did not retrieve a catalog after an environment change in the ENC then it would perform the agent run with a catalog from an undesired environment.

Do you set an environment in your External Node Classifier? If not, and you don't specify the environment in puppet.conf then you will start in the 'production' environment which may not be what you want.

Have you been able to narrow down and reproduce the conditions under which your agent runs happen in an undesired environment? You could have a different issue, albeit that I never had your issue in 3.8.6 with multiple environments.

NB, xy problem: http://www.perlmonks.org/?node=XY+Problem

> Unfortunately, the following doesn't work for my 3.8.6 agents on Centos 6
> systems even though it works fine for 4.3 agents:
>
> service { "puppet":
>     ensure => running,
>     enable => true,
>     hasstatus => true,
>     hasrestart => true,
> }
>
> What we see on some agents is that puppet will restart the service each
> and every time it runs, which gives us lots of false "changes".

Off hand this sounds like the service checker can't find the pid file. If it happens some measurable times per day in your place I would crank up the debug logging and see what's going on.

On the other hand, if it works in 4.3, why not upgrade the remaining 3.x agents and call it a day? We've had fewer issues in 4 than we had in 3.

> # service puppet status
> puppet dead but pid file exists
> # ps aux | grep puppet | grep agent
> root      9879  0.0  0.0 134404 43516 ?       Ss     12:22    0:00
> /usr/bin/ruby/usr/bin/puppet agent
>
> Has anyone else seen this or know of a workaround? I've tried various ways
> of providing a "status => " command but haven't found anything that works
> yet.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [1]puppet-users...@googlegroups.com.
> To view this discussion on the web visit
> [2]https://groups.google.com/d/msgid/puppet-users/5ae7de27-705f-4856-aa07-68449af7385a%40googlegroups.com.
> For more options, visit [3]https://groups.google.com/d/optout.
>
> References
>
> Visible links
> 1. mailto:puppet-users...@googlegroups.com
> 2. https://groups.google.com/d/msgid/puppet-users/5ae7de27-705f-4856-aa07-68449af7385a%40googlegroups.com?utm_medium=email&utm_source=footer
> 3. https://groups.google.com/d/optout

Bret Wortman

unread,
Aug 8, 2016, 11:09:39 AM8/8/16
to Puppet Users
Yep, it's not finding the pidfile because the init script is looking in /var/run/puppet/agent.pid and the daemon is putting it at /var/lib/puppet/run/agent.pid. So for now I'm going to modify the init script wherever we are having this problem.

We're upgrading the systems as we update them to Centos7.

I don't know what it is about the environment that's different. That'll be an investigation for another day. The change to cron was basically to spread out our agent communication load throughout the hour (we were only having them check in hourly -- for us, that's plenty). This started as a way to figure out if there was somehow something different in the environments. I think there is. Just need to figure out what it is now.

Thanks!

Rob Nelson

unread,
Aug 8, 2016, 12:07:19 PM8/8/16
to puppet...@googlegroups.com

On Mon, Aug 8, 2016 at 11:09 AM, Bret Wortman <br...@thewortmans.org> wrote:
Yep, it's not finding the pidfile because the init script is looking in /var/run/puppet/agent.pid and the daemon is putting it at /var/lib/puppet/run/agent.pid. So for now I'm going to modify the init script wherever we are having this problem.

We saw this issue when we performed an upgrade and the puppet.conf file's rundir was different than where the services file was looking for the .pid file. The recommendation was to remove the rundir setting from puppet.conf, as the default location was the same as what the service file expected, rather than to hardcode it to the correct value, in case it changed in the future.

Bret Wortman

unread,
Aug 8, 2016, 1:12:46 PM8/8/16
to puppet...@googlegroups.com
The affected node (or, at least, the one I'm looking at) doesn't actually have rundir set. The hunt for a rational explanation and ideal solution goes on. :-)

Thanks, Rob!

--
You received this message because you are subscribed to a topic in the Google Groups "Puppet Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/puppet-users/HC2knEe4HKw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/CAC76iT-%2BkkFUeB3W%2BGwfmMyqW6-tGJuhbESnj8txJCRZpSQcWw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Bret Wortman

unread,
Aug 9, 2016, 8:51:10 AM8/9/16
to Puppet Users
In trying to work out the environmental differences, I _have_ noticed that the daemon runs as the puppet user, but when I run it interactively, it runs as root. As root, it works, as puppet, it works, but from root's crontab, it doesn't.

What I failed to mention before is that the failure scenario is that I'll get a "Could not evaluate: Could not retrieve file metadata for puppet:///modules/ourlib/pip.conf: end of file reached". It's always the same file, but that file works just fine for most hosts and works as described above. It's just that root's crontab doesn't play nice on some systems. Scratching my head.


On Monday, August 8, 2016 at 8:40:24 AM UTC-4, Bret Wortman wrote:

Rob Nelson

unread,
Aug 9, 2016, 9:11:38 AM8/9/16
to puppet...@googlegroups.com

On Tue, Aug 9, 2016 at 8:51 AM, Bret Wortman <br...@thewortmans.org> wrote:
What I failed to mention before is that the failure scenario is that I'll get a "Could not evaluate: Could not retrieve file metadata for puppet:///modules/ourlib/pip.conf: end of file reached"

Maybe try adding something like "/usr/bin/ruby/usr/bin/puppet config print modulepath > /root/puppet.debug" to cron and compare it with the results you get when you run that interactively. It's possible that there's some difference due to the cron environment, which I always find tricky to debug, but I think it's best to determine if it's an issue at all before banging your head against that wall. Just grasping at straws here, admittedly.

Bret Wortman

unread,
Aug 9, 2016, 9:16:56 AM8/9/16
to puppet...@googlegroups.com
They matched. Good thought, though.

--
You received this message because you are subscribed to a topic in the Google Groups "Puppet Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/puppet-users/HC2knEe4HKw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to puppet-users+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages