Official puppetlabs position on cron vs puppet as a service?

2,491 views
Skip to first unread message

Brian Gupta

unread,
Sep 23, 2011, 6:42:45 PM9/23/11
to Puppet Users
Over the years many shops have come to start running puppet via cron to address memory leaks in earlier versions of Ruby, but the official position was that puppet was meant to be run as a continually running service. 

I am wondering if the official position has changed. On one hand many if not all of the early Ruby issues have been fixed, on the other, the addition of mcollective into the mix as a lightweight agent for triggering adhoc puppet runs, and other tasks somewhat lowers the requirements for puppet to be run as a service. (Or out of cron for that matter).

I understand that in cases where old Ruby versions are for whatever reason mandated the answer may be different.

Thanks,
Brian

--


Len Rugen

unread,
Sep 23, 2011, 10:22:01 PM9/23/11
to puppet...@googlegroups.com
We're switching back to running as a service, but we observed that the puppet runs tended to cluster together.  (We are using foreman also).  We used "splay" at boot to randomize the puppet runs, in case we rebooted a lot of systems at the same time, but over time, puppet runs woud cluster together, causing performance issues on the puppet master. 
 
So, we now restart the service a few times a week via cron to re-splay the runs. 

--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To post to this group, send email to puppet...@googlegroups.com.
To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.

treydock

unread,
Sep 24, 2011, 10:22:18 AM9/24/11
to Puppet Users
> <http://aws.amazon.com/solutions/solution-providers/brandorr/>

Could those memory leak problems cause the Puppet daemon to crash with
no logs indicating why? I have about 20 systems all running CentOS 5
and 6, with Puppet 2.6.9, and I now have to have Zabbix run a "/etc/
init.d/puppet start" everytime the daemon crashes which is almost on a
daily basis for every client. Would be interested to know of a known
fix or if the only "fix" is the workaround of using Cron.

Thanks
- Trey

Aaron Grewell

unread,
Sep 24, 2011, 10:42:38 PM9/24/11
to puppet...@googlegroups.com

We had frequent inexplicable daemon crashes on Solaris, but not on RHEL5 (at least not yet) .   Given known issues with memory leakage in older Ruby releases Cron seemed more likely to be reliable.   We stuck a random wait in the Cron job to spread load on the master and so far it works well.

treydock

unread,
Sep 25, 2011, 3:33:45 PM9/25/11
to Puppet Users


On Sep 24, 9:42 pm, Aaron Grewell <aaron.grew...@gmail.com> wrote:
> We had frequent inexplicable daemon crashes on Solaris, but not on RHEL5 (at
> least not yet) .   Given known issues with memory leakage in older Ruby
> releases Cron seemed more likely to be reliable.   We stuck a random wait in
> the Cron job to spread load on the master and so far it works well.
Could you share how you did the random wait? I may have to switch to
a cron job with how often my daemons are crashing and having to be
restarted by Zabbix.

Thanks
- Trey

Ohad Levy

unread,
Sep 25, 2011, 4:03:09 PM9/25/11
to puppet...@googlegroups.com

I used the ip_to_cron function from
http://projects.puppetlabs.com/projects/1/wiki/Cron_Patterns

afterwards, I just do a sleep random 59, so its also random within the minute.

Ohad

Scott Smith

unread,
Sep 25, 2011, 5:29:45 PM9/25/11
to puppet...@googlegroups.com

Ohad, was rand_fqdn not sufficient for you?

Ohad Levy

unread,
Sep 26, 2011, 4:10:57 AM9/26/11
to puppet...@googlegroups.com
On Mon, Sep 26, 2011 at 12:29 AM, Scott Smith <sc...@ohlol.net> wrote:
> Ohad, was rand_fqdn not sufficient for you?

well.. I did it a long time ago, so I'm not 100% sure, but I think the
main reason was to allow to manage cron entries over an interval, e.g.
3 times an hour, or 7 times a day in a random fashion.

Ohad

Brian Gallew

unread,
Sep 26, 2011, 10:51:48 AM9/26/11
to puppet...@googlegroups.com
I ended up writing a custom rand_fqdn function based heavily off the standard rand_fqdn.  In my environment, we have a lot of related system (e.g. webs001, webs002, webs003), many of which have significant startup times.  I changed the function to split an incoming hostname into a name+numeric suffix (or zero if there is none).  Then it uses the standard rand algorithm on the name part and multiplies the suffix by 5 and adds that in with an appropriate modulus.  This means that for all the "webs" hosts, there is a standard base time, after which they are staggered at 5 minutes intervals (overlapping every 6th, of course).  The end result is that no matter what happens with the rand() function, an entire group of servers is never restarted at the same time.

Aaron Grewell

unread,
Sep 26, 2011, 12:32:15 PM9/26/11
to puppet...@googlegroups.com
I picked this up online somewhere and modified to suit:

   # Puppet will run twice an hour on a random schedule to spread load.
    $first_run  = fqdn_rand(30)
    $second_run = fqdn_rand(30) + 30

cron { 'puppet_cron':
        command => '/usr/bin/puppet agent --onetime --logdest syslog > /dev/null 2>&1',
        user    => 'root',
        minute  => ["$first_run","$second_run"],
        require => File['puppet_conf'],

Joshua Anderson

unread,
Oct 1, 2011, 2:58:00 AM10/1/11
to puppet...@googlegroups.com
Are you using custom facts?

If so, you should check to see if any of them are unintentionally doing Bad Things, e.g., modifying global state like environment variables.

-Josh

Larry Ludwig

unread,
Oct 7, 2011, 9:27:54 PM10/7/11
to puppet...@googlegroups.com
Mostly stlll run as cron. Though for some instances we run as a daemon.

Matthew Nicholson

unread,
Oct 8, 2011, 2:32:35 PM10/8/11
to puppet...@googlegroups.com
We combine these. We run as a service, but have a daily cron, with random time spread among our hosts, to stop/start the service and clean up stale .pid files. This is more of a hold over from our early days more than anything, but it works, doesn't cause issues, and keeps the runs spread out. 



On Fri, Oct 7, 2011 at 9:27 PM, Larry Ludwig <larr...@gmail.com> wrote:
Mostly stlll run as cron. Though for some instances we run as a daemon.

--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/puppet-users/-/itTFPtfZLocJ.

To post to this group, send email to puppet...@googlegroups.com.
To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.



--
Matthew Nicholson

Chris Phillips

unread,
Oct 8, 2011, 4:22:09 PM10/8/11
to puppet...@googlegroups.com
My take on it is to run it from our nagios server. What better way to monitor the puppet runs than by executing that run as part of the check? retry intervals also help push changes out much quicker if they could take multiple runs etc.

We also run a single daily cron job.

Chris

Jonathan Gazeley

unread,
Oct 10, 2011, 8:05:57 AM10/10/11
to puppet...@googlegroups.com
On 08/10/11 21:22, Chris Phillips wrote:
> What better way to monitor the puppet runs than by executing that run as
> part of the check?

I assume your Nagios plugin execution timeout must be insanely long? :)

In the past I have considered using Nagios for things other than
monitoring, and likewise using Puppet for things other than
configuration. On both counts I decided it was probably best to set a
boundary and not wilfully abuse these tools, since it's likely to go
wrong sooner or later! In my organisation we use Nagios only to monitor,
and Puppet only to configure.

Have fun!

Jonathan

Ohad Levy

unread,
Oct 10, 2011, 9:32:35 AM10/10/11
to puppet...@googlegroups.com

If you are using foreman, its very easy to query the last puppet
report state, e.g.

curl -k -u $user:$pass https://foreman/hosts/`hostname
-f`/reports/last?format=json |prettify_json.rb
{
"report": {
"reported_at": "2011-10-10T13:03:02Z",
"metrics": {
"time": {
"group": 0.001799,
"class": 0.002389,
"config_retrieval": 2.4686119556427,
"cron": 0.00056,
"schedule": 0.002556,
"service": 0.702501,
"yumrepo": 0.081921,
"total": 4.6954209556427,
"mailalias": 0.000351,
"package": 0.012924,
"exec": 0.336481,
"file": 1.079741,
"filebucket": 0.000226,
"user": 0.00536
},
"events": {
"total": 0
},
"resources": {
"total": 212
},
"changes": {
"total": 0
}
},
"id": 269755,
"summary": "Success",
"host": "super.tlv.redhat.com",
"logs": [

],
"status": {
"failed": 0,
"restarted": 0,
"applied": 0,
"skipped": 0,
"failed_restarts": 0
}
}
}


Ohad

Chris Phillips

unread,
Oct 10, 2011, 10:00:24 AM10/10/11
to puppet...@googlegroups.com
always done within 30 seconds, and it's not like if it took longer on an occasional rollout  it would impact puppet at all, temporarily messy as the monitor results might be.

fundamentally though, with cron or puppetd being trivial simple, i'm more than happy to be doing it this way.


Chris

Brian Gallew

unread,
Oct 10, 2011, 11:01:00 AM10/10/11
to puppet...@googlegroups.com
Most of my puppet runs take ~15 seconds or so, however my Nagios servers take up to 4 minutes to complete.

On Mon, Oct 10, 2011 at 7:00 AM, Chris Phillips <ch...@untrepid.com> wrote:
On 10 October 2011 13:05, Jonathan Gazeley <jonathan...@bristol.ac.uk> wrote:
On 08/10/11 21:22, Chris Phillips wrote:
What better way to monitor the puppet runs than by executing that run as
part of the check?

I assume your Nagios execution timeout must be insanely long? :)


In the past I have considered using Nagios for things other than monitoring, and likewise using Puppet for things other than configuration. On both counts I decided it was probably best to set a boundary and not wilfully abuse these tools, since it's likely to go wrong sooner or later! In my organisation we use Nagios only to monitor, and Puppet only to configure.

always done within 30 seconds, and it's not like if it took longer on an occasional rollout  it would impact puppet at all, temporarily messy as the monitor results might be.

fundamentally though, with cron or puppetd being trivial simple, i'm more than happy to be doing it this way.


Chris

Craig White

unread,
Oct 10, 2011, 11:16:09 AM10/10/11
to puppet...@googlegroups.com
that always seems to redirect me to 'login' (even though I am passing the -u username:password)

Craig

--
Craig White ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ craig...@ttiltd.com
1.800.869.6908 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ www.ttiassessments.com

Need help communicating between generations at work to achieve your desired success? Let us help!

Ohad Levy

unread,
Oct 10, 2011, 2:13:06 PM10/10/11
to puppet...@googlegroups.com
On Mon, Oct 10, 2011 at 5:16 PM, Craig White <craig...@ttiltd.com> wrote:
> that always seems to redirect me to 'login' (even though I am passing the -u username:password)
>
I'm guessing you have ssl redirection turned on and you are using http
instead of https?

Ohad

Craig White

unread,
Oct 10, 2011, 2:33:41 PM10/10/11
to puppet...@googlegroups.com

On Oct 10, 2011, at 11:13 AM, Ohad Levy wrote:

> On Mon, Oct 10, 2011 at 5:16 PM, Craig White <craig...@ttiltd.com> wrote:
>> that always seems to redirect me to 'login' (even though I am passing the -u username:password)
>>
> I'm guessing you have ssl redirection turned on and you are using http
> instead of https?

----
strange... just tried again and it worked

and an fyi for anyone trying to use nginx/foreman, this seems to work fairly well..

passenger_pre_start https://$SERVER:8142/;
server {
server_name $SERVER;
listen 8142;
root /var/www/foreman/public;
passenger_enabled on;
passenger_min_instances 1;
rails_env production;
rails_spawn_method smart;
passenger_user puppet;
passenger_use_global_queue off;

error_log logs/foreman_error.log error;
access_log logs/foreman_access.log combined;

ssl on;
ssl_certificate /etc/puppet/ssl/certs/$SERVER.pem;
ssl_certificate_key /etc/puppet/ssl/private_keys/$SERVER.pem;
ssl_crl /etc/puppet/ssl/ca/ca_crl.pem;
ssl_session_timeout 5m;
ssl_protocols SSLv3 TLSv1;
ssl_ciphers ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:!kEDH:+EXP:-SSLv2;
ssl_prefer_server_ciphers on;
ssl_verify_client off;
ssl_verify_depth 1;
ssl_session_cache builtin:1000 shared:SSL:10m;
}

Craig

Scott Smith

unread,
Oct 10, 2011, 11:02:32 PM10/10/11
to puppet...@googlegroups.com

Most things are ok if you only have 10 servers

Reply all
Reply to author
Forward
0 new messages