Configuring snmp v3 (or stopping/starting a service in the same cf-agent process)

42 views
Skip to first unread message

Beto

unread,
Dec 14, 2017, 2:47:42 PM12/14/17
to help-cfengine
Configuring snmp v3 with net-snmp requires the so-called "persistent" snmpd.conf, /var/lib/net-snmp/snmpd.conf, only be copied into placer when snmpd is stopped.

At times, snmpd may stop responding and the only way I have found to fix the issue is to:

1. manually delete /var/lib/net-snmp/snmpd.conf
2. cf-agent stops snmpd
3. cf-agent copies /var/lib/net-snmp/snmpd.conf back into place.
4. cf-agent starts snmpd

The problem with this approach is that, evidently due to function caching, the snmpd service status (active/inactive) is only determined once per cf-agent process.  So step 4 fails to start snmpd because the cached status shows it already active.

Currently I work around this issue by delaying step 4 until the next execution of cf-agent.  But that means a delay of  up to 10 minutes getting snmpd running.

I suppose I could replace the service promise in step 4 with a command promise but i'm wondering if there's a better solution, like somehow temporarily disabling function caching.

Your thoughts?

Here's the current policy:

#########################################################
#
# snmpd.cf - configure snmpd
#
#########################################################
bundle agent snmpd
{

  meta
:
     
"description"  string     => "Configure SNMP";
     
"tags"         slist      => { "autorun" };

  files
:
    linux
::
     
"/etc/snmp/snmpd.conf"
        edit_template  
=> "$(glb.templates)/snmpd.conf.mustache",
        edit_defaults  
=> empty,
        template_method
=> "mustache",
        classes        
=> results("bundle", "snmpd_conf"),
        perms          
=> mog("0600","$(snmpd_conf_owner)","root");

     
# NOTE:  snmpd saves persistent data in /var/lib/net-snmp/snmpd.conf (refered to here as the
     
# "persistent" snmpd.conf).  snmpd must be stopped before /var/lib/net-snmp/snmpd.conf is copied
     
# or snmpd will revert any changes on it's next startup.  So, we first copy the persistent snmpd.conf
     
# to a staging area, then stop snmpd, copy snmpd.conf to /var/lib/net-snmp and finally restart snmpd.
     
"/srv/sysadmin/etc/persistent_snmpd.conf"
        comment        
=> "stage the persistent snmpd.conf",
        copy_from      
=> force_cp("/var/lib/net-snmp/snmpd.conf"),
        classes        
=> results("bundle", "staged_snmpd_conf"),
        perms          
=> mog("0640","root","root");

    snmpd_stop_repaired
::
     
"/var/lib/net-snmp/snmpd.conf"
       
delete          => file;

     
"/var/lib/net-snmp/snmpd.conf"
        comment        
=> "copy the persistent snmpd.conf to /var/lib/net-snmp",
        copy_from      
=> local_cp("/srv/sysadmin/etc/persistent_snmpd.conf"),
        classes        
=> results("bundle", "persistent_snmpd_conf"),
        perms          
=> mog("0600","root","root");

  services
:
    snmpd_conf_repaired
::
     
"snmpd"
        service_policy  
=> "reload";

    staged_snmpd_conf_repaired
::
     
"snmpd"
        service_policy  
=> "stop",
        classes        
=> results("bundle", "snmpd_stop");

   
!(snmpd_stop_repaired|staged_snmpd_conf_repaired)::
   
# NOTE: Due to CFEngine function caching, the service status (active/inactive) is only determined once
   
# per cf-agent process.  snmpd service restart must therefore be delayed if it was stopped by the
   
# current cf-agent process.  This means a 5 minute delay in restarting.
     
"snmpd"
        service_policy  
=> "start";

  reports
:
   
"DEBUG|DEBUG_$(this.bundle)"::
     
"DEBUG $(this.bundle): snmpd.conf owner = $(snmpd_conf_owner)";
   
"DEBUG|DEBUG_$(this.bundle).hpsmh"::
     
"DEBUG $(this.bundle): Configuring hpsmh dlmod";
   
"DEBUG|DEBUG_$(this.bundle).!hpsmh"::
     
"DEBUG $(this.bundle): hpsmh dlmod not configured";

}


Nick Anderson

unread,
Dec 14, 2017, 2:58:16 PM12/14/17
to help-cfengine
Perhaps you could key off of the pid file being newer than the conf file?

I have done that before to determine when a host needs to be rebooted (pid 1 newer than grub.conf for example).

Beto

unread,
Dec 14, 2017, 3:25:19 PM12/14/17
to help-cfengine
Hello Nick,

The basic problem is that I can't stop and then start a service in the same cf-agent process.

This is because function caching caches the current state (active) of the service (e.g., snmpd).  So the (re)start of the service is skipped because the cached function state is already "active".

I don't see how examining the PID could solve this problem.

Marco Marongiu

unread,
Dec 14, 2017, 4:27:40 PM12/14/17
to help-c...@googlegroups.com


On 14/12/2017 21:25, Beto wrote:
> The basic problem is that I can't stop and then start a service in the
> same cf-agent process.
>
> This is because function caching caches the current state (active) of
> the service (e.g., snmpd).  So the (re)start of the service is skipped
> because the cached function state is already "active".

Can you use commands promises to stop and start snmpd and set classes
based on the return code of the command, so that you know if
killing/starting the process succeeded and act accordingly?

Ciao
-- bronto

Beto

unread,
Dec 14, 2017, 8:45:56 PM12/14/17
to help-cfengine
Ciao bronto,

Yes I think commands are an option but I wanted to ask you experts here if I might be missing any other solutions.  Like temporarily setting the body common control cache_system_functions to false.

Nick Anderson

unread,
Dec 14, 2017, 9:14:48 PM12/14/17
to help-cfengine
I don't think that policy restart is affected by the cached function results for running state class discovery.

Changing function caching temporarily might take two agent runs, but that could be coordinated between update.cf policy and promoses.cf. I can imagine interesting ways to do this with the new multiple augments support.

I think it might be cleaner if we had a way to do per promise cache invalidation.

Beto

unread,
Dec 15, 2017, 8:48:15 AM12/15/17
to help-cfengine
Nick,

I believe policy restart (i.e., stopping then starting) absolutely is affected by by function caching.  When I first implemented the snmp policy I couldn't understand why snmpd was never getting restarted; it was like the start promise was being skipped.  After much testing and tracing it became vary obvious what was happening.  I suppose this is a corner case as it likely not very often where one needs to stop and then start a service in a bundle. 

Nick Anderson

unread,
Dec 15, 2017, 11:11:55 AM12/15/17
to Beto, help-cfengine
I was only looking at the current MPF, and specifically related to systemv, so perhaps your using a different init or older MPF.

https://github.com/cfengine/masterfiles/blob/master/lib/services.cf#L319


sysvservice.restart::
"$(paths.service) $(service) restart"
handle => "standard_services_sysvservice_restart",
classes => kept_successful_command,
comment => "If the service should be restarted we issue the
standard service command to restart or reload the service.
There is no restriction based on the services current state as
restart can start a service that was not already
running.";

It runs the restart regardless of the current state (running|not_running). But still this promise will only be acutated ONCE per policy run because of promise locking.

If you need to issue that same command multiple times/multiple restarts of the same service within ONE agent execution then we will need to look at making the promise unique based on something like the call stack. I think this is possible in the most recent versions.


--
You received this message because you are subscribed to the Google Groups "help-cfengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to help-cfengine+unsubscribe@googlegroups.com.

To post to this group, send email to help-c...@googlegroups.com.
Visit this group at https://groups.google.com/group/help-cfengine.
For more options, visit https://groups.google.com/d/optout.

Marco Marongiu

unread,
Dec 15, 2017, 11:26:20 AM12/15/17
to help-c...@googlegroups.com


On 15/12/17 17:11, 'Nick Anderson' via help-cfengine wrote:
> If you need to issue that same command multiple times/multiple restarts
> of the same service within ONE agent execution then we will need to look
> at making the promise unique based on something like the call stack

...or the promise handle, possibly?

-- M

Nick Anderson

unread,
Dec 15, 2017, 11:49:15 AM12/15/17
to Marco Marongiu, help-cfengine
Yes, my thought was to set the handle to something based on `callstack_callers()` so that each evaluation of the promise would be different.

But I can't seem to get what I expected:

```
  vars:
    "callers" data => callstack_callers();

  reports:
    "$(with)" with => string_mustache( "{{$-top-}}", @(callers) );
```

$(with) doesn't expand :-/

This works:

```
bundle agent main
# User Defined Service Catalogue
{

  vars:

    "callers" string => string_mustache( "{{$-top-}}", callstack_callers() );

  reports:
    "$(main.callers)";

}

```

But I think we would want it all done in the RHS of the handle so that the callstack_callers would be correct for the actual promise and not some intermediary variable.



Mike Weilgart

unread,
Dec 15, 2017, 3:55:17 PM12/15/17
to Beto, help-cfengine
I think using services promises for anything other than declaring the desired end state of the service is illogical.  More to the point, modifying the standard services bundle so it doesn't cache the function call would be an extreme violation of the design of services promises in the first place.

Under the hood services promises are just syntactic sugar sprinkled on top of processes promises and commands promises.

In this case you aren't declaring an end state; you're asking for an explicit action to be taken.  So use a commands promise—it will make your life a lot simpler.  And it will keep the benefits of services promises nice and clean and simple for everyone who's using them simply for their usual use case (i.e. it won't add the additional overhead for such people).

But I can't make head or tail of your policy.  Where is the source of truth for /var/lib/net-snmp/snmpd.conf?  You copy it from /srv/sysadmin/etc/persistent_snmpd.conf which in turn gets circularly copied from /var/lib/net-snmp/snmpd.conf in the first place.

Best,
—Mike Weilgart
Vertical Sysadmin, Inc.

--
You received this message because you are subscribed to the Google Groups "help-cfengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to help-cfengin...@googlegroups.com.

Beto

unread,
Dec 18, 2017, 8:48:22 AM12/18/17
to help-cfengine
Nick,

I'm using the 3.10 masterfiles.  But remember, I'm not using "restart"; I'm using a "stop", copying snmpd.conf into place, and then doing a "start".  snmpd must be stopped when the persistent snmpd.conf is copied into place.  I don't expect to see this issue using a simple "restart".
To unsubscribe from this group and stop receiving emails from it, send an email to help-cfengin...@googlegroups.com.

Beto

unread,
Dec 18, 2017, 9:43:09 AM12/18/17
to help-cfengine
This is the requirement that you may not understand, although it's all explained in the comments in the above policy:

snmpd must be stopped when /var/lib/net-snmp/snmpd.conf is copied into place. If snmpd is running when /var/lib/net-snmp/snmpd.conf is copied, any changes are reverted whenever snmpd is restarted for ANY reason.  This is because snmpd itself updates /var/lib/net-snmp/snmpd.conf when it starts.  So we, in effect, are only "priming" /var/lib/net-snmp/snmpd.conf from /srv/sysadmin/etc/persistent_snmpd.conf, which is kept in sync with /var/lib/net-snmp/snmpd.conf from the policy host.

I'm beginning to think you're right about using commands.  As I have said, I am just doing my due diligence to ensure I hadn't missed a simpler way.

Thanks!

Mike Weilgart

unread,
Dec 19, 2017, 5:52:05 AM12/19/17
to Beto, help-cfengine
In fact I did dig through your comments extensively before I wrote my email.  And the man page.  Probably more than was warranted; I got nerd sniped.  :)

The trouble is that you wrote: "If snmpd is running when /var/lib/net-snmp/snmpd.conf is copied...." which is extremely unclear.  I expect you meant to say, "If /var/lib/net-snmp/snmpd.conf" is modified while snmpd is running..." but that's not what you actually stated.  Taken literally, your statement would mean that if you run the command "cp /var/lib/net-snmp/snmpd.conf /tmp/" while snmpd is running, the "changes" (what changes?) would be reverted.

From your last email I take it that the "force_cp" copy_from body includes a non-parameterized "servers" attribute and thus is in fact a remote copy.  (force_cp is not in the standard library and you didn't include the definition inline, so again I'm guessing here.)

But yes, commands promises are the simpler way here.  :)  I would use the "if_ok" classes body from the standard library, and the "if_repaired" classes body as well.  So you would have:

files:
"$(snmp_conf)"
whatever
classes => if_repaired("snmpd_restart_needed");

files:
"$(staged_persistent_snmp_conf)"
whatever
classes => if_repaired("snmpd_restart_needed");

commands:
snmpd_restart_needed::
"$(paths.service) snmpd stop"
classes => if_repaired("stop_repaired");

files:
stop_repaired::
"$(persistent_snmp_conf)"
whatever
classes => if_ok("persistent_snmp_conf_ok");

commands:
persistent_snmp_conf_ok::
"$(paths.service) snmpd start";

I don't see why even have a reload command at all in the mix.  Seems you'll just confuse yourself.  With the above approach, there is only one way in which snmpd will ever get restarted.

Best,
—Mike Weilgart
Vertical Sysadmin, Inc.

Beto

unread,
Dec 19, 2017, 7:57:34 AM12/19/17
to help-cfengine
Actually, since the policy is working just as intended, I would say you're the one that is confused.

The reload is there to reload snmpd after /etc/snmp/snmpd.conf is modified.  It has nothing to do with the problem I described related to handling the persistent /var/lib/net-snmp/snmpd.conf.  They are related but totally different files.

At any rate yesterday I slightly reworked the policy slightly to add a command to (re)start snmpd after after stopping it and updating the persistent file.  This solves the problem I posted about.

I truly appreciate you and others taking the time to comment on this.

In case anyone is interested here's the updated bundle (minus the vars section which I omitted for brevity and to protect the innocent):

#########################################################
#
# snmpd.cf - configure snmpd
#
#########################################################
bundle agent snmpd
{

  meta
:
     
"description"  string     => "Configure SNMP";
     
"tags"         slist      => { "autorun" };


   classes
:
     
"hpsmh"        expression => regcmp("true", "$(hpsmh)");


  files
:
    linux
::
     
"/etc/snmp/snmpd.conf"
        edit_template  
=> "$(glb.templates)/snmpd.conf.mustache",
        edit_defaults  
=> empty,
        template_method
=> "mustache",
        classes        
=> results("bundle", "snmpd_conf"),
        perms          
=> mog("0600","$(snmpd_conf_owner)","root");

     
# NOTE:  snmpd saves persistent data in /var/lib/net-snmp/snmpd.conf (refered to here as the
     
# "persistent" snmpd.conf).  snmpd must be stopped before /var/lib/net-snmp/snmpd.conf is copied
     
# or snmpd will revert any changes on it's next startup.  So, we first copy the persistent snmpd.conf
     
# to a staging area, then stop snmpd, copy snmpd.conf to /var/lib/net-snmp and finally restart snmpd.
     
"/srv/sysadmin/etc/persistent_snmpd.conf"
        comment        
=> "stage the persistent snmpd.conf",
        copy_from      
=> force_cp("/var/lib/net-snmp/snmpd.conf"),
        classes        
=> results("bundle", "staged_snmpd_conf"),
        perms          
=> mog("0640","root","root");

    snmpd_stop_repaired
::
     
"/var/lib/net-snmp/snmpd.conf"
       
delete          => file;

     
"/var/lib/net-snmp/snmpd.conf"

        comment        
=> "snmpd is stopped, copy the persistent snmpd.conf to /var/lib/net-snmp",

        copy_from      
=> local_cp("/srv/sysadmin/etc/persistent_snmpd.conf"),
        classes        
=> results("bundle", "persistent_snmpd_conf"),
        perms          
=> mog("0600","root","root");

  services
:
    snmpd_conf_repaired
::
     
"snmpd"
        service_policy  
=> "reload";

    staged_snmpd_conf_repaired
::
     
"snmpd"
        service_policy  
=> "stop",
        classes        
=> results("bundle", "snmpd_stop");


    any
::
     
"snmpd"
        service_policy  
=> "start";

  commands
:
    persistent_snmpd_conf_repaired
::
   
# NOTE: Due to CFEngine function caching, a service promise cannot be used to start snmpd
   
# if it was stopped by the current cf-agent process.  So we use a command instead to start
   
# snmpd.
     
"$(paths.path[service]) snmpd start"
        ifvarclass      
=> "!systemd";

     
"$(paths.path[systemctl]) snmpd start"
        ifvarclass      
=> "systemd";


  reports
:
   
"DEBUG|DEBUG_$(this.bundle)"::
     
"DEBUG $(this.bundle): snmpd.conf owner = $(snmpd_conf_owner)";
   
"DEBUG|DEBUG_$(this.bundle).hpsmh"::
     
"DEBUG $(this.bundle): Configuring hpsmh dlmod";
   
"DEBUG|DEBUG_$(this.bundle).!hpsmh"::
     
"DEBUG $(this.bundle): hpsmh dlmod not configured";

}
Reply all
Reply to author
Forward
0 new messages