Jira (PUP-1054) Services should support 'reload' in addition to 'restart'

8 views
Skip to first unread message

Colin Alston (JIRA)

unread,
Nov 4, 2014, 6:25:27 AM11/4/14
to puppe...@googlegroups.com
Colin Alston commented on New Feature PUP-1054
 
Re: Services should support 'reload' in addition to 'restart'

bump*bump*bump This has been outstanding for about 4 years and it's a pretty critical feature for hundreds of services. Having to write exec extensions all the time just to do 'reload' instead of 'restart' is pretty sad.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.3.7#6337-sha1:2ed701e)
Atlassian logo

Luke Kanies (JIRA)

unread,
Nov 6, 2014, 9:46:27 PM11/6/14
to puppe...@googlegroups.com
Luke Kanies commented on New Feature PUP-1054

Eric Sorenson Does this show up anywhere in our plans?

Seems like there's a lot of interest.

For those who are watching, do you have a clear idea of how you want this to work?

Spencer Krum (JIRA)

unread,
Nov 6, 2014, 11:04:30 PM11/6/14
to puppe...@googlegroups.com
Spencer Krum commented on New Feature PUP-1054

This can be pretty easily worked around by:

file

{ "/etc/config": ensure => present }

~>
service

{ 'myservice': ensure => running, restart => '/usr/bin/apachectl reload', #or something like that }

This should probably be better documented/publicized though. I used to think that this ticket was a huge problem. I've kind of gotten over that. Its seductive to say that we should add a reload parameter to the service type. I think it will lead us into having more and more parameters to the service type (reload, graceful_reload, close_all_connections_and_turn_off, give_me_a_pony, and so on). I also think we will quickly end up with two services who define the word 'reload' and/or 'graceful reload' differently.

So what do I think should be done? I think the workaround of overloading the restart parameter should be exercised. Are there places it doesn't work? Where are those places? If we change Puppet we should make sure we don't leave those cases out in the cold.

To go a bit further, this makes me wonder if notify/subscribe should be extended. I'm not sure exactly what that would look like. But right now notify/subscribe is 'You should restart.' I think what I'm suggesting is a richer language for resources to interact. A file resource could notify the service resource with 'refresh' or with 'restart.' I'm usually pretty weary of adding complexity, but I can imagine types and providers writing handlers for arbitrary events, and that could enable some super cool behaviors in puppet.

Patrick Hemmer (JIRA)

unread,
Nov 6, 2014, 11:13:27 PM11/6/14
to puppe...@googlegroups.com

There's a significant problem with your solution. Lets say you upgrade apache. How are you going to restart it?

Colin Alston (JIRA)

unread,
Nov 7, 2014, 12:06:27 AM11/7/14
to puppe...@googlegroups.com
Colin Alston commented on New Feature PUP-1054

Some things like Postgres need restart under different conditions. Eg a nicer thing to do might be

service

{'postgresql': ensure => running}

file

{'/etc/postgresql/postgresql.conf': notify => Service['postgresql'] }

file

{'/etc/postgresql/pg_hba.conf': reload => Service['postgresql'] }

You need control over /when/ you reload, not replacing the entire restart command.

Daniele Sluijters (JIRA)

unread,
Nov 7, 2014, 3:03:29 AM11/7/14
to puppe...@googlegroups.com

So far I've mostly used the solution Spencer Krum provided because with a lot of services, once it's running after a startup all you need is reload.

Patrick Hemmer makes an interesting point, at package upgrade times you want to restart the service. However, package managers already do this for you. If Puppet upgrades apache2 for me, at least on Debian, the package scripts will stop the service, replace the necessary files and then start the service. The issue at upgrade time is mostly non-existent except for packages like nginx or weechat which support 'hot'/in-place upgrades though in a lot of places even though the tool might support it the package manager decides against it.

I do agree with Colin Alston though, especially in the case of databases because they tend to have a bunch of different 'types' of configuration, some that require a full restart and others that can be reloaded or tweaked at runtime in which case being able to tell Puppet what to do.

Since you can't pass 'actions' to a notify like Service['postgresql' => restart] I don't really see another way except for implementing a new metaparameter. I'm a bit hesitant towards this though as I believe a metaparameter should make sense on all resource types it can be applied to (yes there are exceptions) but it doesn't really make sense to tell a file/user/package/cron/whatever that it should reload.

Matthew Burgess (JIRA)

unread,
Nov 7, 2014, 4:02:27 PM11/7/14
to puppe...@googlegroups.com

I'm not sure that a metaparameter is the right approach here either. At the moment, I can only see the service type here requiring a choice of behaviours. That said, I guess there might be cases where custom types might want to do different things if notified by different dependents. As Colin Alston mentioned, it's the dependents rather than the type itself that needs to tell the type what action to take. As mentioned above sometimes a restart is required, sometimes a reload is sufficient, and it's only the thing that will trigger the service reload/refresh that knows which one is appropriate.

So, how about something like a notifyactions metaparameter that has a one-to-one mapping with the elements specified in the notify metaparameter. If the notifyactions metaparameter is not specified, then the existing default refresh actions are taken, and that also occurs if an element in the notify array has an undef entry in the notifyactions metaparameter. Something like the following, perhaps:

service { 'dnsmasq': ensure => 'running' }
file { '/etc/dnsmasq.conf': notify => Service['dnsmasq'], } # this would restart dnsmasq
file { '/etc/hosts.d/testservers': notify => Service['dnsmasq'], notifyactions => '/sbin/service dnsmasq reload' } # this reloads dnsmasq
file { '/etc/hosts.d/testservers': notify => [Class['someotherclass'], Service['dnsmasq']], notifyactions => [undef, '/sbin/service dnsmasq reload'] } # as does this, but also carries out the default refresh behaviour on someotherclass

John Duarte (JIRA)

unread,
May 15, 2017, 1:48:06 PM5/15/17
to puppe...@googlegroups.com
John Duarte updated an issue
 
Puppet / New Feature PUP-1054
Change By: John Duarte
Labels: redmine  triaged
This message was sent by Atlassian JIRA (v6.4.14#64029-sha1:ae256fe)
Atlassian logo

John Duarte (JIRA)

unread,
May 15, 2017, 1:49:05 PM5/15/17
to puppe...@googlegroups.com

John Duarte (JIRA)

unread,
May 15, 2017, 1:50:05 PM5/15/17
to puppe...@googlegroups.com
John Duarte updated an issue
Change By: John Duarte
Labels: needs_decision redmine triaged

Josh Cooper (JIRA)

unread,
May 16, 2017, 2:46:07 PM5/16/17
to puppe...@googlegroups.com
Josh Cooper updated an issue
Change By: Josh Cooper
Labels: needs_decision  redmine triaged

Moses Mendoza (JIRA)

unread,
May 18, 2017, 1:56:03 PM5/18/17
to puppe...@googlegroups.com

Sam McLeod (JIRA)

unread,
Aug 1, 2017, 11:29:06 PM8/1/17
to puppe...@googlegroups.com
Sam McLeod commented on New Feature PUP-1054
 
Re: Services should support 'reload' in addition to 'restart'

Is this ticket able to be actioned?

Still badly need this unless there's a new way to handle reloads instead of restarts that we haven't come across.

Matthew Patton (JIRA)

unread,
Aug 4, 2017, 2:04:04 AM8/4/17
to puppe...@googlegroups.com

IMO 'reload' is missing the point of Puppet. If you allow Puppet to change the state of the machine, you are by definition doing so during an outage window and have notified upstream devices (cluster manager, load-balancer, etc.) that you are unhealthy, since service disruption is to be expected. QED a restart is proper if more heavy-handed than you might like. That some handful of software has fancy in-place abilities is immaterial.

I suspect implementing 'reload' will cause far more headache for users who wonder why their new configuration is not (or only partly) deployed and then either blame Puppet or have to go delve into their particular software to figure out what changes are honored under what conditions. Why make life difficult? KISS - just restart it.

Colin Alston (JIRA)

unread,
Aug 4, 2017, 3:12:03 AM8/4/17
to puppe...@googlegroups.com
Colin Alston commented on New Feature PUP-1054

No, you're missing the point of 'reload' and continuous orchestration.

Assume you have some configuration change which alters a PostgreSQL pg_hba.conf for example, you can apply those changes without restarting the service and terminating active connections which would disrupt other services. A reload mitigates that by re-configuring the service without a restart.

HUP signals are used a numerous other services, another good example being something like HAproxy for edge routing in cloud infrastructures where again it would be quite annoying to have your entire infrastructure blip just to on-board a new service through Puppet.

Matthew Patton (JIRA)

unread,
Aug 4, 2017, 3:55:04 AM8/4/17
to puppe...@googlegroups.com

It's just happy coincidence that pg_hba.conf changes and certain changes to Apache/Nginx and some others can be done in a theoretically non-disruptive manner. Say you reload Postgres and sure, while your current sessions can continue, all new sessions are failing because you screwed something up. Or even better, some other CI push invoked reload() on the LDAP server(s) and they didn't like it for whatever obscure reason. Now your PG users are unable to authenticate.

I'll happy stipulate that INIT script quality is all over the map, but I'll wager a restart() results in a reliable service outcome whereas reload() is at best a coin flip. Puppet and it's ilk are about enforcing and achieving a deterministic state. End-user disruption doesn't enter the equation one bit.

To that end nobody sane runs CI on production equipment in an uncontrolled manner, EVEN IF you happen to be running daemons that (most of the time) let you get away with it. Nodes are properly drained and rotated out, they are restarted, all pertinent aspects of their health is ascertained, and then and ONLY then, are they put back into the pool. With PG you're in luck; you can leverage pgpool and pgbouncer so there is no reason not to do the Right Thing(r). There are no shortcuts in operations as much as CI/DevOPS would like to pretend they can hand-wave it away.

Colin Alston (JIRA)

unread,
Aug 4, 2017, 4:05:03 AM8/4/17
to puppe...@googlegroups.com
Colin Alston commented on New Feature PUP-1054

Inflammatory statements aside, Puppet is used in numerous production environments to coordinate cluster orchestration and other tasks which heavily rely on non-disruptive configuration changes. There is no need for that to be considered any less controlled than running Kubernetes to shift workloads.

pgpool and pgbouncer are not at all one-size-fits-all solutions, in fact they're exceptionally bad solutions to a far simpler issue. However managing cluster failover is an important example of why you want reloads, consider performing maintenance on a warm/hot standby pair it might be useful not to bring an entire cluster down just to update the sync destination.

Matthew Patton (JIRA)

unread,
Aug 4, 2017, 8:02:04 AM8/4/17
to puppe...@googlegroups.com

At the risk of jacking the thread, what happens to your cluster when there is anywhere from minutes, to maybe even hours (whenever the next successful puppet run completes) where the members don't all have the same configuration? Does it split-brain? Does it improperly migrate the service or trigger competing master elections? Does it corrupt or return inconsistent data? Is it even able to establish quorum or respond correctly to a legitimate member state change?

It seems you're under the impression that 'service reload' always succeeds. Sure, it might for 99% of attempts, but it's the 1% that matters. I suspect the reason this has been open for 4 years is because while it may be convenient to some, much self-inflicted.harm will result by unthinking, careless use. In the scheme of things a restart doesn't take all that much more time than a reload, and it has the benefit of being an observable state transition.

Colin Alston (JIRA)

unread,
Aug 4, 2017, 10:55:04 AM8/4/17
to puppe...@googlegroups.com
Colin Alston commented on New Feature PUP-1054

I don't understand why you're trying to block this feature simply because it "could be dangerous", absolutely everything that every configuration management tool does could be dangerous or work unexpectedly in inexperienced hands, but that's no reason to hamstring the rest of us into ugly workarounds for something which is a simple syntactic feature - especially when almost every other competing tool offers it.

Matthew Patton (JIRA)

unread,
Aug 4, 2017, 11:38:03 AM8/4/17
to puppe...@googlegroups.com

I'm not blocking anything. I'm simply pointing out that 'reload' has all kinds of problems. A feature shouldn't be "experts only" or else be prepared to get burned in non-obvious ways. Maybe some consider it semantics but Puppet has no way to know the daemon did anything in response to the reload. Maybe the process was smart enough to ignore syntax errors, maybe instead it exited on said errors and now isn't running at all. Puppet isn't necessarily going to issue a Start(). So much for that 'reload', eh? Maybe it closed some file handles and not others, maybe it has processes/sockets open that are running the old config until the client disconnects or the LB let's go (if ever). That's a whole lot of Maybe's and I wasn't even trying.

State machines are not about Maybes, they are about (plausible) Guarantees.

Restart on the other hand ensures as best it can that the previous process has been terminated, and that subsequently all configuration files were re-read on startup; it has reasonable assurance and expectation that the intended outcome has been reached.

You're able to implement your own workaround, so maybe do a wrapper type and publish on Forge? That other tools may have capitulated to the demands of people who insist on being cavalier about their state machine is not a worthwhile argument. Correctness trumps convenience and expediency. Every time.

Sam McLeod (JIRA)

unread,
Aug 6, 2017, 8:45:03 PM8/6/17
to puppe...@googlegroups.com
Sam McLeod commented on New Feature PUP-1054

Matthew Patton, respectfully I think you're missing the point debatably the most common use of reload / SIGHUP on modern Linux systems.

I agree with Colin Alston, a great deal of functionality within Puppet or indeed any automation platform or subsystem 'could be dangerous' if used unwisely.

If you take the stance that people should only use software in one, specific way you'll end up with an inflexible system and in this case - product that doesn't work for many people.
When it comes to systems / platform automation and configuration there is (likely more than in any other field of deployment) a diverse range of operating modes with a varying range of scope that the software may be deployed.

One example (of many I can think of) is PostgreSQL, perhaps your solution requires the use of an ACID compliant database - or maybe it's just what your organisation uses as a SOE.
PostgreSQL like a lot of software has configuration that can be changed online without needing to restart the service, for example - you might dynamically adjust the size of various memory buffers based on the number of clients deployed to a cluster etc... you don't need to restart your database server for every change and there could be many undesired side affects of doing so, not just disconnections if not handled correctly but also things like loss of memory caching and live query planner optimisations that could affect performance.

What might be useful is a meta-parameter to Service Reload which could take a command to test that reload will work for a specific change and if it doesn't exit 0, subsequent puppet runs will by effect continue to show as failed rather than applying the failed configuration, issuing a Service Reload, failing once and being forgotten about.

You can do dangerous things with many commands in Puppet (or indeed any configuration management / automation system), it doesn't mean you shouldn't rule out that the commands that might lead to such situations completely - you'd end up with an inflexible, one-size-fits-none product that would suit very few customers needs.

Matthew Patton (JIRA)

unread,
Aug 7, 2017, 7:03:04 AM8/7/17
to puppe...@googlegroups.com

as a almost 30 year sysadmin I think I now how HUP can be used. Since Postgres keeps coming up as a use-case, maybe instead focus on a popular Forge recipe and define a HUP capable service class? BTW what makes you think that all future settings to pg_hba.conf will necessarily be HUP-safe? Either way neither of you have shown why it's perfectly acceptable to make a change to a live production system, and a (clustered) database at that, without doing all of the necessary legwork of site reliability engineering 101. Furthermore, as part of that process you would have written the necessary scripts to re-warm your cache and preload query planner rules (hey look, there's a Git project for that). Sorry to say it but HUP is just a dodge. Do the work and Restart is no big deal.

But even among HUP-friendly programs like Apache, Sendmail, Postfix and I'm sure a zillion more, each have settings that simply will not or can not be honored/updated in a HUP situation. The perennial favorite RSyslog doesn't even honor HUP anymore. Puppet's not about to deduce those nuances. Are you going to write your Apache or Postfix manifest to compare the incoming and current configuration file(s) and grep for the non-HUP friendly keywords and have your manifest branch it's Notify accordingly? If this were Chef and you wanted to write all that logic, you absolutely can.

HUP is "dangerous" because it's a lie. My developers and sysadmins lie enough as it is. I don't need my configuration enforcement mechanism to do it too.

Sam McLeod (JIRA)

unread,
Aug 7, 2017, 7:09:03 AM8/7/17
to puppe...@googlegroups.com
Sam McLeod commented on New Feature PUP-1054

I think you should check your attitude, it’s neither constructive or appreciated.

Colin Alston (JIRA)

unread,
Aug 7, 2017, 7:47:03 AM8/7/17
to puppe...@googlegroups.com
Colin Alston commented on New Feature PUP-1054

Matthew it appears clear that if you are unable to respect your own colleagues in a public forum then you're unlikely to respect anything other than your own view on this matter. I think on that basis we can all safely disregard your "input" here.

Matthew Patton (JIRA)

unread,
Aug 7, 2017, 7:58:03 AM8/7/17
to puppe...@googlegroups.com

You are welcome to explain in detail how any of my points are provably wrong and how you would have Puppet itself or your manifest figure out under what circumstances and under what set of changes 'reload' is demonstrably safe and guarantees your end state has been enforced.

Sam McLeod (JIRA)

unread,
Sep 12, 2017, 8:16:03 PM9/12/17
to puppe...@googlegroups.com
Sam McLeod commented on New Feature PUP-1054

Just heard back from Puppet support:

I'm so sorry for the delay, and I thank you for patience with this ticket. I was able to speak to our product managers about the feature request https://tickets.puppetlabs.com/browse/PUP-1054. The feature is on our roadmap, but we don't have a timeline for when it will be completed. We do regularly go through and evaluate feature requests to reprioritize them and I have brought this up with the engineering team to take look. The best place to look for status updates will be in that JIRA ticket. Please let me know if you have any questions or concerns. if not, is it alright to close this ticket?

Adam Bottchen (JIRA)

unread,
Sep 13, 2017, 6:31:05 PM9/13/17
to puppe...@googlegroups.com
Adam Bottchen updated an issue
 
Change By: Adam Bottchen
CS Priority: Needs Priority

Nick Walker (JIRA)

unread,
Sep 13, 2017, 6:56:04 PM9/13/17
to puppe...@googlegroups.com
Nick Walker commented on New Feature PUP-1054
 
Re: Services should support 'reload' in addition to 'restart'

There is a way to cause a service resource to use reload instead of restart it's just the combination of the hasrestart and restart attributes.

   hasrestart => true,
   restart    => "service ${service_name} reload",

Unfortunately, a lot of services have some settings that can be reloaded and some settings that require a full restart. So, we need a way for other resources to inform the service resource about whether it should perform a restart or reload.

I think that's what Redmine #7594 is suggesting but I can't find a JIRA for that.

Thomas Kishel (JIRA)

unread,
Sep 18, 2017, 6:49:04 PM9/18/17
to puppe...@googlegroups.com
Thomas Kishel commented on New Feature PUP-1054

Yes, the restart parameter workaround limits Notify/Subscribe to only perform a reload since, like in Highlander, "There can be only one" response to a refresh event.

But it seems outside the scope of the Service resource to differentiate between changes to a service's configuration that require a restart vs changes that only require a reload.

Adding a hasreload parameter (and reload) to the Service resource, and modifying the notify / subscribe metaparameters to handle multiple parameters would allow this:

service {'example':
  hasreload  => true,
}
 
file {'/etc/example.conf':
  ensure  => 'file',
  content => 'xxx=yyy',
  notify  => {resource => Service['example'], event=> 'reload'},
}

But doing so just for Service resources doesn't seem orthogonal, if I'm using that word correctly. Using a Exec resource with refreshonly to execute a reload seems reasonable. Wrapping that Exec in a class/module to handle variations in operating systems and their service providers would be even better.

Branan Riley (JIRA)

unread,
May 9, 2018, 2:35:09 PM5/9/18
to puppe...@googlegroups.com
Branan Riley updated an issue
 
Change By: Branan Riley
Labels: redmine service triaged type_and_provider
This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)
Atlassian logo

Branan Riley (JIRA)

unread,
May 9, 2018, 2:35:09 PM5/9/18
to puppe...@googlegroups.com
Branan Riley updated an issue
Change By: Branan Riley
Labels: redmine service type_and_provider

Bogdan Irimie (Jira)

unread,
Nov 5, 2020, 3:51:03 AM11/5/20
to puppe...@googlegroups.com
Bogdan Irimie updated an issue
Change By: Bogdan Irimie
Sprint:
This message was sent by Atlassian Jira (v8.5.2#805002-sha1:a66f935)
Atlassian logo

Bogdan Irimie (Jira)

unread,
Nov 5, 2020, 3:52:04 AM11/5/20
to puppe...@googlegroups.com

Ciprian Badescu (Jira)

unread,
Oct 20, 2021, 4:53:02 AM10/20/21
to puppe...@googlegroups.com
Ciprian Badescu updated an issue
Change By: Ciprian Badescu
Sprint: ready for triage
This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo
Reply all
Reply to author
Forward
0 new messages