As a Puppet user I expect the puppetserver to gracefully handle a transient PuppetDB maintenance mode state without reporting agent failures when there are multiple PuppetDB instances configured.
When one of the PuppetDB instances is in maintenance mode, it returns a 503 to the Puppetserverwhich was implemented in PDB-1019 should be handled when {{command_broadcast = true}} and {{min_successful_submissions = 1}}. Puppetserver However, puppetserver sees this as a failure and sends a 500 error to the agent. The 503 indicates that Since the Puppetserver or the agent should resubmit the query again later other PuppetDB instance is available, but puppetserver does not handle this gracefully and it results in a failed agent run.
PUP-7451 implemented Retry-After headers in the puppet agent {{min_successful_submissions = 1}}, so it may be an option to relay the 503 to should be ignored and the agent command to retry it later the other PuppetDB should be successful. Another option The actual result is to implement a queue or retry within that Puppetserver when sends a 503 is received from 500 back to the agent and the agent runs are all failures for the duration that any of the PuppetDB nodes are in maintenance mode.
Steps to reproduce: # Configure PuppetDB replication # Configure {{command_broadcast = true}} in the {{puppetdb.conf}} on the master # Run a puppet agent while PuppetDB one of the PuppetDBs is in maintenance mode and the other one is available. h2. Logs:
From the puppetserver.log
{code:java}2018-06-08T10:35:23.831-07:00 WARN [qtp2042713953-1209] [puppetserver] Puppet Error connecting to pe-201810-agent-replica.puppetdebug.vlan on 8081 at route /pdb/cmd/v1?checksum=c36ef6428032c548e53a17e5c22e5b4447748574&version=9&certname=pe-201810-agent.puppetdebug.vlan&command=replace_catalog&producer-timestam p=1528479323, error message received was ''. Failing over to the next PuppetDB server_url in the 'server_urls' list 2018-06-08T10:35:23.835-07:00 ERROR [qtp2042713953-1209] [puppetserver] Puppet [503 ] PuppetDB is currently down. Try again later. 2018-06-08T10:35:23.835-07:00 ERROR [qtp2042713953-1209] [puppetserver] Puppet /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/puppetdb/command.rb:8 2:in `submit' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/puppetdb.rb:62:in `block in submit_command' /opt/puppetlabs/puppet/lib/ruby/vendor_rub y/puppet/util/profiler/around_profiler.rb:58:in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/profiler.rb:51:in `profile' /opt/puppetlab s/puppet/lib/ruby/vendor_ruby/puppet/util/puppetdb.rb:99:in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/puppetdb.rb:59:in `submit_comm and' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/indirector/catalog/puppetdb.rb:14:in `block in save' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/pup pet/util/profiler/around_profiler.rb:58:in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/profiler.rb:51:in `profile' /opt/puppetlabs/pup pet/lib/ruby/vendor_ruby/puppet/util/puppetdb.rb:99:in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/indirector/catalog/puppetdb.rb:11:in `sa ve' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/indirector/store_configs.rb:24:in `save' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/indirecto r/indirection.rb:204:in `find' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network/http/api/indirected_routes.rb:121:in `do_find' /opt/puppetlabs/pup pet/lib/ruby/vendor_ruby/puppet/network/http/api/indirected_routes.rb:48:in `block in call' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/context.rb:65 :in `override' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:260:in `override' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network/http/api/i ndirected_routes.rb:47:in `call' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network/http/route.rb:82:in `block in process' org/jruby/RubyArray.java: 1735:in `each' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network/http/route.rb:81:in `process' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/n etwork/http/route.rb:87:in `process' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network/http/route.rb:87:in `process' /opt/puppetlabs/puppet/lib/rub y/vendor_ruby/puppet/network/http/handler.rb:64:in `block in process' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/profiler/around_profiler.rb:58 :in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/profiler.rb:51:in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network /http/handler.rb:62:in `process' uri:classloader:/puppetserver-lib/puppet/server/master.rb:42:in `handleRequest'{code}
From the agent, which gets a 500 instead of a 503.
{code:java} -> "HTTP/1.1 500 Server Error\r\n" -> "Date: Fri, 08 Jun 2018 17:35:33 GMT\r\n" -> "Content-Type: application/json;charset=utf-8\r\n" -> "X-Puppet-Version: 5.5.1\r\n" -> "Content-Length: 108\r\n" -> "Server: Jetty(9.4.z-SNAPSHOT)\r\n" -> "\r\n" reading 108 bytes... -> "{\"message\":\"Server Error: [503 ] PuppetDB is currently down. Try again later.\",\"issue_kind\":\"RUNTIME_ERROR\"}" read 108 bytes Conn keep-alive Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: [503 ] PuppetDB is currently down. Try again later. Warning: Not using cache on failed catalog Error: Could not retrieve catalog; skipping run{code}