Jira (PUP-10664) Puppet 6 should log connection error details when a functional puppet master cannot be located

22 views
Skip to first unread message

Charlie Sharpsteen (Jira)

unread,
Sep 11, 2020, 4:55:04 PM9/11/20
to puppe...@googlegroups.com
Charlie Sharpsteen created an issue
 
Puppet / Improvement PUP-10664
Puppet 6 should log connection error details when a functional puppet master cannot be located
Issue Type: Improvement Improvement
Affects Versions: PUP 6.18.0
Assignee: Unassigned
Created: 2020/09/11 1:54 PM
Priority: Normal Normal
Reporter: Charlie Sharpsteen

At the beginning of each run, the Puppet agent performs a health check to locate a functional Puppet Server to make API calls to. If this health check fails, the run fails with the following message:

Error: Could not run Puppet configuration client: Could not select a functional puppet master from server_list: 'localhost:8140'

In order to facilitate troubleshooting, the messages logged at error level should include some detail of what happened to the health check.

Reproduction Case

  • Install Puppet 6 on CentOS 7:

yum install -y http://yum.puppetlabs.com/puppet6-release-el-6.noarch.rpm
yum install -y puppetserver
{coode}
 
  - Configure the agent to check in locally and bootstrap the Puppet Server CA:
 
{code:bash}
source /etc/profile.d/puppet-agent.sh
 
puppet config set server $(hostname -f)
puppetserver ca setup
 
systemctl start puppetserver

  • Provoke a health check failure by running the Puppet agent with the server url set to localhoost:

puppet agent -t --server_list=localhost:8140

Outcome

The error message is very terse and just states that a healthy server could not be found:

# puppet --version
6.18.0
 
# puppet agent -t --server_list=localhost:8140
Error: Could not run Puppet configuration client: Could not select a functional puppet master from server_list: 'localhost:8140'

Expected Outcome

Raising the log level to DEBUG reveals that the health check failed due to a SSL validation error. These details should be logged at ERROR level so that the root cause of connection failures is visible for post-mortem debugging:

# puppet agent -t --server_list=localhost:8140 --debug
...
Debug: Unable to connect to server from server_list setting: Server hostname 'localhost' did not match server certificate; expected one of dull-sanatorium.delivery.puppetlabs.net, DNS:puppet, DNS:dull-sanatorium.delivery.puppetlabs.net
...
Error: Could not run Puppet configuration client: Could not select a functional puppet master from server_list: 'localhost:8140'

Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.5.2#805002-sha1:a66f935)
Atlassian logo

Charlie Sharpsteen (Jira)

unread,
Sep 11, 2020, 4:56:04 PM9/11/20
to puppe...@googlegroups.com

Charlie Sharpsteen (Jira)

unread,
Sep 11, 2020, 5:11:04 PM9/11/20
to puppe...@googlegroups.com
Charlie Sharpsteen updated an issue
At the beginning of each run, the Puppet agent performs a health check to locate a functional Puppet Server to make API calls to. If this health check fails, the run fails with the following message:

{noformat}

Error: Could not run Puppet configuration client: Could not select a functional puppet master from server_list: 'localhost:8140'
{noformat}


In order to facilitate troubleshooting, the messages logged at error level should include some detail of what happened to the health check.

h2. Reproduction Case

  - Install Puppet 6 on CentOS 7:

{code:bash}

yum install -y http://yum.puppetlabs.com/puppet6-release-el-6.noarch.rpm
yum install -y puppetserver
{ coode code }


  - Configure the agent to check in locally and bootstrap the Puppet Server CA:

{code:bash}
source /etc/profile.d/puppet-agent.sh

puppet config set server $(hostname -f)
puppetserver ca setup

systemctl start puppetserver
{code}

  - Provoke a health check failure by running the Puppet agent with the server url set to {{localhoost}}:

{code:bash}

puppet agent -t --server_list=localhost:8140
{code}

h3. Outcome


The error message is very terse and just states that a healthy server could not be found:

{noformat}

# puppet --version
6.18.0

# puppet agent -t --server_list=localhost:8140
Error: Could not run Puppet configuration client: Could not select a functional puppet master from server_list: 'localhost:8140'
{noformat}

h3. Expected Outcome


Raising the log level to DEBUG reveals that the health check failed due to a SSL validation error. These details should be logged at ERROR level so that the root cause of connection failures is visible for post-mortem debugging:

{noformat}

# puppet agent -t --server_list=localhost:8140 --debug
...
Debug: Unable to connect to server from server_list setting: Server hostname 'localhost' did not match server certificate; expected one of dull-sanatorium.delivery.puppetlabs.net, DNS:puppet, DNS:dull-sanatorium.delivery.puppetlabs.net
...
Error: Could not run Puppet configuration client: Could not select a functional puppet master from server_list: 'localhost:8140'
{noformat}

Charlie Sharpsteen (Jira)

unread,
Sep 11, 2020, 5:12:03 PM9/11/20
to puppe...@googlegroups.com

Josh Cooper (Jira)

unread,
Sep 14, 2020, 10:59:04 AM9/14/20
to puppe...@googlegroups.com

Josh Cooper (Jira)

unread,
Sep 14, 2020, 8:45:03 PM9/14/20
to puppe...@googlegroups.com
Josh Cooper commented on Improvement PUP-10664
 
Re: Puppet 6 should log connection error details when a functional puppet master cannot be located

There are two parts to this:

1. The current intended behavior is to not log errors if we eventually find a server to connect to. In other words, only log the exceptions at the error level if we exhaust the server list:

$ bx puppet agent -t --server_list localhost:8141,localhost:8140
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
..

But if you run with debugging you'll see the first failure followed by the second success:

$ bx puppet agent -td --server_list localhost:8141,localhost:8140
...
Debug: Resolving service 'puppet' using Puppet::HTTP::Resolver::ServerList
Debug: Creating new connection for https://localhost:8141
Debug: Starting connection for https://localhost:8141
Debug: Unable to connect to server from server_list setting: Request to https://localhost:8141/status/v1/simple/master failed after 0.002 seconds: Failed to open TCP connection to localhost:8141 (Connection refused - connect(2) for "localhost" port 8141)
Debug: Creating new connection for https://localhost:8140
Debug: Starting connection for https://localhost:8140
Debug: Using TLSv1.2 with cipher DHE-RSA-AES128-GCM-SHA256
Debug: HTTP GET https://localhost:8140/status/v1/simple/master returned 200 OK
Debug: Caching connection for https://localhost:8140
Debug: Resolved service 'puppet' to https://localhost:8140/puppet/v3
...

The second part is if the server_list is exhausted, then the code that would normally log the exceptions at error level is bypassed.

To confirm the expected behavior, are asking for puppet to log at error level any/all exceptions that occur during server list resolution, even if the resolution is eventually successful for that run? Or should it only log the exceptions if resolution fails?

Josh Cooper (Jira)

unread,
Sep 14, 2020, 8:46:04 PM9/14/20
to puppe...@googlegroups.com

Josh Cooper (Jira)

unread,
Sep 14, 2020, 9:27:03 PM9/14/20
to puppe...@googlegroups.com

Charlie Sharpsteen (Jira)

unread,
Sep 15, 2020, 2:07:04 PM9/15/20
to puppe...@googlegroups.com
Charlie Sharpsteen commented on Improvement PUP-10664
 
Re: Puppet 6 should log connection error details when a functional puppet master cannot be located

Yes, I think we should log all errors that result in the agent failing over to the next server in the list even if the run is ultimately successful. The error messages have important context that may be needed to restore the first server in the list to a healthy state.

Josh Cooper (Jira)

unread,
Sep 16, 2020, 1:13:04 PM9/16/20
to puppe...@googlegroups.com

Josh Cooper (Jira)

unread,
Sep 28, 2020, 2:41:04 PM9/28/20
to puppe...@googlegroups.com

Josh Cooper (Jira)

unread,
Sep 28, 2020, 2:45:04 PM9/28/20
to puppe...@googlegroups.com
Josh Cooper updated an issue
Change By: Josh Cooper
Release Notes: Bug Fix
Release Notes Summary: Puppet agents now always log errors that occur when trying to connect to each server in its "server_list" setting at the "err" level. Previously, the errors were only logged at the "debug" level or at the "err" level if the no servers were available.

Claire Cadman (Jira)

unread,
Oct 12, 2020, 9:15:02 AM10/12/20
to puppe...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages