The Problem When using an agent-specified environment workflow and the requested environment does not exist, there is no way to halt the Puppet run early and prevent a catalog compilation. Additionally, the behavior of automatically switching to the "production" environment is unexpected and not desired in an agent-specified environment workflow. This behavior exposes multiple issues:
- The agent gets a 404 from the file_metadatas endpoint, but it still submits a catalog request:
-
[root@agent7 ~]# puppet agent -t --environment fake --http_debug |
Info: Using environment 'fake' |
opening connection to server7.vagrant:8140... |
opened |
starting SSL for server7.vagrant:8140... |
SSL established, protocol: TLSv1.3, cipher: TLS_AES_128_GCM_SHA256 |
<- "GET /puppet/v3/file_metadatas/plugins?recurse=false&links=manage&checksum_type=sha256&source_per |
.5-p203 (x86_64-linux)\r\nAccept: application/json, text/pson\r\nAccept-Encoding: gzip;q=1.0,deflate |
-> "HTTP/1.1 404 Not Found\r\n" |
-> "Date: Mon, 31 Jan 2022 21:47:28 GMT\r\n" |
-> "Content-Type: application/json;charset=utf-8\r\n" |
-> "X-Puppet-Version: 7.14.0\r\n" |
-> "Content-Length: 87\r\n" |
-> "\r\n" |
reading 87 bytes... |
-> "{\"message\":\"Not Found: Could not find environment 'fake'\",\"issue_kind\":\"RUNTIME_ERROR\"}" |
read 87 bytes |
Conn keep-alive |
Notice: Environment 'fake' not found on server, skipping initial pluginsync. |
<- "POST /puppet/v3/catalog/agent7.vagrant?environment=fake HTTP/1.1\r\nX-Puppet-Version: 7.14.0\r\n |
-
-
- This puts unneeded load on the Puppetserver while it compiles a catalog.
- The server responds with a 200, which is odd considering the environment doesn't exist.
-
-> "HTTP/1.1 200 OK\r\n" |
-> "Date: Mon, 31 Jan 2022 21:47:28 GMT\r\n" |
-> "Content-Type: application/vnd.puppet.rich+json; charset=utf-8\r\n" |
-> "X-Puppet-Version: 7.14.0\r\n" |
-
- The agent then switches to the "production" environment. Supposedly because it's server-specified. But in my case the external node classifier (ENC) is NOT specifying any environment at all.
-
Notice: Local environment: 'fake' doesn't match server specified environment 'production', restarting agent run with environment 'production' |
-
-
- Here's my ENC script used for testing:
-
#!/bin/bash |
cat <<EOF |
--- |
class: {} |
parameters: {} |
EOF |
-
- The agent then does pluginsync against the production environment and follows up with another catalog request, this time against production.
-
<- "POST /puppet/v3/catalog/agent7.vagrant?environment=production HTTP/1.1\r\nX-Puppet-Version: 7.14.0\r\nUser-Agent: Puppet/7.14.0 Ruby/2.7.5-p203 (x86_64-linux)\r\nAccept: application/vnd.puppet.rich+json, application/json, text/pson\r\nContent-Type: application/x-www-form-urlencoded\r\nAccept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3\r\nHost: server7.vagrant:8140\r\nContent-Length: 25797\r\n\r\n" |
-
-
- This puts even more unnecessary load on the Puppetserver.
- With --strict_environment_mode, the agent again gets the 404 from file_metadatas but for some reason is still requests a catalog from the server.
-
[root@agent7 ~]# puppet agent -t --environment fake --http_debug --strict_environment_mode |
Info: Using environment 'fake' |
opening connection to server7.vagrant:8140... |
opened |
starting SSL for server7.vagrant:8140... |
SSL established, protocol: TLSv1.3, cipher: TLS_AES_128_GCM_SHA256 |
<- "GET /puppet/v3/file_metadatas/plugins?recurse=false&links=manage&checksum_type=sha256&source_permissions=ignore&environment=fake HTTP/1.1\r\nX-Puppet-Version: 7.14.0\r\nUser-Agent: Puppet/7.14.0 Ruby/2.7.5-p203 (x86_64-linux)\r\nAccept: application/json, text/pson\r\nAccept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3\r\nHost: server7.vagrant:8140\r\n\r\n" |
-> "HTTP/1.1 404 Not Found\r\n" |
-> "Date: Mon, 31 Jan 2022 22:03:16 GMT\r\n" |
-> "Content-Type: application/json;charset=utf-8\r\n" |
-> "X-Puppet-Version: 7.14.0\r\n" |
-> "Content-Length: 87\r\n" |
-> "\r\n" |
reading 87 bytes... |
-> "{\"message\":\"Not Found: Could not find environment 'fake'\",\"issue_kind\":\"RUNTIME_ERROR\"}" |
read 87 bytes |
Conn keep-alive |
Notice: Environment 'fake' not found on server, skipping initial pluginsync. |
<- "POST /puppet/v3/catalog/agent7.vagrant?environment=fake HTTP/1.1\r\nX-Puppet-Version: 7.14.0\r\nUser-Agent: Puppet/7.14.0 Ruby/2.7.5-p203 (x86_64-linux)\r\nAccept: application/vnd.puppet.rich+json, application/json, text/pson\r\nContent-Type: application/x-www-form-urlencoded\r\nAccept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3\r\nHost: server7.vagrant:8140\r\nContent-Length: 25793\r\n\r\n" |
-
-
- This puts unneeded load on the Puppetserver as it compiles a catalog.
- The agent receives a 200 from the server after the catalog request:
- again, which is weird considering the environment doesn't exist.
-
-> "HTTP/1.1 200 OK\r\n" |
-> "Date: Mon, 31 Jan 2022 22:03:16 GMT\r\n" |
-> "Content-Type: application/vnd.puppet.rich+json; charset=utf-8\r\n" |
-> "X-Puppet-Version: 7.14.0\r\n" |
-> "Vary: Accept-Encoding, User-Agent\r\n" |
-> "Content-Encoding: gzip\r\n" |
-> "Content-Length: 316\r\n" |
-> "\r\n" |
-
- Then the agent gives up with a misleading error:
-
Error: Not using catalog because its environment 'production' does not match agent specified environment 'fake' and strict_environment_mode is set |
-
-
-
- This is misleading because the server is NOT specifying an environment (see the ENC script above).
- The real problem is that the "fake" environment doesn't exist.
Desired Behavior When using an agent-specified environment workflow:
- The agent should not request a catalog after the initial 404 to the file_metadatas API (i.e. when pluginsync failed)
- Error messages for non-existent environments shouldn't assume you're using a server-specified environment.
This points above may be too specific to the current implementation, so a more generic way to phrase the desired behavior is something like:
- There should be an agent-side option to fail the Puppet run fast when the requested environment doesn't exist.
- That option should not cause a catalog compilation on the Puppetserver at all.
Related Info This looks like it might be a regression in behavior related to PUP-10582 and possibly was introduced with changes made for PUP-6802. |