Confused Puppet Manifest ... Possible caching issue?

87 views
Skip to first unread message

Matt Wise

unread,
Aug 14, 2014, 1:25:06 PM8/14/14
to puppet...@googlegroups.com
I've got a pretty strange issue here. Imagine we have two servers... ServerA and ServerB. Last night ServerB pulled down some configuration bits from our puppet servers and tried to re-name itself to ServerA.

How? Well theres two things that may have triggered this behavior.

1. We use a custom Puppet Node Name fact to set our node names, rather than the hostnames:

[main]
...
    # Use the fact 'puppet_node' as our node classifier rather than the hostname.
    node_name = facter
    node_name_fact = puppet_node

2. We have Nginx proxy_cache all of our GET/HEAD requests to avoid hammering the Puppet Master processes with calls to the mostly static content like templates:

        # Never, ever, ever cache our certificate or API requests... always pass them to the puppet master.
        location ~ /(.*)/certificate(.*)/(.*)$ { proxy_pass http://unicorn; }
# If a request comes in for the 'master' environment, do not cache it at all
        location ~ /master/(.*)$ { proxy_pass http://unicorn; }
        location / {
            # Cache all requests to the Puppet Unicorn process for at least 10 minutes.
            proxy_cache nginx;
            proxy_cache_methods GET HEAD;
            proxy_cache_key "$scheme$proxy_host$request_uri";
            proxy_cache_valid 10m;
            proxy_cache_valid 404     1m;
            proxy_ignore_headers X-Accel-Expires Expires Cache-Control Set-Cookie;
            proxy_pass http://unicorn;
        }

Digging into the logs, it looks like we're caching a bit too much and are actually caching the /<env>/node/<puppet node name> queries. Here you can see that we generate the results once, then return cached results on the next several queries:

"GET /production/node/nsp_node_prod? HTTP/1.1" 200 13834 "-" "-" 0.021
"GET /production/node/nsp_node_prod? HTTP/1.1" 200 13834 "-" "-" 0.000
"GET /production/node/nsp_node_prod? HTTP/1.1" 200 13834 "-" "-" 0.000
"GET /production/node/nsp_node_prod? HTTP/1.1" 200 13834 "-" "-" 0.000
"GET /production/node/nsp_node_prod? HTTP/1.1" 200 13834 "-" "-" 0.000
"GET /production/node/nsp_node_prod? HTTP/1.1" 200 13834 "-" "-" 0.000

So, I have two questions ..

1. What is the purpose of calling the Node API? Is the agent doing this? Why?
2. Is it possible that if an agent called the node api and got "its own node information" that was wrong, it could then request an invalid catalog?

(Note, we're running Puppet 3.4.3 behind Nginx with Unicorn... and yes, even though we use a single node name for these machines, they use different 'facts' to define which packages and roles they are serving up...)

Matt Wise
Sr. Systems Architect
Nextdoor.com

Matt W

unread,
Aug 22, 2014, 10:38:20 AM8/22/14
to puppet...@googlegroups.com
Even with the caching disabled, I think we ran into this again. Can one of the puppet-devs chime in here and let me know what might be going on?

jcbollinger

unread,
Aug 25, 2014, 10:29:06 AM8/25/14
to puppet...@googlegroups.com


On Friday, August 22, 2014 9:38:20 AM UTC-5, Matt W wrote:
Even with the caching disabled, I think we ran into this again. Can one of the puppet-devs chime in here and let me know what might be going on?



I am not among the Puppet developers, but I think I already touched on the likely problem in your other thread.  You have multiple nodes are identifying themselves to Puppet as the same machine, and if you rely on facts that differ among identity-sharing nodes then you are poking at exactly the point where your shared-identity model breaks down.

Even so, I think your approach would probably work if you serialized catalog requests, e.g. by using the built-in webrick server, since it seems likely that you are experiencing a race on the server.  Specifically, I suspect you'll find that those calls to the REST API are all originating from the master itself.  If an ENC is in use then it would be high on my list of suspects.


John

Matt Wise

unread,
Aug 25, 2014, 11:53:37 AM8/25/14
to puppet...@googlegroups.com
Its tricky because we use an ELB in front of the puppet masters, and we know that the calls to the /node/<node_name> REST API are coming from the ELB, but because of the way we have the ELB configured (pure TCP passthrough), we don't get the extra headers like the x_forwarded_for header. This makes it hard to tell where the requests for the node information are coming from. That said, it feels odd that the puppet master itself would reach out to its own Node API to get node information, rather than just using the information passed in for the catalog request.

Matt Wise
Sr. Systems Architect
Nextdoor.com


--
You received this message because you are subscribed to a topic in the Google Groups "Puppet Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/puppet-users/EorzYWGEUUE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to puppet-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/17f251ea-b694-4c65-9b92-7150b693ba3e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Felix Frank

unread,
Aug 25, 2014, 5:08:39 PM8/25/14
to puppet...@googlegroups.com
On 08/14/2014 07:24 PM, Matt Wise wrote:
>
> 1. What is the purpose of calling the Node API? Is the agent doing
> this? Why?

That's a good one. Does your log not indicating where those calls originate?

Matt Wise

unread,
Aug 25, 2014, 5:17:14 PM8/25/14
to puppet...@googlegroups.com
The log shows the remote connecting IP -- but the IP is the ELB in front of our puppet servers. Unfortunately because we're doing pure TCP-passthrough, ELB logging itself is not useful either in this case. :/

Matt Wise
Sr. Systems Architect
Nextdoor.com


--
You received this message because you are subscribed to a topic in the Google Groups "Puppet Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/puppet-users/EorzYWGEUUE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to puppet-users...@googlegroups.com.

Felix Frank

unread,
Aug 25, 2014, 5:32:58 PM8/25/14
to puppet...@googlegroups.com
On 08/25/2014 11:17 PM, Matt Wise wrote:
> The log shows the remote connecting IP -- but the IP is the ELB in
> front of our puppet servers. Unfortunately because we're doing pure
> TCP-passthrough, ELB logging itself is not useful either in this case. :/
Uhm I see. Bummer.

Is that a Linux box? Could you resort to TPROXY to pass the client IP
through? =)
Reply all
Reply to author
Forward
0 new messages