Re: Apache mod_cache/mod_disk_cache...?

241 views
Skip to first unread message

Matt W

unread,
Jun 13, 2012, 1:29:54 PM6/13/12
to puppet...@googlegroups.com
Any thoughts on this problem? With caching enabled, we could make the response-times and CPU-time for requesting files significantly faster. We'd even be able to cache the puppet manifests for short periods when clients request them over-and-over-and-over (i.e., during first boot/configuration of a system). I think the main question here is why does Puppet add a ? to the end of each file-download request even if there is no meta-data attached?

On Saturday, June 9, 2012 8:37:10 PM UTC-4, Matt W wrote:
Is anybody using mod_cache/mod_disk_cache with Puppet? I found a post talking about it here (http://paperairoplane.net/?p=380) and I tried to implement it .. but I found that nothing was being cached. Near as I can tell, Apache refuses to cache any URL that has a query-string attached to it:


• If the URL included a query string (e.g. from a HTML form GET method) it will not be cached unless the response specifies an explicit expiration by including an "Expires:" header or the max-age or s-maxage directive of the "Cache-Control:" header, as per RFC2616 sections 13.9 and 13.2.1.

However, when you look at the mod_cache doc itself
Ordinarily, requests with query string parameters are cached separately for each unique query string. This is according to RFC 2616/13.9 done only if an expiration time is specified. TheCacheIgnoreQueryString directive tells the cache to cache requests even if no expiration time is specified, and to reply with a cached reply even if the query string differs. From a caching point of view the request is treated as if having no query string when this directive is enabled.

These two things seem at-odds with eachother. When I turn 'cacheignorequerystring' On in Apache, the caching starts to work ... but as I understand it, it menas that a request for /foo.sh?bar will cache and return the same result as /foo.sh?xyz ... thus making the query string completely ignored. However, if I leave it off, I get no caching at all because Puppet seems to make every single file request with a ? attached to it:

Jun 10 00:17:59.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadata/modules/zk/ssl/cacert.pem? HTTP/1.1" 200 330 "-" "-" 0/6260
Jun 10 00:17:59.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadata/modules/zk/ssl/zookeeper.XYZ.com.key? HTTP/1.1" 200 346 "-" "-" 0/4499
Jun 10 00:17:59.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadata/modules/stunnel/stunnel? HTTP/1.1" 200 328 "-" "-" 0/4703
Jun 10 00:18:00.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadatas/modules/zk/code?&recurse=true&links=manage&checksum_type=md5& HTTP/1.1" 200 660 "-" "-" 0/7805
Jun 10 00:18:02.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadata/modules/zk/upstart? HTTP/1.1" 200 323 "-" "-" 0/4843
Jun 10 00:18:03.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadatas/modules/prod_ve/certs?&recurse=true&links=manage&checksum_type=md5& HTTP/1.1" 200 2765 "-" "-" 0/16361

If Puppet did not have the ? at the end of the URL, I think that Apache would cache the requests... but obviously this still prevents me from caching the catalogs. Any thoughts?

—Matt


On Saturday, June 9, 2012 8:37:10 PM UTC-4, Matt W wrote:
Is anybody using mod_cache/mod_disk_cache with Puppet? I found a post talking about it here (http://paperairoplane.net/?p=380) and I tried to implement it .. but I found that nothing was being cached. Near as I can tell, Apache refuses to cache any URL that has a query-string attached to it:


• If the URL included a query string (e.g. from a HTML form GET method) it will not be cached unless the response specifies an explicit expiration by including an "Expires:" header or the max-age or s-maxage directive of the "Cache-Control:" header, as per RFC2616 sections 13.9 and 13.2.1.

However, when you look at the mod_cache doc itself
Ordinarily, requests with query string parameters are cached separately for each unique query string. This is according to RFC 2616/13.9 done only if an expiration time is specified. TheCacheIgnoreQueryString directive tells the cache to cache requests even if no expiration time is specified, and to reply with a cached reply even if the query string differs. From a caching point of view the request is treated as if having no query string when this directive is enabled.

These two things seem at-odds with eachother. When I turn 'cacheignorequerystring' On in Apache, the caching starts to work ... but as I understand it, it menas that a request for /foo.sh?bar will cache and return the same result as /foo.sh?xyz ... thus making the query string completely ignored. However, if I leave it off, I get no caching at all because Puppet seems to make every single file request with a ? attached to it:

Jun 10 00:17:59.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadata/modules/zk/ssl/cacert.pem? HTTP/1.1" 200 330 "-" "-" 0/6260
Jun 10 00:17:59.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadata/modules/zk/ssl/zookeeper.XYZ.com.key? HTTP/1.1" 200 346 "-" "-" 0/4499
Jun 10 00:17:59.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadata/modules/stunnel/stunnel? HTTP/1.1" 200 328 "-" "-" 0/4703
Jun 10 00:18:00.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadatas/modules/zk/code?&recurse=true&links=manage&checksum_type=md5& HTTP/1.1" 200 660 "-" "-" 0/7805
Jun 10 00:18:02.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadata/modules/zk/upstart? HTTP/1.1" 200 323 "-" "-" 0/4843
Jun 10 00:18:03.000000 puppetmaster-20372704.cloud.XYZ.com apache: puppetmaster-20372704.cloud.XYZ.com:443 204.236.165.198 - - - puppet.XYZ.com:8140 "GET /production/file_metadatas/modules/prod_ve/certs?&recurse=true&links=manage&checksum_type=md5& HTTP/1.1" 200 2765 "-" "-" 0/16361

If Puppet did not have the ? at the end of the URL, I think that Apache would cache the requests... but obviously this still prevents me from caching the catalogs. Any thoughts?

—Matt

Jeff McCune

unread,
Jun 13, 2012, 8:15:36 PM6/13/12
to puppet...@googlegroups.com
On Wed, Jun 13, 2012 at 10:29 AM, Matt W <ma...@nextdoor.com> wrote:
> Any thoughts on this problem? With caching enabled, we could make the
> response-times and CPU-time for requesting files significantly faster. We'd
> even be able to cache the puppet manifests for short periods when clients
> request them over-and-over-and-over (i.e., during first boot/configuration
> of a system). I think the main question here is why does Puppet add a ? to
> the end of each file-download request even if there is no meta-data
> attached?

In Puppet 2.6 and earlier all of the Facter data for a node was passed
to the master as a query parameter when the catalog is requested.
We've changed this in Puppet 2.7 to POST the data to avoid the length
limit on the GET URI. As you see in your logs for file meta data, we
also pass resource parameters directly in the URI. recurse=true and
such.

What's likely happening is that we always set the query string for all
REST API requests. Even if the parameters are an empty hash or what
not.

You might want to consider the caching in nginx. I've successfully
deployed this configuration at a large customer site and it does
respect query parameters.

-Jeff
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/puppet-users/-/4G2W6NNsTkgJ.
>
> To post to this group, send email to puppet...@googlegroups.com.
> To unsubscribe from this group, send email to
> puppet-users...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.
Reply all
Reply to author
Forward
0 new messages