|
The 4 requests are due to a bug. We first make a HEAD request to get the content-md5 or last-modified "checksums". Note the server must return one of these headers in order for puppet to not download the file every time it runs. Puppet also doesn't support ETag.
For reasons unknown the code would make 2 HEAD requests. I eliminated the second request in https://github.com/puppetlabs/puppet/commit/deac457fe305b487614eb280fe9e67d518d13d97.
If the destination file is not "insync" (it's missing, "checksums" don't match, or the server didn't send a checksum), then we try to download the file contents. We make another HEAD request apparently to see if we should be redirected, and then issue the GET. I don't believe this HEAD request is necessary, and should probably be eliminated. Seems more correct to just issue the GET request, and follow the redirect(s), if any.
Whenever puppet downloads a file, it should write to a tempfile, and verify the downloaded content matches the expected checksum. Note if the http server doesn't send the content-md5 header, then we have to fall back to last-modified, which won't ensure the downloaded file checksum is correct.
So for this ticket it sounds like there are two things:
-
Eliminate the one extra HEAD request
-
Improve documentation that the server must reply with Content-MD5 or Last-Modified in the HEAD response so that the agent doesn't download the file on every run.
And we can keep PUP-6380 open for how to handle servers like S3 that don't allow HEAD requests. Does that make sense?
|