Does get_url prevent caching?

63 views
Skip to first unread message

Joshua J. Kugler

unread,
Feb 2, 2016, 8:59:06 PM2/2/16
to Ansible Project
I am trying to download some ISOs to multiple machines (via a proxy to
conserve bandwidth).

The ISO is being stored by the proxy, and the machine is using the proxy, but
it is downloading from the upstream source every time. Squid is showing in its
logs:

1454462392.008 532579 192.168.122.10 TCP_CLIENT_REFRESH_MISS/200 632291702 GET
http://mirrors.kernel.org/centos/7.2.1511/isos/x86_64/CentOS-7-x86_64-Minimal-1511.iso - HIER_DIRECT/198.145.20.143 application/octet-stream

According to the squid docs:

TCP_CLIENT_REFRESH_MISS
The client issued a "no-cache" pragma, or some analogous cache control command
along with the request. Thus, the cache has to refetch the object.

Using a standard wget (or, say, yum to retrieve packages) does not cause
CLIENT_REFRESH_MISSes.

Is there something in the get_url code that is causing the sending of a no
cache pragma? Or maybe it's not turning off some default option in the
underlying urllib (or whatever it uses under the hood)?

j

--
Joshua J. Kugler - Fairbanks, Alaska
Azariah Enterprises - Programming and Website Design
jos...@azariah.com - Jabber: peda...@gmail.com
PGP Key: http://pgp.mit.edu/ ID 0x73B13B6A

Andrew Edelstein

unread,
Aug 10, 2016, 6:37:52 PM8/10/16
to ansible...@googlegroups.com
While I'd be interested to know the answer to this as well, why don't you just download the ISO to a local machine, then have your Ansible play grab the ISO from that machine?

--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-project+unsubscribe@googlegroups.com.
To post to this group, send email to ansible-project@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/4441885.mgqQDUHLTY%40hosanna.
For more options, visit https://groups.google.com/d/optout.

Richard James Salts

unread,
Aug 10, 2016, 11:49:07 PM8/10/16
to ansible...@googlegroups.com

On 03/02/16 12:58, Joshua J. Kugler wrote:
> I am trying to download some ISOs to multiple machines (via a proxy to
> conserve bandwidth).
>
> The ISO is being stored by the proxy, and the machine is using the proxy, but
> it is downloading from the upstream source every time. Squid is showing in its
> logs:
>
> 1454462392.008 532579 192.168.122.10 TCP_CLIENT_REFRESH_MISS/200 632291702 GET
> http://mirrors.kernel.org/centos/7.2.1511/isos/x86_64/CentOS-7-x86_64-Minimal-1511.iso - HIER_DIRECT/198.145.20.143 application/octet-stream
>
> According to the squid docs:
>
> TCP_CLIENT_REFRESH_MISS
> The client issued a "no-cache" pragma, or some analogous cache control command
> along with the request. Thus, the cache has to refetch the object.
>
> Using a standard wget (or, say, yum to retrieve packages) does not cause
> CLIENT_REFRESH_MISSes.
>
> Is there something in the get_url code that is causing the sending of a no
> cache pragma? Or maybe it's not turning off some default option in the
> underlying urllib (or whatever it uses under the hood)?
I'd say it's nothing to do with get_url or the fact ansible is involved.
It's more likely to do with the configuration of squid, particularly
http://www.squid-cache.org/Doc/config/maximum_object_size/.
The default is 4MB which is significantly smaller than the 600MB of the
Centos ISO.
>
> j
>

Reply all
Reply to author
Forward
0 new messages