I've been noticing some issues with my app and traced it back to how
the urlfetch API is caching responses. In looking at the headers, I
see that that Google Web Accelerator is used as a proxy to handle
response caching:
{'Content-Length': '14692', 'Via': 'HTTP/1.1 GWA (remote cache hit)',
'Age': ' 20828', 'X-Google-Cache-Control': 'remote-cache-hit', 'Vary':
'Cookie,Accept-Encoding', 'Server': 'nginx/0.5.30', 'Connection':
'keep-alive', 'Date': 'Mon, 25 Aug 2008 23:03:52 GMT', 'Content-Type':
'application/xml'}
The problem is that the actual url I'm trying to retrieve was updated
several hours ago, yet AE/GWA is still getting cache hits (as
indicated by the "X-Google-Cache-Control" header). This is returning
a 200 HTTP response code as well, so there's little clue that a cached
version is being used.
This very well could be updated in the GWA cache by the time you read
this, but here's a test AE app making the request:
http://urltest.appspot.com/
and here's the actual URL:
http://lazytweet.disqus.com/c/23350/comments.rss
There is one more item from today in the disqus feed that is not yet
reflected in the AE/GWA response.
I've added Pragma and Cache-Control headers to the request with "no-
cache" as the value, and it didn't help. I see the Disqus response
headers don't do much to help with no last-modified or other hints.
But, is there a way on the AE/GWA end to tell it to go check for the
latest?