Show Metadata Cache cache_ok:false and a lot of file cache misses

783 views
Skip to first unread message

Adrian Petre

unread,
Aug 17, 2014, 3:26:47 PM8/17/14
to mod-pagesp...@googlegroups.com
I analyze pagespeed_admin/cache#show_metadata for the url "http://adrhc.go.ro/mypersonalwebsite/#/.../.../30/asc/1" and I get:
Metadata cache key:rname/aj_zELwxrfTIMCpEsrEcbeu/http://adrhc.go.ro/mypersonalwebsite/#/.../.../30/asc/1@@_
cache_ok:false
can_revalidate:false
partitions:

How can I find why cache is not ok ?

Also I notice I have a lot of file_cache_misses (60%); how can I find why this happens ?

Joshua Marantz

unread,
Aug 17, 2014, 8:34:59 PM8/17/14
to mod-pagespeed-discuss
Sorry for the confusion; this is a new feature that we haven't even doc'd yet.

Metadata cache keys are an entire rewritten URL ".pagespeed.", e.g. http://modpagespeed.com/images/xBikeCrashIcn.png.pagespeed.ic.ueGwoh55pQ.webp

We don't have the capability to look up all the metadata entries associated with the original resource (even if there happens to only be one).  In this case, the original resource http://modpagespeed.com/images/BikeCrashIcn.png might have a metadata cache entry for the jpeg-transcoded version (for FireFox, IE, Safari) and the webp-transcoded version (for Chorme, Opera, and Android).  So the "show metadata cache" UI requires you to put in a .pagespeed. URL.

Cache misses may happen because your cache is not working properly, or because your origin might employ a cache-busting technique for its URL generation, or your origin resources may be privately cached.  They may happen because of cache evictions as well, if the cache size is too small to hold your site's resources.

However cache misses can also happen when mod_pagespeed is simply learning a large site, and might diminish as you keep revisiting the same page.

-Josh


--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/8c2595f2-fbbd-4241-96c9-c739f115d7ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Adrian Petre

unread,
Aug 18, 2014, 3:30:26 AM8/18/14
to mod-pagesp...@googlegroups.com
Is it possible to learn who are the pagespeed URLs ?

Joshua Marantz

unread,
Aug 18, 2014, 8:19:45 AM8/18/14
to mod-pagespeed-discuss
Normally you would get them from the HTML after URLs are rewritten, either via "View Source" or the Developer Tools in your browser.

If you are trying to diagnose why a URL is *not* rewritten then this obviously won't work.  I'd try turning on the "debug" filter and then examining the rewritten HTML.  We did a lot of work a couple of months ago to make the "debug" filter print more useful hints about why PageSpeed makes certain decisions.   Though these "debug" filter improvements are released yet, I think you are using a recent trunk build so you should have them.

Another thing you can do to figure out .pagespeed. URLs is to scan your file cache directory, but you will need to take into account the filename escaping we do so arbitrary URLs can be written out as files.  You might be able to figure it out just by looking at the filenames, or you can check the source here: https://code.google.com/p/modpagespeed/source/browse/trunk/src/pagespeed/kernel/util/url_to_filename_encoder.cc

In any case, we are aware that the admin page user-interface for cache exploration needs a lot of work.  To make matters worse, you are looking at a version of that user interface that was probably being modified at the time you grabbed a snapshot of trunk to port to ARM.

Can I ask: what do you hope to achieve by examining the metadata cache?  We created this interface initially for ourselves but put it in the UI primarily so we could help people debug problems via email.

-Josh



On Mon, Aug 18, 2014 at 3:30 AM, Adrian Petre <adrian....@gmail.com> wrote:
Is it possible to learn who are the pagespeed URLs ?

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.

Adrian Petre

unread,
Aug 18, 2014, 9:10:28 AM8/18/14
to mod-pagesp...@googlegroups.com
I have the 17-Aug-2014 version (
X-Mod-Pagespeed:
1.7.0.0-4161).
I'm trying to learn more about the relation between different types of cache (as presented here pagespeed_admin/graphs#cache_type): cache, file cache, LRU, shared memory. I have 53% file cache misses and I don't understand why (space allocated is infinite) and struggling to find to problem learning as much as I can find. Besides running on ARM I also have to use non-standard-linux-paths, e.g. /ffp/var as /var directory; I'm trying to understand if this could influence the file cache misses. Sometimes (scarce) I get "Failed to stat" errors but I'm quite sure it has nothing to do with the file cache misses.

Adrian Petre

unread,
Aug 18, 2014, 9:19:04 AM8/18/14
to mod-pagesp...@googlegroups.com
In addition it puzzles me the result when using http://www.webpagetest.org/compare against the /mypersonalwebsite already optimized with pagespeed. I'm comparing using the "Test mobile page" option and the google's timing is far better (7x). 
With http://www.webpagetest.org/result/140818_AY_HJQ/ I get:
B First Byte Time
A Keep-alive Enabled
A Compress Transfer
F Compress Images -> related only to my personal images (family album) not images on buttons or links
F Progressive JPEGs -> related only to my personal images (family album) not images on buttons or links
A Cache static content
X Effective use of CDN -> here only ask to put the images (family album) on a CDN
so I think that only these adjustment for getting an A overall mark doesn't justify the 7x time difference for http://www.webpagetest.org/compare or does it ?

Joshua Marantz

unread,
Aug 18, 2014, 9:21:02 AM8/18/14
to mod-pagespeed-discuss
RE learning about caching in MPS, please read: http://modpagespeed.jmarantz.com/2012/12/caching-in-modpagespeed.html

RE 53% file cache misses: is this steady state?  Can you leave your server running and send load over your top 100 URLs with a script running recursive wget?  I would like to know the miss-rate under that load after 10 minutes of spinning that wget-recursive script.


--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.

Adrian Petre

unread,
Aug 18, 2014, 9:35:48 AM8/18/14
to mod-pagesp...@googlegroups.com
I just let a page refresh (7s rate) with chrome developer tools window open and Disable cache (while DevTools is open) checked.

Adrian Petre

unread,
Aug 18, 2014, 10:12:28 AM8/18/14
to mod-pagesp...@googlegroups.com
After 30 min with every 7s a full page reload I get:

Cache hits = backend hits = 50%
File Cache (infinite):
hits 36.7%
inserts 3.1%
misses 60.2%
LRU hits = 57.88% -> I set it very small 512K
Shared Memory hits = 98.5% -> large (for my system) 10M

Adrian Petre

unread,
Aug 18, 2014, 10:14:11 AM8/18/14
to mod-pagesp...@googlegroups.com
And no errors or warnings in pagespeed_admin/message_history.

Joshua Marantz

unread,
Aug 18, 2014, 11:21:04 AM8/18/14
to mod-pagespeed-discuss
scanning briefly your WPT results I see URL http://adrhc.go.ro/fotografii/thumbs/2012-11-07_Amalia/DSC_9734.jpg/1405035003000 served with this cache-control:

  1. Cache-Control:
    max-age=31536000, public, must-revalidate

That "must-revalidate" means that PageSpeed cannot legally (by HTTP rules) serve an optimized version of it from its own cache, without validating it with the origin. So we consider it uncacheable, and I suspect this might be related to your cache misses.  Can you remove that "must-revalidate" attribute from your origin server for those images?

It's fine to serve the images with a shorter TTL, say 1 hour or 1 day, but if you force clients (in this case, mod_pagespeed) to check back with the server whether the image is up-to-date every time it's served then you are not going to get great performance.

-Josh


On Mon, Aug 18, 2014 at 10:14 AM, Adrian Petre <adrian....@gmail.com> wrote:
And no errors or warnings in pagespeed_admin/message_history.

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.

Adrian Petre

unread,
Aug 18, 2014, 11:29:16 AM8/18/14
to mod-pagesp...@googlegroups.com
max-age=31536000, public, must-revalidate -> this means that only after 1 year (max-age=31536000) the resource would become stale so the client would have to check it (must-revalidate)

Why would pagespeed check back with the server whether the image is up-to-date every time it's served but not as it should I mean after 1 year (max-age=31536000) ?

Adrian Petre

unread,
Aug 18, 2014, 11:37:19 AM8/18/14
to mod-pagesp...@googlegroups.com
I also use ModPagespeedModifyCachingHeaders off.

Jeff Kaufman

unread,
Aug 18, 2014, 11:40:08 AM8/18/14
to mod-pagespeed-discuss
Why did you set ModPagespeedModifyCachingHeaders off? That's almost
never a good idea. In the doc we write:

"""
We do not suggest you turn this option off. It breaks mod_pagespeed's
caching assumptions and can lead to unoptimized HTML being served from
a proxy caches set up in front of the server. If you do turn it off,
we suggest that you do not set long caching headers to HTML or users
may receive stale or unoptimized content.
"""

https://developers.google.com/speed/pagespeed/module/install#ModifyCachingHeaders



On Mon, Aug 18, 2014 at 11:37 AM, Adrian Petre <adrian....@gmail.com> wrote:
> I also use ModPagespeedModifyCachingHeaders off.
>
> --
> You received this message because you are subscribed to the Google Groups
> "mod-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mod-pagespeed-di...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mod-pagespeed-discuss/15e8df62-c7fd-4d7a-b682-e6c8d4a4a378%40googlegroups.com.

Jeff Kaufman

unread,
Aug 18, 2014, 11:41:07 AM8/18/14
to mod-pagespeed-discuss
The "must-revalidate" in your Cache-Control header requires checking
back every time to look for changes.
> --
> You received this message because you are subscribed to the Google Groups
> "mod-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mod-pagespeed-di...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mod-pagespeed-discuss/59596d4c-d8b9-47cc-9091-1b1b73073e1c%40googlegroups.com.

Adrian Petre

unread,
Aug 18, 2014, 11:46:09 AM8/18/14
to mod-pagesp...@googlegroups.com
Because I have:

ExpiresByType text/html "access plus 0 seconds"
<FilesMatch "\.(css|html?|jpe?g|json|js|bmp|gif|png|tiff|x-icon)$">
Header append Cache-Control "public, must-revalidate"
</FilesMatch>

user1 asks xxx.html: i get partial optimized xxx.html
user2 asks xxx.html: i get partial optimized xxx.html
...
usern asks xxx.html: i get full optimized xxx.html

All users will have a valid/correct html but only some of them will have it fully optimized; that's not a problem. But with this config next time when user1 asks for xxx.html it will the server won't have to send it again if it is not modified because user1 already have it in browser cache.

Adrian Petre

unread,
Aug 18, 2014, 11:51:32 AM8/18/14
to mod-pagesp...@googlegroups.com
max-age=31536000, public, must-revalidate -> this means that only after 1 year (max-age=31536000) the resource would become stale so the client would have to check it (must-revalidate)
Why would pagespeed check back with the server whether the image is up-to-date every time it's served but not as it should I mean after 1 year (max-age=31536000) ?

Do you know why ?

Joshua Marantz

unread,
Aug 18, 2014, 12:25:06 PM8/18/14
to mod-pagespeed-discuss
Actually I think Adrian is right.  See http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.4 and the debate on http://stackoverflow.com/questions/2932890/http-cache-control-max-age-must-revalidate

And in fact PageSpeed interpretes it that way, and I was mis-remembering.   From pagespeed/kernel/http/caching_headers_test.cc :

  // must-revalidate does not imply uncacheability: it just means
  // that stale content should not be trusted.
  SetCacheControl("must-revalidate,max-age=600");
  EXPECT_FALSE(headers_->ProxyRevalidate());
  EXPECT_TRUE(headers_->MustRevalidate());

and (I just checked), adding after this:

  EXPECT_TRUE(headers_->IsProxyCacheable());

works fine.  So my suggestion about removing the "must-revalidate" from your server settings can be ignored.

The only implication of "must-revalidate" on PageSpeed is if there is *no* max-age, then we would not cache it based on heuristics.

-Josh


Message has been deleted

Adrian Petre

unread,
Aug 30, 2014, 7:54:56 AM8/30/14
to mod-pagesp...@googlegroups.com
It seems that I misconfigured the file cache. 
I had: 
ModPagespeedFileCachePath "/usr/local/zy-pkgs/ffproot/ffp/var/cache/mod_pagespeed/file-cache/" 
and a different value for the file cache path for ModPagespeedCreateSharedMemoryMetadataCache. 
After setting to the same path:
ModPagespeedCreateSharedMemoryMetadataCache "/usr/local/zy-pkgs/ffproot/ffp/var/cache/mod_pagespeed/file-cache/" 10240
everything is fine.
Reply all
Reply to author
Forward
0 new messages