Caching an image over 12,000 times

28 views
Skip to first unread message

Devon

unread,
Feb 2, 2015, 4:03:14 PM2/2/15
to mod-pagesp...@googlegroups.com
I am running mod_pagespeed 1.9.32.3 on Apache 2.2 on RHEL5.

While trying to tune to solve some serious CPU burn issues, I discovered that there's some oddness going on with the caching.  It appears that some assets are getting cached in the /var/cache/mod_pagespeed area thousands of times.  For instance one small PNG image has 12,683 references in there.  They all have the same host/path, but appear to have different css-keys?

.............



Here is the output for that file from the metadata cache in the admin - http://dcloud.sparkred.com/image/3i0i3y352s2V

Here is the configuration from the admin:

Version: 13: on

Filters
ah	Add Head
cw	Collapse Whitespace
ch	Combine Heads
jc	Combine Javascript
gp	Convert Gif to Png
jp	Convert Jpeg to Progressive
jw	Convert Jpeg To Webp
mc	Convert Meta Tags
pj	Convert Png to Jpeg
ec	Cache Extend Css
ei	Cache Extend Images
es	Cache Extend Scripts
fc	Fallback Rewrite Css 
if	Flatten CSS Imports
hw	Flushes html
ci	Inline Css
ii	Inline Images
il	Inline @import to Link
ji	Inline Javascript
id	Insert Image Dimensions
js	Jpeg Subsampling
cm	Move Css To Head
rj	Recompress Jpeg
rp	Recompress Png
rw	Recompress Webp
ri	Resize Images
cf	Rewrite Css
jm	Rewrite External Javascript
jj	Rewrite Inline Javascript
cu	Rewrite Style Attributes With Url
is	Sprite Images
cp	Strip Image Color Profiles
md	Strip Image Meta Data




Any ideas?  I have 222,000 files under the cache dir right now, which seems excessive....


Jeff Kaufman

unread,
Feb 2, 2015, 4:14:06 PM2/2/15
to mod-pagespeed-discuss
It looks to me like you have files like
/static/images/closeBtn.png@@_css-key=W7EdkrCSZv_RANDOM as part of the
design of your site, and RANDOM is something for cachebusting? Do you
know what this is for? It's not part of pagespeed.
> --
> You received this message because you are subscribed to the Google Groups
> "mod-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mod-pagespeed-di...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mod-pagespeed-discuss/78080e6b-4e4a-400d-abfe-a407d8111310%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Devon

unread,
Feb 2, 2015, 4:22:21 PM2/2/15
to mod-pagesp...@googlegroups.com
Thank you for your reply.  The two references we have to closeBtn.png in the application are:

an in-line css block on one page:

 .ui-dialog-titlebar-close{
 background
:url(/static/images/closeBtn.png);


and in jquery.ui.theme.css:


.ui-widget-content .ui-dialog-titlebar-close.ui-state-hover {
 background
: url(../../images/closeBtn.png) no-repeat;


Which gets rewriten by mod_pagespeed to this in the output css:




 
 MD5Hasher hasher; key_suffix_ = StrCat("css-key=", hasher.Hash(css_text), "_", hasher.Hash(css_url.AllExceptLeaf()));


So it looks like something internal to mod_pagespeed?

Jeff Kaufman

unread,
Feb 2, 2015, 4:31:58 PM2/2/15
to mod-pagespeed-discuss
I'm sorry, you're right! css-key is something we add to our cache
keys as part of our image spriting code, I'd forgotten about that.

I wonder if maybe css_url.AllExceptLeaf() doesn't do the right thing
for inline css? I'll look more.
>> > email to mod-pagespeed-di...@googlegroups.com.
> --
> You received this message because you are subscribed to the Google Groups
> "mod-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mod-pagespeed-di...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mod-pagespeed-discuss/3c6ea499-c91f-4fc1-8d51-e3c35f174a81%40googlegroups.com.

Devon

unread,
Feb 2, 2015, 4:35:49 PM2/2/15
to mod-pagesp...@googlegroups.com
Thank you!  Let me know if there's anything else I can provide that would be helpful or if there's a good workaround in the mean time?
>> > To view this discussion on the web visit
>> >
>> > https://groups.google.com/d/msgid/mod-pagespeed-discuss/78080e6b-4e4a-400d-abfe-a407d8111310%40googlegroups.com.
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "mod-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an

Maksim Orlovich

unread,
Feb 3, 2015, 8:22:01 AM2/3/15
to mod-pagesp...@googlegroups.com
On Mon, Feb 2, 2015 at 4:35 PM, Devon <devon...@gmail.com> wrote:
> Thank you! Let me know if there's anything else I can provide that would be
> helpful or if there's a good workaround in the mean time?

Disabling image spriting (sprite_images) seems like it ought to work?

@jefftk: Using something like the path for the HTML page in the inline
case could work but would still suck if we have tons of directories
--- we ran into problem with that with the CSS filter as well...

Jeff Kaufman

unread,
Feb 3, 2015, 10:07:06 AM2/3/15
to mod-pagespeed-discuss
On Tue, Feb 3, 2015 at 8:22 AM, 'Maksim Orlovich' via
mod-pagespeed-discuss <mod-pagesp...@googlegroups.com> wrote:
>
> @jefftk: Using something like the path for the HTML page in the inline
> case could work but would still suck if we have tons of directories
> --- we ran into problem with that with the CSS filter as well...

@devon: Is this your situation? Do you have 12k+ html pages on your
site that include "background:url(/static/images/closeBtn.png);" in an
inline style block?

@morlovich: Why do we need a hash of the path in the metadata cache
key at all? Why not just hash the contents of the style block?

Devon

unread,
Feb 3, 2015, 10:14:36 AM2/3/15
to mod-pagesp...@googlegroups.com
@jefftk - quite possibly yes.  The inline block is in a fragment which is included in product pages in an eCommerce site with more than 12,000 products.  It's the same JSP, but rendering out different pages at different URIs.

Jeff Kaufman

unread,
Feb 3, 2015, 10:17:26 AM2/3/15
to mod-pagespeed-discuss
On Tue, Feb 3, 2015 at 10:14 AM, Devon <devon...@gmail.com> wrote:
> @jefftk - quite possibly yes. The inline block is in a fragment which is
> included in product pages in an eCommerce site with more than 12,000
> products. It's the same JSP, but rendering out different pages at different
> URIs.

Turning off the sprite_images filter would probably be a good idea if
you're running out of cache space then.

Devon

unread,
Feb 3, 2015, 10:19:33 AM2/3/15
to mod-pagesp...@googlegroups.com
I will give it a try.  The issue is less disk space, than CPU and disk I/O burn while writing out 250,000+ cache files...  If this works and we can reduce duplicate copies of many items by 12,000X that would help a great deal.

Maksim Orlovich

unread,
Feb 3, 2015, 10:39:21 AM2/3/15
to mod-pagesp...@googlegroups.com
It's potentially relevant for resolving relative URLs --- the same css
snippet in foo.com/bar and foo.com/glarch
may have images resolve differently (if it's something like
background:url(images/closeBtn.png), without the leading
slash). I suppose with inline css we could actually check, though
that's a bit more computation than we usually
want for computing the cache key.

See also CssFilter::Context::CacheKeySuffix() --- it may make some
sense to refine that further, and propagate it to the spriter, too.

Devon

unread,
Feb 3, 2015, 10:57:19 AM2/3/15
to mod-pagesp...@googlegroups.com
It's still early, but disabling the image spriting seems to have made a big difference.   The cache is now at 80 MB and 18,000 files, instead of 1.5 GB and 340,000 files.  CPU burn is much lower 1.5 instead of 5-8.

Thanks!
Reply all
Reply to author
Forward
0 new messages