Using mod_pagespeed as an image optimization/resizing tool

132 views
Skip to first unread message

Andrei Bocan

unread,
Aug 7, 2014, 5:18:01 PM8/7/14
to mod-pagesp...@googlegroups.com
Hey there,

I've got sort of an oddball use case here, namely I'd like to just use the image resizing / optimization features in mod_pagespeed in front of my image-serving services.

The problem I've run into is url signing.

What i imagined would've been my best bet up to now was AcceptInvalidSignatures, but my tests have pretty much proved that to be dead wrong, in that the ttl i get back for both original and resized versions is 300s.

What am i doing wrong with this? Where is the actual check for the hash/signature being done ?

Joshua Marantz

unread,
Aug 7, 2014, 5:40:50 PM8/7/14
to mod-pagespeed-discuss
What you are trying to do is formulate the ic.pagespeed. URLs yourself in JS or something?  Is that right?  And you don't know the MD5 sum of the optimized file until it's optimized, so you just put in a "0" there or something?

It's not released yet, but in the 'trunk' there happens to be a new option:
  ModPagespeedAcceptInvalidSignatures on
which will do exactly what you want!

1.9 is coming "soon" (can't say when you because we don't know) that will have that as a supported feature.
-Josh


--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/c7650e36-588f-4b19-9625-42e33c56aa4a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrei Bocan

unread,
Aug 8, 2014, 10:16:05 AM8/8/14
to mod-pagesp...@googlegroups.com
Hey Josh,

Thanks for replying.

Does this behavior happen to be in trunk already ?

I compiled trunk, tried setting ModPagespeedAcceptInvalidSignatures to true ( since that's what the new sample configs use ) and it seemed to still to feed the private cache-control directive.

Also noticed that image resizing was being done, and mod_pagespeed was shoving stuff in memcached and everything, but the x-modpagespeed headers were missing for some reason. Is this the expected behavior for trunk ?

Joshua Marantz

unread,
Aug 8, 2014, 10:30:26 AM8/8/14
to mod-pagespeed-discuss
Yes, it's in trunk.  That directive appears in "debug.conf.template", but that's not meant for production use.  It creates a large number of virtual hosts which are only used to test various configuration corner-cases, and this is one of them.  I would suggest cleaning that up for serving traffic and doing your own testing.

Are you sure you are setting ModPagespeedAcceptInvalidSignatures in the right vhost?

mod_pagespeed puts X-Mod-Pagespeed on HTML response headers, but does not (currently) put it on rewritten .pagespeed. resources.

-Josh




--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.

Jeff Kaufman

unread,
Aug 8, 2014, 10:44:04 AM8/8/14
to mod-pagespeed-discuss
PageSpeed has two different hashes it puts in urls, and it responds to
them being valid in two different ways.

In basically all versions of pagespeed there's a hash in the url which
is a hash of the content the url should point to. This is used for
longcaching, and if there's a hash mismatch between what pagespeed
received in the request url and what it's about to send out then it
sets the cache lifetime to 300s to avoid poisoning the cache. This
sounds like what you're running into. This can be disabled in trunk
with PubliclyCacheMismatchedHashesExperimental, but this is something
we're testing and we can't promise it will stay around.

There's another hash, new in trunk and not released in any version
yet, that's optional and used to verify that urls pagespeed received
are ones that it actually generated. This one is enabled for if
UrlSigningKey is set, in which case requests that fail the check will
get a 403 (or maybe 404) instead of a short cache lifetime. The
AcceptInvalidSignatures setting is for people enabling this signing
feature, who don't want to reject requests during that rollout.
> https://groups.google.com/d/msgid/mod-pagespeed-discuss/CAGKR%2BEALeGjiof7nfiC9gtiOMGVhTpnkQfWKZDPC28bgOgjMXg%40mail.gmail.com.

Andrei Bocan

unread,
Aug 8, 2014, 11:34:21 AM8/8/14
to mod-pagesp...@googlegroups.com
I suspect I'm missing some part of the nomenclature around what a 'hash'
is and what a 'signature' is.

PubliclyCacheMismatchedHashesExperimental gets me some part of what I want
but not 100% of the way there. 

When resizing an image, the first request will serve up the original image as a temp
fix until the actual resize ends up being created. This original image, if using 
PubliclyCacheMismatchedHashesExperimental ends up getting served with a long
ttl as well, which wouldn't be that awesome as far as my CDN is concerned.

As far as i've been able to discern, the format for filename of the images i'd like to serve up is
<width>x<height>x<filename>.pagespeed.ic.<hash>.<format>

And it seems that the hash part is the one that's screwing me over in this particular case.

As in, the first request goes off and notices the mismatched hash, queues up the image resize,
and serves up the image with a short ttl, and subsequent requests somehow compute this
hash and shove it in subsequent requests.

Is there any way to separate the two? As in, get mod_pagespeed to ignore the hash and
serve the right size image and get the short ttl for the case when the original image gets
served up?

Would my best bet be to re-implement the hash implementation in another process and
just use that filename to request the images?

Thanks again for the help, really appreciate it.

Joshua Marantz

unread,
Aug 8, 2014, 2:06:28 PM8/8/14
to mod-pagespeed-discuss
Sorry for the confusion.

Ignore my previous suggestion of using ModPagespeedAcceptInvalidSignatures; that won't have any effect for your case. 

Jeff is right, ModPagespeedPubliclyCacheMismatchedHashesExperimental is what you want, and as the name implies, it's experimental.  But we'd love to know more about your use-case.

Don't worry about the signatures for now, they are not affecting you.  Only the hash matters.

You can reduce the risk of serving an unoptimized file by extending the rewrite deadline, say to 10 seconds.
    ModPageSpeedRewriteDeadlinePerFlushMs 10000

-Josh


--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.

Andrei Bocan

unread,
Aug 8, 2014, 2:17:35 PM8/8/14
to mod-pagesp...@googlegroups.com
Right, yeah, that flag is closer to what I want, but extending the rewrite deadline makes me a
bit antsy because it opens this whole setup up to a sort of backpressure scenario where all of the asset
serving is backed up because connection stick around waiting for too long.

The use case is this: i'd like to have all of my thumbnail generation replaced by mod_pagespeed.

I've got one service that serves up the original version of my assets, and then in front of that there's
mod_pagespeed that would be tasked with producing the various scaled and optimized versions of 
the assets.

And then, in front of mod_pagespeed there would be the usual array of lb/cdn type boxes.

Joshua Marantz

unread,
Aug 8, 2014, 2:28:12 PM8/8/14
to mod-pagespeed-discuss
So you like the behavior of background rewriting and falling back to the original resource the first time we see it.

However you want the fallback to result in cache-control:private,max-age=300, and once it's optimized serve it with long TTL, but don't check the hash.  Is that right?

-Josh


--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.

Andrei Bocan

unread,
Aug 8, 2014, 2:32:09 PM8/8/14
to mod-pagesp...@googlegroups.com
Yup, exactly!

Is there a configuration that'll get me there or do I have to dive into the code and try to sort it out myself? What would be a good place to start?

Jeff Kaufman

unread,
Aug 8, 2014, 3:23:43 PM8/8/14
to mod-pagespeed-discuss
There's not a configuration that will do that currently. This needs
code changes.
> --
> You received this message because you are subscribed to the Google Groups
> "mod-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mod-pagespeed-di...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mod-pagespeed-discuss/aa9aa04e-fc83-4101-b8c0-008f1f8df885%40googlegroups.com.

Andrei Bocan

unread,
Aug 8, 2014, 3:36:11 PM8/8/14
to mod-pagesp...@googlegroups.com
Cool, thanks for letting me know.

What would be a good place to start on something like this ?

I was looking at net/instaweb/rewriter/rewrite_context.cc:1022, but i'm not super sure what the exact flow of a request would be.

Are there any docs that i could consult when trying to dig into the source that would shed some more light on that ?

Joshua Marantz

unread,
Aug 8, 2014, 5:17:35 PM8/8/14
to mod-pagespeed-discuss
The private/300 cache setting is done on rewrite_context.cc:3076, in RewriteContext::FixFetchFallbackHeaders

The trick here is to fix the code to distinguish between arriving here because of a deadline expiration, in which case we should write the 'private/300' even if the experimental flag is on, versus any other reason that method is called.

This looks easy (if you are comfortable hacking some C++).  In class RewriteContext::FetchContext, method HandleDeadline calls FetchFallbackDoneImpl calls FixFetchFallbackHeaders.  Snake in a bool argument deadline_exceeded==true through that call chain.

Then fix the other call-chains reaching FixFetchFallbackHeaders to pass through that new bool argument deadline_exceeded=false.

You'll have to change some method signatures to add the bool in net/instaweb/rewriter/public/rewrite_options.h

-Josh





--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.

Andrei Bocan

unread,
Aug 8, 2014, 5:26:52 PM8/8/14
to mod-pagesp...@googlegroups.com
Awesome!

I'll check back in with a patch once i've got this up and running, just in case it might be useful for anyone else.

Thank you again for all the help in dealing with this, you guys have been nothing short of amazing.

Andrei Bocan

unread,
Aug 11, 2014, 9:28:24 PM8/11/14
to mod-pagesp...@googlegroups.com
Hey,

Think I'm sort of done with this, if you feel like it would be useful to other people, let me know what you'd like the flag to be called,
or any changes i could help with.

Haven't added any tests to this particular flag, but i did make sure that the patch keeps all other tests passing both on the latest 
stable and trunk.


Thanks again for the help in getting this off the ground.

David Jack

unread,
Feb 5, 2015, 1:09:14 PM2/5/15
to mod-pagesp...@googlegroups.com
Resurrecting this thread as we're trying to deprecate our patched version of modpagespeed.  Couldn't we move the ModPagespeedPubliclyCacheMismatchedHashesExperimental option check to the statement where we do the hash comparison?  Something along the lines of:  https://gist.github.com/davars/a9e40c8e19311e37f771 ?

Jeff Kaufman

unread,
Feb 6, 2015, 9:39:31 AM2/6/15
to mod-pagespeed-discuss
What's the advantage of moving the check like that?
> --
> You received this message because you are subscribed to the Google Groups
> "mod-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mod-pagespeed-di...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mod-pagespeed-discuss/f3d80f4c-84fd-47eb-b807-efbfe1dcb790%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages