Mobile devices not getting optimized images through CDN (cache-filling vs user-agent)

36 views
Skip to first unread message

aro...@webscalenetworks.com

unread,
Sep 1, 2017, 11:05:03 PM9/1/17
to mod-pagespeed-discuss
We are having a problem that mobile devices are often not getting optimized images when they are going through the CDN, and instead pagespeed is continually re-optimizing those images instead of caching them.

My questions are:
* Is there a way to configure pagespeed to encode the full image-optimization configuration into the URL so that it does NOT depend on the user-agent to generate the correct, cacheable image?
* Would this problem be solved if we configured pagespeed to use memcached for the metadata cache?
* Can I configure pagespeed to not consider the user-agent when optimizing images?

More details on the actual problem:

We have narrowed this problem down to the fact that the CDN is overwriting the user-agent -- so if a mobile-optimized URL comes to a server through the CDN that doesn't have the value cached, it sees the CDN's user-agent and generates a non-mobile optimized version.  Unfortunately, the digest of the newly optimized resource won't match the requested URL, so the server responds with a short cache timeout and doesn't cache that URL.  Unless the mobile users hit one of the pagespeed servers that has the correct, mobile-optimized resource already cached, it will never respond with the correct content and it will continuously try to re-optimize.

I noticed that pagespeed used to encoding more details about the image optimization parameters into the url itself so that it was less reliant on the user-agent to determine the optimization parameters, however this was removed in commit 9741fa186 quite a while back.  There's no explanation of why that extra configuration was removed.  By manually playing with the URLs to insert the "legacy" mobile codes, I've confirmed that those extra parameters are not sufficient to generate a mobile-optimized version of the image.

Unfortunately, it's not possible for us to configure the CDN to pass the user-agent through without partitioning the cache space (which would be disastrous).

FYI, we're running pagespeed 1.11.33.5.  A cursory investigation of the recent codebase suggests that upgrading will not solve my problem.

Thanks!
- Augusto



Joshua Marantz

unread,
Sep 1, 2017, 11:14:29 PM9/1/17
to mod-pagespeed-discuss
That's some good sleuthing you did there finding that commit.  You may be right; 1.11 may be good enough for this purpose, however the 1.12 release has been marked stable and has tons of bug fixes so I still recommend upgrading. The configuration you want, I think, is AllowVaryOn: https://www.modpagespeed.com/doc/reference-image-optimize#AllowVaryOn

Can you configure your CDN to allow varying on the Accept header?  The Accept header will not fragment cache as badly as User-Agent would.  If you can do that, put
   ModPagespeeedAllowVaryOn Accept,Save-Data

-Josh

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/d4c2ef1c-c09f-40d5-8cbb-d981e71eda57%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

aro...@webscalenetworks.com

unread,
Sep 2, 2017, 5:08:44 PM9/2/17
to mod-pagespeed-discuss
Thanks Josh,

I tried turning that on, but it appears that the user-agent is still causing pagespeed to optimize the image differently.  I'm now making these requests:

URL: ....pagespeed.ic.vVBnt1_794.jpg <-- expected digest is vVBnt1_794

iphone:
  Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
  User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B137 Safari/601.1
  Actual image digest: vVBnt1_794

CDN:
  Accept: */*
  User-Agent: Anything else here
  Actual image digest: qU8xPtAbAL

CDN w/ Accept passthrough:
  Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
  User-Agent: Anything else here
  Actual image digest: qU8xPtAbAL

And, unfortunately, even with the AllowVaryOn setting to Accept,Save-Data, I get the desktop image digest when contacting a server other than the original server that has the optimized URL already in the cache.

- Augusto

Joshua Marantz

unread,
Sep 3, 2017, 3:48:44 PM9/3/17
to mod-pagespeed-discuss
One trivial issue: you need to make sure your CDN passes through Save-Data as well, and includes it in its cache-key.  However, I don't expect this is your issue with private/300 on iphone, because I don't expect iPhone clients to send Save-Data -- that's a feature of the Chrome bandwidth reduction feature.

Can you replicate the scenario in terms of a shell script using wget or curl, setting exactly the headers you want?  Once you do that, can you repeat the shell-script a few times (with at least a few seconds between each run)?  In the scenario you give, I do expect the unoptimized image version to come from a second server until its cache gets warmed up in the background.  It should be served with private caching so it doesn't get captured by your CDN.

Let's also make sure your CDN respects the 'private' header and doesn't accidentally cache the unoptimized result.




--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

aro...@webscalenetworks.com

unread,
Sep 5, 2017, 12:44:40 AM9/5/17
to mod-pagespeed-discuss
Thanks Josh.  Indeed, all of these tests that I'm running are actually communicating directly with the pagespeed servers via curl (or a custom script) but simulating the CDN.  The only difference between the requests is the User-Agent header value (and until recently, the Accept header).

I am also accounting for receiving the unoptimized image, so I make requests repeatedly until I get an optimized result.  I can consistently reproduce getting the desktop-optimized version instead of the mobile-optimized version.

- Augusto

Joshua Marantz

unread,
Sep 5, 2017, 8:57:45 AM9/5/17
to mod-pagespeed-discuss
Can you share the script?  If you don't want to reveal the site maybe you can email it just to me.



--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-discuss+unsub...@googlegroups.com.

aro...@webscalenetworks.com

unread,
Sep 6, 2017, 12:20:07 PM9/6/17
to mod-pagespeed-discuss
I sent a private message but I didn't know if you got it.  Regardless, here's the updated script that runs against a test server.  The cache must be cleaned at the start of the script, so if someone else runs it at the same time it'll screw up other concurrent executions.

Here's the output that I get:

Clearing the pagespeed cache: Done

Requesting the page from server 1 several times so pagespeed optimizes the images: ........ - Done
Now requesting one of the optimized images referenced on that page from each of the servers:
  http://o15qsw-nssqpmay9vos.lagrange.ninja/fBdnswUSt/304x228x1.jpg.pagespeed.ic.vVBnt1_794.jpg
That URL points to the CDN, but we'll bypass the CDN and make requests directly
to each of the servers since we don't want the CDN randomly picking one of our
servers.
The original image has a digest of DxERqlNBe1
The mobile-optimized image should have a digest of vVBnt1_794
The desktop-optimized image should have a digest of qU8xPtAbAL


Requesting image from server 1 (should be optimized, served from cache):
++ curl --resolve o15qsw-nssqpmay9vos.lagrange.ninja:80:35.197.45.86 -s -D server1-cdn-image.jpg.headers -o server1-cdn-image.jpg http://o15qsw-nssqpmay9vos.lagrange.ninja/fBdnswUSt/304x228x1.jpg.pagespeed.ic.vVBnt1_794.jpg -H 'Accept-Encoding: gzip' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'User-Agent: CloudFront'
Digest from server 1: vVBnt1_794     expected: vVBnt1_794
Requesting image from server 2 (should start optimizing, but return unoptimized image):
++ curl --resolve o15qsw-nssqpmay9vos.lagrange.ninja:80:35.197.12.104 -s -D server2-cdn-image.jpg.headers -o server2-cdn-image.jpg http://o15qsw-nssqpmay9vos.lagrange.ninja/fBdnswUSt/304x228x1.jpg.pagespeed.ic.vVBnt1_794.jpg -H 'Accept-Encoding: gzip' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'User-Agent: CloudFront'
Digest from server 2: DxERqlNBe1     expected: DxERqlNBe1
Requesting image from server 2 (should be done optimizing, but will optimize for desktop):
++ curl --resolve o15qsw-nssqpmay9vos.lagrange.ninja:80:35.197.12.104 -s -D server2-cdn-image.jpg.headers -o server2-cdn-image.jpg http://o15qsw-nssqpmay9vos.lagrange.ninja/fBdnswUSt/304x228x1.jpg.pagespeed.ic.vVBnt1_794.jpg -H 'Accept-Encoding: gzip' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'User-Agent: CloudFront'
Digest from server 2: qU8xPtAbAL     expected: vVBnt1_794
Requesting image from server 2 with the mobile user-agent (might get the original image here):
++ curl --resolve o15qsw-nssqpmay9vos.lagrange.ninja:80:35.197.12.104 -s -D server2-mobile-image.jpg.headers -o server2-mobile-image.jpg http://o15qsw-nssqpmay9vos.lagrange.ninja/fBdnswUSt/304x228x1.jpg.pagespeed.ic.vVBnt1_794.jpg -H 'Accept-Encoding: gzip' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B137 Safari/601.1'
Digest from server 2: DxERqlNBe1     expected: DxERqlNBe1
Requesting image from server 2 with the mobile user-agent again (now we'll get the correct image):
++ curl --resolve o15qsw-nssqpmay9vos.lagrange.ninja:80:35.197.12.104 -s -D server2-mobile-image.jpg.headers -o server2-mobile-image.jpg http://o15qsw-nssqpmay9vos.lagrange.ninja/fBdnswUSt/304x228x1.jpg.pagespeed.ic.vVBnt1_794.jpg -H 'Accept-Encoding: gzip' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B137 Safari/601.1'
Digest from server 2: vVBnt1_794     expected: vVBnt1_794
Requesting image from server 2 (getting the correct image from the fixed cache):
++ curl --resolve o15qsw-nssqpmay9vos.lagrange.ninja:80:35.197.12.104 -s -D server2-cdn-image.jpg.headers -o server2-cdn-image.jpg http://o15qsw-nssqpmay9vos.lagrange.ninja/fBdnswUSt/304x228x1.jpg.pagespeed.ic.vVBnt1_794.jpg -H 'Accept-Encoding: gzip' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'User-Agent: CloudFront'
Digest from server 2: vVBnt1_794     expected: vVBnt1_794

- Augusto
pagespeed-test.sh

aro...@webscalenetworks.com

unread,
Sep 11, 2017, 3:44:03 PM9/11/17
to mod-pagespeed-discuss
So I've dug into that and made some changes locally that restore the url encoding and confirmed that it fixes out problems.  I've identified a similar bug with the css url encoding as of this commit.  This explains explicitly that the metadata cache is where the actual cache key is stored and implies that the reason for removing the URL encoding is to avoid partitioning the cache since multiple similar user-agent settings may result in identical files.

Seems like the right thing to do is to have a content-based cache rather than name-based cache instead, but that's a more involved change.

I'm looking into contributing my changes upstream, perhaps guarded by a flag?  We are avoiding a shared metadata cache because we need our pagespeed servers to not depend on any SPF externalities.

- Augusto

Otto van der Schaaf

unread,
Sep 11, 2017, 4:20:30 PM9/11/17
to mod-pagespeed-discuss
Contributions are very welcome over at https://github.com/pagespeed/mod_pagespeed.
A pull request would be a great way to discuss your changes in more detail (and get Travis CI to test them, if you have not done so yourself)
I'd be happy to assist where I can with landing any fixes

Otto

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/627523ed-d888-4b51-9a2f-1dc013b4bc33%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages