Recommended config for pagespeed reverse proxy server?

624 views
Skip to first unread message

Quinn Comendant

unread,
Jul 17, 2015, 6:50:49 PM7/17/15
to mod-pagesp...@googlegroups.com
Hello all,

I'd like to setup a server to use as a dedicated pagespeed reverse proxy server for several sites which cannot run pagespeed on their origin. It seems I should be able to create a VirtualHost for each site and use ModPagespeedMapProxyDomain to specify the origin server. I haven't been able to get this to work. Here's what I have so far.

* http://origin.example.com/ is the web server for all dynamic and static content. It does not run pagespeed.
* http://www.example.com/ will be the pagespeed server configured as a reverse proxy. No files are hosted here.

I've installed mod-pagespeed-stable-1.9.32.4-7251.x86_64 for CentOS and am using the default config from /etc/httpd/conf.d/pagespeed.conf except with CoreRules enabled (my config at <http://hastebin.com/jeroludahe>).

I then added a VirtualHost directory:

<VirtualHost 111.222.333.444:80>
    ServerName www.example.com
   
<IfModule pagespeed_module>
        ModPagespeed On
        ModPagespeedMapProxyDomain www.example.com origin.example.com
   
</IfModule>
</VirtualHost>

Now any requests to http://www.example.com/ results in an empty page with a "403 Forbidden" header:

[q@localhost ~] curl -I http://www.example.com/
HTTP
/1.1 403 Forbidden
[…]


The request is logged to the global access_log, but there are no messages in error_log (does MPS have its own error log?). I am able to access http://www.example.com/pagespeed_global_admin but there is no indication there of errors.


If I remove the ModPagespeed* config and set up the vhost as a proxy using ProxyPassReverse / http://origin.example.com/, the site loads fine. However, of course, pagespeed doesn't take effect.


Can you see what I've done wrong? What is the recommended way to setup a pagespeed reverse proxy?


Many thanks,
Quinn

PS: Once I find the solution I'll post it on stackoverflow too, since I can't find any step-by-step examples of how to do this online. 

Jeff Kaufman

unread,
Jul 20, 2015, 9:08:07 AM7/20/15
to mod-pagespeed-discuss
What you're trying to do makes a lot of sense, but MapProxyName isn't the right configuration setting.  Instead, set up mod_proxy as in a standard reverse proxying setup, and turn on PageSpeed globally.

It looks like you may have tried this, except without setting "ModPagespeed on"?





--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/a5c7b2be-387b-4107-bc3b-974b8f9174b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Quinn Comendant

unread,
Jul 25, 2015, 3:22:00 PM7/25/15
to mod-pagespeed-discuss, jef...@google.com
Hi Jeff

Of course, why would pagespeed include its own proxy logic, when it can use mod_proxy? ☺️ 

Ok, so adding the two together does work: I'm getting the X-Page-Speed: 1.9.32.4-7251 header, and compressed HTML. This is progress.

However, the URLs to resources are not being rewritten (i.e., hyperlinks still point to origin server):

<a href="http://origin.example.com/page">clickme</a>

My current vhost config is now:

<VirtualHost 111.222.333.444:80>
    ServerName www.example.com
   
<IfModule pagespeed_module>
        ModPagespeed On
        ModPagespeedMapProxyDomain www.example.com origin.example.com
   
</IfModule>

    ProxyPass / http://origin.example.com/
    ProxyPassReverse / http://origin.example.com/
</VirtualHost>

And pagespeed config as shown at http://hastebin.com/jeroludahe

To try getting the URLs rewritten, I've tried adding:

ModPagespeedMapProxyDomain www.example.com origin.example.com

and:

ModPagespeedMapProxyDomain http://www.example.com/ http://origin.example.com/

Both result in "403 Forbidden" errors (although nothing in apache's error_log?).

I've also tried using ModPagespeedDomain and ModPagespeedMapRewriteDomain without success.

What is the correct way to ensure all URLs are rewritten to use the domain of the proxy host?

Thanks,
Quinn

Quinn Comendant

unread,
Jul 25, 2015, 3:25:15 PM7/25/15
to mod-pagespeed-discuss, jef...@google.com, qu...@strangecode.com
On Saturday, July 25, 2015 at 2:22:00 PM UTC-5, Quinn Comendant wrote:
To try getting the URLs rewritten, I've tried adding:

ModPagespeedMapProxyDomain www.example.com origin.example.com

and:

ModPagespeedMapProxyDomain http://www.example.com/ http://origin.example.com/

Both result in "403 Forbidden" errors (although nothing in apache's error_log?).

Oh, but strangely, if I reverse the arguments, the "403 Forbidden" errors go away (although URLs still are not rewritten):

ModPagespeedMapProxyDomain origin.example.com www.example.com

 Any idea why that might be?

Q

Jeff Kaufman

unread,
Jul 27, 2015, 7:54:05 AM7/27/15
to mod-pagespeed-discuss, qu...@strangecode.com
Could you replace your ModPagespeedMapProxyDomain call with:

    ModPagespeedMapRewriteDomain www.example.com origin.example.com
    ModPagespeedEnableFilters rewrite_domains

That tells PageSpeed that when it sees origin.example.com urls it should replace them with www.example.com urls.

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.

Quinn Comendant

unread,
Jul 27, 2015, 12:52:26 PM7/27/15
to mod-pagespeed-discuss, jef...@google.com
Actually, that does something: adding `rewrite_domains` allows image src to be rewritten from, e.g.:


to:


However, it does nothing for <a href="…"> links. Those are all still using origin.example.com even while the img src are being rewritten.

Quinn

Message has been deleted

Jeff Kaufman

unread,
Jul 28, 2015, 10:35:41 AM7/28/15
to Quinn Comendant, mod-pagespeed-discuss
What Host header is origin.example.com receiving?  What does it think its name is?  I think the problem with the reverse proxying here is that origin.example.com should simply be configured as if it is www.example.com and receive www.example.com Host headers, and then all your links will be fine.

On Mon, Jul 27, 2015 at 1:06 PM, Quinn Comendant <qu...@strangecode.com> wrote:
It's also not rewriting included CSS URLs:

<link rel="stylesheet" type="text/css" href="http://origin.example.com/ie.css">

And as mentioned before also not anchors:

<a href="http://origin.example.com/tag/things">things</a>

😓

Quinn Comendant

unread,
Jul 28, 2015, 3:21:59 PM7/28/15
to Jeff Kaufman, mod-pagespeed-discuss
Hi Jeff,

On Tue, 28 Jul 2015 10:35:39 -0400, Jeff Kaufman wrote:
> What Host header is origin.example.com receiving?
> What does it think its name is?

It's receiving `Host: origin.example.com`. It's name is origin.example.com (for this example).

> I think the problem with the reverse proxying here is
> that origin.example.com should simply be configured as if it is
> www.example.com and receive www.example.com Host headers

Ok, that might be a solution. But the site is currently running behind google's PS service, using this configuration (pulling content from origin.example.com) so I expected it to work the same with the module.

Also, in doing this we would not be able to test, and compare results with the new setup. In fact, for now, I'm accessing this at, e.g., www-new.example.com so I can compare the results with www.example.com which is still hosted with the PS service.

Furthermore, this may not work with some origin hosts, if they don't support multiple hostname aliases for a hosted site. The reverse proxy server will need to access one hostname to resolve the origin service (e.g., origin.example.com; because DNS for www.example.com would resolve to itself) and then use its equivalent to a `ProxyPreserveHost` setting forward requests to the origin server's second hostname (e.g., www.example.com).

These seem like two good reasons the PS module should support rewriting href domains. Are you sure it doesn't?

Quinn

Jeff Kaufman

unread,
Jul 31, 2015, 12:44:51 PM7/31/15
to mod-pagespeed-discuss
When PageSpeed Service contacts origin.example.com to load content for
www.example.com it sets "Host: www.example.com" and not "Host:
origin.example.com". If you configure Apache to set the Host header
the same way then it should generate the urls you want. This is
something you can configure entirely within www-new.example.com, for
testing, without touching origin.example.com.

This should work with all origins, because the change that's required
(turning on ProxyPreserveHost or similar to keep the original Host
header) is one you make on the reverse proxy server, not on the
origin.
> --
> You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/20150728142157870598.dc80ac7e%40strangecode.com.

Quinn Comendant

unread,
Aug 5, 2015, 12:36:22 AM8/5/15
to mod-pagesp...@googlegroups.com
Yes, you can do this if www.example.com resolves to the pagespeed proxy, in which case ProxyPreserveHost will pass this to the origin (as you said). But if www.example.com points somewhere else (e.g., A legacy server) and the only domain that resolves to the pagespeed proxy is www-new.example.com (for testing), then the origin server will be hit with "Host: www-new.example.com" (or "Host: origin.example.com" if not using ProxyPreserveHost). In this scenario, links pointing to www.example.com will remain as such (not rewritten) and we miss out be able to test the site as well as we'd like.

It it on the roadmap for mod-pagespeed to rewrite href domains?

Thanks for your help Jeff, I really appreciate it.

Quinn

Jeff Kaufman

unread,
Aug 5, 2015, 9:36:09 AM8/5/15
to mod-pagespeed-discuss
That makes sense.

This behavior is actually pretty close to something else we've been experimenting.  We can look at how much more work it would be to include this.

Getting this to cover 100% of internal links on the site can be hard because of, say, js, but for testing hrefs should be enough.


Jeff Kaufman

unread,
Aug 5, 2015, 10:05:47 AM8/5/15
to mod-pagespeed-discuss, Maksim Orlovich
It turns out we do have this feature, it's just not documented.  If you set "ModPagespeedDomainRewriteHyperlinks on" then it should rewrite hrefs.

I'll document it.

Quinn Comendant

unread,
Aug 5, 2015, 12:17:11 PM8/5/15
to mod-pagesp...@googlegroups.com
On Wed, 5 Aug 2015 10:05:45 -0400, 'Jeff Kaufman' via mod-pagespeed-discuss wrote:
> It turns out we do have this feature, it's just not documented. If
> you set "ModPagespeedDomainRewriteHyperlinks on" then it should
> rewrite hrefs.

Ahaha, great. ;)

Thanks,
Quinn

Quinn Comendant

unread,
Aug 5, 2015, 12:36:43 PM8/5/15
to mod-pagesp...@googlegroups.com
On Wed, 5 Aug 2015 10:05:45 -0400, 'Jeff Kaufman' via mod-pagespeed-discuss wrote:
> It turns out we do have this feature, it's just not documented. If
> you set "ModPagespeedDomainRewriteHyperlinks on" then it should
> rewrite hrefs.

Just a note that this also requires use of:

ModPagespeedMapRewriteDomain http://dst.example.com http://src.example.com
ModPagespeedEnableFilters rewrite_domains

And if you want to rewrite links beyond the default set of attributes, you can add more (e.g., this will add <span src="…"> to the rewritten hyperlinks):

ModPagespeedUrlValuedAttribute span src Hyperlink

See documentation of ModPagespeedUrlValuedAttribute at:
https://github.com/pagespeed/mod_pagespeed/issues/437#issuecomment-90118288

Jeff, you can include this in your documentation task. =)

Quinn
Reply all
Reply to author
Forward
0 new messages