Magento CSRF Form Key Handling with Varnish / Pagespeed

1,896 views
Skip to first unread message

Guillaume Ziegler

unread,
Apr 24, 2014, 4:36:40 AM4/24/14
to ngx-pagesp...@googlegroups.com
I have a problem with Pagespeed. 

To use CSRF with Magento, my request is  

Pagespeed rewrites my request :
href="http://table-de-massage-electrique.fr/wishlist/index/add/product/61/form_key/%3Cesi:include%20src=" http:="" table-de-massage-electrique.fr="" turpentine="" esi="" getformkey="" ttl="" 3600="" method="" scope="" global="" access="" private="" "="">/"

I try :
pagespeed Disallow "*/turpentine/esi/*";

But it doesn't work.

Do you have any idea ? Thank you on advance.

Maksim Orlovich

unread,
Apr 24, 2014, 9:22:04 AM4/24/14
to ngx-pagesp...@googlegroups.com
Do you have something running after ngx_pagespeed but before the
browser that interprets the <esi:include> part? That's the only way I
can see this working,
since the output is essentially the correct HTML5 parse of the input,
with a bit of extraneous %-encoding (and an extra = attribute). This
is of course because the
href="" attribute is closed by the src=" right after <esi:include. A
workaround would be to use two different kinds of quotes in your
document (which may be
annoying/impossible). Options won't help since this is at the HTML
parser level (and maybe "unfixable" --- I am having a hard time seeing
how to parse mixtures
of HTML with arbitrary unknown syntaxes). Of course if we could
somehow get the esi:include interpreter to run before us, that would
work best.
> --
> You received this message because you are subscribed to the Google Groups
> "ngx-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ngx-pagespeed-di...@googlegroups.com.
> Visit this group at http://groups.google.com/group/ngx-pagespeed-discuss.
> For more options, visit https://groups.google.com/d/optout.

Jeff Kaufman

unread,
Apr 24, 2014, 9:42:25 AM4/24/14
to ngx-pagesp...@googlegroups.com
Do you have LoadFromFile on? If so you need to use LoadFromFileRule
to blacklist dynamic pages.

On Thu, Apr 24, 2014 at 9:22 AM, 'Maksim Orlovich
<morl...@google.com>' via ngx-pagespeed-discuss

Guillaume Ziegler

unread,
Apr 24, 2014, 9:45:18 AM4/24/14
to ngx-pagesp...@googlegroups.com
In recent versions of Magento (CE 1.8+ and EE 1.13+), CSRF protection was added to several additional forms, where previously it had only really been used on login form (in the frontend).
https://github.com/nexcess/magento-turpentine/blob/master/TECHINCAL_NOTES.md#csrf-form-key-handling

Guillaume Ziegler

unread,
Apr 24, 2014, 9:46:16 AM4/24/14
to ngx-pagesp...@googlegroups.com
I don't have LoadFromFile on.

Jeff Kaufman

unread,
Apr 24, 2014, 10:05:02 AM4/24/14
to ngx-pagesp...@googlegroups.com
Looking at https://github.com/nexcess/magento-turpentine#how-it-works

"""
The extension works in two parts, page caching and block (ESI/AJAX)
caching. A simplified look at how they work:

For pages, Varnish first checks whether the visitor sent a frontend
cookie. If they didn't, then Varnish will generate a new session token
for them. The page is then served from cache (or fetched from the
backend if it's not already in the cache), with any blocks with ESI
polices filled in via ESI. Note that the cookie checking is bypassed
for clients identified as crawlers (see the Crawler IP Addresses and
Crawler User Agents settings).

For blocks, the extension listens for the
core_block_abstract_to_html_before event in Magento. When this event
is triggered, the extension looks at the block attached to it and if
an ESI policy has been defined for the block then the block's template
is replaced with a simple ESI (or AJAX) template that tells Varnish to
pull the block content from a separate URL. Varnish then does another
request to that URL to get the content for that block, which can be
cached separately from the page and may differ between different
visitors/clients.
"""

This is not going to work. Your current setup is:

generate html > pagespeed > magento

Unfortunately the html you're feeding to pagespeed has ESIs that are
intended for magento, but pagespeed sees them as invalid html.

I think you can fix this by running pagespeed after your ESIs are interpreted:

generate html > magento > pagespeed




On Thu, Apr 24, 2014 at 9:46 AM, Guillaume Ziegler
<guillaume....@gmail.com> wrote:
> I don't have LoadFromFile on.
>

Guillaume Ziegler

unread,
Apr 24, 2014, 10:08:46 AM4/24/14
to ngx-pagesp...@googlegroups.com
To use CSRF with Magento, my request is  

Pagespeed rewrites my request :
href="http://table-de-massage-electrique.fr/wishlist/index/add/product/61/form_key/%3Cesi:include%20src=" http:="" table-de-massage-electrique.fr="" turpentine="" esi="" getformkey="" ttl="" 3600="" method="" scope="" global="" access="" private="" "="">/"

The response must be :

Magento checks for the form post destination controller with the form_key.

Jeff Kaufman

unread,
Apr 24, 2014, 10:17:26 AM4/24/14
to ngx-pagesp...@googlegroups.com
This is what's happening without pagespeed:

1. You generate html that looks like <a href="...<esi:include src="...">>
2. Magento sees that, and changes it to <a href="...QJ0yiI3LzxQKZAXH">

With pagespeed:

1. You generate html like <a href="...<esi:include src="...">>
2. PageSpeed tries to parse that as html and fails, because that's not
valid html. In failing it also mangles it into <a href="..." esi="",
...>.
3. Magento no longer sees <esi:include src="..."> and so doesn't
substitute in QJ0yiI3LzxQKZAXH.

If you make pagespeed run after magento you should get what you want:

1. You generate html like <a href="...<esi:include src="...">>
2. Magento sees that, and changes it to <a href="...QJ0yiI3LzxQKZAXH">
3. PageSpeed processes the page, sees only html, doesn't break things.

Guillaume Ziegler

unread,
Apr 24, 2014, 10:43:48 AM4/24/14
to ngx-pagesp...@googlegroups.com

Jeff Kaufman

unread,
Apr 24, 2014, 10:55:39 AM4/24/14
to ngx-pagesp...@googlegroups.com
Your situation is different from the one the documentation is
addressing because you're emitting invalid html and then fixing it
inside varnish. There are advantages to running varnish in front of
pagespeed (more efficient) but that's simply not going to work in your
case because of the ESI.

On Thu, Apr 24, 2014 at 10:43 AM, Guillaume Ziegler
>> > email to ngx-pagespeed-di...@googlegroups.com.
>> > Visit this group at
>> > http://groups.google.com/group/ngx-pagespeed-discuss.
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "ngx-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ngx-pagespeed-di...@googlegroups.com.

Guillaume Ziegler

unread,
Apr 24, 2014, 3:31:48 PM4/24/14
to ngx-pagesp...@googlegroups.com
If pagespeed though it was an url, it shouldn't rewrite with the instruction 
pagespeed Disallow "*/turpentine/esi/*";

It rewrites :
to :
http:="" table-de-massage-electrique.fr="" turpentine="" esi="" getformkey="" ttl="" 3600="" method="" scope="" global="" access="" private="" "="

>> > Visit this group at
>> > http://groups.google.com/group/ngx-pagespeed-discuss.
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "ngx-pagespeed-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an

Jeff Kaufman

unread,
Apr 24, 2014, 3:49:55 PM4/24/14
to ngx-pagesp...@googlegroups.com
On Thu, Apr 24, 2014 at 3:31 PM, Guillaume Ziegler
<guillaume....@gmail.com> wrote:
> If pagespeed though it was an url, it shouldn't rewrite with the instruction
> pagespeed Disallow "*/turpentine/esi/*";
>

The problem is very early on in pagespeed's processing of your file.
PageSpeed can't handle html that has things like <a <b>>. That's not
valid html, and it breaks our parser before it gets far enough along
in the process to locate a url and determine whether it's allowed for
rewriting.

If you want pagespeed to optimize your html, you first need to run it
through ESI interpretation.

Joshua Marantz

unread,
Apr 24, 2014, 3:53:39 PM4/24/14
to ngx-pagesp...@googlegroups.com
FWIW the Apache module, mod_pagespeed, had to make a change some years back to run before mod_includes for exactly the same reason.

-Josh



--
You received this message because you are subscribed to the Google Groups "ngx-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ngx-pagespeed-di...@googlegroups.com.

Jeff Kaufman

unread,
Apr 24, 2014, 4:09:13 PM4/24/14
to ngx-pagesp...@googlegroups.com
On Thu, Apr 24, 2014 at 3:53 PM, 'Joshua Marantz
<jmar...@google.com>' via ngx-pagespeed-discuss
<ngx-pagesp...@googlegroups.com> wrote:
> FWIW the Apache module, mod_pagespeed, had to make a change some years back
> to run before mod_includes for exactly the same reason.

After mod_includes, no?

Guillaume Ziegler

unread,
Apr 24, 2014, 4:15:11 PM4/24/14
to ngx-pagesp...@googlegroups.com
I find the instruction. (Specifying Additional URL-Valued Attributes https://developers.google.com/speed/pagespeed/module/domains)

pagespeed UrlValuedAttribute esi:include src hyperlink;

Jeff Kaufman

unread,
Apr 24, 2014, 4:24:18 PM4/24/14
to ngx-pagesp...@googlegroups.com
On Thu, Apr 24, 2014 at 4:15 PM, Guillaume Ziegler
<guillaume....@gmail.com> wrote:
> pagespeed UrlValuedAttribute esi:include src hyperlink;

I wouldn't expect this to work. The pagespeed html parser is not
going to understand <a
Running pagespeed after magento should work, though.

Michel Brito

unread,
May 19, 2014, 2:11:22 AM5/19/14
to ngx-pagesp...@googlegroups.com
Hello guys,

There is any solution?
I have the same problem

My request is:

But pagespeed rewrites my request to:
<a href="http://mysite.com/form_key/<esi:include src=" http: mysite.com turpentine esi getFormKey ttl 3600 method esi scope global access private " />/">

Otto van der Schaaf

unread,
May 19, 2014, 4:22:39 PM5/19/14
to ngx-pagesp...@googlegroups.com
Like Jeff stated earlier, I think that the proper solution would be to get the ESI includes to be processed before pagespeed runs.

Even if you could fix this immediate problem without doing so, PageSpeed *will* function suboptimally, as it can only see the html in parts, not the whole.
It might even break stuff when it attempts to optimize small shards of html which gets injected into the page where the original esi tag was encountered.

I think you could have a go at getting the order of things correct with setting up a reverse proxy that does pagespeed optimization,
in front of the webserver that generates the content and processes the ESI includes.
This would probably involve adding a server{} block with pagespeed on, that uses proxy_pass to forward requests to
the webserver. 

Otto van der Schaaf

unread,
May 19, 2014, 4:27:42 PM5/19/14
to ngx-pagesp...@googlegroups.com
Sorry - reading back, it looks like Varnish is actually doing the ESI processing in front of nginx+ngx_pagespeed, right?
In that case, you can't fix it they way I was suggesting.

You could try Apache Traffic Server, which is able to process ESI includes and do pagespeed optimization
in a configurable order as a reverse proxy. 

Otto

Jeff Kaufman

unread,
May 20, 2014, 2:29:18 PM5/20/14
to ngx-pagesp...@googlegroups.com
A bit of searching turns up the https://github.com/taf2/nginx-esi
which might also do what you want, if it runs before ngx_pagespeed.

Jeroen Vermeulen

unread,
Jun 12, 2014, 10:05:51 AM6/12/14
to ngx-pagesp...@googlegroups.com
I probably have found a workaround.
I described it in the discussion on GitHub, https://github.com/nexcess/magento-turpentine/issues/479

Jeff Kaufman

unread,
Jun 12, 2014, 10:20:37 AM6/12/14
to ngx-pagesp...@googlegroups.com
I'm glad you found a solution that works for you, but I should warn
you that this is fragile and could easily break. Sending non-html to
PageSpeed isn't something we can support, and we can't guarantee that
PageSpeed will always mangle your non-html the same way or even
preserve all the information.

The supported way to do this is to interpret ESIs before they get to PageSpeed.
Reply all
Reply to author
Forward
0 new messages