subtle difference when mod-pagespeed unzips the html data between using Firefox and Chrome

12 views
Skip to first unread message

Alex Wu

unread,
Feb 2, 2015, 2:33:55 PM2/2/15
to mod-pagesp...@googlegroups.com
From firefox, the first batch of bytes is decoded as
The data come down from is
\x1f\x8b\b\xccW\xebn\xdb6\x14\xfe\x9f\xa78\xe3\x80\xc5\x06"kk\x11\x0c\xb3\xad\x0c]\x104\xfb\x13\x14\xe8\x8c\xfe4\x18\x91\x96\x98\xd0\xa4JRV\xdc

unzipped from mod_pagespeed
\n  \xc2\xa0\n<html lang="en">\n<head>\n  <title>Google AdWords: Create Google Account</title>\n


from Chrome,
The data come down as
\x1f\x8b\b\xec|i{\xa3\xb8\xb2\xf0\xf7\xfe\x15\x84\xb9\x9d\x981\xde\xed\xc4\x8eC\xf2&\xce\xbe\xef\xeb\xc9\xf4#@6\xc4\x18\b`;N'\xff\xfd\xad\x92

The unzip data from pagespeed:
\n<html lang="en">\n<head>\n  <title>Google AdWords: Create Google Account</title>\n  <link rel="stylesheet" href="/adwords/signup.css" type="text/css"/>\n

The extra   \xc2\xa0\n at the beginning prevents pagespeed's html parse to parse it for FF, so the rest of optimzation filters.


The request is to google adwords:


 https://adwords.google.com/um/StartDeclineInvite?itk=TfMJmksBAAA.NSkD0IOQNnUK8AaP2BCXdrE-pi368DM2YvBWQzbkWsA.tX3OJ_H2fnBwCef6P0uqmA&hl=en_US

I am not sure who is at fault for such subtle difference.

Alex

Joshua Marantz

unread,
Feb 2, 2015, 2:51:10 PM2/2/15
to mod-pagespeed-discuss
Hi Alex,

I suppose those are byte-order markers, which PageSpeed should ignore at the beginning of an HTML stream.  However from what you pasted it looks like they are not at the beginning of the stream, but are preceded by a newline and a space.  If this is legal as an HTML response, then it makes sense to file a bug to strip all leading whitespace and BOMs before HTML-sniffing.

This is all in the context of a forward proxy, right?

-Josh

--
You received this message because you are subscribed to the Google Groups "mod-pagespeed-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod-pagespeed-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mod-pagespeed-discuss/9945f586-c844-45cf-90de-c72090315ae5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jeff Kaufman

unread,
Feb 2, 2015, 3:05:54 PM2/2/15
to mod-pagespeed-discuss

Jeffrey Crowell

unread,
Feb 2, 2015, 3:12:08 PM2/2/15
to mod-pagesp...@googlegroups.com

Jan-Willem Maessen

unread,
Feb 2, 2015, 3:12:15 PM2/2/15
to mod-pagesp...@googlegroups.com
Everyone's friend non-breaking space strikes again!


So it sounds like we ought to strip leading nbsp as well.  I can only imagine this is the output of some nbsp-loving tool.

On Mon, Feb 2, 2015 at 3:05 PM, 'Jeff Kaufman' via mod-pagespeed-discuss <mod-pagesp...@googlegroups.com> wrote:

Jeff Kaufman

unread,
Feb 2, 2015, 3:24:21 PM2/2/15
to mod-pagespeed-discuss
Reply all
Reply to author
Forward
0 new messages