Ideas so far

Eddie Jaoude

unread,

Jun 3, 2010, 2:08:20 PM6/3/10

to Web Testing Framework

There are some great ideas being suggested! However, one suggestion by
Pat (sorry dude), I actually disagree with.

> In my opinion, we shouldn't even be thinking about things like capturing PageSpeed or YSlow at the browser.
There is no reason why we should/could not capture results from the
user's browsers with the beaconing facility. Using Har files as well
is a good idea too. I think both are required.

I have posted a message on httpfox's group asking about a beaconing
facility, as this would be a great tool (similar to httpwatch) to
integrate as well (users can beacon results, upload a har file or have
a auto daily monitor).

Patrick Lightbody

unread,

Jun 3, 2010, 2:27:27 PM6/3/10

to web-testin...@googlegroups.com

No worries, Steve beat you to the punch slamming that idea already ;)
No offense taken at all, I'll just try to explain why I think it's
something to consider:

Pulling performance stats from browsers is *hard* work - probably the
hardest part of the whole thing. Just look at all the work Pat Meenan
has had to do, or that WebMetrics does, or that BrowserMob does. Now
do the same for Firefox, Opera, Safari, Chrome, mobile devices, and
whatever comes out next.

PageSpeed and YSlow started their lives (and continue to do so) as
Firefox-specific tools, but the core of what they do (page performance
analysis) really isn't Firefox-specific. I want to avoid having to
create a PageSpeed/YSlow for every browser and having new proprietary
APIs to pull that data. We already have to do that to get HAR data,
let's not multiply the work by 2X or 3X.

Yes, I understand there are data inputs that they need that HAR does
not provide. But maybe we should move the goalposts around HAR and get
those data inputs (such as JS performance) in there. Or if not HAR,
some other standard data structure that sits of top of HAR. That way
we have a chance at A) getting browser vendors to provide a conduit to
that data natively (holy grail), and B) allowing innovation with
YSlow/PageSpeed-style analysis without getting bogged down in browser
internals.

So while I agree we _can_ capture PageSpeed/YSlow beacons and that we
_should_ do the analysis they provide, it strikes me as more efficient
if we focus our energy on pulling out data in a cross-browser
compatible manner and then do analysis in a way that is offline and
works for all browsers.

Hopefully that clarifies my position a bit more.

Patrick

--
We provide FREE website monitoring and load testing
http://browsermob.com

Patrick Lightbody
Founder, BrowserMob
+1 (503) 828-9003 x 101

Patrick Meenan

unread,

Jun 3, 2010, 6:33:57 PM6/3/10

to web-testin...@googlegroups.com

At least from my experience, post-processing (if possible) tends to be a LOT
easier to manage. There are certain things you can only do in the browser
but any time you need to change the logic you need to do a code update on
all of the test machines. It also means that you can't re-process old data
using the new rules. No doubt we won't be able to do everything in
post-processing but we should try to make as much as possible possible to be
deferred (maybe with flexibility to allow for it to also be calculated at
the edge).

I posted a doc with the initial component thoughts and everyone currently in
the group should have permission to go in and modify it. The GUI components
may end up not being important to discuss in this context as they are just
API consumers and don't expose an API of their own.

Thanks,

-Pat

Patrick Lightbody

unread,

Jun 3, 2010, 6:56:00 PM6/3/10

to web-testin...@googlegroups.com

Agreed, all great points.

I just added some comments to the doc (look for the [PL: ... ])

My gut says that if I had to prioritize the various components, I'd
personally start with:

1) Results Storage (particularly locking down the API)
2) Browser Automation Engine (particularly the extraction of the perf
data to pass in to the above API)
3) Processing Engine
4) Task Manager
5) Front-end/GUIs

Patrick

lennysan

unread,

Jun 3, 2010, 7:14:46 PM6/3/10

to Web Testing Framework

Another couple reasons for post-processing, which may have been
mentioned already, are reducing the observer effect (i.e. introducing
timing delays) and not hurting the scalability of running multiple
tests in parallel on one machine. I know the "Page Speed Activity"
measurement currently really hurts the browser's performance (which
I'm told will be fixed), but the less that happens in the browser the
more accurate and scalable the tests should be.

One other component that will probably be necessary is some sort of
watchdog process to restart zombie'd browsers, or timeouts that aren't
correctly followed. This could be lumped in with the task manager.

> > We provide FREE website monitoring and load testinghttp://browsermob.com

Patrick Meenan

unread,

Jun 3, 2010, 7:29:15 PM6/3/10

to web-testin...@googlegroups.com

My thinking would be that the browser automation engine would be a
completely black box that included everything running on the test machine
(including managing the browser processes, watchdog, etc). I know this is a
little more than what (other) Patrick was thinking about but it eliminates
any concerns about OS or browser-specific knowledge.

I didn't want to jump the gun too much for throwing out just my thoughts but
I'd love to see all of the interfaces be REST http interfaces and any
process-specific stuff would be contained within each box.

What I was proposing as the task manager would essentially be the
centralized logic that knew how to hand work out to various test agents and
scheduling tests but wouldn't do any browser control itself. An interesting
point of discussion would be if the task manager would push to test agents
or the test agents would poll the central scheduler (though this is probably
getting a little too in the weeds already).

If I can carve out some time tomorrow I'll try and put a picture together of
what I was thinking for people to throw darts at. As a bunch of us have
some pretty large scale deployments I expect there are multiple ways to
solve the problem and I don't want to push my way too hard so please feel
free to shred it and tell me I'm an idiot :-)

Adrian Yee

unread,

Jun 3, 2010, 8:17:45 PM6/3/10

to Web Testing Framework

On Jun 3, 11:27 am, Patrick Lightbody <patr...@browsermob.com> wrote:
> PageSpeed and YSlow started their lives (and continue to do so) as
> Firefox-specific tools, but the core of what they do (page performance
> analysis) really isn't Firefox-specific. I want to avoid having to
> create a PageSpeed/YSlow for every browser and having new proprietary
> APIs to pull that data. We already have to do that to get HAR data,
> let's not multiply the work by 2X or 3X.
>
> Yes, I understand there are data inputs that they need that HAR does
> not provide. But maybe we should move the goalposts around HAR and get
> those data inputs (such as JS performance) in there. Or if not HAR,
> some other standard data structure that sits of top of HAR. That way
> we have a chance at A) getting browser vendors to provide a conduit to
> that data natively (holy grail), and B) allowing innovation with
> YSlow/PageSpeed-style analysis without getting bogged down in browser
> internals.
>
> So while I agree we _can_ capture PageSpeed/YSlow beacons and that we
> _should_ do the analysis they provide, it strikes me as more efficient
> if we focus our energy on pulling out data in a cross-browser
> compatible manner and then do analysis in a way that is offline and
> works for all browsers.

I'm up for moving the performance analysis out of the browser, but if
we are only able to move some of it out, are we creating more trouble
for ourselves? Every browser would still need the code to send the
other data needed to make the analysis, and we would need to come up
with another HAR type format (or cram it into the HAR). For certain
Page Speed/YSlow recommendations, it would mean moving code that's
probably easier to handle in the browser (eg. checking to see if
scaled images are being served). On the other hand, we do get the
benefit of being able to get a basic performance analysis from just
the HAR file, which would be independent of which browser the HAR file
was generated from.

Adrian

Patrick Meenan

unread,

Jun 3, 2010, 8:33:26 PM6/3/10

to web-testin...@googlegroups.com

I'm not familiar with all of the rules across all of the tools (well, enough
to know where the gaps are). Maybe a good exercise would be to take an
inventory and see what checks could be done against a HAR 1.2 file (which
allows for binary image data), which could be done if we dumped a subset of
the DOM and which can't be done outside of the browser (or requiring more
custom exports).

At a minimum we might at least be able to come up with a subset of the rules
that can be handled this way and potentially leave the other rules for users
using the actual desktop plugin. Enough sites so utterly fail to even get
the basics right that we don't necessarily need to solve for 100%.

-Pat

-----Original Message-----
From: web-testin...@googlegroups.com
[mailto:web-testin...@googlegroups.com] On Behalf Of Adrian Yee
Sent: Thursday, June 03, 2010 8:18 PM
To: Web Testing Framework
Subject: Re: Ideas so far

Adrian Yee

unread,

Jun 3, 2010, 8:56:55 PM6/3/10

to web-testin...@googlegroups.com

Better upgrade your hard drives, all those HAR files are going to be
huge! :)

Adrian

Bryan McQuade

unread,

Jun 4, 2010, 2:31:35 PM6/4/10

to Web Testing Framework

Hi,

Patrick pointed me to this list. This is a great discussion. I'm the
lead developer on Page Speed so I can add some info from the Page
Speed side of things.

Page Speed currently has a browser-independent SDK, and we currently
have a command-line version of Page Speed that takes HAR as input and
emits a simple text output of scores as output. It's a proof of
concept more than anything, but it works. You could take the core code
used to build this command line tool and have it do other things as
well, such as link it into other binaries, have it emit JSON instead
of text, etc etc. It's all pretty easy to do.

The binary is available here if you're interested:
http://page-speed.googlecode.com/files/har_to_pagespeed.exe (no linux
or windows builds just yet and an updated version with a few more
rules is coming soon)

Indeed there are some Page Speed rules that do require additional
inputs. We mostly require DOM access. Our current DOM API is pretty
simple. You can browse it here: http://page-speed.googlecode.com/svn/lib/trunk/src/pagespeed/core/dom.h
(note that it was just updated today so if you've looked at it
previously, it's changed a bit). Basically we need to traverse the
DOM, look at attributes, and check the width/height of DOM elements.
Pretty simple.

Having just HAR will get you about 80% coverage of our rules. HAR+this
DOM API gets you to 95%. We still have 2 rules that need additional
data: one is a rule that looks for unused JS and the other looks for
unused CSS. These need additional data that we currently reach
directly into Firefox for. It's not clear how to generalize them to
other browsers but we can get there eventually.

So either adding a dump of the DOM to the HAR format, or having an
additional file that represents DOM information, will probably be
sufficient to run these kinds of tools.

To capture DOM (and possibly generate HAR) I've been looking into
building a variant of the Chromium/Webkit DumpRenderTree, which runs a
true headless browser (no xvfb or anything like that needed), fetches
a URL, and currently dumps a text representation of the render tree to
the console. I expect it to be a bit more stable and faster than
Firefox xvfb-based solutions I've used in the past.

I'm talking to the Chromium folks about extending this to dump HAR
+DOM. If we can get that working then we'll have a nice command line
tool that we can source webkit-based HAR+DOM dumps from. Other folks
have built IE (webpagetest) and Firefox (showslow) automation tools so
we would have all 3 major browsers covered at that point.

I do think there is a lot of value to running Page Speed/YSlow in the
browser for iterative development, but I agree there's also a lot of
value to being able to run these tools outside of the browser. We
built the Page Speed SDK to satisfy both needs.

Patrick Lightbody

unread,

Jun 5, 2010, 9:01:14 PM6/5/10

to web-testin...@googlegroups.com

Bryan,
Thanks for checking in with us and clarifying a few things.

IMO, we should try to make most of the this effort work with the
assumption that we can get eventually get 100% of the data we need to
process recommendations from PageSpeed and YSlow offline.

For the stuff that we can't (currently) get out, perhaps this is a
place to take advantage of HAR's custom extensions (keys that start
with _ I believe) and encode custom data/results there?

Patrick

ed...@jaoudestudios.com

unread,

Jun 7, 2010, 1:49:51 AM6/7/10

to web-testin...@googlegroups.com

Thanks Bryan, its good to hear that it is possible for PageSpeed to be browser independent & possibly outside the browser. As much as it pains me to say it, it would be good to have results from the varous IEs as well.

@Pat
Using HAR's custom extension sounds interesting.

Sent using BlackBerry® from Orange

-----Original Message-----
From: Patrick Lightbody <pat...@browsermob.com>
Sender: web-testin...@googlegroups.com
Date: Sat, 5 Jun 2010 18:01:14
To: <web-testin...@googlegroups.com>
Reply-To: web-testin...@googlegroups.com
Subject: Re: Ideas so far

Bryan,
Thanks for checking in with us and clarifying a few things.

IMO, we should try to make most of the this effort work with the
assumption that we can get eventually get 100% of the data we need to
process recommendations from PageSpeed and YSlow offline.

For the stuff that we can't (currently) get out, perhaps this is a
place to take advantage of HAR's custom extensions (keys that start

with_ I believe) and encode custom data/results there?

>> >> So while I agree we_can_ capture PageSpeed/YSlow beacons and that we

Bryan McQuade

unread,

Jun 7, 2010, 6:44:18 AM6/7/10

to web-testin...@googlegroups.com

Yes, I agree about IE. Patrick Meenan's WebPageTest should be a good
source for this data.

BTW when I said "3 major browsers" I meant Firefox, IE, and the
webkit-based browsers (Chromium and Safari together).

pors

unread,

Jul 27, 2010, 3:01:09 PM7/27/10

to Web Testing Framework

Hi, I just joined this group and it seems the very first message for
this topic is a response to an older thread. Is that still available
somewhere?

Thx
Mark

On Jun 3, 9:08 pm, Eddie Jaoude <ed...@jaoudestudios.com> wrote:
> There are some great ideas being suggested! However, one suggestion by
> Pat (sorry dude), I actually disagree with.
>

> > In my opinion, we shouldn't even be thinking about things likecapturingPageSpeed or YSlow at the browser.

Patrick Lightbody

unread,

Jul 29, 2010, 1:50:13 AM7/29/10

to web-testin...@googlegroups.com

Mark,
Not sure if I have the original thread. Probably best that we just
start some new discussions. Maybe you can kick one off introducing
yourself and why you're interested in the group. Hopefully that will
spawn some discussions from some of the other new members that joined
over the last few weeks.

Patrick

--

We provide FREE website monitoring and load testing
http://browsermob.com

Patrick Lightbody
BrowserMob
(w) +1 (503) 828-9003 x 101
(m) +1 (415) 830-5488

pors

unread,

Jul 29, 2010, 10:33:41 AM7/29/10

to Web Testing Framework

Hi Patrick and the rest of this group!

I was introduced to this group by Patrick Meenan, whom I contacted
when I read about this initiative. I am the CTO of WatchMouse, a
website performance monitoring company.

As most of you probably do, I can see a many advantages in
standardizing the way performance data can be imported/exported by the
different tools that exist.
I guess it comes down to specifying interfaces for both requesting and
retrieving (storing) performance data. The HAR format is a great start
here I think, which we at WatchMouse started supporting recently in
our "Root Cause Analysis", and which we plan to integrate in more of
our services. We are willing to support a standardizing effort, both
in helping to specify interfaces and to actually implement them.
I missed the first part of this thread (with great ideas being
suggested apparently), so if someone would like to share those I (and
the other newcomers on this list) am up to date for some further
discussion.

Cheers,
Mark

On Jul 29, 8:50 am, Patrick Lightbody <patr...@browsermob.com> wrote:
> Mark,
> Not sure if I have the original thread. Probably best that we just
> start some new discussions. Maybe you can kick one off introducing
> yourself and why you're interested in the group. Hopefully that will
> spawn some discussions from some of the other new members that joined
> over the last few weeks.
>
> Patrick
>
> --

> We provide FREE website monitoring and load testinghttp://browsermob.com

Patrick Meenan

unread,

Jul 29, 2010, 11:05:47 AM7/29/10

to web-testin...@googlegroups.com

Sorry if it's painful to read, hopefully it comes across ok. Here is the
mail thread that started before we set up the Google group and moved the
discussion over:

On 6/3/2010 1:46 PM, Souders, Steve wrote:
[let's move this to web-testin...@googlegroups.com when everyone's
there]

It's very cool to see these tools evolving and the potential as they fit
together. We're really at the beginning of a new space that's going to be
big. HAR was just an idea that has flourished. It's very likely whatever we
settle on here will become pervasive and a standard within a year.

Wrt offline YSlow & Page Speed - some critical rules (% of JS executed)
aren't feasible offline.

Wrt scripting languages - we'll definitely need this eventually, but right
now most (all?) web sites don't have the basic performance metrics on their
main URLs. We need to keep scripting languages in mind, but definitely a
phase 2 or 3 item IMO.

The phase 1 goal for me would be that web site owners have a chart of page
load time, total page weight, and YSlow and/or Page Speed score plotted over
time. And then we can add drill down capabilities.

-Steve

On 6/3/2010 9:31 AM, Rachitsky, Lenny wrote:
I'm 100% in on this too. The things that I see as more important are having
a consistent scripting language that you can use across services/tools (e.g.
Selenium/Webdriver), and being able to compare apples-to-apples when looking
at performance metrics amongst different services/browsers (e.g. HAR,
Firebug, Speed Tracer, PageTest). This would have the biggest impact imho,
especially in the commercial monitoring space.

Recently I've been working on adding support for exporting to HAR from our
monitoring service, which will allow us to integrate Page Speed/YSlow
recommendations and eventually integrate directly with the tools we're
talking about here.

On 6/3/10 6:46 AM, "Patrick Lightbody" <pat...@browsermob.com> wrote:
I'm absolutely behind this effort. I think all the parts can and should have
at least one open source implementation over time, but the most critical
shared components are likely:

- Browser automation (ie: Selenium)
- Automated HAR extraction (ie: PageTest for IE, Firebug+NetExport+XYZ for
Firefox, Proxy-based approach for other browsers, etc)
- "HAR Server" that provides standard set of APIs to save, organize, and
query HAR resources

In my opinion, we shouldn't even be thinking about things like capturing

PageSpeed or YSlow at the browser. We should just capture HAR and work with
those teams to get to the point where we can generate complete reports
entirely from offline HAR captures. And if HAR 1.1 or 1.2 doesn't contain
enough data to do that, then we evolve HAR as well.

I think once those three components are taken care of, all the various open
source, free, and commercial interests will sort of work themselves out and
we'll see other stuff, such as graphing or scheduling, open up naturally.

I'm attaching an incomplete slide deck that I've previously shared with Pat
Meenan and Steve Souders that has my thinking around this. It's based around
two existing opensource projects I've been tinkering with, both of which are
nascent efforts at the above three components:

- http://github.com/lightbody/browsermob-proxy
- http://github.com/lightbody/browsermob-page-perf

Pat: would be great for you to set up a mailing list for us to go down this
effort. Only question is what the name of the group/list should be?

--

We provide FREE website monitoring and load testing
http://browsermob.com

Patrick Lightbody
Founder, BrowserMob

+1 (503) 828-9003 x 101

On Thu, Jun 3, 2010 at 5:30 AM, Meenan, Patrick
<patrick...@corp.aol.com> wrote:
Just wanted to pull this off of the list for a bit so we don't have to spam
everyone else.

There have been at least 3 or 4 side conversations I've been having about
standardizing the interfaces for different parts of the web performance
testing stack. I think if we put our heads together we could carve up a
typical performance testing architecture into standard blocks that we all
use and come up with standard interfaces between the components. That way
we could mix and match open source components and proprietary services and
we don't all have to re-invent the wheel.

Does this sound reasonable? If so I'll see about starting up a new group
where we can hash through some of the details. I'm willing to commit to
having Pagetest and WebPagetest use whatever standard we would come to
agreement on which at a minimum would open up IE testing for all of the
tools that are currently firefox-only.

At a really high level I'm thinking the sort building blocks we'd be talking
about are:

- A browser automation engine that knows how to take tasks (pages/scripts),
run them and provide results
- A task manager that knows how to hand out jobs to the automation
engine(s), including support for one-off and recurring tasks
- Front-end components for taking-in and managing work
- A back-end that can take the test results and store them (with standard
interfaces for retrieving the results)
- Processing engines that could plug into the back-end and evaluate the
results (may not make sense but post-processing performance checks may be a
good option to have)
- A GUI for presenting test results (individual and trended support)

The standardization would be around the interfaces in and out of each of
these components. That way we can share browser automation for example but
still leave room for commercial services to differentiate themselves.

Thoughts?

Thanks,

-Pat

Reply all

Reply to author

Forward