= The Hurdles =
1. What is the impact of Mozmill on performance in the first place? I'm
not exactly sure how to determine this, even.
2. How would you mitigate the noise from the network transactions (cache
the pageset?)
3. How would you decide on a specific site to load or a specific set of
sites that represent some reasonable sample of the web? Or does it make
sense to load the same site (or even about:blank) 1000 times. This
depends on the application you're testing also, obviously.
= The Decisions =
1. What are you trying to measure? Tab performance? Inbox
performance? Memory use? What does that mean?
2. How do you produce anything actionable from the results? This one is
sticky, and pertains to all high end load testing analysis. Often the
amount of data pumped through any given system in a stress testing
situation is great, so how do you determine an adequate response by the
application versus an incorrect response to that stimulus? How do you
then take your result and point developers at the issue? One way to do
this is to graph trends with steadily increasing load. I did this once
and discovered a system where time from input to response grew from a
linear relationship in low stress situations to an exponential
relationship when under duress. This was a bug, and it is possible that
similar analysis could be defined for Mozilla applications and tested
using Mozmill.
So, Gary's got a great idea. Neither he nor I have the bandwidth to
lead the project, though we'd both like to help. Do we have anyone
interested in taking a look at this? Are there thoughts about any of
the questions I've raised above? More questions to be raised?
Thanks,
Clint
If you saw a big jump in differentials between two revisions you could
also kick off a test comparing those two versions rather than just the
latest release and the currently nightly.
Also, I'm saying nightly here but really you could use _any_ full
build. So if we have a build running for every revision, and we have
enough hardware to run all these tests against each build, we could
get a more granular idea of where the performance problem surfaced.
Another big advantage we have here is that mozmill is designed to run
identically on a local machine as it does in a continuous environment
and the setup even when running with Python is pretty simple. Since
the tests run in parallel for comparisons you could run them locally
and get the same kind of comparison that we do in continuous which
would allow for better debugging by developers.
-Mikeal
In Windmill we do this by default, but we don't find it nearly as
useful as you would think. For one thing the click event firing
returns once the event is done propagating, but since javascript is so
asynchronous this doesn't actually measure the time it took for the
code on the page to fire that was attached to that event.
I think what is more useful is the addition of manual timers. This
way, you could set your own timer before a click event and end it
after a waitForElement() with a really low eval interval.
I know it sounds like it would be nice to double all our functional
test as performance tests but what you end up finding is that you
aren't measure what you would actually like to measure and the amount
of data you're flooded with ends up not being very useful just because
of the sheer volume.
> To look at the larger picture of something like how long it takes to
> open the 1001th tab, or how the memory works with 1000 tabs open, we
> can craft a specific test for that.
>
> I would think that adding some basic tools into mozmill to get the #
> of threads, memory used/process, cpu time, and total time taken
> would be good. Then we could just query that from our test case and
> maybe publish a parallel set of data to the pass/fail results which
> include the raw perf metrics.
So, the way you would do this is to write some Python code that can
poll for all the local system performance information. Then in mozmill
you fire your own custom event in javascript whenever you want this
information logged and just add a callback for that event in Python.
Then after all the tests are finished you parse out all perf data on
the Python side.
This brings up a few points I think I failed to make.
1) mozmill performance tests will require being run from Python
2) each set of performance tests will probably have their own Python
script to launch and parse data, but those scripts will mostly use
tools already provided on the mozmill and jsbridge Python side.
-Mikeal
> To look at the larger picture of something like how long it takes to
> open the 1001th tab, or how the memory works with 1000 tabs open, we
> can craft a specific test for that.
>
> I would think that adding some basic tools into mozmill to get the #
> of threads, memory used/process, cpu time, and total time taken
> would be good. Then we could just query that from our test case and
> maybe publish a parallel set of data to the pass/fail results which
> include the raw perf metrics.
So, the way you would do this is to write some Python code that can
poll for all the local system performance information. Then in mozmill
you fire your own custom event in javascript whenever you want this
information logged and just add a callback for that event in Python.
Then after all the tests are finished you parse out all perf data on
the Python side.
This brings up a few points I think I failed to make.
1) mozmill performance tests will require being run from Python
2) each set of performance tests will probably have their own Python
script to launch and parse data, but those scripts will mostly use
tools already provided on the mozmill and jsbridge Python side.
-Mikeal