> Is there any chance we could keep track of variance? It's painful to get a
> new faster/slower number out of our infrastructure only to discover that the
> difference is noise.
Doing something like this is very important. We don't necessarily
need to track the variance in graphserver itself, so long as we had a
uniform way of computing the variance from the raw graphserver data.
Unfortunately, figuring out the variance of a test is very difficult
to do automatically.
You have outliers like the one here, with no apparent cause, plus you
have changes caused by code which is backed out. You need to
distinguish a momentary 5% bump in test scores due to a botched
checkin from a 5% bump caused by natural randomness.
Changes in infra, even changes which aren't expected to affect
variance, sometimes have an effect as well. For example, in the bug
where I looked at Dromaeo scores [1], an infra change which affected
which machines got which jobs may have affected the variance, because
the test score was affected by which object files were rebuilt! I
also saw in that bug that test scores were different on different
trees, again due to infra weirdness.
You also have to model the variance. Most tests' results aren't
normally distributed. Are they uniform in a range? Are they
something else? Some are bimodal. And the kind of distribution can
change over time; bimodality can appear or disappear as we fix or
regress things.
Since the variance can change over time, you have to somehow deal with
these points in time where there was a hard shift in the test result
distribution.
And all of this has to be automatic and work without intervention,
because we have a gazillion tests and are adding more.
Anyway, I'm not saying we shouldn't do this. It is, in fact,
extremely important that we figure this out. Not only is it currently
difficult to tell how your patch affects performance, but all sorts of
changes to the distribution of benchmark results are missed by our
current baby-statistics approach to monitoring.
But I don't think solution is going to be simple, unfortunately.
-Justin
[1]
https://bugzilla.mozilla.org/show_bug.cgi?id=653961