On Monday, March 4, 2013 5:15:56 AM UTC-8, Jim Mathies wrote:
> For metrofx we’ve been working on getting omtc and apzc running in the browser. One of the things we need to be able to do is run performance tests that tell us whether or not the work we’re doing is having a positive effect on perf. We currently don’t have automated tests up and running for metrofx and talos is even farther off.
> So to work around this I’ve been putting together some basic perf tests I can use to measure performance using the mochitest framework. I’m wondering if this might be a useful answer to our perf tests problems long term.
I think this is an incredibly interesting proposal, and I'd love to see something like it go forward. Detailed reactions below.
> Putting together talos tests is a real pain. You have to write a new test using the talos framework (which is a separate repo from mc), test the test to be sure it’s working, file rel eng bugs on getting it integrated into talos test runs, populated in graph server, and tested via staging to be sure everything is working right. Overall the overhead here seems way too high.
Yup. And that's a big problem. Not only does this make your life harder, it makes people not do as much performance testing as they otherwise might. The JS team has had the experience that adding a new way of creating correctness tests incredibly easy (with *zero* overhead in the common case) really helped get more tests written and used. So I think it would be great to make it a lot easier to write perf tests.
> Maybe we should consider changing this system so devs can write performance tests that suit their needs that are integrated into our main repo? Basically:
> 1) rework graphs server to be open ended so that it can accept data from test runs within our normal test frameworks.
IIUC, something like this is a key requirement: letting any perf test feed into the reporting system. People have pointed out that the Talos tests run on selected machines, so the perf tests should probably run on them as well, rather than on the correctness test machines. But that's only a small change to the basic idea, right?
> 2) develop of test module that can be included in tests that allows test writers to post performance data to graph server.
Does that mean a mochitest module? This part seems optional, although certainly useful. Some tests will require non-mochitest frameworks.
I believe jmaher did some work to get in-browser standard JS benchmarks running automatically and reporting to graph-server. I'm curious how that would fit in with this idea--would the test module help at all, or could there be some other kind of more general module maybe, so that even things like standard benchmarks can be self-serve?
> 3) come up with a good way to manage the life cycle of active perf tests so graph server doesn’t become polluted.
:-) How about getting an owner optionally listed for new tests, and then tests will be removed if no one is looking at them (according to web server logs) and there is no owner of record or the owner doesn't say the tests are still needed?
> 4) port existing talos tests over to the mochitest framework.
> 5) drop talos.
This seems like it's in the line of "fix Talos". I'm not sure if this particular 4+5 is the right way to go, but the idea has some merit.
To the extent that people don't pay attention to Talos, it seems we really don't need to do anything with it. If people are paying attention to and taking care of performance in their area, then we're covered. To take the example I happen to know best, the JS team uses AWFY to track JS performance on standard benchmarks and additional tests they've decided are useful. So Talos is not needed to track JS performance. Having all the features of the new graph server does sound pretty cool, though.
It appears that there a few areas that are only covered by Talos for now, though. I think in that category we have warm startup time via Ts, and basic layout performance via Tp. I'm not sure about memory, because we do seem to detect increases via Talos, but we also have AWSY, and I don't know whether AWSY obviates Talos memory measurements or not.
For that kind of thing, I'm thinking maybe we should go with the same "teams take care of their own perf tests" idea. Performance is a natural owner for Ts. I'm not entirely sure about Tp, but it's probably layout or DOM. Then those teams could decide if they wanted to switch from Talos to a different framework. If everything's working properly, if the difficulty of reproducing Talos tests locally caused enough problems to warrant it, the owning teams would notice and switch.