Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Benefits of PGO on Windows

118 views
Skip to first unread message

Dave Mandelin

unread,
Oct 17, 2012, 9:55:57 PM10/17/12
to dand...@mozilla.com, tg...@mozilla.com, lma...@mozilla.com
Following the recent discussion about PGO, I really wanted to understand what benefits PGO gives Firefox on Windows, if any--I was skeptical. Rafael (IIRC) posted some Talos numbers, but I didn't know how to interpret them. So I decided to try a few simple experiments to try to falsify the hypothesis that "PGO has user-perceivable benefits".

Experimental setup: Windows builds from http://hg.mozilla.org/mozilla-central/rev/5f4a6a474455 on a Windows 7 Xeon. I took opt and pgo builds from the tbpl links.

Experiment 1: cold startup time

I used a camera to measure time from pressing enter on a
a command line until the Fx window was completely shown.

results:
opt: 3.025 seconds
pgo: 1.841

- A clear win for PGO. I'm told that there is a startup
time optimization that orders omni.ja that only runs
in PGO builds. So it's not necessarily from the PGO
itself, but at least it means the current PGO builds
really are better.

Experiment 2: JS benchmarks

I ran SunSpider and V8. I would have run Kraken too, but
it takes longer to run and I already had significant
results by then. I did 1-2 runs. Below I show the average,
rounded off to not show noise digits.

results:
opt pgo
SunSpider 250 200 (seconds)
V8 8900 9400 ("score")

- Another clear win for PGO.

(Side note: I've recorded startup times for myself, with my normal profile, of ~30 seconds. I assumed that was just normal, so today I looked on Telemetry and saw that only 5-8% of startup times are that long. (I wish I knew what % of cold startups that is.) Today's results were with a clean profile, so it seems like my normal profile must be busting my startups (and others') badly. It would be really nice to make startup time independent of profile.)

Dave

Taras Glek

unread,
Oct 17, 2012, 10:56:47 PM10/17/12
to Dave Mandelin, dand...@mozilla.com, lma...@mozilla.com
On 10/17/2012 6:55 PM, Dave Mandelin wrote:
> Following the recent discussion about PGO, I really wanted to understand what benefits PGO gives Firefox on Windows, if any--I was skeptical. Rafael (IIRC) posted some Talos numbers, but I didn't know how to interpret them. So I decided to try a few simple experiments to try to falsify the hypothesis that "PGO has user-perceivable benefits".
>
> Experimental setup: Windows builds from http://hg.mozilla.org/mozilla-central/rev/5f4a6a474455 on a Windows 7 Xeon. I took opt and pgo builds from the tbpl links.
>
> Experiment 1: cold startup time
>
> I used a camera to measure time from pressing enter on a
> a command line until the Fx window was completely shown.
>
> results:
> opt: 3.025 seconds
> pgo: 1.841
>
> - A clear win for PGO. I'm told that there is a startup
> time optimization that orders omni.ja that only runs
> in PGO builds. So it's not necessarily from the PGO
> itself, but at least it means the current PGO builds
> really are better.

That's omni.ja funkyness with startup-cache, etc. Glad to see it still
works. If the compiler was doing anything startup-smart to the binary,
your fast system could probably startup in ~1s. Compilers do not care
about startup(though there is interest among gcc people), so we have to
do https://bugzilla.mozilla.org/show_bug.cgi?id=662397 & something
similar on every platform

>
> Experiment 2: JS benchmarks
>
> I ran SunSpider and V8. I would have run Kraken too, but
> it takes longer to run and I already had significant
> results by then. I did 1-2 runs. Below I show the average,
> rounded off to not show noise digits.
>
> results:
> opt pgo
> SunSpider 250 200 (seconds)
> V8 8900 9400 ("score")
>
> - Another clear win for PGO.

>
> (Side note: I've recorded startup times for myself, with my normal profile, of ~30 seconds. I assumed that was just normal, so today I looked on Telemetry and saw that only 5-8% of startup times are that long. (I wish I knew what % of cold startups that is.) Today's results were with a clean profile, so it seems like my normal profile must be busting my startups (and others') badly. It would be really nice to make startup time independent of profile.)
you can measure what goes wrong in your profile with work going-on in
https://bugzilla.mozilla.org/show_bug.cgi?id=799638

That's really great that you have such a repeatable+slow profile. I
never had the pleasure of having a fast machine that can start firefox
this slowly.

Taras



Mike Hommey

unread,
Oct 18, 2012, 2:00:05 AM10/18/12
to Dave Mandelin, tg...@mozilla.com, dand...@mozilla.com, dev-pl...@lists.mozilla.org, lma...@mozilla.com
On Wed, Oct 17, 2012 at 06:55:57PM -0700, Dave Mandelin wrote:
> Following the recent discussion about PGO, I really wanted to understand what benefits PGO gives Firefox on Windows, if any--I was skeptical. Rafael (IIRC) posted some Talos numbers, but I didn't know how to interpret them. So I decided to try a few simple experiments to try to falsify the hypothesis that "PGO has user-perceivable benefits".
>
> Experimental setup: Windows builds from http://hg.mozilla.org/mozilla-central/rev/5f4a6a474455 on a Windows 7 Xeon. I took opt and pgo builds from the tbpl links.
>
> Experiment 1: cold startup time
>
> I used a camera to measure time from pressing enter on a
> a command line until the Fx window was completely shown.
>
> results:
> opt: 3.025 seconds
> pgo: 1.841
>
> - A clear win for PGO. I'm told that there is a startup
> time optimization that orders omni.ja that only runs
> in PGO builds. So it's not necessarily from the PGO
> itself, but at least it means the current PGO builds
> really are better.

If you copy omni.ja from the PGO build to the opt build, you'll be able
to see if everything comes from that. We're planning to make that
currently PGO-only optimization run on all builds. (bug 773171)

Mike

Ted Mielczarek

unread,
Oct 18, 2012, 7:59:02 AM10/18/12
to Dave Mandelin, tg...@mozilla.com, dand...@mozilla.com, dev-pl...@lists.mozilla.org, lma...@mozilla.com, wlac...@mozilla.com
On 10/17/2012 9:55 PM, Dave Mandelin wrote:
> Following the recent discussion about PGO, I really wanted to understand what benefits PGO gives Firefox on Windows, if any--I was skeptical. Rafael (IIRC) posted some Talos numbers, but I didn't know how to interpret them. So I decided to try a few simple experiments to try to falsify the hypothesis that "PGO has user-perceivable benefits".

If you're interested in the benchmark side of things, it's fairly easy
to compare now that we build both PGO and non-PGO builds on a regular
basis. I'm having a little trouble getting graphserver to give me recent
data, but you can pick arbitrary tests that we run on Talos and graph
them side-by-side for the PGO and non-PGO cases. For example, here's Ts
and "Tp5 MozAfterPaint" for Windows 7 on both PGO and non-PGO builds
(the data ends in February for some reason):
http://graphs.mozilla.org/graph.html#tests=[[16,1,12],[115,1,12],[16,94,12],[115,94,12]]&sel=none&displayrange=365&datatype=running

You can see that there's a pretty solid 10-20% advantage to PGO in these
tests.

Here's Dromaeo (DOM) which displays a similar 20% advantage:
http://graphs.mozilla.org/graph.html#tests=[[73,94,12],[73,1,12]]&sel=none&displayrange=365&datatype=running

It's certainly hard to draw a conclusion about your hypothesis from just
benchmarks, but when almost all of our benchmarks display 10-20%
reductions on PGO builds it seems fair to say that that's likely to be
user-visible. We've spent hundreds of man-hours for perf gains far less
than that.

On a related note, Will Lachance has been tasked with getting our
Eideticker performance measurement framework working with Windows, so we
should be able to experimentally measure user-visible responsiveness in
the near future.

-Ted

Dave Mandelin

unread,
Oct 19, 2012, 9:32:54 PM10/19/12
to dev-pl...@lists.mozilla.org
On Wednesday, October 17, 2012 11:00:13 PM UTC-7, Mike Hommey wrote:
> If you copy omni.ja from the PGO build to the opt build, you'll be able
> to see if everything comes from that. We're planning to make that
> currently PGO-only optimization run on all builds. (bug 773171)

Excellent suggestion, plus it made me repeat the experiment. The repeat turned up somewhat more confusing data that still seems to support PGO for Windows startup. I did 2-3 tests with each of 4 configurations (I botched one trial and didn't bother rebooting to test it again), and got this:

pgo with pgo omni.ja 1.6 - 1.7 seconds (1.6, 1.7)
pgo with opt omni.ja 1.4 - 1.6 seconds (1.4, 1.6, 1.6)
opt with pgo omni.ja 1.3 - 8.0 seconds (1.3, 1.4, 8.0)
opt with opt omni.ja 2.9 - 8.7 seconds (2.9, 6.2, 8.7)

The number of trials is too small to conclude very much. If we really wanted to know, either someone would have to spend some time doing this over and over, or we'd have to use Telemetry with some A/B testing.

It's very weird to me that despite the new weirdness, opt/opt was always slower than pgo/pgo, and by about the same amount as my first experiment (in the best case for opt/opt).

Dave

Dave Mandelin

unread,
Oct 19, 2012, 9:32:54 PM10/19/12
to mozilla.de...@googlegroups.com, dev-pl...@lists.mozilla.org
On Wednesday, October 17, 2012 11:00:13 PM UTC-7, Mike Hommey wrote:
> If you copy omni.ja from the PGO build to the opt build, you'll be able
> to see if everything comes from that. We're planning to make that
> currently PGO-only optimization run on all builds. (bug 773171)

Dave Mandelin

unread,
Oct 19, 2012, 9:55:57 PM10/19/12
to Dave Mandelin, tg...@mozilla.com, dand...@mozilla.com, dev-pl...@lists.mozilla.org, lma...@mozilla.com, wlac...@mozilla.com
On Thursday, October 18, 2012 4:59:10 AM UTC-7, Ted Mielczarek wrote:
> If you're interested in the benchmark side of things, it's fairly easy
> to compare now that we build both PGO and non-PGO builds on a regular
> basis. I'm having a little trouble getting graphserver to give me recent
> data, but you can pick arbitrary tests that we run on Talos and graph
> them side-by-side for the PGO and non-PGO cases. For example, here's Ts
> and "Tp5 MozAfterPaint" for Windows 7 on both PGO and non-PGO builds
> (the data ends in February for some reason):
>
> http://graphs.mozilla.org/graph.html#tests=[[16,1,12],[115,1,12],[16,94,12],[115,94,12]]&sel=none&displayrange=365&datatype=running
>
> You can see that there's a pretty solid 10-20% advantage to PGO in these
> tests.

Ah. That answers my question about more data.

For Ts, I see a difference of only 70ms (e.g., 520-590 at the last point). That's borderline trivial, but the differences I measure are much greater. What does Ts actually measure, anyway? Is it measuring only from main() starting to first paint, or something like that?

For Tp5, I see a difference of 80ms (330-410 and such). I'm not really sure what to make of that. By itself, it doesn't necessarily seem like it would that noticeable, but the fraction is big enough that if it holds up for longer and bigger pages, I could see it slightly improving pageloads and probably also reducing some pauses for layout and such. From what I understand about Tp5, it's not really measuring modern pageloads (ignores network and isn't focused on popular sites). I wish we had something more representative so we could draw better conclusions (and not just about PGO).

> Here's Dromaeo (DOM) which displays a similar 20% advantage:
>
> http://graphs.mozilla.org/graph.html#tests=[[73,94,12],[73,1,12]]&sel=none&displayrange=365&datatype=running
>
> It's certainly hard to draw a conclusion about your hypothesis from just
> benchmarks, but when almost all of our benchmarks display 10-20%
> reductions on PGO builds it seems fair to say that that's likely to be
> user-visible.

It seems fair to me to say that core browser CPU-bound tasks are likely to be 10-20% faster. There is probably some of that users can notice, although I'm not sure exactly what it would be. The JS benchmarks do show faster in the two builds, but I haven't tested other JS-based things to see if it's noticeable. I guess I should be testing game framerates or something like that too.

> We've spent hundreds of man-hours for perf gains far less than that.

Yes, we need to get more judicious about how we apply our perf efforts. :-)

> On a related note, Will Lachance has been tasked with getting our
> Eideticker performance measurement framework working with Windows, so we
> should be able to experimentally measure user-visible responsiveness in
> the near future.

I'm curious to see what kinds of tests it will enable.

Dave

Dave Mandelin

unread,
Oct 19, 2012, 9:55:57 PM10/19/12
to mozilla.de...@googlegroups.com, tg...@mozilla.com, Dave Mandelin, lma...@mozilla.com, wlac...@mozilla.com, dev-pl...@lists.mozilla.org, dand...@mozilla.com
On Thursday, October 18, 2012 4:59:10 AM UTC-7, Ted Mielczarek wrote:
> If you're interested in the benchmark side of things, it's fairly easy
> to compare now that we build both PGO and non-PGO builds on a regular
> basis. I'm having a little trouble getting graphserver to give me recent
> data, but you can pick arbitrary tests that we run on Talos and graph
> them side-by-side for the PGO and non-PGO cases. For example, here's Ts
> and "Tp5 MozAfterPaint" for Windows 7 on both PGO and non-PGO builds
> (the data ends in February for some reason):
>
> http://graphs.mozilla.org/graph.html#tests=[[16,1,12],[115,1,12],[16,94,12],[115,94,12]]&sel=none&displayrange=365&datatype=running
>
> You can see that there's a pretty solid 10-20% advantage to PGO in these
> tests.

Ah. That answers my question about more data.

For Ts, I see a difference of only 70ms (e.g., 520-590 at the last point). That's borderline trivial, but the differences I measure are much greater. What does Ts actually measure, anyway? Is it measuring only from main() starting to first paint, or something like that?

For Tp5, I see a difference of 80ms (330-410 and such). I'm not really sure what to make of that. By itself, it doesn't necessarily seem like it would that noticeable, but the fraction is big enough that if it holds up for longer and bigger pages, I could see it slightly improving pageloads and probably also reducing some pauses for layout and such. From what I understand about Tp5, it's not really measuring modern pageloads (ignores network and isn't focused on popular sites). I wish we had something more representative so we could draw better conclusions (and not just about PGO).

> Here's Dromaeo (DOM) which displays a similar 20% advantage:
>
> http://graphs.mozilla.org/graph.html#tests=[[73,94,12],[73,1,12]]&sel=none&displayrange=365&datatype=running
>
> It's certainly hard to draw a conclusion about your hypothesis from just
> benchmarks, but when almost all of our benchmarks display 10-20%
> reductions on PGO builds it seems fair to say that that's likely to be
> user-visible.

It seems fair to me to say that core browser CPU-bound tasks are likely to be 10-20% faster. There is probably some of that users can notice, although I'm not sure exactly what it would be. The JS benchmarks do show faster in the two builds, but I haven't tested other JS-based things to see if it's noticeable. I guess I should be testing game framerates or something like that too.

> We've spent hundreds of man-hours for perf gains far less than that.

Yes, we need to get more judicious about how we apply our perf efforts. :-)

> On a related note, Will Lachance has been tasked with getting our
> Eideticker performance measurement framework working with Windows, so we
> should be able to experimentally measure user-visible responsiveness in
> the near future.

Justin Lebar

unread,
Oct 20, 2012, 12:00:22 AM10/20/12
to Dave Mandelin, tg...@mozilla.com, dand...@mozilla.com, lma...@mozilla.com, mozilla.de...@googlegroups.com, wlac...@mozilla.com, dev-pl...@lists.mozilla.org
> If we really wanted to know, either someone would have to spend some time
> doing this over and over, or we'd have to use Telemetry with some A/B testing.

This would actually be a pretty easy thing to do, to a first
approximation anyway. Just turn off PGO on Windows for one nightly
build and see how that affects all our metrics.

I'll grant that's not a proper A/B study, but it'd probably be good enough.
> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
0 new messages