Hi Gijs,
None of those things are true in my opinion. For what it's worth, the expected regression in CART was more around 20% than around 40%. The number surprises me a little, and we'll look into what makes CART so bad off (on our test servers) specifically. We haven't looked specifically at CART as we mostly looked into TART and Tsvgr while investigating.
I'll address each point individually:
a) Extremely hacky work for a 1% gain seems like a bad idea. Depending on how hacky, maintainable, etc, that seems like it might have been the wrong decision :-). But 1% gains can still accumulate, so I certainly wouldn't call them useless.
b) You have to look at what the tests are meant to do, Avi knows more about the tests than I do and has already supplied some information, but there's a difference in tweaking the UI, or a specific portion of rendering, or radically changing the way we draw things. The tests might not be a good reflection of the average user experience, but they help us catch situations where we unwittingly harm performance, or where we want proof that a change in some core algorithm does indeed make things run faster. That makes them very useful, just not for radical architectural changes. Fwiw, in what CART or TART is testing I'm not claiming it will be a net performance improvement, there are however other interactions that improve that are inherently linked to this one and cannot be decoupled (because of it being an architectural change). Similarly we only run our tests on one hardware configuration, this in my mind again stresses their purpose as a relative regression test as opposed to being representative of perceived UX.
c) A couple of things here, first of all we consulted with people outside of the gfx team about this, and we were in agreement with the people we talked to. When it comes to moving forward architecturally we should always be ready to accept something regressing, that has nothing to do with what team is doing the regression, that is related to what we're trying to accomplish. I can guarantee you for example, e10s will cause at least some significant regressions in some situations, yet we might very well have to accept those regressions to offer our users big improvements in other areas, as well as to pave the way for future improvements. With OMTA for example, several aspects of our (UI) performance can be improved in a way that no amount of TART improvements in the old architecture ever could.
Now I don't want to repeat too much of what I've already said, but I'd like to reiterate on the fact that if there are 'real' regressions in the overall user experience, we will of course attempt to address those issues, but we are also only a pref change away from going back to on-main-thread compositing.
The focus should, in my opinion be, what has actually been affected by this in a way that it has a strong, negative impact on user experience. Considering the scope of this change I am certain those things exist and they will need to be fixed. Perhaps the CART regression -is- a sign of some real unacceptable perceived performance, I'm not ruling that out, but we have not identified an interaction which significantly regressed. This is all extremely hardware dependent though (i.e. on most of my machines TART and CART both improve with OMTC), so if someone -has- seen this cause a significant performance regression in some sort of interaction, do let us know.
Best regards,
Bas
Looking on from m.d.tree-management, on Fx-Team, the merge from this
change caused a >40% CART regression, too, which wasn't listed in the
original email. Was this unforeseeen, and if not, why was this
considered acceptable?
As gavin noted, considering how hard we fought for 2% improvements (one
of the Australis folks said yesterday "1% was like Christmas!") despite
our reasons of why things were really OK because of some of the same
reasons you gave (e.g. running in ASAP mode isn't realistic, "TART is
complicated", ...), this hurts - it makes it seem like (a) our
(sometimes extremely hacky) work was done for no good reason, or (b) the
test is fundamentally flawed and we're better off without it, or (c)
when the gfx team decides it's OK to regress it, it's fine, but not when
it happens to other people, quite irrespective of reasons given.
All/any of those being true would give me the sad feelings. Certainly it
feels to me like (b) is true if this is really meant to be a net
perceived improvement despite causing a 40% performance regression in
our automated tests.
~ Gijs
On 18/05/2014 19:47, Bas Schouten wrote:
> Hi Gavin,
>
> There have been several e-mails on different lists, and some communication on some bugs. Sadly the story is at this point not anywhere in a condensed form, but I will try to highlight a couple of core points, some of these will be updated further as the investigation continues. The official bug is bug 946567 but the numbers and the discussion there are far outdated (there's no 400% regression ;)):
>
> - What OMTC does to tart scores differs wildly per machine, on some machines we saw up to 10% improvements, on others up to 20% regressions. There also seems to be somewhat more of a regression on Win7 than there is on Win8. What the average is for our users is very hard to say, frankly I have no idea.
> - One core cause of the regression is that we're now dealing with two D3D devices when using Direct2D since we're doing D2D drawing on one thread, and D3D11 composition on the other. This means we have DXGI locking overhead to synchronize the two. This is unavoidable.
> - Another cause is that we're now having two surfaces in order to do double buffering, this means we need to initialize more resources when new layers come into play. This again, is unavoidable.
> - Yet another cause is that for some tests we composite 'ASAP' to get interesting numbers, but this causes some contention scenario's which are less likely to occur in real-life usage. Since the double buffer might copy the area validated in the last frame from the front buffer to the backbuffer in order to prevent having to redraw much more. If the compositor is compositing all the time this can block the main thread's rasterization. I have some ideas on how to improve this, but I don't know how much they'll help TART, in any case, some cost here will be unavoidable as a natural additional consequence of double buffering.
> - The TART number story is complicated, sometimes it's hard to know what exactly they do, and don't measure (which might be different with and without OMTC) and how that affects practical performance. I've been told this by Avi and it matches my practical experience with the numbers. I don't know the exact reasons and Avi is probably a better person to talk about this than I am :-).
>
> These are the core reasons that we were able to identify from profiling. Other than that the things I said in my previous e-mail still apply. We believe we're offering significant UX improvements with async video and are enabling more significant improvements in the future. Once we've fixed the obvious problems we will continue to see if there's something that can be done, either through tiling or through other improvements, particularly in the last point I mentioned there might be some, not 'too' complex things we can do to offer some small improvement.
>
> If we want to have a more detailed discussion we should probably pick a list to have this on and try not to spam people too much :-).
>
> Bas
>
>> but tart will regress by ~20%, and several other suites will regress as well.
>> We've investigated this extensively and we believe the majority of these
>> regressions are due to the nature of OMTC and the fact that we have to do
>> more work.
>
> Where can I read more about the TART investigations? I'd like to
> understand why it is seen as inevitable, and get some of the details
> of the regression. OMTC is important, and I'm excited to see it land
> on Windows, but the Firefox and Performance teams have just come off a
> months-long effort to make significant wins in TART, and the thought
> of taking a 20% regression (huge compared to some of the improvements
> we fought for) is pretty disheartening.
>
> Gavin
>
> On Sun, May 18, 2014 at 12:16 AM, Bas Schouten <
bsch...@mozilla.com> wrote:
>> Hey all,
>>
>> After quite a lot of waiting we've switched on OMTC on Windows by default today (bug 899785). This is a great move towards moving all our platforms onto OMTC (only linux is left now), and will allow us to remove a lot of code that we've currently been duplicating. Furthermore it puts us on track for enabling other features on desktop like APZ, off main thread animations and other improvements.
>>
>> Having said that we realize that what we've currently landed and turned on is not completely bug free. There's several bugs still open (some more serious than others) which we will be addressing in the coming weeks, hopefully before the merge to Aurora. The main reason we've switched it on now is that we want to get as much data as possible from the nightly channel and our nightly user base before the aurora merge, as well as wanting to prevent any new regressions from creeping in while we fix the remaining problems. This was extensively discussed both internally in the graphics team and externally with other people and we believe we're at a point now where things are sufficiently stabilized for our nightly audience. OMTC is enabled and disabled with a single pref so if unforeseen, serious consequences occur we can disable it quickly at any stage. We will inevitably find new bugs in the coming weeks, please link any bugs you happen to come across to bug 899785, if anything se
>> ems very serious, please let us know, we'll attempt to come up with a solution on the short-term rather than disabling OMTC and reducing the amount of feedback we get.
>>
>> There's also some important notes to make on performance, which we expect to be reported by our automated systems:
>>
>> - Bug 1000640 is about WebGL. Currently OMTC regresses WebGL performance considerably, patches to fix this are underway and this should be fixed on the very short term.
>>
>> - Several of the Talos test suite numbers will change considerably (especially with Direct2D enabled), this means Tscroll for example will improve by ~25%, but tart will regress by ~20%, and several other suites will regress as well. We've investigated this extensively and we believe the majority of these regressions are due to the nature of OMTC and the fact that we have to do more work. We see no value in holding off OMTC because of these regressions as we'll have to go there anyway. Once the last correctness and stability problems are all solved we will go back to trying to find ways to get back some of the performance regressions. We're also planning to move to a system more like tiling in desktop, which will change the performance characteristics significantly again, so we don't want to sink too much time into optimizing the current situation.
>>
>> - Memory numbers will increase somewhat, this is unavoidable, there's several steps which have to be taken when doing off main thread compositing (like double-buffering), which inherently use more memory.
>>
>> - On a brighter note: Async video is also enabled by these patches. This means that when the main thread is busy churning JavaScript, instead of stuttering your video should now happily continue playing!
>>
>> - Also there's some indications that there's a subjective increase in scrolling performance as well.
>>
>>
>> If you have any questions please feel free to reach out to myself or other members of the graphics team!
>>
>>
>> Bas