I managed to get in contact with a couple devs from the Safari and Chrome projects and discuss this with them to try and understand how their respective browsers currently implement requestAnimationFrame and understand the relevant issues and challenges.
Talked w/ Dean Jackson from the Safari team:
Their requestAnimationFrame is in flux right now - it's got some current issues and they're planning to overhaul it significantly in the near future.
They've had difficulty getting every aspect of rAF quite right in their current implementation; in particular whether or not the callbacks should be issued for hidden/obscured windows and tabs.
The design of rAF is an issue for them in that it locks users to a 60hz timer; they have encountered applications that really would have preferred a 10hz setTimeout but used requestAnimationFrame instead because they were told it was the right thing to do. An optimal replacement/refinement of requestAnimationFrame, for them, would allow a content author to specify the intended update rate of their content.
requestAnimationFrame in Safari is hooked up to the display refresh/composition engine instead of going through the normal dom paint pipeline. The compositor tries to give them a callback that aligns with vsync so that they can issue requestAnimationFrame callbacks on timing that lines up well with the retrace. I believe this also means that if you repaint a canvas from within a requestAnimationFrame callback, all they have to do is recomposite instead of invalidate the page containing the canvas and do a full content repaint (I might be wrong on this).
From their perspective the basic model proposed in this API is sound: Separate update and render callbacks; the browser is free to drop render callbacks based on CPU load or based on content being obscured. With the addition of a configurable target rate for updates, their issues with updates being locked to 60hz would be addressed as well.
Dean also expressed interest in an API that allows or even encourages the browser vendor to buffer up 1-3 rendered frames to present at the next vertical sync interval so that those buffered frames can be handed to the compositor and presented at the next realistic opportunity, since this aligns well with how they already render and would dovetail nicely with video playback.
If we move forward with an API like this, Dean suggested the proposal go onto a W3C list. I don't really know anything about that. :D
Talked w/ James Robinson from the Chrome team:
Chromium has an extremely robust implementation of requestAnimationFrame. The core principle is that they attempt to call rAF callbacks at the "correct" rate for a given application. They determine the optimal callback rate based on various inputs: vertical sync timer, computational complexity, etc. Their goal is to present a steady stream of frames at a solid framerate, and optimally align that stream with vertical sync. This means that, for example, if the application is unable to run at 60hz on a 60hz monitor, they would drop to 30hz - not say, 50hz or 40hz.
Chromium's model for this is based on a pipeline of producers and consumers - at one end you have the frame consumer that pulls completely rendered frames out of the pipeline and presents them at vertical sync; on the other end, the requestAnimationFrame callback runs update logic and pushes rendering calls for a frame into the pipeline. They attempt to align the timing on one end of the pipeline (the producers) with the timing on the other end (the consumers), so that eventually the requestAnimationFrame callbacks will drift into near-perfect alignment with vertical sync. This also implies that they can (and will, if necessary) buffer up entire frames, adding a slight amount of latency.
They also measure time spent blocked in various stages of the pipeline: Time blocked on Present/SwapBuffers, time blocked on draw calls, delays on calls to the window manager/compositor. They take that time into account when scheduling frames, so that if the GPU is heavily loaded or the window manager is having trouble keeping up, they will avoid overloading those parts of the system with lots of buffered frames/work. (On my machine, I can actually reproduce this - if I play fullscreen video on my secondary monitor, Chrome will sometimes drop game content down from 60hz to 30hz in response to GPU load and the Aero compositor taking longer to respond.)
At present, Chromium actually hands a 'presentation time' to requestAnimationFrame callbacks that allows them to adjust to callback scheduling variability. They've discussed actually passing a second time to the callback, which is the time at which they anticipate the frame actually being presented to the screen; it's hard to accurately determine this, though, so it would be more informative ('this is roughly how long you have to do your work before the next vsync') than precise ('this is when your next frame is').
This particular detail is inspired by CVDisplayLink, a system they hook into on OS X for presentation:
http://developer.apple.com/library/mac/#documentation/QuartzCore/Reference/CVDisplayLinkRef/Reference/reference.html
Since their WebAudio implementation provides very low latency and precise scheduling of audio events, they would optimally like to be in a situation where timing can be synchronized with audio timing, so that game logic, rendering, and audio are all running off the same precise clock. On OS X, CoreVideo and CoreAudio use the same clock so this is possibly easier there.
Right now Chromium's scheduling behavior has a few different modes depending on what kind of content is in a page: WebGL content, for example, will put you into a GPU-accelerated compositing mode, and there are a couple experimental modes that can be turned on in browser configuration.
James described the model they are attempting to move to for scheduling. It's complicated so I'll just paste it here:
jamesr: we basically take a target framerate and a phase factor and then user a timer to produce a set of ticks at the desired framerate (with some careful logic to take care of poor timer precision without drifting)
jamesr: and then produce frames off of that
jamesr: the target framerate is based off of the screen's parameters and the phase factor is used to make sure we don't get aliasing if we are too close to the vsync line
jamesr: then we monitor for backpressure - which may come from the SwapBuffers, from some GL call taking a long time, or from deeper in the pipeline (like the window manager)
They're only part-way there, though, so it's a work in progress for them. It sounds like it does a lot to help them reduce judder and aliasing while still presenting vertically-synced content.
We also discussed the idea of the application being responsible for dropping rendering workloads instead of the browser - i.e. a game can somehow determine that it is overloading the system and as a result not hitting 60fps, and in response, only perform draw calls on every other frame.
Another example he gave was that a physics based game out there runs at 60hz, but only performs physics updates at 30hz - the alternate frames are simply an interpolation of previous game state, to reduce CPU load. This is definitely not a case I gave much thought to when preparing my initial proposal.
One thing James expressed interest in is the idea of providing more data to consumers of requestAnimationFrame so they can understand the constraints the browser is under and how well things are performing, so that the application can take steps to reduce the amount of work it does and get closer to 60hz.
One interesting difference from the gecko model for rAF (as I understand it) is that in Chromium, requestAnimationFrame callbacks occur as sort of a 'last step' in the rendering pipeline - after invalidation has occurred from DOM changes or a composited layer has moved, and they're getting ready to composite, they kick off requestAnimationFrame callbacks so that canvases can get updated. This means that in the case where a game doesn't interact with the DOM at all, the main loop is essentially 'requestAnimationFrame -> draw calls -> composite -> present' without involving any DOM or invalidation machinery. (I might have some of the details wrong here)
A common theme in this discussion is the importance of trying to improve the consistency between browsers - getting Gecko, Safari, Chrome and IE all using the same basic model for scheduling requestAnimationFrame so that developers will be able to build working apps that deliver smooth framerates without having to spend tons of time understanding each browser.
In the next week or so I'll probably spend some time describing concrete use cases and how they work under the existing requestAnimationFrame model, so that we can consider how they would work under the new API proposal. I figure that will help us understand differences between each browser's version of rAF, as well.
-kg