Intent to Experiment: Align Timers (including DOM timers) at 125 Hz

605 views
Skip to first unread message

Etienne Pierre-doray

unread,
Aug 3, 2022, 9:50:48 AM8/3/22
to blink-dev, Francois Pierre Doray

etie...@chromium.org

Explainer

None

Specification

https://html.spec.whatwg.org/multipage/timers-and-user-prompts.html

Design docs


https://docs.google.com/document/d/1OjZoHNvn_vz6bhyww68B_KZBi6_s5arT8xMupuNEnDM/edit

Summary

Run all timers (with a few exceptions) with a non-zero delay on a regular 8ms aligned wake up (125 Hz), instead of as soon as their delay has passed. This affect DOM timers; On foreground pages, run DOM timers with a non-zero delay on a regular 8ms aligned wake up, instead of as soon as their delay has passed. On background pages, DOM timers already run on a regular 1s aligned wake up (1 Hz), or even less frequently after 5 minutes.



Blink component

Blink>Scheduling

TAG review



TAG review status

Not applicable

Risks



Interoperability and Compatibility

This feature changes the behavior of an existing API in a way that is spec-compliant (the spec says "Optionally, wait a further implementation-defined length of time", ref.: https://html.spec.whatwg.org/multipage/timers-and-user-prompts.html#run-steps-after-a-timeout). Content that relies on precise timing for DOM Timers may stop working properly in Chromium with this feature. The risk is mitigated by delaying DOM Timers by at most 8 ms, and by disabling the feature when WebRTC has active connections in the process. Content that cannot support a 8 ms delay would probably be better served by alternative APIs described at https://developer.chrome.com/blog/timer-throttling-in-chrome-88/#workarounds. Due to the significant battery savings that come with this feature, we expect that most browsers will decide to implement it after some time.



Gecko: No signal

WebKit: No signal

Web developers: No signals

Other signals:

WebView application risks

Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications?



Goals for experimentation

Gain insight on potential compatibility issues and evaluate impact on guardian metrics (page load, latency).



Reason this experiment is being extended



Ongoing technical constraints



Debuggability

This changes the behavior of an existing API. No new debugging support is added.



Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?

Yes

Is this feature fully tested by web-platform-tests?

No

DevTrial instructions

https://github.com/eti-p-doray/align-wakeups/blob/main/HOWTO.md

Flag name

align-wakeups

Requires code in //chrome?

False

Tracking bug

https://crbug.com/1153139

Estimated milestones

OriginTrial desktop last105
OriginTrial desktop first105
DevTrial on desktop105
OriginTrial Android last105
OriginTrial Android first105
DevTrial on Android105
OriginTrial webView last105
OriginTrial webView first105
We plan to do a 1% Stable experiment for M105 stable.

Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/5680188671655936

Yoav Weiss

unread,
Aug 3, 2022, 11:10:52 AM8/3/22
to Etienne Pierre-doray, blink-dev, Francois Pierre Doray
What's the plan for monitoring potential breakage? Looking at incoming bugs?

On Wed, Aug 3, 2022 at 3:50 PM Etienne Pierre-doray <etie...@chromium.org> wrote:

etie...@chromium.org

Explainer

None

Specification

https://html.spec.whatwg.org/multipage/timers-and-user-prompts.html

Design docs


https://docs.google.com/document/d/1OjZoHNvn_vz6bhyww68B_KZBi6_s5arT8xMupuNEnDM/edit

Summary

Run all timers (with a few exceptions) with a non-zero delay on a regular 8ms aligned wake up (125 Hz), instead of as soon as their delay has passed. This affect DOM timers; On foreground pages, run DOM timers with a non-zero delay on a regular 8ms aligned wake up, instead of as soon as their delay has passed. On background pages, DOM timers already run on a regular 1s aligned wake up (1 Hz), or even less frequently after 5 minutes.



Blink component

Blink>Scheduling

TAG review



TAG review status

Not applicable

Risks



Interoperability and Compatibility

This feature changes the behavior of an existing API in a way that is spec-compliant (the spec says "Optionally, wait a further implementation-defined length of time", ref.: https://html.spec.whatwg.org/multipage/timers-and-user-prompts.html#run-steps-after-a-timeout). Content that relies on precise timing for DOM Timers may stop working properly in Chromium with this feature. The risk is mitigated by delaying DOM Timers by at most 8 ms, and by disabling the feature when WebRTC has active connections in the process. Content that cannot support a 8 ms delay would probably be better served by alternative APIs described at https://developer.chrome.com/blog/timer-throttling-in-chrome-88/#workarounds. Due to the significant battery savings that come with this feature, we expect that most browsers will decide to implement it after some time.



Gecko: No signal

WebKit: No signal


Might be worthwhile to ask: https://bit.ly/blink-signals
 
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CALoDvsaQA8iqxdxNEh1PkBCzPFSsSSmZ72Jgmev-bdwenG6DrQ%40mail.gmail.com.

Lennart Grahl

unread,
Aug 3, 2022, 11:42:19 AM8/3/22
to blink-dev, Etienne Pierre-doray, fdo...@chromium.org
Question: What about APIs that have no proper flow control support, e.g. WebSocket? They rely on (ab)use of setTimeout to avoid writing too much into the underlying buffer. I wouldn't consider a 1s flow disruption delay to be acceptable for this use case, not even in the background. Are there any plans to prevent such issues?

Etienne Pierre-doray

unread,
Aug 3, 2022, 1:07:31 PM8/3/22
to Lennart Grahl, blink-dev, fdo...@chromium.org
What's the plan for monitoring potential breakage? Looking at incoming bugs?
 Yes, there's been a few breakage on earlier channel (all addressed as of now). This one was related to DOM timer: https://buganizer.corp.google.com/issues/220682826

Might be worthwhile to ask: https://bit.ly/blink-signals
There's anecdotal evidence that Webkit is aligning timers at 10ms; all I could find re. DOM timers is this throttling at 30Hz in low power mode. 
Does https://bit.ly/blink-signals apply even if this chromestatus doesn't change the spec?

Question: What about APIs that have no proper flow control support, e.g. WebSocket? They rely on (ab)use of setTimeout to avoid writing too much into the underlying buffer. I wouldn't consider a 1s flow disruption delay to be acceptable for this use case, not even in the background. Are there any plans to prevent such issues?
Although that was meant for another proposal, this blog post suggests an alternative to some setTimeout abuses. 
Note that this proposal only concerns the 8ms delay. The 1s throttling was done in this previous chromestatus for background pages and shipped in M86.

Yoav Weiss

unread,
Aug 3, 2022, 11:27:45 PM8/3/22
to Etienne Pierre-doray, Lennart Grahl, blink-dev, fdo...@chromium.org
On Wed, Aug 3, 2022 at 7:07 PM Etienne Pierre-doray <etie...@chromium.org> wrote:
What's the plan for monitoring potential breakage? Looking at incoming bugs?
 Yes, there's been a few breakage on earlier channel (all addressed as of now). This one was related to DOM timer: https://buganizer.corp.google.com/issues/220682826

For folks outside of Google, the bug describes a site that relied on timer accuracy to schedule tasks, and saw a degradation in their performance metrics. The site in question then fixed it by moving away from those methods. (I hope I'm capturing that correctly. I only skimmed through the issue so please correct me if I got it wrong)

I suspect many non-Google properties run similar code, and would similarly be surprised by this change. e.g. I know that many JS driven animations used to rely on timers, and I'm not sure that's no longer the case. Similarly, performance measurements that rely on timer accuracy as a proxy for "CPU busyness" are common.


Might be worthwhile to ask: https://bit.ly/blink-signals
There's anecdotal evidence that Webkit is aligning timers at 10ms; all I could find re. DOM timers is this throttling at 30Hz in low power mode. 
Does https://bit.ly/blink-signals apply even if this chromestatus doesn't change the spec?

The web-exposed behavior is changing here, with potential compatibility and interoperability implications. So even if the spec allows for that, I think it's worthwhile to ask other vendors for their opinions on this.
 
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Alex Russell

unread,
Aug 4, 2022, 10:05:22 AM8/4/22
to blink-dev, Yoav Weiss, Lennart Grahl, blink-dev, Francois Pierre Doray, Etienne Pierre-doray
On the animation question, 8ms coalescence should service up to 120hz, but high-framerate apps aren't likely to be doing timer-based animation. I'm less worries about that than I am about other, less understood impacts.

Strong +1 on trying to understand everything we can about content that might be affected, and reaching out to developers and other vendors who might have insight into workloads that rely on the current behavior.

That said, I'm also supportive of trying this in Canary/Dev ASAP (without rolling to Beta) to try to get that data.

Best,

Alex

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.

Etienne Pierre-doray

unread,
Aug 4, 2022, 1:50:27 PM8/4/22
to Alex Russell, blink-dev, Yoav Weiss, Lennart Grahl, Francois Pierre Doray
The site in question then fixed it by moving away from those methods. (I hope I'm capturing that correctly. I only skimmed through the issue so please correct me if I got it wrong)
As a solution, we ended up opting-out any DOM timer from alignment whenever there's an active WebRTC connection in the process (as a proxy for video conferencing), so similar videoconferencing patterns in non-Google sites won't be affected by the experiment. That being said, there exists an alternative to setTimeout() for this use case, MediaStreamTrackProcessor().readable.
I know that many JS driven animations used to rely on timers, and I'm not sure that's no longer the case.
Like Alex said, animations driven by setTimeout() aren't significantly affected by a 8ms alignment because the typical frame rate (16ms) is aligned on the same boundary (although the well-known alternative rAF is preferred).

Similarly, performance measurements that rely on timer accuracy as a proxy for "CPU busyness" are common.
Do you have a concrete example to look at? I would like understand the use case to identify efficient alternatives similar to what we did in timer-throttling-in-chrome-88.
Considering Windows resolution is ~16ms for delays > 32ms, I wouldn't expect this to be very reliable in the first place; in fact, any setTimeout() with a delay >32ms isn't significantly affected by this proposal.

Might be worthwhile to ask: https://bit.ly/blink-signals
Issue created here

That said, I'm also supportive of trying this in Canary/Dev ASAP (without rolling to Beta) to try to get that data.
For context, we already experimented with AlignWakeUps on Canary/Dev a few times and on Beta in June (~4 weeks). Notably, the experiment showed a substantial reduction (improvement) of Chrome Energy Impact. Another issues that was surfaced (and fixed) during this experiment crbug.com/1340677 (which is not related to DOM timers).
One main goal of experimenting on stable is to get a statistically significant and reliable signal for battery discharge. With this in mind, there's no added value for us in experimenting on pre-stable anymore (which typically doesn't produce reliable enough data) and we would like to move up to 1% stable.

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Stefan Zager

unread,
Aug 5, 2022, 1:19:11 PM8/5/22
to Etienne Pierre-doray, blink-dev, Francois Pierre Doray
This is a common programming pattern:

requestAnimationFrame(() => {
  setTimeout(() => {
    // At this point, it's very likely that layout is clean,
    // because we *just* completed a rendering update.
    // Queries of layout information are very unlikely to
    // trigger a forced layout.
    document.body.offsetTop;
  });
});

Aligning timers will make this strategy for avoiding forced layout less effective, because other non-timer tasks may run ahead of the setTimeout and invalidate layout.

Also: the rAF(setTimeout()) construct is used extensively in chromium's test corpus, to schedule work ASAP after a rendering update. Many of our tests use a synchronous compositor: rendering updates happen without delay whenever there are no other tasks ready to run. In other tests, we increase the frequency of rendering updates from 60Hz to 200Hz, just because we know there won't be any long-running tasks and we want the tests to run faster.

So... I see potential issues here.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Dave Tapuska

unread,
Aug 5, 2022, 2:00:16 PM8/5/22
to Stefan Zager, Etienne Pierre-doray, blink-dev, Francois Pierre Doray
Stefan, this was just for "non-zero delay" timers? Are there still potential issues there?

Stefan Zager

unread,
Aug 5, 2022, 2:17:12 PM8/5/22
to Dave Tapuska, Stefan Zager, Etienne Pierre-doray, blink-dev, Francois Pierre Doray
On Fri, Aug 5, 2022 at 11:00 AM Dave Tapuska <dtap...@chromium.org> wrote:
Stefan, this was just for "non-zero delay" timers? Are there still potential issues there?

Ah, sorry, I missed that detail. In that case, I think none of my objections apply.

Yoav Weiss

unread,
Aug 11, 2022, 5:16:35 AM8/11/22
to Stefan Zager, Dave Tapuska, Etienne Pierre-doray, blink-dev, Francois Pierre Doray
Do I understand correctly that you're asking for experimentation only in the 105?

We discussed this intent at the API owners meeting yesterday (Daniel, Rego, MikeT and myself), and reached a conclusion that there are two goals for this experiment, but only one of them can be achieved with 1% stable experimentation.
We believe the experiment can show the potential benefits of such a behavior change, but won't necessarily expose compat issues for sites that don't pay very close attention (as it's easy to dismiss bugs in 1% of users as flakes).
Hence, we think it's fine to run the experiment in order to figure out the potential benefits, but would need a more elaborate plan to figure out the compat implications and feasibility of shipping this.

Does that make sense?

Etienne Pierre-doray

unread,
Aug 11, 2022, 11:48:54 AM8/11/22
to Yoav Weiss, Stefan Zager, Dave Tapuska, blink-dev, Francois Pierre Doray
Hi Blink API Owners,
Thanks for taking the time to look into this feature.
 
Do I understand correctly that you're asking for experimentation only in the 105?
This is correct. Although I imagined the following rollout plan, with a separate I2S once I gathered data on Stable:
- (previously) 50% on canary/dev/beta M103/M104
- 50% canary/dev/beta + 1% Stable on M105
- 100% Stable on M106

 won't necessarily expose compat issues for sites that don't pay very close attention (as it's easy to dismiss bugs in 1% of users as flakes).
What would be a suitable roll-out plan to expose compat issues? In similar performance interventions (e.g. Intensive throttling), origin trial (on 50% Beta and 1% Stable) was able to surface issues and provide necessary feedback for launch to be LGTM-ed.

 

Chris Harrelson

unread,
Aug 24, 2022, 3:15:30 PM8/24/22
to Etienne Pierre-doray, Yoav Weiss, Stefan Zager, Dave Tapuska, blink-dev, Francois Pierre Doray
On Thu, Aug 11, 2022 at 8:48 AM Etienne Pierre-doray <etie...@chromium.org> wrote:
Hi Blink API Owners,
Thanks for taking the time to look into this feature.
 
Do I understand correctly that you're asking for experimentation only in the 105?
This is correct. Although I imagined the following rollout plan, with a separate I2S once I gathered data on Stable:
- (previously) 50% on canary/dev/beta M103/M104
- 50% canary/dev/beta + 1% Stable on M105
- 100% Stable on M106

Ok. So your experiment is not an OT, but rather asking permission for an A/B (finch) experiment on those channels?
 

 won't necessarily expose compat issues for sites that don't pay very close attention (as it's easy to dismiss bugs in 1% of users as flakes).
What would be a suitable roll-out plan to expose compat issues? In similar performance interventions (e.g. Intensive throttling), origin trial (on 50% Beta and 1% Stable) was able to surface issues and provide necessary feedback for launch to be LGTM-ed.

We discussed this intent at the API owners meeting today. 50% beta may not yield the feedback you want, because developers or users may conclude that an apparent breakage is a non-reproducible bug because it only reproduces some of the time or on some computers. To correct for this, and given you hope to ship it in one release, I suggest an "optimistic shipping" strategy:

1. Turn on (via finch) for canary/dev at 100% for canary / dev version N
2. Continue to beta at 100% for version N assuming no bugs reported in step 1
3. After 2.5 weeks at beta with no bugs reported, send an I2S to blink-dev, which we'd approve assuming no issues were reported
4a. Assuming 3 succeeds, proceed to 100% stable when N ships
4b. Assuming it fails, turn off the experiment in beta. This will still leave 1.5 weeks of testing without the change as part of the normal release cycle

This plan is designed to avoid the not-reproducible-bug issue, and also satisfy the need for us to test what is actually shipping on the beta channel.

At present N would be 107.

Etienne Pierre-doray

unread,
Aug 24, 2022, 4:35:46 PM8/24/22
to Chelbi Owre, Chris Harrelson, Dave Tapuska, Francois Pierre Doray, Stefan Zager, Yoav Weiss, blink-dev
Ok. So your experiment is not an OT, but rather asking permission for an A/B (finch) experiment on those channels?
Correct, I very recently discovered the difference between finch and OT; sorry for the confusion.

This plan is designed to avoid the not-reproducible-bug issue, and also satisfy the need for us to test what is actually shipping on the beta channel.
That sounds like a reasonable plan.
Is it still ok to run 1% experiment on M105 stable meanwhile. Per yoavweiss@ previous response, this is mostly aimed at evaluating performance/power benefits.

On Wed, Aug 24, 2022 at 3:16 PM Chelbi Owre <chelb...@gmail.com> wrote:
Fuck off

Chris Harrelson

unread,
Aug 24, 2022, 4:38:40 PM8/24/22
to Etienne Pierre-doray, Chelbi Owre, Dave Tapuska, Francois Pierre Doray, Stefan Zager, Yoav Weiss, blink-dev
On Wed, Aug 24, 2022 at 1:35 PM Etienne Pierre-doray <etie...@chromium.org> wrote:
Ok. So your experiment is not an OT, but rather asking permission for an A/B (finch) experiment on those channels?
Correct, I very recently discovered the difference between finch and OT; sorry for the confusion.

This plan is designed to avoid the not-reproducible-bug issue, and also satisfy the need for us to test what is actually shipping on the beta channel.
That sounds like a reasonable plan.
Is it still ok to run 1% experiment on M105 stable meanwhile. Per yoavweiss@ previous response, this is mostly aimed at evaluating performance/power benefits.

I think that's fine.
 

Chris Harrelson

unread,
Aug 31, 2022, 11:35:45 AM8/31/22
to Etienne Pierre-doray, Chelbi Owre, Dave Tapuska, Francois Pierre Doray, Stefan Zager, Yoav Weiss, blink-dev
LGTM to roll out the "optimistic shipping" plan up to beta. Then please come back to this thread for final shipping approval.

Morgaine (de la faye)

unread,
Sep 6, 2022, 6:26:16 PM9/6/22
to blink-dev, Etienne Pierre-doray, fdo...@chromium.org
I don't love this at all. A more flexible batching/coalescing that *can* batch/coalesce when it makes sense, but which preserves the desired/asked for wake-ups when they are all alone & there's nothing to coalesce to would be far more reasonable/kind. This feels like it applies a huge cudgel to all users because some users use too many wakeups. I think most of the expected power savings can be gotten for the bad users, without such a vast & sad negative impact for the responsible users. Linux gained a similar "Tickless" kernel support capability in 2.6.6, May 2004, and this feels like a significant reversion to a state of tech far predating that basic innovation. I cannot support the sacrifices made here. Please do a little more work on this to only impact bad users/places where this might help; not everyone.

Couple other details:

Can authors still expect long term guarantees? If I request 10ms setInterval, and then write a counter that counts to a thousand, will ~10s have passed to get there? Folks doing media-timing rely on their setIntervals being wall-clock stable-ish; does this significant change threaten that?

125Hz is going to serve up tick at least once per frame up to 120hHz. But the amount of available execution time before frame is going to phase in & out of sync (at almost any display rate frequency) such that there are times there is plenty of execution time, & times when it is happening right near where the frame tick would be. This spec seems like might increases the probability of jank occurring at regular intervals.

In general I think this tick rate would have been a good target for 2015, but that we are hoping to build more snappy & responsive systems. 250Hz would be a better tick rate, coming with 4ms potential delays rather than 8ms.

Stefan Zager

unread,
Sep 6, 2022, 7:19:32 PM9/6/22
to Morgaine (de la faye), blink-dev, Etienne Pierre-doray, fdo...@chromium.org
On Tue, Sep 6, 2022 at 3:26 PM Morgaine (de la faye) <rek...@gmail.com> wrote:
I don't love this at all. A more flexible batching/coalescing that *can* batch/coalesce when it makes sense, but which preserves the desired/asked for wake-ups when they are all alone & there's nothing to coalesce to would be far more reasonable/kind. This feels like it applies a huge cudgel to all users because some users use too many wakeups. I think most of the expected power savings can be gotten for the bad users, without such a vast & sad negative impact for the responsible users. Linux gained a similar "Tickless" kernel support capability in 2.6.6, May 2004, and this feels like a significant reversion to a state of tech far predating that basic innovation. I cannot support the sacrifices made here. Please do a little more work on this to only impact bad users/places where this might help; not everyone.

Can you give an example where it's important to have greater precision? Even now, timers are not guaranteed to be serviced exactly when they expire -- a browser is not a real-time OS -- so what is actually lost?

I'm not sure what "media timing" is exactly, but if you need precise elapsed time, then performance.now() is the recommended approach.

As for aligning with display refresh rate: a timer will never do better than requestAnimationFrame in this regard.
 
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
Reply all
Reply to author
Forward
0 new messages