Newer timer options - high resolution waitable timers

925 views
Skip to first unread message

Bruce Dawson

unread,
Oct 18, 2020, 7:45:48 PM10/18/20
to scheduler-dev
Based on a commit to the Go runtime I just ran some tests using the undocumented CREATE_WAITABLE_TIMER_HIGH_RESOLUTION flag with CreateWaitableTimerEx. It allows high-resolution waits without raising the timer interrupt frequency. In some cases these waits can be much shorter than those that are possible with the other mechanisms.

My test code is available here:

https://github.com/randomascii/blogstuff/blob/main/timer_interval/waitable_timer.cpp

I verified these results, from the original Go commit message:

1. CREATE_WAITABLE_TIMER_HIGH_RESOLUTION is off - timeBeginPeriod is off
delay is 1000 us - slept for 4045 us
delay is 100 us - slept for 3915 us
delay is 10 us - slept for 3291 us
delay is 1 us - slept for 2234 us

2. CREATE_WAITABLE_TIMER_HIGH_RESOLUTION is on - timeBeginPeriod is off
delay is 1000 us - slept for 1076 us
delay is 100 us - slept for 569 us
delay is 10 us - slept for 585 us
delay is 1 us - slept for 17 us

3. CREATE_WAITABLE_TIMER_HIGH_RESOLUTION is off - timeBeginPeriod is on
delay is 1000 us - slept for 742 us
delay is 100 us - slept for 893 us
delay is 10 us - slept for 414 us
delay is 1 us - slept for 920 us

4. CREATE_WAITABLE_TIMER_HIGH_RESOLUTION is on - timeBeginPeriod is on
delay is 1000 us - slept for 1466 us
delay is 100 us - slept for 559 us
delay is 10 us - slept for 535 us
delay is 1 us - slept for 5 us

In other words, this lets a program do shorter sleeps than were previously possible without having to raise the timer interrupt frequency.

To see it demonstrate its abilities you need to run it on a machine running 1803 or later because that is when I believe this flag was created. For full results you need to run it on a machine that doesn't have the timer interrupt frequency raised (my test code checks for that) which is usually not the case for Googler machines because of Go programs that are always running which raise the timer interrupt frequency (the aforementioned Go runtime commit should allow this to be fixed).

It's an intriguing possibility to be able to get more accurate timers without raising the timer interrupt frequency. On the other hand it raises the possibility of abuse since it will desynchronize our timers and could make things consume more power.

I thought I'd open this up for some initial discussion before filing a bug.

-- 

Bruce Dawson

François Doray

unread,
Oct 19, 2020, 10:04:50 AM10/19/20
to Bruce Dawson, scheduler-dev
If I understand correctly, raising the timer frequency for short waits but not for long waits (current state) results in wake ups following long waits to be aligned. Using CREATE_WAITABLE_TIMER_HIGH_RESOLUTION could break the alignment and result in more power consumption. Is that correct? If that's a problem, we could use adjust wait timeouts requested by the scheduler to have wake ups that are ~aligned across threads and processes (using base::TimeTicks::SnappedToNextTick).

On another note, olivierli@ implemented the Power.Mac.BatteryDischarge histogram to assess the battery discharge rate on Mac (unit: 1/10000 of full capacity per minute). It is a top priority for us to implement the same histogram on Windows. It can probably be used to assess the impact of using CREATE_WAITABLE_TIMER_HIGH_RESOLUTION in the wild?

--
You received this message because you are subscribed to the Google Groups "scheduler-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scheduler-de...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/scheduler-dev/CAE5mQiMrAanSc-ydfJSjwjbkKDD0aCJFAXOJTkz1nyOdGJsmzw%40mail.gmail.com.

Gabriel Charette

unread,
Oct 19, 2020, 11:42:08 AM10/19/20
to François Doray, Bruce Dawson, scheduler-dev
On Mon, Oct 19, 2020 at 10:04 AM François Doray <fdo...@chromium.org> wrote:
If I understand correctly, raising the timer frequency for short waits but not for long waits (current state) results in wake ups following long waits to be aligned.

Which "long waits" do you mean? Other apps'? In Chrome we sadly don't make an effort to align long waits (we just don't ask for high-frequency). Say you start two timers for 30 seconds, they will naturally start a few microseconds apart (time delta between executing PostDelayedTask()). And if the OS wakes up on time for the first one, then it's possible that it goes back to sleep with microseconds to go on the second one (ideally we wouldn't ask the OS to wakeup at that precise time for a lo-res timer when we know we have another lo-res timer *right* after it. We could do this "easily" as an experiment by adding something like 3% (1s/30s) on all lo-res sleeps (then we get alignment even if the system is otherwise hi-res because of other apps).
 
Using CREATE_WAITABLE_TIMER_HIGH_RESOLUTION could break the alignment and result in more power consumption. Is that correct? If that's a problem, we could use adjust wait timeouts requested by the scheduler to have wake ups that are ~aligned across threads and processes (using base::TimeTicks::SnappedToNextTick).

On another note, olivierli@ implemented the Power.Mac.BatteryDischarge histogram to assess the battery discharge rate on Mac (unit: 1/10000 of full capacity per minute). It is a top priority for us to implement the same histogram on Windows. It can probably be used to assess the impact of using CREATE_WAITABLE_TIMER_HIGH_RESOLUTION in the wild?

Awesome! Sounds useful for A/B comparison (if precise enough).
 

François Doray

unread,
Oct 19, 2020, 12:00:37 PM10/19/20
to Gabriel Charette, Bruce Dawson, scheduler-dev
On Mon, Oct 19, 2020 at 11:42 AM Gabriel Charette <g...@chromium.org> wrote:


On Mon, Oct 19, 2020 at 10:04 AM François Doray <fdo...@chromium.org> wrote:
If I understand correctly, raising the timer frequency for short waits but not for long waits (current state) results in wake ups following long waits to be aligned.

Which "long waits" do you mean? Other apps'? In Chrome we sadly don't make an effort to align long waits (we just don't ask for high-frequency). Say you start two timers for 30 seconds, they will naturally start a few microseconds apart (time delta between executing PostDelayedTask()). And if the OS wakes up on time for the first one, then it's possible that it goes back to sleep with microseconds to go on the second one (ideally we wouldn't ask the OS to wakeup at that precise time for a lo-res timer when we know we have another lo-res timer *right* after it. We could do this "easily" as an experiment by adding something like 3% (1s/30s) on all lo-res sleeps (then we get alignment even if the system is otherwise hi-res because of other apps).

I was speculating that when we don't enable the high-resolution timer, Windows naturally aligns the wake ups that follow our long waits. I don't know if that's true. brucedawson@: What did you mean by "it will desynchronize our timers and could make things consume more power"?

Bruce Dawson

unread,
Oct 19, 2020, 12:14:18 PM10/19/20
to François Doray, Gabriel Charette, scheduler-dev
When using our current mechanism (WaitForSingleObject/MultipleObject/Sleep) with a timeout our waits are synchronized to the regular timer interrupt so they are automatically aligned/coalesced - that is my understanding. This is particularly true when running on battery where we run the timer interrupt at a maximum of 125 Hz. There should be ~7-8 ms periods where all of our threads are sleeping, which Intel has said is crucial for reducing power draw.

If each thread (and each process) is using high-resolution waitable timers to ask the OS to wake them up in, say, 8 ms then the actual wakeup times could well end up being staggered such that the CPU never gets to sleep.

As Francois said we could ask the waitable timers to wake us up at specific absolute times (instead of relative times) and quantize those to any desired precision (1, 2, 4, 8, or 16 ms) which should give us timer coalescing across processes, and arbitrarily high resolution timers, without raising the timer interrupt frequency.

Further complicating things:
  • High resolution waitable timers only work on Windows 10 1803 and above, so most of our users have them but far from all
  • Microsoft's handling of the old-school timer interrupt scheduling changed in Windows 10 2004 such that raising the timer interrupt frequency is not as bad for power as it used to be so the benefits may be lower
I really want Microsoft or Intel to chime in on this. They must have a lot of data that drove these two timer changes and it would be great if they could share it.
--
Bruce Dawson

Olivier Li Shing Tat-Dupuis

unread,
Oct 21, 2020, 9:39:25 AM10/21/20
to Bruce Dawson, François Doray, Gabriel Charette, scheduler-dev
This is all very interesting,

Following my experimentations with power use on mac timer coalescing / task alignment is actually one of the first things I want to attack (once we are happy with the metric situation to track and justify the efforts).

I hadn't even thought of the resolution of the OS timers actually and how that influences such work.

@Bruce Dawson : I'm currently digging through documentation to get an idea of the power state dynamics on Intel CPUs. Can you please tell me where you got information from them? And maybe I could sync with someone which would be even better.

Thanks!



--

Olivier Li Shing Tat-Dupuis | Software developer | oliv...@google.com | 1-415-761-1995

Bruce Dawson

unread,
Oct 21, 2020, 11:54:47 AM10/21/20
to Olivier Li Shing Tat-Dupuis, François Doray, Gabriel Charette, scheduler-dev
crbug.com/927165 has some discussion of Intel CPU power curves. The filer of that bug works for Intel and if you contact them directly you may be able to get more detailed information - ping me if you need an introduction.
--
Bruce Dawson

Sami Kyostila

unread,
Oct 22, 2020, 12:01:15 PM10/22/20
to Bruce Dawson, Olivier Li Shing Tat-Dupuis, François Doray, Gabriel Charette, scheduler-dev
IIRC IE/Edge had logic to snap timers that are close to the vsync frequency to the actual vsync signal. Something like that could help reduce wake-ups here too.

- Sami

Bruce Dawson

unread,
Oct 22, 2020, 8:35:37 PM10/22/20
to Sami Kyostila, Olivier Li Shing Tat-Dupuis, François Doray, Gabriel Charette, scheduler-dev
Yep, the vsync logic in IE was added around the time that I was there. stanisc@ tried to get similar logic working in Chrome but hit numerous obstacles. One of those was that waiting on vsync interrupts caused by the GPU to stay "more awake" which in some scenarios actually increased power draw. The potential gains were always relatively small, and with some non-zero losses it stopped seeming so great.

I think IE did it not for power reasons but for smoothness, although I'm sure there were overlapping design constraints.
--
Bruce Dawson

Reply all
Reply to author
Forward
0 new messages