[PATCH] hpet: handle wrap-around with 32-bit counter

Waldemar Kozaczuk

unread,

Jun 28, 2019, 9:36:35 AM6/28/19

to osv...@googlegroups.com, Waldemar Kozaczuk

This patch enhances the hpet clock with 32-bit main
counter to handle wrap-arounds. It does it by maintaining
separate upper 32-bit counter and incrementing it
when wrap around is detected.

Signed-off-by: Waldemar Kozaczuk <jwkoz...@gmail.com>
---
drivers/hpet.cc | 26 +++++++++++++++++++++-----
1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/hpet.cc b/drivers/hpet.cc
index ba9ed3bc..a6696405 100644
--- a/drivers/hpet.cc
+++ b/drivers/hpet.cc
@@ -34,21 +34,37 @@ protected:

#define HPET_COUNTER 0x0f0

-//FIXME: Enhance this class to handle wrap-around
class hpet_32bit_clock : public hpetclock {
public:
hpet_32bit_clock(mmioaddr_t hpet_mmio_address) : hpetclock(hpet_mmio_address) {
- debug_early_u64("WARNING: hpet with 32-bit counter will wrap around in seconds: ",
- (_period * (1UL << 32)) / 1000000000UL);
+ _upper_32bit_counter.store(0);
}
protected:
virtual s64 time() override __attribute__((no_instrument_function)) {
- return _wall + mmio_getl(_addr + HPET_COUNTER) * _period;
+ return _wall + fetch_counter() * _period;
}

virtual s64 uptime() override __attribute__((no_instrument_function)) {
- return mmio_getl(_addr + HPET_COUNTER) * _period;
+ return fetch_counter() * _period;
+ }
+private:
+ s64 fetch_counter() {
+ auto last_valid_counter = _last_read_counter.load();
+ auto upper_counter = _upper_32bit_counter.load();
+
+ s64 counter = mmio_getl(_addr + HPET_COUNTER);
+ if (counter < last_valid_counter) { // Wrap-around
+ // In case more than one thread fetches the value and encounter
+ // the wrap-around, make sure the upper counter is only incremented
+ // once
+ _upper_32bit_counter.compare_exchange_strong(upper_counter, upper_counter + 1);
+ }
+
+ _last_read_counter.store(counter);
+ return _upper_32bit_counter.load() * (1UL << 32) + counter;
}
+
+ std::atomic<s64> _upper_32bit_counter;
};

class hpet_64bit_clock : public hpetclock {
--
2.20.1

Nadav Har'El

unread,

Jun 30, 2019, 4:31:54 AM6/30/19

to Waldemar Kozaczuk, Osv Dev

Again, can use relaxed memory ordering in this load, or we're trying to order some memory access with something else? Maybe "counter"?

+
+ s64 counter = mmio_getl(_addr + HPET_COUNTER);
+ if (counter < last_valid_counter) { // Wrap-around

+ // In case more than one thread fetches the value and encounter
+ // the wrap-around, make sure the upper counter is only incremented
+ // once
+ _upper_32bit_counter.compare_exchange_strong(upper_counter, upper_counter + 1);

This doesn't look 100% safe to me. Look at the following time sequence on two CPUs:

CPU 1 CPU2

last_counter=2^32-1 last_counter=2^32-1

upper_counter = 17

counter = 137 (<last_counter)

upper_counter = 18 (with CAS)

upper_counter = 18

counter=785 (<last_counter)

upper_counter = 19 (with CAS)

last_counter = 137

last_counter = 18

I think that the end result of this extremely-unlikely timing is that upper_counter was incremented twice (from 17 to 19) despite just one wrap-around period passing.

I wonder if it wouldn't be much simpler to have the upper_counter per-cpu instead of trying to make it global. We would need

to assume that each of the CPUs calls the clock many times per wrap-around period to be able to catch this wrap-around, but

I think that with an 8 minute period (IIRC... am I right? can you please verify?), this will always be the case.

By the way, I believe we also get an interrupt on every wrap-around, so can use that to count the wrap-around instead of

trying to guess when it happens. Can you please take a look at other HPET drivers (e.g., in Linux) to see how they handle

32-bit hpet and perhaps get ideas?

+ }
+
+ _last_read_counter.store(counter);
+ return _upper_32bit_counter.load() * (1UL << 32) + counter;

Nitpick: Instead of *(1<<32) you can (s64).. << 32, instead of hoping the optimizer will replace the multiplication by a shift.

}
+
+ std::atomic<s64> _upper_32bit_counter;
};

class hpet_64bit_clock : public hpetclock {
--
2.20.1

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/20190628133630.7563-1-jwkozaczuk%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Waldek Kozaczuk

unread,

Jul 2, 2019, 12:03:43 AM7/2/19

to OSv Development

My previous "enable 32-bit hpet" patch you applied makes possible to run OSv on hyperkit for ~ 8 minutes. So honestly, I am just aiming to extend this 8-minutes to something longer with as little coding as possible ;-) As somebody said nobody will ever run OSv with hpet in production. But it is not to say we should not make it as correct as possible with as little effort as possible. Trying to find a compromise :-)

I will try to send a better patch but probably without implementing interrupts.

Good idea with per-CPU counter. Yes it is ~ 8 minutes.

By the way, I believe we also get an interrupt on every wrap-around, so can use that to count the wrap-around instead of
trying to guess when it happens. Can you please take a look at other HPET drivers (e.g., in Linux) to see how they handle
32-bit hpet and perhaps get ideas?

Handling interrupts sounds like a lot more work. I did look at Linux and the hpet implementation there is rather quite long (> 1K lines of code), not easy to follow and ridden with complaints about hpet design like this - https://github.com/torvalds/linux/blob/master/arch/x86/kernel/hpet.c#L363-L384.

+ }
+
+ _last_read_counter.store(counter);
+ return _upper_32bit_counter.load() * (1UL << 32) + counter;

Nitpick: Instead of *(1<<32) you can (s64).. << 32, instead of hoping the optimizer will replace the multiplication by a shift.

}
+
+ std::atomic<s64> _upper_32bit_counter;
};

class hpet_64bit_clock : public hpetclock {
--
2.20.1

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to osv...@googlegroups.com.

Waldemar Kozaczuk

unread,

Jul 5, 2019, 11:37:25 PM7/5/19

to osv...@googlegroups.com, Waldemar Kozaczuk

This patch enhances the hpet clock with 32-bit main
counter to handle wrap-arounds. It does it by maintaining

separate upper 32-bit counter per-cpu and incrementing it

when wrap around is detected.

Signed-off-by: Waldemar Kozaczuk <jwkoz...@gmail.com>
---

drivers/hpet.cc | 53 ++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 46 insertions(+), 7 deletions(-)

diff --git a/drivers/hpet.cc b/drivers/hpet.cc
index 22afae7d..d08afa76 100644
--- a/drivers/hpet.cc
+++ b/drivers/hpet.cc
@@ -18,6 +18,7 @@ extern "C" {
#include <osv/xen.hh>
#include <osv/irqlock.hh>
#include "rtc.hh"
+#include <osv/percpu.hh>

using boost::intrusive::get_parent_from_member;

@@ -33,23 +34,61 @@ protected:

#define HPET_COUNTER 0x0f0

-//FIXME: Enhance this class to handle wrap-around

+// The hpet clocks are tricky to implement right. Ideally hpet clocks,
+// especially those with 32-bit main counter, should be avoided in
+// virtualized environments. Unfortunately some hypervisors like
+// hyperkit and other bhyve derivatives provide 32-bit hpet clock only and
+// we would like to support those hypervisors as well to let potential
+// users test OSv on OSX and FreeBSD.
+// The implementation below is a compromise that delivers monotonic
+// reads but may suffer from inaccurate time reads especially when
+// OSv guest gets suspended by host. In essence it maintains it own
+// upper 32-bit counter per cpu which gets incremented every time
+// it gets less than previously saved main counter read from the host.
+// In future we can improve it by handling wrap-around interrupts but
+// but it is not clear if it would be better and especially worth the effort.

class hpet_32bit_clock : public hpetclock {
public:

- hpet_32bit_clock(mmioaddr_t hpet_mmio_address) : hpetclock(hpet_mmio_address) {

- debug_early_u64("WARNING: hpet with 32-bit counter will wrap around in seconds: ",
- (_period * (1UL << 32)) / 1000000000UL);

- }
+ hpet_32bit_clock(mmioaddr_t hpet_mmio_address) : hpetclock(hpet_mmio_address),
+ cpu_notifier([&] { init_on_cpu(); }) {}
+

protected:
virtual s64 time() override __attribute__((no_instrument_function)) {
- return _wall + mmio_getl(_addr + HPET_COUNTER) * _period;
+ return _wall + fetch_counter() * _period;
}

virtual s64 uptime() override __attribute__((no_instrument_function)) {
- return mmio_getl(_addr + HPET_COUNTER) * _period;
+ return fetch_counter() * _period;
+ }
+private:

+ void init_on_cpu() {
+ (*_upper_32bit_counter).store(0, std::memory_order_relaxed);
+ (*_last_read_counter).store(0, std::memory_order_relaxed);
}
+
+ s64 fetch_counter() {
+ auto last_valid_counter = (*_last_read_counter).load(std::memory_order_relaxed);
+ auto upper_counter = (*_upper_32bit_counter).load(std::memory_order_relaxed);

+
+ s64 counter = mmio_getl(_addr + HPET_COUNTER);
+ if (counter < last_valid_counter) {

+ // Wrap-around - increment upper counter
+ (*_upper_32bit_counter).compare_exchange_weak(upper_counter, upper_counter + 1);
+ }
+
+ (*_last_read_counter).store(counter, std::memory_order_relaxed);
+ return ((*_upper_32bit_counter).load(std::memory_order_relaxed) << 32) + counter;
+ }
+
+ static percpu<std::atomic<s64>> _upper_32bit_counter;
+ static percpu<std::atomic<s64>> _last_read_counter;
+
+ sched::cpu::notifier cpu_notifier;
};

+PERCPU(std::atomic<s64>, hpet_32bit_clock::_upper_32bit_counter);
+PERCPU(std::atomic<s64>, hpet_32bit_clock::_last_read_counter);
+
class hpet_64bit_clock : public hpetclock {
public:
hpet_64bit_clock(mmioaddr_t hpet_mmio_address) : hpetclock(hpet_mmio_address) {}
--
2.20.1

Nadav Har'El

unread,

Aug 12, 2019, 3:59:26 AM8/12/19

to Waldemar Kozaczuk, Osv Dev

In Sat, Jul 6, 2019 at 6:37 AM Waldemar Kozaczuk <jwkoz...@gmail.com> wrote:

This patch enhances the hpet clock with 32-bit main
counter to handle wrap-arounds. It does it by maintaining
separate upper 32-bit counter per-cpu and incrementing it
when wrap around is detected.

I have a few comments below, but in general I think this direction is sort-of-acceptable in the sense that

it is better than the current situation, which doesn't handle the wrap-around at all. So I think we can

commit this (though please look at my comments below).

But in the long run, I can't say I thin this approach is very reliable. If one CPU misses a wrap-around,

it will have its clock differing from that of other CPUs and this will never be corrected. If all of them miss

the wrap-around (e.g., suspend), none of them get the up-to-date time. To fix those things, maybe we

need to query the RTC on every wrap-around, and perhaps keep the same upper-counter on all CPUs

(but if we do that, we need to do this safely and correctly).

If you want to plan a long-term solution, I suggest you take a look at what Linux does for 32-bit HPET.

I suggest this for two reasons: First, a lot of smart people contributed to Linux, and there is a lot of

experience with their code. But second, since 99% of the OSs running on these hypervisors are Linux,

the authors of these hypervisors probably tuned their implementation of HPET to work correctly on

Linux, so anything deviating from what Linux does there might not work as well.

Is this even needed? Unless I'm misremembering, per-cpu variables are zero-initialized by default, as all global variables.

}
+
+ s64 fetch_counter() {
+ auto last_valid_counter = (*_last_read_counter).load(std::memory_order_relaxed);
+ auto upper_counter = (*_upper_32bit_counter).load(std::memory_order_relaxed);
+
+ s64 counter = mmio_getl(_addr + HPET_COUNTER);
+ if (counter < last_valid_counter) {
+ // Wrap-around - increment upper counter
+ (*_upper_32bit_counter).compare_exchange_weak(upper_counter, upper_counter + 1);

compare_exchange_weak should be done in a loop, I think you wanted _strong?

I'm starting to think maybe you did these atomic operations as protection against multiple threads, on the same CPU, trying to get time concurrently?

This can be done by the cheaper (on many-core machines) preempt_lock and I think the code will also be easier to understand.

Although, to be honest, the difference will probably be negligable (the compare_exchange above happens rarely, and just the relaxed load has no performance penalty.

+ }
+
+ (*_last_read_counter).store(counter, std::memory_order_relaxed);
+ return ((*_upper_32bit_counter).load(std::memory_order_relaxed) << 32) + counter;
+ }
+
+ static percpu<std::atomic<s64>> _upper_32bit_counter;
+ static percpu<std::atomic<s64>> _last_read_counter;

+
+ sched::cpu::notifier cpu_notifier;
};

+PERCPU(std::atomic<s64>, hpet_32bit_clock::_upper_32bit_counter);
+PERCPU(std::atomic<s64>, hpet_32bit_clock::_last_read_counter);

+
class hpet_64bit_clock : public hpetclock {
public:
hpet_64bit_clock(mmioaddr_t hpet_mmio_address) : hpetclock(hpet_mmio_address) {}
--
2.20.1

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/20190706033715.5981-1-jwkozaczuk%40gmail.com.

Commit Bot

unread,

Aug 19, 2019, 10:19:47 PM8/19/19

to osv...@googlegroups.com, Waldemar Kozaczuk

From: Waldemar Kozaczuk <jwkoz...@gmail.com>
Committer: Waldemar Kozaczuk <jwkoz...@gmail.com>
Branch: master

hpet: handle wrap-around with 32-bit counter

This patch enhances the hpet clock with 32-bit main
counter to handle wrap-arounds. It does it by maintaining
separate upper 32-bit counter per-cpu and incrementing it
when wrap around is detected.

Signed-off-by: Waldemar Kozaczuk <jwkoz...@gmail.com>

---

diff --git a/drivers/hpet.cc b/drivers/hpet.cc

--- a/drivers/hpet.cc
+++ b/drivers/hpet.cc
@@ -18,6 +18,7 @@ extern "C" {
#include <osv/xen.hh>
#include <osv/irqlock.hh>
#include "rtc.hh"
+#include <osv/percpu.hh>

using boost::intrusive::get_parent_from_member;

@@ -33,23 +34,53 @@ class hpetclock : public clock {

hpetclock(hpet_mmio_address) {}

+
protected:
virtual s64 time() override __attribute__((no_instrument_function)) {
- return _wall + mmio_getl(_addr + HPET_COUNTER) * _period;
+ return _wall + fetch_counter() * _period;
}

virtual s64 uptime() override __attribute__((no_instrument_function)) {
- return mmio_getl(_addr + HPET_COUNTER) * _period;
+ return fetch_counter() * _period;
}

+private:

+ s64 fetch_counter() {
+ auto last_valid_counter =
(*_last_read_counter).load(std::memory_order_relaxed);
+ auto upper_counter =
(*_upper_32bit_counter).load(std::memory_order_relaxed);
+
+ s64 counter = mmio_getl(_addr + HPET_COUNTER);
+ if (counter < last_valid_counter) {
+ // Wrap-around - increment upper counter

+ (*_upper_32bit_counter).compare_exchange_strong(upper_counter,
upper_counter + 1);

+ }
+
+ (*_last_read_counter).store(counter, std::memory_order_relaxed);
+ return ((*_upper_32bit_counter).load(std::memory_order_relaxed) <<
32) + counter;
+ }
+
+ static percpu<std::atomic<s64>> _upper_32bit_counter;
+ static percpu<std::atomic<s64>> _last_read_counter;

};

Waldek Kozaczuk

unread,

Aug 19, 2019, 10:26:45 PM8/19/19

to OSv Development

Thanks for the review. I have made some tweaks per your suggestions and pushed it.

On Monday, August 12, 2019 at 3:59:26 AM UTC-4, Nadav Har'El wrote:

In Sat, Jul 6, 2019 at 6:37 AM Waldemar Kozaczuk <jwkoz...@gmail.com> wrote:
This patch enhances the hpet clock with 32-bit main
counter to handle wrap-arounds. It does it by maintaining
separate upper 32-bit counter per-cpu and incrementing it
when wrap around is detected.

I have a few comments below, but in general I think this direction is sort-of-acceptable in the sense that
it is better than the current situation, which doesn't handle the wrap-around at all. So I think we can
commit this (though please look at my comments below).

But in the long run, I can't say I thin this approach is very reliable. If one CPU misses a wrap-around,
it will have its clock differing from that of other CPUs and this will never be corrected. If all of them miss
the wrap-around (e.g., suspend), none of them get the up-to-date time. To fix those things, maybe we
need to query the RTC on every wrap-around, and perhaps keep the same upper-counter on all CPUs
(but if we do that, we need to do this safely and correctly).

I think that querying RTC on every wrap-around to calculate the new value of the upper counter (instead of incrementing it)

might be the easiest thing to do.

If you want to plan a long-term solution, I suggest you take a look at what Linux does for 32-bit HPET.
I suggest this for two reasons: First, a lot of smart people contributed to Linux, and there is a lot of
experience with their code. But second, since 99% of the OSs running on these hypervisors are Linux,
the authors of these hypervisors probably tuned their implementation of HPET to work correctly on
Linux, so anything deviating from what Linux does there might not work as well.

Unfortunately, I am not smart enough nor have time to get smarter to read that code and understand the intention behind it ;-) Again this is just to make OSv run on the non-production type of hypervisor - hyperkit - on Mac. I think it is good enough until somebody complains.

To unsubscribe from this group and stop receiving emails from it, send an email to osv...@googlegroups.com.

Reply all

Reply to author

Forward