I after more research I may more inclined to say this might be a QEMU bug. In essence, in TCG mode QEMU does NOT seem to correctly emulate CNTV_TVAL_EL0 (Virtual Timer TimerValue) and CNTV_CVAL_EL0 (Virtual Timer CompareValue) registers when one sets a tiny value to the former one or increments the latter one by a tiny value. In situations when OSv would seem to have ceased to receive any new timer interrupts the last value assigned to CNTV_TVAL_EL0 would typically be 3-4 ticks which given the frequency of 62500000 equals setting a timer in 48 or 64 nanoseconds.
Based on the descriptions of these 4 registers taken from ARMv8 Arch Documentation:
But what should happen if by the time we set CNTV_TVAL_EL0 and write to CNTV_CTL_EL0 to set IMASK to 0 and ENABLE bit to 1, the timer condition is already met because the tval was really tiny like 3 or 4, etc. Per the spec the timer condition should be met and we should be receiving an interrupt. But it seems in TCG mode QEMU does not seem to raise one.
But also please note that after we write CNTV_TVAL_EL0 followed by CNTV_CTL_EL0, subsequently reading CNTV_CTL_EL0 should return value with ISTATUS on if the condition is met.
Interestingly enough if I take advantage of the above and re-read CNTV_CTL_EL0 and detect that ISTATUS is met and fire the timer event on the spot and enable IMASK, all the tests that kept handing with TCG work fine based on hundreds of runs (one test like tst-hub.cc would hang always in the test_timer part before the change, now it always passes).
So this patch seems to be fixing the issue and make handling clock events slightly more efficient by skipping settning unnecessary alarms. This optimization also work in native mode with KVM (for KVM mode this patch is unnecessary as the native cpu is able to handle tiny TVAL values it seems).
diff --git a/arch/aarch64/arm-clock.cc b/arch/aarch64/arm-clock.cc
index c1f4b277..c7aafc5a 100644
--- a/arch/aarch64/arm-clock.cc
+++ b/arch/aarch64/arm-clock.cc
@@ -106,7 +106,22 @@ arm_clock_events::arm_clock_events()
res = 16 + 11; /* default PPI 11 */
}
_irq.reset(new ppi_interrupt(gic::irq_type::IRQ_TYPE_EDGE, res,
- [this] { this->_callback->fired(); }));
+ [this] {
+/* From AArch64 Programmer's Guides Generic Timer:
+ * The interrupts generated by the timer behave in a level-sensitive manner.
+ * This means that, once the timer firing condition is reached,
+ * the timer will continue to signal an interrupt until one of the following situations occurs:
+ - IMASK is set to one, which masks the interrupt.
+ - ENABLE is cleared to 0, which disables the timer.
+ - TVAL or CVAL is written, so that firing condition is no longer met.
+ When writing an interrupt handler for the timers, it is important
+ that software clears the interrupt before deactivating the interrupt in the GIC.
+ Otherwise the GIC will re-signal the same interrupt again. */
+ u32 ctl = this->read_ctl();
+ ctl |= 2; /* mask timer interrupt */
+ this->write_ctl(ctl);
+ this->_callback->fired();
+ }));
}
arm_clock_events::~arm_clock_events()
@@ -157,11 +172,20 @@ void arm_clock_events::set(std::chrono::nanoseconds nanos)
class arm_clock *c = static_cast<arm_clock *>(clock::get());
tval = ((__uint128_t)tval * c->freq_hz) / NANO_PER_SEC;
- u32 ctl = this->read_ctl();
- ctl |= 1; /* set enable */
- ctl &= ~2; /* unmask timer interrupt */
- this->write_tval(tval);
- this->write_ctl(ctl);
+ if (tval) {
+ u32 ctl = this->read_ctl();
+ ctl |= 1; /* set enable */