Stronger abstraction for Time and Alarm

Patrick Mooney

unread,

Feb 1, 2020, 8:38:49 PM2/1/20

to tock...@googlegroups.com

As noted in tock#1521, regarding generalization of ticks in the
time-related HILs, the addition of RISC-V and its "machine timer" poses
a challenge for the consumers of those interfaces. I'd like to explore
a straw man of how those abstractions could be strengthened in order to
make it easier for both in-kernel and userspace consumers to deal with
time.

Initial assumption: This would be for Tock on 32-bit ISAs.

Consider the following possibilities for a system timer peripheral:
1. 24-bit ARM SysTick Timer
2. 32-bit peripheral timer
3. 64-bit RISC-V machine timer

Notably, while options #1 and #2 are functionally very similar,
differing only in width. That is they both match on strict equality
(interrupt fires on time == compare) and can be configured to
interrupt/notify when time overflows. This is fundamentally different
than the RISC-V machine timer (detailed in the specification, section
3.1.10), which matches on greater-than-equals (interrupt on time >=
compare) and lacks an overflow notification (owing to its 64-bit width).

If the Time and Alarm HILs are left as they exist today, it's
practically impossible to make the machine timer act in a manner
compatible with the more traditionally compact uC-style timers. A
stronger (and more opaque) abstraction may solve that problem and bring
a more consistent experience across all of the supported hardware.

Let's establish some desires for the Time and Alarm interfaces:

1. A consistent time.now() reading width (using the full 32-bit word)
2. The ability to unambiguously schedule an alarm far (2^32 - 1 ticks)
in the future
3. For alarms scheduled "too near" in the future (so they're passed by
the time the schedule logic is reached), fire the handler immediately

With those in mind, each of the timer peripherals could be extended to
meet the requirements:

1. By counting overflows, the 24-bit SysTick timer could be extended
"manually" to 32-bits. The 64-bit machine timer would simply ignore
the upper word. (Simplifying time.now() readings in the process)
2. For alarms scheduled long in the future, the HIL could calculate the
number of overflows which must take place before looking for a match.
In the case of the machine timer, lacking an overflow interrupt, the
otherwise unused high word of the timer/comparator would suffice.
3. When scheduling "short" timers, the greater-than-equals behavior of
the machine timer would catch otherwise "missed" firings. For
strict-equals timers, an explicit comparison with now() could serve
as adequate emulation.

All told, it's at least a start on what could be a Time and Alarm
interface that is simpler _and_ consistent for downstream consumers.
Further scrutiny will be required to tease out potentially nasty edge
cases and ensure that the interfaces are satisfactory.

Considering the implied modifications to the syscall interface, it's not
a change to be made lightly, and may slot in nicely to the "Tock 2.0"
effort mentioned in the "Syscall return values" thread.

-Patrick

Philip Levis

unread,

Feb 1, 2020, 11:59:09 PM2/1/20

to Patrick Mooney, tock...@googlegroups.com

On Feb 1, 2020, at 5:38 PM, 'Patrick Mooney' via Tock Embedded OS Development Discussion <tock...@googlegroups.com> wrote:

1. By counting overflows, the 24-bit SysTick timer could be extended
"manually" to 32-bits. The 64-bit machine timer would simply ignore
the upper word. (Simplifying time.now() readings in the process)

Makes sense.

2. For alarms scheduled long in the future, the HIL could calculate the
  number of overflows which must take place before looking for a match.
  In the case of the machine timer, lacking an overflow interrupt, the
  otherwise unused high word of the timer/comparator would suffice.

Can this ever be more than 1 overflow?

I think what you’re saying is that if the current counter value (both in 32 and 64 bit) is 0xf0000000 and software requests a (32-bit) alarm of 0x000000010, then

- on a (CortexM) 32-bit counter platform the compare will be set to 0x00000010, but

- on a (RISC-V) 64-bit counter platform the compare will be set to 0x100000010 (the high register is 0x1 and the low register is 0x00000010).

3. When scheduling "short" timers, the greater-than-equals behavior of
  the machine timer would catch otherwise "missed" firings. For
  strict-equals timers, an explicit comparison with now() could serve
  as adequate emulation.

I went back through the TinyOS source code to see how we handed this there; I didn’t track the system that much (Cory Sharp wrote it), but I recall that it never encountered weird edge conditions or uncertainty about when to fire. The Timer system was rock solid, handled all kinds of timer widths and frequencies.

The key difference was that TinyOS mostly depended on Timers, rather than Alarms. The distinction is that an Alarm is just a compare value, while a Timer is a current time and delta (plus a bit on whether it repeats every delta or is one-shot). Alarm is fine when you certain there is extremely low latency between a call and hitting the hardware. However, when there is a lot of latency (e.g., a system call, userspace scheduling), you can’t distinguish a very short timer that happens to now be in the past and a very long timer that was supposed to be far in the future. Because Timer tells you the current time, you can know. This has come up in

https://github.com/tock/tock/pull/1499

For example, suppose a process reads the Alarm as 0x4000 and decides it wants an Alarm to fire at 0x4300. However, this call doesn’t trap to the kernel until 0x4400. The kernel can’t tell if it should fire immediately (because the time has passed) or should schedule it for the far far future. You can start doing things like saying you can only set a timer 2^31 in the future, but this loses half of your range for that edge case and also means callers have to check this.

In contrast, if a call for a Timer comes in with current=0x4000 and delta=0x300, then the kernel can know that the timer should fire.

Where I’m going with this is that I’m starting to think we should keep Alarm as a low-level kernel interface, but userspace processes (and probably capsules) should use a Timer interface.

Phil

———————
Philip Levis (he/him)
Associate Professor, Computer Science and Electrical Engineering

Faculty Director, lab64 Maker Space
Stanford University
http://csl.stanford.edu/~pal

Patrick Mooney

unread,

Feb 3, 2020, 10:54:00 AM2/3/20

to Philip Levis, tock...@googlegroups.com

On Sat, 1 Feb 2020 at 22:59, Philip Levis <p...@cs.stanford.edu> wrote:
> Can this ever be more than 1 overflow?

I think that's a good question. Is there some length of time that
consumers (capsules or otherwise) can count on being able to set as a
timer without issuing back-to-back timer setups? I don't see anything
using Freq16MHz from the time HIL, but that would yield less than 5
minutes. It could be that longer timers like that would be better
suited to be built on an even higher level RTC-ish time abstraction.

> I think what you’re saying is that if the current counter value (both
> in 32 and 64 bit) is 0xf0000000 and software requests a (32-bit) alarm
> of 0x000000010, then
> - on a (CortexM) 32-bit counter platform the compare will be set to
> 0x00000010, but
> - on a (RISC-V) 64-bit counter platform the compare will be set to
> 0x100000010 (the high register is 0x1 and the low register is
> 0x00000010).

My inclination with the RISC-V machine timer would be to to use the
upper word of timercmp only to detect overflow (using a timercmp value
of 0x100000000).

> it never encountered weird edge conditions or uncertainty about when
> to fire. The Timer system was rock solid, handled all kinds of timer
> widths and frequencies.

I wasn't suggesting that there were existing edge cases, just that we'd
want to be wary in the context of building such abstractions.

> Where I’m going with this is that I’m starting to think we should keep
> Alarm as a low-level kernel interface, but userspace processes (and
> probably capsules) should use a Timer interface.

Splitting up the interfaces definitely seems like the right approach.
If the lower (Alarm) interface has overflow disambiguation (perhaps
implicitly scheduling from now() + value), then it should cover both the
existing uC timers and machine timer.

-Patrick

Brad Campbell

unread,

Feb 3, 2020, 1:21:38 PM2/3/20

to Patrick Mooney, Philip Levis, Tock Embedded OS Development Discussion

For userspace we could add an alternative to the alarm capsule, but if I'm understanding correctly the question is what should hide the heterogeneity of the underlying timer hardware? If we add more constraints to the Alarm HIL then we likely lose the low level flexibility some capsules may want. We can create a new HIL, and then decide if a capsule should provide a translation or if the chip peripheral driver should implement both HILs.

- Brad

--
You received this message because you are subscribed to the Google Groups "Tock Embedded OS Development Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tock-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tock-dev/CAAErXu8FWjNqDHqPCxMXCjk94nRq6%3D40v7jBiwcH6y%3DdG10PNg%40mail.gmail.com.

Philip Levis

unread,

Feb 3, 2020, 2:17:10 PM2/3/20

to Patrick Mooney, tock...@googlegroups.com

On Feb 3, 2020, at 7:53 AM, Patrick Mooney <pat...@oxidecomputer.com> wrote:

I think what you’re saying is that if the current counter value (both
in 32 and 64 bit) is 0xf0000000 and software requests a (32-bit) alarm
of 0x000000010, then
- on a (CortexM) 32-bit counter platform the compare will be set to
0x00000010, but
- on a (RISC-V) 64-bit counter platform the compare will be set to
0x100000010 (the high register is 0x1 and the low register is
0x00000010).

My inclination with the RISC-V machine timer would be to to use the
upper word of timercmp only to detect overflow (using a timercmp value
of 0x100000000).

I don’t see how one would do this? The semantics of an N-bit timer should be that it wraps around when it overflows, in part because this is how (2’s complement) integer arithmetic works.

Or do you mean that on a 64-bit system, it would work like this:

1) Caller sets the compare value to 0x1 0x00000010.
2) Callee sets the compare value to 0x1 0x00000000, to detect overflow.
3) On overflow, callee sets the compare value to 0x1 0x00000010.

Why use an intermediate low-bits-overlow interrupt? It would seem to me that you’re emulating the hardware in software.

Splitting up the interfaces definitely seems like the right approach.
If the lower (Alarm) interface has overflow disambiguation (perhaps
implicitly scheduling from now() + value), then it should cover both the
existing uC timers and machine timer.

This assumes that the machine timer implementation disables the interrupt on firing and re-enables it when another Alarm is set? If so, we should define the Alarm trait to have this behavior. Otherwise, a uC Alarm client may set the compare value once and expect it will fire multiple times. The current trait does (it’s a “one shot timer”) but we can specify this more precisely, i.e., calling set_alarm() will result in at most one callback to the client.

It’s too bad we didn’t realize these properties of mtimer when we redid the Timer HIL! Still, better late than never.

FWIW, looking back at the TinyOS timer library, this was the Alarm interface (trait). The async keyword meant “can run in interrupt context”. Alarm has a startAt and this is what the higher-level Timer implementations (which can’t run in interrupt context, support periodic timers and virtualization) use. Gotta love the “soon”!

interface Alarm<precision_tag, size_type> {
/**
  * Set a single-short alarm to some time units in the future. Replaces
  * any current alarm time. Equivalent to start(getNow(), dt). The
  * <code>fired</code> will be signaled when the alarm expires.
  *
  * @param dt Time until the alarm fires.
  */
async command void start(size_type dt);

/**
  * Set a single-short alarm to time t0+dt. Replaces any current alarm
  * time. The <code>fired</code> will be signaled when the alarm expires.
  * Alarms set in the past will fire "soon".
  *
  * <p>Because the current time may wrap around, it is possible to use
  * values of t0 greater than the <code>getNow</code>'s result. These
  * values represent times in the past, i.e., the time at which getNow()
  * would last of returned that value.
  *
  * @param t0 Base time for alarm.
  * @param dt Alarm time as offset from t0.
  */
async command void startAt(size_type t0, size_type dt);

async command size_type getNow();
async command size_type getAlarm();
async command void stop();
async command bool isRunning();

async event void fired(); // TinyOS interfaces were bidirectional, this is a callback

Patrick Mooney

unread,

Feb 3, 2020, 3:44:42 PM2/3/20

to Philip Levis, tock...@googlegroups.com

On Mon, 3 Feb 2020 at 12:21, Brad Campbell <bra...@gmail.com> wrote:

> For userspace we could add an alternative to the alarm capsule, but if
> I'm understanding correctly the question is what should hide the
> heterogeneity of the underlying timer hardware?

Yes, I think a consistent interface for time and timers (scheduled work)
is a desirable abstraction for a relatively general-purpose OS.

> If we add more constraints to the Alarm HIL then we likely lose the
> low level flexibility some capsules may want.

Unless it's a capsule which is driving hardware that is tightly coupled
to the timer itself, flexibility what would be lost? In the case of
such tight coupling (like driving a PWM duty cycle or something?), it's
hard to believe one could get away with anything but dedicating that
timer to the task.

> We can create a new HIL, and then decide if a capsule should provide a
> translation or if the chip peripheral driver should implement both
> HILs.

I would propose that except for capsules which deal directly with the
hardware timer HILs in order to homogenize the exposed interface, all
other kernel consumers should use that generic homogenized abstraction.

On Mon, 3 Feb 2020 at 13:17, Philip Levis <p...@cs.stanford.edu> wrote:

> I don’t see how one would do this? The semantics of an N-bit timer
> should be that it wraps around when it overflows, in part because this
> is how (2’s complement) integer arithmetic works.

Say that the consumer of the machine timer has scheduled an alarm past
the 32-bit overflow value (0x00000010+ovf in your example). In order to
emulate the function of a 32-bit timer with overflow detection, the
machien timer would configure a timercmp value of 0x100000000. When the
interrupt for that was fired, it would clear the upper words of timer
and timercmp (again for 32-bit emulation), and fire the overflow
handler. The consumer, in that handler, would configure the new
expiration time for 0x00000010.

> It’s too bad we didn’t realize these properties of mtimer when we
> redid the Timer HIL! Still, better late than never.

It's the userspace implications which worry me the most. Updating the
rest of the kernel will certainly require work (and testing!), but has a
manageable scope.

Guillaume Endignoux

unread,

Feb 5, 2020, 11:21:12 AM2/5/20

to Patrick Mooney, Philip Levis, Tock Embedded OS Development Discussion

Here is my take on some of the issues mentioned in this thread.

1. Resolving ambiguity when setting alarms near in the future (e.g. now is ticks=0x4000, we want an alarm at 0x4300 but by the time we make it to the kernel the time is already 0x4400), and supporting alarms far in the future (more than 2^24 or 2^32 ticks away).

Assuming all hardware that we want to support with the timer HIL have support for:

- comparison of ticks on say >= 24 bits,

- events whenever overflow happens,

couldn't we always represent time as 64 bit ticks? (assuming 64-bit overflow never happens in any realistic application ; if that's not enough, we could use 128 bits instead)

From the userspace point of view, reading now=0x4000 and setting an alarm for later=0x4300 would be unambiguous even if it only hits the kernel at 0x4400, because a time in the future would be 0x100004300 instead of 0x4300.

As far as I can see, simulating 64-bit ticks requires:

- Keeping an accurate overflow count in the kernel (in case of 24-bit ticks, this means 40 bits of overflow).

This means that the overflow interrupt must always be enabled and occur no matter what (so never happen to be disabled when it should fire).

I guess we can assume that the time to handle it and increment the overflow counter will always be faster than the delay to the next overflow.

- Support for communicating 64 bits of useful information on the userspace boundary (especially when the kernel returns from a syscall to userspace).

This could already be simulated with allow()-ing an 8-byte buffer but seems a bit overkill.

Alternatively, we could have syscalls return values in multiple registers for a Tock 2.0 API - to extend what was discussed elsewhere in the context of providing unambiguous error codes (e.g. instead of only using register 0 to return a value to userspace, we could have register 0 = status/error code, register 1 = low word, register 2 = high word).

I haven't looked much at what every hardware supports, but the nRF52x chips seem to support that.

2. Setting an alarm very soon in the future.

Setting an alarm for e.g. the next tick may not trigger on == comparison. For example, the nRF52840's RTC specification states that:

> If the COUNTER is N, writing N or N+1 to a CC register may not trigger a COMPARE event.

(And conversely:

> If the COUNTER is N and the current CC register value is N+1 or N+2 when a new CC value is written, a match may trigger on the previous CC value before the new value takes effect.)

See https://infocenter.nordicsemi.com/index.jsp?topic=%2Fps_nrf52840%2Frtc.html&cp=4_0_0_5_21

So how can one really set an alarm at the next clock tick?

- As was proposed in this thread, we could eagerly fire the alarm immediately if scheduled before some small threshold, e.g. within the next 10 ticks (value TBD).

However, the current capsules that I found while working on https://github.com/tock/tock/pull/1521 all have a pattern of "read now & set N ticks after that". So if N is too small (due to a very short delay set by the capsule - or a low frequency of the underlying timer), we could end up in an infinite loop where the alarm immediately fires again.

It'd be more robust to have the capsules save the time when the last alarm was scheduled for, and wait N ticks from std::max(now, last_scheduled_alarm) so that eventually we don't trigger the next alarm immediately - but that requires all capsules to migrate to such a pattern.

- Another idea could be to leverage multiple compare registers when available (the nRF52840 RTC has 4 of them), in some kind of "double buffering" of comparisons.

That is, schedule compare[0] at the tick T we wish, and also schedule compare[1] at T+2, so that if compare[0] doesn't trigger (due to the "N+1 problem" mentioned in nRF52840's specs), the compare[1] would trigger instead as a backup.

That would be a bit later but at least we could detect that we missed the previous alarm.

I'm not sure about the details (e.g. how to avoid any sort of race condition here), but this way we could probably simulate a >= comparison with multiple == comparisons.

This means using more compare registers per alarm/timer, but given that all alarms/timers can currently be multiplexed through a single virtual alarm in Tock, we otherwise only use one comparison register anyway, regardless of the number of drivers.

Guillaume

Patrick Mooney

unread,

Feb 5, 2020, 2:22:16 PM2/5/20

to Guillaume Endignoux, Philip Levis, Tock Embedded OS Development Discussion

On Wed, 5 Feb 2020 at 10:21, Guillaume Endignoux <guill...@google.com> wrote:
> couldn't we always represent time as 64 bit ticks? (assuming 64-bit
> overflow never happens in any realistic application ; if that's not
> enough, we could use 128 bits instead)

Considering "full-sized" OSes haven't bothered to go beyond 64 bits for
timekeeping, 128 seems like total overkill. The case for 64-bit is
perhaps trickier. There might be very few consumers which would benefit
from 64-bit width on timers. If so, it would be unfortunate to impose
the two-word cost on everything else.

It might be adequate for the time abstraction to simply keep track of
32-bit overflows in an emulated upper time register which could be
queried (via a separate interface) for software which deals in longer
time scales. That way, the primary interface for querying the time and
setting alarms/timers, would deal only with single-word values.

It's worth exploring the potential and likely-to-be-common use cases so
the solution is both flexible and efficient.

> Setting an alarm for e.g. the next tick may not trigger on ==
> comparison. For example, the nRF52840's RTC specification states that:
>
> > If the COUNTER is N, writing N or N+1 to a CC register may not
> > trigger a COMPARE event.

If Tock is going to present a consistent time interface to capsules (and
userspace), I think the underlying HIL provider would need to account
for this. If an upper level consumer is attempting to program an alarm
that might not fire due to hardware limitations, then perhaps perform
the closest approximation? (Schedule N+3 ticks ahead instead? Fire the
timer immediately? Busy spin until the expiration?)

Regardless, it would be up to the NRF52840 module(s) to work around the
limitation, rather than forcing that duty on consumers.

Philip Levis

unread,

Feb 10, 2020, 1:41:22 PM2/10/20

to Patrick Mooney, Guillaume Endignoux, Tock Embedded OS Development Discussion

I’d like to separate out what I see as three separate discussions here:

1) What are the Alarm and Timer traits?

2) What instantiations of the Alarm and Timer traits are generally used within the kernel?

3) What instantiation of the Alarm or Timer trait is used by the sys call driver (exported to userspace)?

It could be (and likely is) that the answers to 2+3 are the same thing.

The fact that the Alarm and Timer traits take a width parameter and frequency type means that we can separate out 1 from 2 and 3.

There are many cases when a 32-bit wraparound counter is fine. There are also cases when 64 bits is needed. Forcing everyone to use 64 bits, though, is wasted RAM. There is a tradeoff between code space and RAM: as it takes a little extra code to make a 64-bit timer 32 bits. Or 16! But ultimately we can’t talk about the width of the time unit unless we also talk about its frequency. A ms-timer can be fine for most (but not all) cases with 16 bits, but a microsecond timer might need more.

One observation: within the kernel, Alarms tend to be fast. The common case is for timeouts/delays in sensors and radios. E.g., the ISL29035 requires that software wait 420us before reading a requested sample. The two places we can expect this to change is low-power networking and wide-area networking. E.g., BLE advertisements can be slowish (sub-Hz) to save energy by keeping the radio off, and TCP timeouts can be seconds over a a wide area network.

So here’s my strawman proposal, which tries to bring together the different ideas everyone has brought up:

1) We change Alarm and Timer so their start() methods takes a t0 and a dt.

2) The kernel maintains a 64-bit counter at 32kHz (the standard lowest power oscillator).

- It can of course maintain other counters, but this is the one it promises and is generally used

3) The standard AlarmMux or TimerMux is 32 bits.

4) Userspace sees 32-bit 32kHz Alarms/Timers.

Platforms with a 32-bit counter use their overflow interrupt to emulate the top 32 bits in the 64-bit counter.

Platforms with a 64-bit counter use the full 64 bits on compares; having t0 and dt should make this easier.

In this model, the standard implementation for a chip provides both an Alarm<u32> and a Counter<u64>.

The syscall interface uses 32 bits for most of its calls. However, it also has a command for requesting the full 64-bit time. If we decide that libtock needs 64-bit Alarms, we can emulate them in userspace. A 32kHz counter at 32 bits is 17 bits of seconds, (2^32 - 2^15), which is about 36 hours. So emulating 64 bits in software requires an additional interrupt every 36 hours in order for software to emulate the overflow.

Phil

--
You received this message because you are subscribed to the Google Groups "Tock Embedded OS Development Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tock-dev+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/tock-dev/CAAErXu-F803sW18m0YjixqJj7Tsc4Zhqqpnrvij4EPERK7WEUg%40mail.gmail.com.

Patrick Mooney

unread,

Feb 11, 2020, 4:48:24 PM2/11/20

to Philip Levis, Guillaume Endignoux, Tock Embedded OS Development Discussion

On Mon, 10 Feb 2020 at 12:41, Philip Levis <p...@cs.stanford.edu> wrote:
> The fact that the Alarm and Timer traits take a width parameter and
> frequency type means that we can separate out 1 from 2 and 3.

Considering that those width parameters are effectively discarded by the
higher-level abstractions (muxes) placed on top of Alarm implementations
today, I'm not sure that applies.

> There are many cases when a 32-bit wraparound counter is fine. There
> are also cases when 64 bits is needed. Forcing everyone to use 64
> bits, though, is wasted RAM.

I agree that forcing everything to be 64-bit seems like overkill.

> But ultimately we can’t talk about the width of the time unit unless
> we also talk about its frequency. A ms-timer can be fine for most (but
> not all) cases with 16 bits, but a microsecond timer might need more.

A low-width high-frequency timer would still be usable provided that it
presented an adequate tool set (like overflow triggering) to its
consumers. (That is unless it overflowed so often that handling the
resulting interrupts became overwhelming.)

> 1) We change Alarm and Timer so their start() methods takes a t0 and a
> dt.

Firstly, if we're changing the Alarm interface, does it make sense to
keep the Timer HIL as well, considering it has no implementers today?

What about cases where t0 was near 32-bit overflow, and the actual
scheduling of the timer happened after the timer rolled? Would the
logic implement some windowing to forcibly disambiguate "a little late"
from "far in the future"?

> 2) The kernel maintains a 64-bit counter at 32kHz (the standard lowest
> power oscillator).
> - It can of course maintain other counters, but this is the one it
> promises and is generally used

Are we OK foreclosing on support for any devices which do not have such
a timer (or one which cannot be scaled to that value)? The existing
chips/boards seem to have this covered, but it seems feasible that
someone in the future may come along with something that doesn't. (Even
like something with a 16MHz timer, where an integer prescale for 32768
results in ~500ppm error.)

Deciding on such a limitation is fine, I think it would be good to be
explicit about the choice if so.

> 3) The standard AlarmMux or TimerMux is 32 bits.

Overall, this sounds good. The utility of the 64-bit timer back-end as
described above becomes a little more questionable if the "standard"
interface is 32-bit only, though.

> 4) Userspace sees 32-bit 32kHz Alarms/Timers.

Matching the standardized in-kernel interface makes sense.

> Platforms with a 64-bit counter use the full 64 bits on compares;
> having t0 and dt should make this easier.

As I mentioned, unless t0 is 64-bit (and therefore the rest of the
interface), it's not clear how to square this with the standardized
32-bit interface.

> The syscall interface uses 32 bits for most of its calls. However, it
> also has a command for requesting the full 64-bit time.

Are you thinking this would be where the upper word would be requested,
or where userspace provides a pointer argument to where it wants the
full 64-bit time copied out to?

> If we decide that libtock needs 64-bit Alarms, we can emulate them in
> userspace. A 32kHz counter at 32 bits is 17 bits of seconds, (2^32 -
> 2^15), which is about 36 hours. So emulating 64 bits in software
> requires an additional interrupt every 36 hours in order for software
> to emulate the overflow.

I agree, as long as it's easy for userspace to trigger on that overflow
condition (which disambiguated alarm scheduling would facilitate). It's
a sensible thing to punt on for now.

-Patrick

Philip Levis

unread,

Feb 12, 2020, 9:53:03 PM2/12/20

to Patrick Mooney, Guillaume Endignoux, Tock Embedded OS Development Discussion

On Feb 11, 2020, at 1:48 PM, 'Patrick Mooney' via Tock Embedded OS Development Discussion <tock...@googlegroups.com> wrote:

On Mon, 10 Feb 2020 at 12:41, Philip Levis <p...@cs.stanford.edu> wrote:
The fact that the Alarm and Timer traits take a width parameter and
frequency type means that we can separate out 1 from 2 and 3.

Considering that those width parameters are effectively discarded by the
higher-level abstractions (muxes) placed on top of Alarm implementations
today, I'm not sure that applies.

Do you mean

impl<A: Alarm<'a>> Time for VirtualMuxAlarm<'a, A> {

    type Frequency = A::Frequency;

    fn max_tics(&self) -> u32 {

        self.mux.alarm.max_tics()

}

    fn now(&self) -> u32 {

        self.mux.alarm.now()

}

}

How this is just assuming a u32? I agree, this would need to change to respect the width parameter.

Phil

Patrick Mooney

unread,

Feb 12, 2020, 10:24:45 PM2/12/20

to Philip Levis, Guillaume Endignoux, Tock Embedded OS Development Discussion

On Wed, 12 Feb 2020 at 20:52, Philip Levis <p...@cs.stanford.edu> wrote:
> How this is just assuming a u32?

That apparently nothing overrides the 'W' type parameter for the Time
and Alarm HILs, particularly since all of the abstractions built on top
of them do not even bother to parameterize. (Which, granted, would
probably be a pain with integer primitive types.)

Vadim Sukhomlinov

unread,

Feb 12, 2020, 10:58:16 PM2/12/20

to Patrick Mooney, Philip Levis, Guillaume Endignoux, Tock Embedded OS Development Discussion

I'd say microsecond timer should be 64-bit. Justification - 32-bit will give you up to 4096seconds, which is a bit more than 1h. May not be enough for various security applications which have to check the validity period of the token. Other usages - periodic daily wake-ups from sleep, etc. Sticking to 1ms timer may not be enough due to resolution requirements for dealing with IO. There might be several timers - high resolution and low resolution with different APIs. Also, 64 bits is not that challenging - on 32-bit ARM/RISC V it will be returned in registers, and it can be read directly from Timer device. Applications can reduce resolution on their own if needed.

--
You received this message because you are subscribed to the Google Groups "Tock Embedded OS Development Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tock-dev+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/tock-dev/CAAErXu-2S68CHYneq-8O1P3sHsdMiYPpUSw8TBAu5hS8xSuCEQ%40mail.gmail.com.

--

Regards,

Vadim Sukhomlinov

Philip Levis

unread,

Feb 12, 2020, 11:32:02 PM2/12/20

to Vadim Sukhomlinov, Patrick Mooney, Guillaume Endignoux, Tock Embedded OS Development Discussion

I think there are two issues here — what is the expected duration of a wraparound, and the actual precision.

I’m wary of specify us because I’m wary of an Alarm ever advertising a precision it does not have. E.g., if a system drops into 32kHz for low power, then it should not be advertising us precision Alarms, because in reality its alarms have a precision of ~30us (one tick of a 32kHz). So you will end up with weird cases where you’d expect Alarm A to fire after Alarm B (it’s 10 ticks later, after all), but it doesn’t.

A given kernel can absolutely provide higher frequency counters, and services for it can assume them.

Phil

Guillaume Endignoux

unread,

Feb 17, 2020, 5:26:12 AM2/17/20

to Philip Levis, Vadim Sukhomlinov, Patrick Mooney, Tock Embedded OS Development Discussion

If you take the nRF52840's RTC, the overflow is at 2^24 for a frequency of 32KHz, so 512 seconds, i.e. barely more than 8 minutes.

So there's clearly a benefit to handling overflows.

Now, are 32 bits enough?

Again, at 32KHz that's about 36 hours - so definitely reachable in a device's lifetime.

However, I'm more concerned about it because if the logic doesn't handle overflows well, you don't detect it during development/debugging (because 36 hours is much longer than the typical development cycle), but then risk having unexpected issues in production.

At least with an 8 minute overflow, there is a chance to catch issues during development.

With 64 bits, we can assume that overflow doesn't occur during a device's lifetime (even at 4GHz overflow takes more than 100 years), which removes the need for extra logic to detect overflow (except at the chip level, where overflow is detected to simulate the 64 bits).

The next question was whether 64-bit timers would waste RAM.

The best would be to measure this, but quite frankly, I think that shaving an extra 4 bytes per timer is premature optimization.

If one is worried about RAM usage, I think there are plenty of low-hanging fruits in Tock to optimize and save orders of magnitude more memory.

Here are a few examples:

- Tuning which capsules are included for a board. Each capsule's `static_init` bytes are in memory. If a given application doesn't need the temperature sensor, one can make a custom board without it.

- Tuning the size of buffers for various capsules in the kernel. It's not uncommon to have buffers of 1KB, so even decreasing that by a few percent would save more RAM than the timers.

- Doing a proper "no debug" building mode, a.k.a. https://github.com/tock/tock/issues/1372. Multiple elements can be shaved off from the kernel, for example: console, process console, low-level debug, in-kernel debugging.

- Tuning memory available for apps, depending on how many apps and how much IPC is expected, e.g. https://github.com/tock/tock/issues/1532.

So what would really be the cost of 64-bit timers?

Let's consider the "worst-case" scenario where we also extend syscalls to return 64 bits (e.g. instead of returning in register 0 only, use an extra register for the most significant bits).

- In terms of syscall boundary, from the userspace point of view all the input registers are considered erased by the syscall due to the "memory" barrier (both in libtock-c and libtock-rs), so using more registers to write extra results wouldn't increase register pressure in userspace code (there are currently 4 input registers but only 1 output register).

- At the syscall boundary from the kernel's point of view, that's a few more words to write back to memory, so a few more instructions to run and a few more bytes in the binary. Syscalls that don't need it can also just not write anything back.

- All arithmetic operations on timers now include operations on two words. That's a few more instructions.

- Each timestamp stores 4 extra bytes. But how many timestamps are stored at any given time in the kernel? There's roughly 1 timestamp per client of an alarm mux, so I think that an extra 100 bytes is a realistic upper bound in the current state of Tock.

On the other side, what are the benefits of 64-bit logical timers?

- They avoid the need for a "3-way comparison" currently used to compare wrapping ticks w.r.t. some reference point. This 3-way comparison makes things more complex to understand and also involves more instructions and storage of more timers.

- 64-bit timers are reliable in all scenarios out-of-the-box, so less risk of bugs. The developer doesn't have to decide whether scenario A works fine with 32 bits whereas scenario B requires 64 bits to handle overflows.

Last, if application A needs to re-implement their own overflow detection in userspace, the savings made in the kernel are offset by extra cost in userspace.

- The overflow detection logic now needs to be implemented for each application, or at the very least twice (in libtock-c as well as libtock-rs) - which means more maintenance, more risk of programming errors, as opposed to a centralized implementation in the kernel.

- If there are multiple processes running on the same board, the extra logic is copied in each of them.

- If notifying userspace that an overflow occurs requires some extra syscalls/callbacks, the cost could easily be 10x higher than the extra 4 bytes per timer saved in the kernel (and the kernel itself now needs to provide more syscalls, so more logic there as well).

If on the other hand application B doesn't need timers, and/or if the memory cost of 4 extra bytes really is prohibitive, why not re-compile the kernel without any of the timers on that board, with fewer debugging capsules, etc.?

All in all, before ruling out the possibility of using 64 bits, I'd suggest weighing the pros and cons more thoroughly.

Guillaume

Patrick Mooney

unread,

Feb 17, 2020, 3:52:55 PM2/17/20

to Guillaume Endignoux, Philip Levis, Vadim Sukhomlinov, Tock Embedded OS Development Discussion

On Mon, 17 Feb 2020 at 04:26, Guillaume Endignoux <guill...@google.com> wrote:
> Now, are 32 bits enough?
>

> However, I'm more concerned about it because if the logic doesn't
> handle overflows well, you don't detect it during
> development/debugging (because 36 hours is much longer than the
> typical development cycle), but then risk having unexpected issues in
> production.

I agree that presenting an API (both in-kernel and to userspace) which
makes it easy to do the correct thing with respect to overflow is
important here. Phil's suggestion of `set_alarm(from_when, ticks)`
appears to solve that problem rather simply.

> So what would really be the cost of 64-bit timers?

I think a question to ask is: Assuming Tock provides a simple interface
for unambiguous single-word (32-bit) timer scheduling, how many things
would still need a >36hour/64-bit timer? If such cases do exist, could
we extend the interface in an opt-in manner to suit their needs?

> Let's consider the "worst-case" scenario where we also extend syscalls
> to return 64 bits (e.g. instead of returning in register 0 only, use
> an extra register for the most significant bits).

Considering the discussion about separating syscall errors from the
results, this would mean 3-word return value, which itself departs from
most ABI conventions (where the result is typically specified as up to
two registers).

-Patrick

Philip Levis

unread,

Feb 17, 2020, 4:07:27 PM2/17/20

to Patrick Mooney, Guillaume Endignoux, Vadim Sukhomlinov, Tock Embedded OS Development Discussion

On Feb 17, 2020, at 12:52 PM, Patrick Mooney <pat...@oxidecomputer.com> wrote:

Let's consider the "worst-case" scenario where we also extend syscalls
to return 64 bits (e.g. instead of returning in register 0 only, use
an extra register for the most significant bits).

Considering the discussion about separating syscall errors from the
results, this would mean 3-word return value, which itself departs from
most ABI conventions (where the result is typically specified as up to
two registers).

If you need a 64-bit value from the kernel you can always do an allow() with an 8-byte buffer.

Phil

Guillaume Endignoux

unread,

Feb 18, 2020, 3:17:17 AM2/18/20

to Philip Levis, Patrick Mooney, Vadim Sukhomlinov, Tock Embedded OS Development Discussion

> I think a question to ask is: Assuming Tock provides a simple interface
> for unambiguous single-word (32-bit) timer scheduling, how many things
> would still need a >36hour/64-bit timer? If such cases do exist, could
> we extend the interface in an opt-in manner to suit their needs?

Scheduling an alarm in more than 36 hours may be a niche use case.

But having a monotonic clock that works for more than 36 hours is definitely not niche.

To take a concrete example (from OpenSK), the FIDO2 protocol specifies that some operations are only allowed in the first N seconds after powering up a security key.

With a monotonic clock, that's easy to implement: just obtain the current time and compare it with N.

But if the clock is not monotonic (due to wrapping around at 36 hours), this doesn't work because the operation would be accepted at time (2^32 + N - 1) for example.

So to be correct, we need to handle the overflow in userspace, either with some overflow-specific callback (which doesn't exist yet), or by repeatedly scheduling an alarm to count seconds for example (or any other suitable unit of time).

In any case, these workarounds are quite overkill if the kernel can handle 64-bit timestamps instead.

Instead of asking each application to implement their own overflow logic (or rather not ask them and let the developers maybe figure it out themselves or maybe not figure it out and end up with a bug in their code), why not doing it once and for all in the kernel?

Again, I think that the supposed overhead of 64-bit timestamps is negligible compared to many other optimization opportunities in Tock, and that the cost of re-doing overflow detection for every application would likely be at least an order of magnitude higher (by cost I mean not only memory, but additional code, implementation bugs, etc.).

> If you need a 64-bit value from the kernel you can always do an allow() with an 8-byte buffer.

That would be an overkill way of "saving 4 bytes" by adding extra code for an additional allow/command pair, both in the kernel and userspace, thereby wasting much more than 4 bytes.

> Considering the discussion about separating syscall errors from the
> results, this would mean 3-word return value, which itself departs from
> most ABI conventions (where the result is typically specified as up to
> two registers).

In principle I don't see how departing from most existing ABI conventions is an issue in itself, but if Tock aims at following existing conventions it would be nice to have it documented somewhere.

Guillaume Endignoux

unread,

Feb 18, 2020, 3:21:47 AM2/18/20

to Philip Levis, Patrick Mooney, Vadim Sukhomlinov, Tock Embedded OS Development Discussion

> > How this is just assuming a u32?
>
> That apparently nothing overrides the 'W' type parameter for the Time
> and Alarm HILs, particularly since all of the abstractions built on top
> of them do not even bother to parameterize. (Which, granted, would
> probably be a pain with integer primitive types.)

Just noticed this part of the discussion.

Yes, even though the current HIL has a W parameter, in practice it only works for u32 at the moment.

See https://github.com/tock/tock/pull/1521 for a draft of how to support 24-bit counters for example - this requires changing the HIL so that the bit width is propagated where necessary.

Although that draft was before the RISC-V discussion, which means that the HIL probably requires changes beyond that to support RISC-V as well.

Philip Levis

unread,

Mar 5, 2020, 12:18:29 PM3/5/20

to Guillaume Endignoux, Patrick Mooney, Vadim Sukhomlinov, Tock Embedded OS Development Discussion

I’d like to get this discussion going again. With #1651 it’s becoming more pressing. How about I make a branch for a proposed redesign, send a detailed description of it to this list, then we can comment/discuss here? I’ll try to have it ready by early next week.

Phil

Amit Levy

unread,

Mar 10, 2020, 10:33:54 AM3/10/20

to tock...@googlegroups.com

Following up that this is a good plan.

--
You received this message because you are subscribed to the Google Groups "Tock Embedded OS Development Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tock-dev+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/tock-dev/1385739A-E451-413C-9B9F-AE88C0EAEA3A%40cs.stanford.edu.

Reply all

Reply to author

Forward