In article <r1h1lj$i5f$
1...@dont-email.me>, pozz <
pozz...@gmail.com> wrote:
>I need a timestamp in millisecond in linux epoch. It is a number that
>doesn't fit in a 32-bits number.
>
>I'm using a 32-bit MCU (STM32L4R9...) so I don't have a 64-bits hw
>counter. I need to create a mixed sw/hw 64-bits counter. It's very
>simple, I configure a 32-bits hw timer to run at 1kHz and increment an
>uint32_t variable in timer overflow ISR.
>
>Now I need to implement a GetTick() function that returns a uint64_t. I
>know it could be difficult, because of race conditions. One solutions is
>to disable interrupts, but I remember another solution.
This is actually a very tricky problem. I believe it is not possible to
solve it with the constraints you have laid out above. David Brown's solution
in his GetTick() function is correct, but it doesn't discuss why.
If you have a valid 64-bit counter which you can only reference 32-bits at
a time (which I'll make functions, read_high32() and read_low32(), but these
can be hardware registers, volatile globals, or real functions), then an
algorithm to read it reliably is basically your original algorithm:
uint64_t
GetTick()
{
old_high32 = read_high32();
while(1) {
low32 = read_low32();
new_high32 = read_high32();
if(new_high32 == old_high32) {
return ((uint64_t)new_high32 << 32) | low32;
}
old_high32 = new_high32;
}
}
This code does not need to mask interrupts, and it works on multiple CPUs.
This works even if interrupts occur at any point for any duration, even
if the code is interrupted for more than 49 days.
However, you don't have a valid 64-bit counter you can only read 32-bits at a
time. You have a free-running hardware counter which read_low32() returns.
It counts up every 1ms, and eventually wraps from 0xffff_ffff to 0x0000_0000
and causes an interrupt (which lots of people have helpfully calculated at
about 49 days). Let's assume that interrupt calls this handler:
volatile uint32_t ticks_high = 0;
void
timer_wrap_interrupt()
{
ticks_high++;
}
where by convention only this code will write to ticks_high (this is a very
important limitation). And so my function read_high32() is simply:
{ return ticks_high; }.
Unfortunately, with this design, I believe it is not possible to implement
a GetTick() function which does not sometimes fail to return a correct time.
There is a fundamental race between the interrupt and the timer value rolling
to 0 which software cannot account for.
The problem is it's possible for software to read the HW counter and see it
has rolled over from 0xffff_ffff to 0 BEFORE the interrupt occurs which
increments ticks_high. This is an inherent race: the timer wraps to 0, and
signals an interrupt. It's possible, even if for only a few cycles, to
read the register and see the zero before the interrupt is taken.
Shown more explicitly, the following are all valid states (let's assume
ticks_high is 0, read_low32() just ticked to 0xffff_fffe):
Time read_low32() ticks_high
-------------------------------------------------
0 0xffff_fffe 0
1ms 0xffff_ffff 0
1.99999ms 0xffff_ffff 0
2ms 0x0000_0000 0
Interrupt is sent and is now pending
2ms+delta 0x0000_0000 1
The issue is: what is "delta", and can other code (including your GetTick()
function) run between "2ms" and "2ms+delta"? And the answer is almost
assuredly "yes". This is a problem.
The GetTick() routine above can read g_high32==0, read_low32()==0, and then
g_high32==0 again at around time 2ms+small_amount, and return 0, even though
a cycle or two ago, read_low32() returned 0xffff_ffff. So time appears to
jump backwards 49 days when this happens.
There are a variety of solutions to this problem, but they all involve
extra work and ignoring the 32-bit rollover interrupt. So, remove
timer_wrap_interrupt(), and then do:
1) Have a single GetTick() routine, which is single-tasking (by
disabling interrupts, or a mutex if there are multiple processors).
This requires something to call GetTick() at least once every 49 days
(worst case). This is basically the Rich C./David Brown solution, but
they don't mention that you need to remove the interrupt on 32-bit overflow.
2) Use a higher interrupt rate. For instance, if we can take the interrupt
when read_low32() has carry from bit 28 to bit 29, then we can piece together
code which can work as long as GetTick() isn't delayed by more than 3-4 days.
This require GetTick() to change using code given under #4 below.
3) Forget the hardware counter: just take an interrupt every 1ms, and
increment a global variable uint64_t ticks64 on each interrupt, and then
GetTick just returns ticks64. This only works if the CPU hardware supports
atomic 64-bit accesses. It's not generally possible to write C code for a
32-bit processor which can guarantee 64-bit atomic ops, so it's best to have
the interrupt handler deal with two 32-bit variables ticks_low and
ticks_high, and then you still need the GetTicks() to have a while loop to
read the two variables.
4) Use a regular existing interrupt which occurs at any rate, as long as it's
well over 1ms, and well under 49 days. Let's assume you have a 1-second
interrupt. This can be asynchronous to the 1ms timer. In that interrupt
handler, you sample the 32-bit hardware counter, and if you notice it
wrapping (previous read value > new value), increment ticks_high.
You need to update the global volatile variable ticks_low as well as the
current hw count. And this interrupt handler needs to be the only code
changing ticks_low and ticks_high. Then, GetTick() does the following:
uint32_t local_ticks_low, local_ticks_high;
[ while loop to read valid ticks_low and ticks_high value into the
local_* variables ]
uint64_t ticks64 = ((uint64_t)local_ticks_high << 32) | local_ticks_low;
ticks64 += (int32_t)(read_low32() - local_ticks_low);
return ticks64;
Basically, we return the ticks64 from the last regular interrupt, which could
be 1 second ago, and we add in the small delta from reading the hw counter.
Again, this requires the 1-second interrupt to be guaranteed to happen before
we get close to 49 days since the last 1-second interrupt (if it's really
a 1-second interrupt, it easily meets that criteria. If you try to pick
something irregular, like a keypress interrupt, then that won't work). It
does not depend on the exact rate of the interrupt at all.
I wrote it above with extra safety--It subtracts two 32-bit unsigned variables,
gets a 32-bit unsigned result, treats that as a 32-bit signed result, and adds
that to the 64-bit unsigned ticks count. It's not strictly necessary to do
the 32-bit signed result cast: it just makes the code more robust in case
the HW timer moves backwards slightly. Imagine some code tries to adjust the
current timer value by setting it backwards slightly (say, some code trying
to calibrate the timer with the RTC or something). Without the cast to
32-bit signed int, this slight backwards move would result in ticks64
jumping ahead 49 days, which would be bad. In C, this is pretty easy, but it
should be carefully commented so no one removes any important casts.
Kent