Hi
We have discovered a bug of RTEMS relating to the clock. Even though we only tested it for the ERC32 board (for now), we believe it is common throughout all the targets.
The test consists in checking if the time given by rtems_clock_get is always increasing. The code is as follows:
rtems_task Init(rtems_task_argument argument) {
struct timespec old;
struct timespec new;
rtems_clock_get_uptime(&old);
while (1) {
rtems_clock_get_uptime(&new);
if (new.tv_sec < old.tv_sec)
failedTestExit(TEST_CASE, 10, "seconds decrease");
if (new.tv_sec == old.tv_sec) {
if (new.tv_nsec < old.tv_nsec) {
/* this is the error given */
printk(" new = %d %d\n", new.tv_sec, new.tv_nsec);
printk(" old = %d %d\n", old.tv_sec, old.tv_nsec);
/* commented just to continue the test */
/*failedTestExit(TEST_CASE, 10, "second equals but nanoseconds decreased");*/
}
}
old = new;
/* stop after 10 seconds */
if (new.tv_sec > 10)
sucessfullTestExit(TEST_CASE);
}
sucessfullTestExit(TEST_CASE);
}
This is the output we got using the SIS simulator:
new = 0 50002000
old = 0 59990000
new = 0 90002000
old = 0 99990000
new = 0 100003000
old = 0 109991000
new = 0 120003000
old = 0 129991000
new = 0 160002000
old = 0 169989000
new = 0 210002000
old = 0 219989000
new = 0 260002000
old = 0 269990000
new = 0 300001000
old = 0 309989000
new = 0 320003000
old = 0 329991000
new = 0 360002000
old = 0 369990000
new = 0 370002000
old = 0 379989000
new = 0 380001000
old = 0 389989000
new = 0 420002000
old = 0 429990000
new = 0 470002000
old = 0 479989000
new = 0 500004000
old = 0 509991000
new = 0 520001000
old = 0 529989000
new = 0 550003000
old = 0 559991000
new = 0 600002000
old = 0 609990000
new = 0 650002000
old = 0 659989000
new = 0 670001000
old = 0 679988000
new = 0 690002000
old = 0 699990000
new = 0 730002000
old = 0 739989000
new = 0 770001000
old = 0 779989000
new = 0 790004000
old = 0 799992000
new = 0 800001000
old = 0 809989000
new = 0 840001000
old = 0 849989000
new = 0 860004000
old = 0 869991000
new = 0 870001000
old = 0 879989000
new = 0 890004000
old = 0 899991000
new = 0 900001000
old = 0 909989000
new = 0 920004000
old = 0 929991000
new = 0 930001000
old = 0 939989000
new = 0 950004000
old = 0 959992000
new = 0 960001000
old = 0 969989000
new = 1 40002000
old = 1 49990000
new = 1 60002000
As you can see, the problem is raised when a clock tick occurs: the time read in “new” is older than “old”. This cannot happen since “new” is updated AFTER “old”! This problem occurs if a clock interrupt is triggered after the number of clock ticks have been read but BEFORE the nanosecond field has been read. The resulting number of clock ticks is “small” (should be plus one).
We can use the interrupt pending CPU registers info to determine if, while we are reading the number of clock ticks that occurred and the nanoseconds, a clock interrupt could be pending.
We do not have enough knowledge to say that this solution can be used for all boards (determining if an interrupt is pending). In the SPARC architecture, at least, it is possible. If it is possible in all architectures, then a solution to RTEMS would be to change the code of the _TOD_Get_uptime function to:
void _TOD_Get_uptime(struct timespec *uptime) {
ISR_Level level;
struct timespec offset;
volatile uint32_t pending;
/* assume uptime checked by caller */
offset.tv_sec = 0;
offset.tv_nsec = 0;
_ISR_Disable(level);
*uptime = _TOD_Uptime;
if (_Watchdog_Nanoseconds_since_tick_handler)
offset.tv_nsec = (*_Watchdog_Nanoseconds_since_tick_handler)();
/* code added: */
pending = isClockInterruptPending();
_ISR_Enable(level);
/* code added */
/* if an interrupt occurred while interrupts were disabled and the nanoseconds is too little */
/* means that a clock interrupt occurred BEFORE the nanoseconds were read */
if (pending && offset.tv_nsec < rtems_configuration_get_microseconds_per_tick() / 2) {
struct timespec clockTick = {0, rtems_configuration_get_microseconds_per_tick()*1000};
_Timespec_Add_to(&offset, &clockTick); /* so add one clock tick to the offset */
}
/* else, the clock tick occurred AFTER the nanoseconds were read, so no problem */
_Timespec_Add_to(uptime, &offset);
}
At least, with these modifications, the test passes :)!
Kind regards
Manuel Coutinho
No you have discovered a bug in the ERC32's get nanoseconds since last
tick handler.
Apparently it doesn't recognize that is has been longer than a tick
since the last clock
tick. It should determine that is has overflowed and return 1 clock
tick + partial clock
tick.
It is returning the wrong answer.
>
>
> The test consists in checking if the time given by rtems_clock_get is
> always increasing. The code is as follows:
>
>
>
> .....
....
--
Joel Sherrill, Ph.D. Director of Research & Development
joel.s...@OARcorp.com On-Line Applications Research
Ask me about RTEMS: a free RTOS Huntsville AL 35805
Support Available (256) 722-9985
_______________________________________________
rtems-users mailing list
rtems...@rtems.com
http://rtems.rtems.org/mailman/listinfo/rtems-users
Hi. Thanks for testing in leon2…was just about to do that :)!
Basically, the test aims to see that the clock is always increasing. But the result is that, sometimes, it decreases.
This happens when a clock tick occurs.
The RTEMS algorithm to read the time consists of:
1. Disabling interrupts
2. Read how many clock interrupts occurred
3. Read nanoseconds elapsed
4. Enable interrupts
The problem occurs if an interrupt occurs between the 1 and 3 steps: -> the number of clock ticks is NOT updated (because interrupts are disabled) and the nanoseconds field (which is read directly from the CPU registers) is restarted.
For example, the clock is 2 clock ticks and 99999 us (a clock tick is just about to occur). Suppose you disable interrupts, read 2 clock ticks, then the CPU clock counter overflows and the nanoseconds is reset to 0 and you read 0 nanoseconds.
You end up reading 2 clock ticks and 0 nanoseconds when in fact you should read 3 clock ticks and 0 nanoseconds - you cannot stop the clock counter (nor do you wish to!).
Hope this explains the problem a little bit better :)
By the way, by glancing at the code of the remaining BSPs, specifically at the Clock_driver_nanoseconds_since_last_tick macro, I don’t believe this is a problem just with SPARC. I don’t see this problem address in any board. But, of course, I might be wrong since it was just a very brief look.
Kind regards
Manuel Coutinho
I’m not sure that the isClockInterruptPending() is a function that can be available to all BSPs. Some CPU might not support this. I don’t know :(.
In either case, here’s the function for the ERC32 board (I know, the names can be improved):
#define INTERRUPT_PENDING_ADDRESS 0x01F80048
/**
* RTC interrupt level
**/
#define RTC_ISR_INTERRUPT_LEVEL 13
#define RTC_INTERRUPT_PENDING_MASK (1 << RTC_ISR_INTERRUPT_LEVEL)
uint32_t isClockInterruptPending() {
uint32_t *pointer_pending = (uint32_t *) INTERRUPT_PENDING_ADDRESS;
return pending (*pointer_pending ) & RTC_INTERRUPT_PENDING_MASK;
/* code added */
/* if an interrupt occurred while interrupts were disabled and the nanoseconds is too little */
/* means that a clock interrupt occurred BEFORE the nanoseconds were read */
if (pending && offset.tv_nsec < rtems_configuration_get_microseconds_per_tick() / 2) {
struct timespec clockTick = {0, rtems_configuration_get_microseconds_per_tick()*1000};
_Timespec_Add_to(&offset, &clockTick); /* so add one clock tick to the offset */
}
I see that _TOD_Uptime is updated every clock tick in cpukit/score/src/coretodtickle.c (see bellow) with microseconds precision. It is not enough with the next patch?
=== modified file 'cpukit/score/src/coretodgetuptime.c'
--- cpukit/score/src/coretodgetuptime.c 2009-03-31 07:56:38 +0000
+++ cpukit/score/src/coretodgetuptime.c 2009-03-31 09:49:25 +0000
@@ -20,6 +20,7 @@
#include <rtems/score/isr.h>
#include <rtems/score/timespec.h>
#include <rtems/score/tod.h>
+#include <rtems/config.h>
/*
* _TOD_Get_uptime
@@ -46,8 +47,8 @@
_ISR_Disable( level );
*uptime = _TOD_Uptime;
- if ( _Watchdog_Nanoseconds_since_tick_handler )
- offset.tv_nsec = (*_Watchdog_Nanoseconds_since_tick_handler)();
+// if ( _Watchdog_Nanoseconds_since_tick_handler )
+// offset.tv_nsec = (*_Watchdog_Nanoseconds_since_tick_handler)();
_ISR_Enable( level );
This solution is actually very robust. Only if the number of microseconds per tick is very small (~10 us) should you get problems. This is, if two clock interrupts can occur while interrupts are disabled. But we consider this to be impossible.
Remember that the clock ticks and nanoseconds are read while interrupts are disabled. So after the disabling of interrupts at most only one clock tick could occur.
The _TOD_Tickle_ticks only states that one more clock tick occurred. The nanoseconds field is not updated. Haven’t tested, but don’t believe that just by commenting those lines the problem will be solved.
Any code modifications outside the device driver are not correct and
will not address all cases of using this extension.
This might be a common issue with get nanoseconds handlers but it will
NOT be addressed in this way. Check the is pending bit or that the
counter overflowed in the get nanoseconds handler and return the correct
number.
--joel
> ------------------------------------------------------------------------
>
> *From:* ait...@googlemail.com <mailto:ait...@googlemail.com>
> [mailto:ait...@googlemail.com <mailto:ait...@googlemail.com>]
> *On Behalf Of *Aitor Viana
> *Sent:* Tuesday, March 31, 2009 8:31 AM
> *To:* Joel Sherrill
> *Cc:* Manuel Coutinho; rtems...@rtems.com
> <mailto:rtems...@rtems.com>
> *Subject:* Re: Bug with clock nanoseconds
> rtems...@rtems.com <mailto:rtems...@rtems.com>
> http://rtems.rtems.org/mailman/listinfo/rtems-users
> -----Original Message-----
> From: Joel Sherrill [mailto:joel.s...@oarcorp.com]
> Sent: Tuesday, March 31, 2009 11:40 AM
> To: Aitor Viana
> Cc: Manuel Coutinho; rtems...@rtems.com
> > Hi. Thanks for testing in leon2.was just about to do that :)!
ERC32:
uint32_t bsp_clock_nanoseconds_since_last_tick(void) {
uint32_t clicks;
clicks = ERC32_MEC.Real_Time_Clock_Counter;
/* Down counter */
clicks = (uint32_t) (BSP_Configuration.microseconds_per_tick - clicks) *
1000;
/* if an interrupt occurred while interrupts were disabled and the
nanoseconds is too little */
/* means that a clock interrupt occurred BEFORE the nanoseconds were
read */
if (ERC32_Is_interrupt_pending(ERC32_INTERRUPT_REAL_TIME_CLOCK) &&
clicks < BSP_Configuration.microseconds_per_tick / 2)
clicks += BSP_Configuration.microseconds_per_tick * 1000;
return clicks;
}
LEON2:
uint32_t bsp_clock_nanoseconds_since_last_tick(void) {
uint32_t clicks;
clicks = LEON_REG.Timer_Counter_1;
/* Down counter */
clicks = (uint32_t) (BSP_Configuration.microseconds_per_tick - clicks) *
1000;
/* if an interrupt occurred while interrupts were disabled and the
nanoseconds is too little */
/* means that a clock interrupt occurred BEFORE the nanoseconds were
read */
if (LEON_Is_interrupt_pending(LEON_INTERRUPT_TIMER1) && clicks <
BSP_Configuration.microseconds_per_tick / 2)
clicks += BSP_Configuration.microseconds_per_tick * 1000;
return clicks;
}
I've tested and the result seams ok :)
Kind regards
Manuel Coutinho
"How many nanoseconds has it been since you last called rtems_clock_tick?"
The driver was answering this question with the "number of nanoseconds
since the last counter hit zero" not since it last called rtems_clock_tick.
Most of the time the answers are the same but there is a small window
where it is not and this case highlights that.
> Moreover, CONFIGURE_MICROSECONDS_PER_TICK defines the RTEMS time
> granularity so I dont see the point making this ad-hoc modification in
> bsp_clock_nanoseconds_since_last_tick() function to make
> _TOD_Get_Uptime() work properly when we can do it applying this simple
> patch
>
You are losing the ability to have the timestamp include "partial ticks".
If you call get uptime 10 times in a row, you will likely get only 1 or two
(if a tick occurs) distinct values. Getting nanosecond accurate timestamps
is the entire functional goal of the handler being called.
> === modified file 'cpukit/score/src/coretodgetuptime.c'
> --- cpukit/score/src/coretodgetuptime.c 2009-03-31 07:56:38 +0000
> +++ cpukit/score/src/coretodgetuptime.c 2009-03-31 11:58:51 +0000
> @@ -46,8 +46,8 @@
>
> _ISR_Disable( level );
> *uptime = _TOD_Uptime;
> - if ( _Watchdog_Nanoseconds_since_tick_handler )
> - offset.tv_nsec = (*_Watchdog_Nanoseconds_since_tick_handler)();
> +// if ( _Watchdog_Nanoseconds_since_tick_handler )
> +// offset.tv_nsec = (*_Watchdog_Nanoseconds_since_tick_handler)();
>
> _ISR_Enable( level );
>
> loosing no resolution.
>
> BTW this patch is tested and working.
>
>
--
Joel Sherrill, Ph.D. Director of Research & Development
joel.s...@OARcorp.com On-Line Applications Research
Ask me about RTEMS: a free RTOS Huntsville AL 35805
Support Available (256) 722-9985
_______________________________________________
rtems-users mailing list
rtems...@rtems.com
--joel