A timer driver for Cortex-M0+... it rarely doesn't work

pozz

unread,

Apr 26, 2017, 6:55:56 PM4/26/17

to

I don't know if it is an issue specific to Atmel SAMC21 Cortex-M0+
devices or not.

I wrote a simple timer driver: a 32-bits hw counter clocked at 875kHz
(14MHz/16) that triggers an interrupt on overflow (every 1h 21'). In
the interrupt I increment the 32-bits global variable _ticks_high.
The 64-bits number composed by _ticks_high (upper 32-bits) and the hw
counter (lower 32-bits) is my system tick. This 64-bits software
counter, incremented at 875kHz, will never overflow during my life, so
it's good :-)

In timer.h I have:
--- Start of timer.h ---
#include <stdint.h>
#include <io.h> // Atmel specific

#define TIMER_FREQ 875000

typedef uint64_t Timer;
extern uint32_t _ticks_high;

#define volatileAccess(v) *((volatile typeof((v)) *) &(v))

static inline uint64_t ticks(void) {
uint32_t h1 = volatileAccess(_ticks_high);
TC0->COUNT32.CTRLBSET.reg =
TC_CTRLBSET_CMD(TC_CTRLBSET_CMD_READSYNC_Val);
uint32_t l1 = TC0->COUNT32.COUNT.reg;
uint32_t h2 = volatileAccess(_ticks_high);
if (h2 != h1) return ((uint64_t)h2 << 32) + 0;
else return ((uint64_t)h1 << 32) + l1;
}

static inline void TimerSet(Timer *tmr, uint64_t delay) {
/* delay is in ms */
*tmr = ticks() + delay * TIMER_FREQ / 1000;
}

static inline int TimerExpired(Timer *tmr) {
return ticks() >= *tmr;
}
--- End of timer.h ---

In timer.c I have the ISR:
--- Start of timer.c ---
...
void TC0_Handler(void) {
if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {
++_ticks_high;
TC0->COUNT32.INTFLAG.reg = TC_INTFLAG_OVF;
}
}
...
--- End of timer.c ---

The idea is simple (and stolen from a post appeared on this newsgroup).
At first, the 64-bits software counter must be calculated disabling
interrupts, because if a timer interrupt triggers during calculation,
the overall software counter could be wrong.
By reading _ticks_high before and after reading hw counter, we can avoid
to disable the interrupts.

Now I have a code that uses timers. It's a simple state-machine that
manages communication over a bus.
--- Start of bus.c ---
...
int bus_task(void) {
switch(bus.state) {
case BUS_IDLE:
if (TimerExpired(&bus.tmr_answer)) {
/* Send new request on the bus */
...
TimerSet(&bus.tmr_answer, timeout_answer);
bus.state = BUS_WAITING_ANSWER;
}
break;

case BUS_WAITING_ANSWER:
if (TimerExpired(&bus.tmr_answer)) {
/* No reply */
bus.state = BUS_IDLE;
TimerSet(&bus.tmr_answer, 0);
} else {
if (reply_received() == true) {
/* Analyze the reply */
bus.state = BUS_IDLE;
TimerSet(&bus.tmr_answer, 0);
}
}
break;
}
return 0;
}
...
--- End of bus.c ---

I don't think I need to explain the code in bus.c. The only thing to
specify is that bus_task() is called continuously in the main loop.

99% of the time this code works well. Unfortunately I have seen some
strange events. Rarely (very rarely, one time in a week) the bus seems
frozen for a time. After that it restarts the normal activity magically.
There's a thing that relates those strange events to driver of timers:
the bus stall time lasts exactly 1h 21', the overflow period of hw counter.

I suspect there's a problem in my low-level driver and sometimes, maybe
near the overflow, the code doesn't work as I expect.
Maybe the TimerSet() function sometimes sets a wrong value to the
uint64_t timer, maybe a tick value that will happen only at the next
overflow of the hw counter.

Do you see where is the problem?

Simon Clubley

unread,

Apr 26, 2017, 7:27:17 PM4/26/17

to

On 2017-04-26, pozz <pozz...@gmail.com> wrote:
> I don't know if it is an issue specific to Atmel SAMC21 Cortex-M0+
> devices or not.
>

[snip]

>
> 99% of the time this code works well. Unfortunately I have seen some
> strange events. Rarely (very rarely, one time in a week) the bus seems
> frozen for a time. After that it restarts the normal activity magically.
> There's a thing that relates those strange events to driver of timers:
> the bus stall time lasts exactly 1h 21', the overflow period of hw counter.
>
> I suspect there's a problem in my low-level driver and sometimes, maybe
> near the overflow, the code doesn't work as I expect.
> Maybe the TimerSet() function sometimes sets a wrong value to the
> uint64_t timer, maybe a tick value that will happen only at the next
> overflow of the hw counter.
>
> Do you see where is the problem?
>

Does changing the optimisation level change the problem ?

Does a review of the generated code using objdump match what you
would expect ?

Have you tried placing come kind of debug marker in TC0_Handler()
(maybe turning on an LED) to see if TC0_Handler() is called without
the interrupt flag being set ?

This last one is in case there's some kind of rare timing issue
which causes the overflow handler to be called without TC_INTFLAG_OVF
being set yet when INTFLAG is examined.

Simon.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world

John Speth

unread,

Apr 26, 2017, 8:17:31 PM4/26/17

to

First you need to make ticks_high volatile. It doesn't make any sense
to me why you'd cast it volatile in one execution context but not the
other. If it's shared between two execution contexts, the compiler
needs to know that.

It sounds like you have a concurrent access problem that you need to
protect against. If it was me, I'd disable interrupts instead of your
trick.

JJS

pozz

unread,

Apr 27, 2017, 3:32:17 AM4/27/17

to

Il 27/04/2017 01:24, Simon Clubley ha scritto:
> On 2017-04-26, pozz <pozz...@gmail.com> wrote:
>> I don't know if it is an issue specific to Atmel SAMC21 Cortex-M0+
>> devices or not.
>>
>
> [snip]
>
>>
>> 99% of the time this code works well. Unfortunately I have seen some
>> strange events. Rarely (very rarely, one time in a week) the bus seems
>> frozen for a time. After that it restarts the normal activity magically.
>> There's a thing that relates those strange events to driver of timers:
>> the bus stall time lasts exactly 1h 21', the overflow period of hw counter.
>>
>> I suspect there's a problem in my low-level driver and sometimes, maybe
>> near the overflow, the code doesn't work as I expect.
>> Maybe the TimerSet() function sometimes sets a wrong value to the
>> uint64_t timer, maybe a tick value that will happen only at the next
>> overflow of the hw counter.
>>
>> Do you see where is the problem?
>>
>
> Does changing the optimisation level change the problem ?

As you can think, it's very difficult make a test and try, because I
have to wait for the rare event that could happen after a week.

Anyway I think yes, disabling optimisation should solve, but it's not a
solution.

> Does a review of the generated code using objdump match what you
> would expect ?

Do you mean reading the output listing with assembler instructions? I'm
not an expert of ARM assembler, anyway I tried to read it and it seems
correct.

> Have you tried placing come kind of debug marker in TC0_Handler()
> (maybe turning on an LED) to see if TC0_Handler() is called without
> the interrupt flag being set ?

No, it would be very strange. TC0_Handler() function address is stored
only in the vector table, so it is called only when TC0 peripheral
requests an interrupt.
Anyway, if the interrupt flag is not set, in TC0_Handler() there's a if
and the increment of _ticks_high isn't done.

> This last one is in case there's some kind of rare timing issue
> which causes the overflow handler to be called without TC_INTFLAG_OVF
> being set yet when INTFLAG is examined.

There's an if and the increment of _ticks_high wouldn't be done.

pozz

unread,

Apr 27, 2017, 3:37:21 AM4/27/17

to

Il 27/04/2017 02:17, John Speth ha scritto:
> First you need to make ticks_high volatile. It doesn't make any sense
> to me why you'd cast it volatile in one execution context but not the
> other. If it's shared between two execution contexts, the compiler
> needs to know that.

_ticks_high is accessed in TimerSet(), TimerExpired() and the ISR
TC0_Handler(). Timerset() and TimerExpired() reads _ticks_high by
calling ticks() function. So _ticks_high is accessed only in two
points: ticks() and TC0_Handler().

ticks() is called during normal background flow (not interrupt), so the
access to _ticks_high must be volatile.
TC0_Handler() is the ISR, so I think a volatile access to _ticks_high is
not necessary (the ISR can't be interrupted).

> It sounds like you have a concurrent access problem that you need to
> protect against. If it was me, I'd disable interrupts instead of your
> trick.

I liked this trick because avoids disabling interrupts. TimerExpired()
is often called.

I found the post (from Wouter van Ooijen) where this trick is suggested:
https://groups.google.com/d/msg/comp.arch.embedded/9d8I5FFbmX4/6yL33UR92F8J

Don Y suggested other improvements in the same thread.

David Brown

unread,

Apr 27, 2017, 4:01:14 AM4/27/17

to

On 27/04/17 00:55, pozz wrote:
> I don't know if it is an issue specific to Atmel SAMC21 Cortex-M0+
> devices or not.
>
> I wrote a simple timer driver: a 32-bits hw counter clocked at 875kHz
> (14MHz/16) that triggers an interrupt on overflow (every 1h 21'). In
> the interrupt I increment the 32-bits global variable _ticks_high.
> The 64-bits number composed by _ticks_high (upper 32-bits) and the hw
> counter (lower 32-bits) is my system tick. This 64-bits software
> counter, incremented at 875kHz, will never overflow during my life, so
> it's good :-)
>
> In timer.h I have:
> --- Start of timer.h ---
> #include <stdint.h>
> #include <io.h> // Atmel specific
>
> #define TIMER_FREQ 875000
>
> typedef uint64_t Timer;
> extern uint32_t _ticks_high;

Identifiers that begin with an underscore are reserved for file scope -
you are not supposed to use them for "extern" variables. I would be
extremely surprised to find that a compiler treated them in any special
way, but them's the rules (§7.1.3 in C11 N1570.)

>
> #define volatileAccess(v) *((volatile typeof((v)) *) &(v))

That looks familiar :-)

>
> static inline uint64_t ticks(void) {
> uint32_t h1 = volatileAccess(_ticks_high);
> TC0->COUNT32.CTRLBSET.reg =
> TC_CTRLBSET_CMD(TC_CTRLBSET_CMD_READSYNC_Val);
> uint32_t l1 = TC0->COUNT32.COUNT.reg;
> uint32_t h2 = volatileAccess(_ticks_high);
> if (h2 != h1) return ((uint64_t)h2 << 32) + 0;
> else return ((uint64_t)h1 << 32) + l1;
> }

(I don't know what the CTRLBSET stuff is for - I am not familiar with
Atmel's chips here.)

If the low part of the counter rolls over and the interrupt has not run
(maybe interrupts are disabled, or you are already in a higher priority
interrupt, or interrupts take a few cycles to work through the system)
then this will be wrong - l1 will have rolled over, but h2 will not show
an updated value yet.

So if the high parts don't match, you need to re-read the low part and
re-check the high parts. It is perhaps easiest to express in a loop:

static inline uint64_t ticks(void) {
uint32_t h1 = volatileAccess(_ticks_high);

while (true) {

uint32_t l1 = TC0->COUNT32.COUNT.reg;
uint32_t h2 = volatileAccess(_ticks_high);

if (h1 == h2) return ((uint64_t) h2 << 32) | l1;
h1 = h2;
}
}

You can also reasonably note that this loop is not going to be re-run
more than once, unless you have interrupt functions that last for an
hour and a half, or something similarly bad - and then the failure of

the ticks() function is the least of your problems! Then you can write:

static inline uint64_t ticks(void) {
uint32_t h1 = volatileAccess(_ticks_high);

uint32_t l1 = TC0->COUNT32.COUNT.reg;
uint32_t h2 = volatileAccess(_ticks_high);

if (h1 != h2) l1 = TC0->COUNT32.COUNT.reg;
return ((uint64_t) h2 << 32) | l1;

}

>
> static inline void TimerSet(Timer *tmr, uint64_t delay) {
> /* delay is in ms */
> *tmr = ticks() + delay * TIMER_FREQ / 1000;
> }
>
> static inline int TimerExpired(Timer *tmr) {
> return ticks() >= *tmr;
> }

When you want a boolean result, return "bool", not "int". (This is not
your problem, of course - it's just good habit.)

An alternative method here is to just use the low 32-bit part (with
delays limited to 2^31 ticks), but be sure that you have considered
wraparound. Keep everything in unsigned at first, to make overflows
work as modulo arithmetic:

static uint32_t ticks32(void) {
return TC0->COUNT32.COUNT.reg;
}

static inline void TimerSet32(Timer32 *tmr, uint32_t delay) {
*tmr = ticks32() + delay * TIMER_FREQ / 1000;
}

static inline bool TimerExpired32(Timer32 *tmr) {
int32_t d = ticks32() - *tmr;
return (d >= 0);
}

Note that doing the subtraction as unsigned, then converting to signed
int, gives you the wraparound behaviour you need. Converting an
unsigned int to a signed int when the result is out of range is
implementation dependent (§6.3.1.3), but gcc and many other compilers
define it as modulo arithmetic:
<https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html>

> --- End of timer.h ---
>
> In timer.c I have the ISR:
> --- Start of timer.c ---
> ...
> void TC0_Handler(void) {
> if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {
> ++_ticks_high;
> TC0->COUNT32.INTFLAG.reg = TC_INTFLAG_OVF;
> }
> }
> ...
> --- End of timer.c ---
>
> The idea is simple (and stolen from a post appeared on this newsgroup).

"Good artists copy, great artists steal", as someone famous once said.

It sounds like you are missing a tick interrupt - maybe you are somehow
blocking the overflow interrupt on occasion. Maybe the CTRLBSET line is
the problem (since I don't know what it does!) But it could be the
problem I noted above in ticks().

Diagnosing problems like this are a real pain - and demonstrating that
you have fixed them is even worse. The key is to find some way to speed
up the hardware timer so that you can provoke the problem regularly -
then you can be sure when you have fixed it. Are you able to make this
timer overflow at a lower bit count - say, 12 bits rather than 32 bits?
If not, then try this for your hardware interrupt function:

void TC0_Handler(void) {
if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {
++_ticks_high;
TC0->COUNT32.INTFLAG.reg = TC_INTFLAG_OVF;

TC0->COUNT32.COUNT.reg = 0xfffff000;
}
}

That will force overflows and _ticks_high counts to run much faster.

Dimiter_Popoff

unread,

Apr 27, 2017, 4:02:06 AM4/27/17

to

Hoping that the compiler will know how to fix this is not a very
efficient approach.
It is just a few lines of assembly, no matter how inconvenient the ARM
assembly may be it will still save you weeks (months) of blind trial
and error.
You just need to _know_ how this is done, typing blindly this or that
is unlikely to ever work.

Dimiter

------------------------------------------------------
Dimiter Popoff, TGI http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/

David Brown

unread,

Apr 27, 2017, 4:13:40 AM4/27/17

to

On 27/04/17 09:37, pozz wrote:
> Il 27/04/2017 02:17, John Speth ha scritto:
>> First you need to make ticks_high volatile. It doesn't make any sense
>> to me why you'd cast it volatile in one execution context but not the
>> other. If it's shared between two execution contexts, the compiler
>> needs to know that.

"volatile" does not mean that a variable is shared between two execution
contexts - it is neither necessary nor sufficient to make such sharing work.

To get shared accesses right, you need to be sure of what is accessed at
what times - it is the /accesses/ to the variable that need to be
volatile. Marking a variable "volatile" is simply a shortcut for saying
that /all/ accesses to it must be volatile accesses.

>
> _ticks_high is accessed in TimerSet(), TimerExpired() and the ISR
> TC0_Handler(). Timerset() and TimerExpired() reads _ticks_high by
> calling ticks() function. So _ticks_high is accessed only in two
> points: ticks() and TC0_Handler().
>
> ticks() is called during normal background flow (not interrupt), so the
> access to _ticks_high must be volatile.
> TC0_Handler() is the ISR, so I think a volatile access to _ticks_high is
> not necessary (the ISR can't be interrupted).
>

Yes, the volatile accesses here are fine. Omitting the volatile for
_ticks_high does not give any benefits or disadvantages, because you are
only doing a single read and write of the variable anyway - there is
very little room for the compiler to optimise. The compiler can move
the non-volatile accesses to _ticks_high to after the clearing of the
overflow flag, maybe saving a cycle or two if it is a Cortex M7 that can
benefit from more sophisticated instruction scheduling. But otherwise,
it does not matter one way or the other.

Jack

unread,

Apr 27, 2017, 4:15:13 AM4/27/17

to

Il giorno giovedì 27 aprile 2017 09:32:17 UTC+2, pozz ha scritto:

> As you can think, it's very difficult make a test and try, because I
> have to wait for the rare event that could happen after a week.

change the timer clock so it overflows faster than 1.21h...

Bye Jack

David Brown

unread,

Apr 27, 2017, 4:27:41 AM4/27/17

to

Looking over these again, I can see that they will have the same
potential problem. They are fine for when the high part of the counter
is also in hardware - but /not/ if it is dependent on interrupts which
could be delayed. If you have a situation like this:

_ticks_high is 1
timer reg is 0xffff'fff0
interrupts are disabled
timer reg rolls over, and is now 0x0000'0003
the ticks() function will return 0x0000'0001'0000'0003 instead of
0x0000'0002'0000'0003

You will need to check the hardware overflow flag to make this work.

static inline uint64_t ticks(void) {
uint32_t h1 = volatileAccess(_ticks_high);
uint32_t l1 = TC0->COUNT32.COUNT.reg;
uint32_t h2 = volatileAccess(_ticks_high);

if (h1 != h2) {
// We just had an interrupt, so we know there is
// no pending overflow
l1 = TC0->COUNT32.COUNT.reg;
// Or l1 = 0 for slightly faster code

return ((uint64_t) h2 << 32) | l1;
}

if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {

// There has been an overflow, and it was not handled
l1 = TC0->COUNT32.COUNT.reg;
return ((uint64_t) (h2 + 1) << 32) | l1;
}
// If there has been an overflow or an interrupt, it happened
// after the first reads - so these are consistent and safe
return ((uint64_t) h1 << 32) | l1;
}

If possible, change to the 32 bit version :-) Failing that, if your
chip supports chaining of timers in hardware, use that.

Thanks for this thread, Pozz - it's a good question, makes people think,
and can hopefully help other developers.

David Brown

unread,

Apr 27, 2017, 5:06:54 AM4/27/17

to

Why would you think it is easier to write this in assembly than C? The
issue is to be sure that the /algorithm/ is entirely safe regardless of
any ordering of interrupts, interrupt disables, timer overflows, etc.
Exactly the same issues are relevant in assembly.

Another poster suggested fiddling with optimisation levels, which is
equally odd here. There are enough "volatile" accesses here to ensure
that the order of accesses in the C code will be kept when the code is
compiled to assembly - writing it in assembly or changing optimisation
levels will not affect that. At most, it will change the timing -
making the problem appear more or less often. The problem is to ensure
that the ordering of actions specified by the programmer and the
algorithm is correct - not that it is being compiled as expected.

pozz

unread,

Apr 27, 2017, 5:39:01 AM4/27/17

to

Il 27/04/2017 10:01, David Brown ha scritto:

>> #define volatileAccess(v) *((volatile typeof((v)) *) &(v))
>
> That looks familiar :-)

;-)

>> static inline uint64_t ticks(void) {
>> uint32_t h1 = volatileAccess(_ticks_high);
>> TC0->COUNT32.CTRLBSET.reg =
>> TC_CTRLBSET_CMD(TC_CTRLBSET_CMD_READSYNC_Val);
>> uint32_t l1 = TC0->COUNT32.COUNT.reg;
>> uint32_t h2 = volatileAccess(_ticks_high);
>> if (h2 != h1) return ((uint64_t)h2 << 32) + 0;
>> else return ((uint64_t)h1 << 32) + l1;
>> }
>
> (I don't know what the CTRLBSET stuff is for - I am not familiar with
> Atmel's chips here.)

Atmel says you *need* to write CTRLB register before reading COUNT
register from TC0 peripheral. I don't know why exactly, but if you
don't, you can't read the correct COUNT value.

> If the low part of the counter rolls over and the interrupt has not run
> (maybe interrupts are disabled, or you are already in a higher priority
> interrupt, or interrupts take a few cycles to work through the system)
> then this will be wrong - l1 will have rolled over, but h2 will not show
> an updated value yet.

Yes, you are right and I thought about this possibility... but IMHO it's
impossibile.

ticks() are called only by TimerSet() and TimerExpired() and those two
functions are called only in normal background (not interrupt) code.
This means ticks() always runs in a lower priority than TC0 ISR.
Moreover, I never disable interrupts in any part of the code.

I don't understand what do you mean with "[...] or interrupts take a few
cycles to work through the system". Is it possible to read a rolled-over
hw counter value (0x00000003) and a not incremented _ticks_high?

uint32_t l1 = TC0->COUNT32.COUNT.reg;
uint32_t h2 = volatileAccess(_ticks_high);

COUNT register is read before _ticks_high second read. If COUNT has
rolled over and the rolled value is in l1, the interrupt was fired for
sure... IMHO. So h2 should contain the incremented value.

> So if the high parts don't match, you need to re-read the low part and
> re-check the high parts. It is perhaps easiest to express in a loop:
>
> static inline uint64_t ticks(void) {
> uint32_t h1 = volatileAccess(_ticks_high);
> while (true) {
> uint32_t l1 = TC0->COUNT32.COUNT.reg;
> uint32_t h2 = volatileAccess(_ticks_high);
> if (h1 == h2) return ((uint64_t) h2 << 32) | l1;
> h1 = h2;
> }
> }
>
> You can also reasonably note that this loop is not going to be re-run
> more than once, unless you have interrupt functions that last for an
> hour and a half, or something similarly bad - and then the failure of
> the ticks() function is the least of your problems! Then you can write:
>
> static inline uint64_t ticks(void) {
> uint32_t h1 = volatileAccess(_ticks_high);
> uint32_t l1 = TC0->COUNT32.COUNT.reg;
> uint32_t h2 = volatileAccess(_ticks_high);
> if (h1 != h2) l1 = TC0->COUNT32.COUNT.reg;
> return ((uint64_t) h2 << 32) | l1;
> }

This doesn't solve the problem you described.

_ticks_high = 1
90 <- h1 = 1
91
...
99
0 <- read l1
1
2 <- h2 = 1 (interrupt has not fired, your assumption)
3

ticks() could be {1,0} that is completely wrong.

>> static inline void TimerSet(Timer *tmr, uint64_t delay) {
>> /* delay is in ms */
>> *tmr = ticks() + delay * TIMER_FREQ / 1000;
>> }
>>
>> static inline int TimerExpired(Timer *tmr) {
>> return ticks() >= *tmr;
>> }
>
> When you want a boolean result, return "bool", not "int". (This is not
> your problem, of course - it's just good habit.)
>
> An alternative method here is to just use the low 32-bit part (with
> delays limited to 2^31 ticks), but be sure that you have considered
> wraparound. Keep everything in unsigned at first, to make overflows
> work as modulo arithmetic:
>
> static uint32_t ticks32(void) {
> return TC0->COUNT32.COUNT.reg;
> }
>
> static inline void TimerSet32(Timer32 *tmr, uint32_t delay) {
> *tmr = ticks32() + delay * TIMER_FREQ / 1000;
> }
>
> static inline bool TimerExpired32(Timer32 *tmr) {
> int32_t d = ticks32() - *tmr;
> return (d >= 0);
> }
>
> Note that doing the subtraction as unsigned, then converting to signed
> int, gives you the wraparound behaviour you need. Converting an
> unsigned int to a signed int when the result is out of range is
> implementation dependent (§6.3.1.3), but gcc and many other compilers
> define it as modulo arithmetic:
> <https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html>

Yes, I know this method and I used it many times in the past. However it
has two drawbacks:
- the maximum delay is 2^31 (in my case, around 40 minutes, but I have
delays of 1h)
- many times you need an additional "timer running/active" flag

The second point is more important. If you want to switch on a LED after
10 minutes when a button is pressed:

bool tmr_led_active;
Timer32 tmr_led;
void button_pressed_callback(void) {
TimerSet32(&tmr_led, 10 * 60 * 1000);
tmr_led_active = true;
}
void main_loop(void) {
...
if (tmr_led_active && TimerExpired32(&tmr_led)) {
switch_on_led();
tmr_led_active = false;
}
}

If you don't check the timer flag, the led will switch on at random times.

Yes, I will try your ideas.

pozz

unread,

Apr 27, 2017, 5:53:03 AM4/27/17

to

Il 27/04/2017 10:27, David Brown ha scritto:
>> [...]

>> static inline uint64_t ticks(void) {
>> uint32_t h1 = volatileAccess(_ticks_high);
>> uint32_t l1 = TC0->COUNT32.COUNT.reg;
>> uint32_t h2 = volatileAccess(_ticks_high);
>> if (h1 != h2) l1 = TC0->COUNT32.COUNT.reg;
>> return ((uint64_t) h2 << 32) | l1;
>> }
>>
>
> Looking over these again, I can see that they will have the same
> potential problem. They are fine for when the high part of the counter
> is also in hardware - but /not/ if it is dependent on interrupts which
> could be delayed. If you have a situation like this:
>
> _ticks_high is 1
> timer reg is 0xffff'fff0
> interrupts are disabled
> timer reg rolls over, and is now 0x0000'0003
> the ticks() function will return 0x0000'0001'0000'0003 instead of
> 0x0000'0002'0000'0003
>
> You will need to check the hardware overflow flag to make this work.

Yes, but this situation it's impossible in my case. Who could disable
interrupts here?

>
> static inline uint64_t ticks(void) {
> uint32_t h1 = volatileAccess(_ticks_high);
> uint32_t l1 = TC0->COUNT32.COUNT.reg;
> uint32_t h2 = volatileAccess(_ticks_high);
> if (h1 != h2) {
> // We just had an interrupt, so we know there is
> // no pending overflow
> l1 = TC0->COUNT32.COUNT.reg;
> // Or l1 = 0 for slightly faster code
> return ((uint64_t) h2 << 32) | l1;
> }
> if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {
> // There has been an overflow, and it was not handled
> l1 = TC0->COUNT32.COUNT.reg;
> return ((uint64_t) (h2 + 1) << 32) | l1;
> }
> // If there has been an overflow or an interrupt, it happened
> // after the first reads - so these are consistent and safe
> return ((uint64_t) h1 << 32) | l1;
> }
>
> If possible, change to the 32 bit version :-) Failing that, if your
> chip supports chaining of timers in hardware, use that.

Atmel SAM TCx peripherals are 16-bits counters/timers, but they can be
chained in a couple to have a 32-bits counter/timer. I already coupled
TC0 with TC1 to have a 32-bits hw counter. I can't chain TC0/TC1 with
TC2/TC3 to have a hardware 64-bits counter/timer.

> Thanks for this thread, Pozz - it's a good question, makes people think,
> and can hopefully help other developers.

Yes, those problems are fascinating :-)

David Brown

unread,

Apr 27, 2017, 6:54:06 AM4/27/17

to

You are sure you are not calling ticks() from within another interrupt
function?

>
> ticks() are called only by TimerSet() and TimerExpired() and those two
> functions are called only in normal background (not interrupt) code.
> This means ticks() always runs in a lower priority than TC0 ISR.
> Moreover, I never disable interrupts in any part of the code.
>
> I don't understand what do you mean with "[...] or interrupts take a few
> cycles to work through the system". Is it possible to read a rolled-over
> hw counter value (0x00000003) and a not incremented _ticks_high?

There is always a delay between an overflow in the timer, and the actual
interrupt function being started. Depending on details of the chip, it
is possible there will be several cpu cycles of the current instruction
stream executed before the interrupt function is run. If I remember
rightly, the M3/M4 has a 5 stage pipeline. When the interrupt is
registered by the core, it effectively means that a "jump to interrupt
vector" instruction is squeezed into the instruction stream - but these
current 5 instructions must be completed before the interrupt call takes
effect. You will also have a cycle or two delay in the NVIC, and
perhaps a cycle or two delay getting the signal out of the timer block
(especially if the timer does not run at full core speed). Cortex M
interrupts are handled quickly - but not immediately.

Yes - hence my follow-up post.

In such cases, you rarely need the same level of accuracy. Use your
accurate ticks32() counter to let you track a seconds counter in the
main loop of your code - this can be read freely without worrying about
synchronisation.

Another method is to make the timer overflow interrupt occur more often
- say, every millisecond. This increments a single millisecond counter
(either 32-bit, or 64-bit, or split lo/hi 32-bit - whatever suits).
Then reading that is easy, because the change to this counter is always
done in the same context as an atomic operation - there is no
complication of independent updates of the low and high halves. And
most of the time, it is enough to just use the low 32-bit part.

> - many times you need an additional "timer running/active" flag
>
> The second point is more important. If you want to switch on a LED after
> 10 minutes when a button is pressed:
>
> bool tmr_led_active;
> Timer32 tmr_led;
> void button_pressed_callback(void) {
> TimerSet32(&tmr_led, 10 * 60 * 1000);
> tmr_led_active = true;
> }
> void main_loop(void) {
> ...
> if (tmr_led_active && TimerExpired32(&tmr_led)) {
> switch_on_led();
> tmr_led_active = false;
> }
> }
>
> If you don't check the timer flag, the led will switch on at random times.

So you have to add a flag - big deal. You are not using any more
memory, your code is smaller, and it is simpler to be sure that
everything is correct.

<snip>

>
> Yes, I will try your ideas.

Good luck - it is not an easy task. Don't forget to let us know the
source of the problem, and the solution!

Dimiter_Popoff

unread,

Apr 27, 2017, 7:20:03 AM4/27/17

to

On 27.4.2017 г. 12:06, David Brown wrote:
> On 27/04/17 10:02, Dimiter_Popoff wrote:
>> On 27.4.2017 г. 01:55, pozz wrote:
>
>>> I suspect there's a problem in my low-level driver and sometimes, maybe
>>> near the overflow, the code doesn't work as I expect.
>>> Maybe the TimerSet() function sometimes sets a wrong value to the
>>> uint64_t timer, maybe a tick value that will happen only at the next
>>> overflow of the hw counter.
>>>
>>> Do you see where is the problem?
>>>
>>
>> Hoping that the compiler will know how to fix this is not a very
>> efficient approach.
>> It is just a few lines of assembly, no matter how inconvenient the ARM
>> assembly may be it will still save you weeks (months) of blind trial
>> and error.
>> You just need to _know_ how this is done, typing blindly this or that
>> is unlikely to ever work.
>>
>
> Why would you think it is easier to write this in assembly than C? The
> issue is to be sure that the /algorithm/ is entirely safe regardless of
> any ordering of interrupts, interrupt disables, timer overflows, etc.
> Exactly the same issues are relevant in assembly.

Perhaps not "easier", just taking a few minutes rather than a few
months.
Doing an lwarx/stwcx. like thing on a timer in a high level language
is simply a waste of someone's time - but then someone might find
it easier to waste time than to write the < 10 lines, keep on
wondering why todays compiler output does not work while yesterdays
did, what is actually happening etc.
The obvious thing to do here is:

-lock the 64 bit entity (test a flag and set it, if found set
try again),
-read the 64 bit counter lwarx/stwcx.-ish,
-unlock the timer.
- ensure this is done more frequently than the lower 32 bits
overflow.

And you want to write this in C (or whatever)?! Well, it is your time.

>
> Another poster suggested fiddling with optimisation levels, which is
> equally odd here. There are enough "volatile" accesses here to ensure
> that the order of accesses in the C code will be kept when the code is
> compiled to assembly - writing it in assembly or changing optimisation
> levels will not affect that.

Well good, if your C compiler behaves in such a predictable way go
ahead. And don't come crying a year later when you discover it did
so indeed - almost :-).

The mere fact that I see so _many_ and _lengthy_ posts lately all
related to wrestling the compiler to do what one wants it to do, this
taking often _months_ of peoples time instead of a few minutes
speaks for itself.

pozz

unread,

Apr 27, 2017, 7:52:00 AM4/27/17

to

Il 27/04/2017 12:54, David Brown ha scritto:

> You are sure you are not calling ticks() from within another interrupt
> function?

Yes, sure. I am very careful during writing interrupts code, even if it
wasn't sufficient in this case.

>> ticks() are called only by TimerSet() and TimerExpired() and those two
>> functions are called only in normal background (not interrupt) code.
>> This means ticks() always runs in a lower priority than TC0 ISR.
>> Moreover, I never disable interrupts in any part of the code.
>>
>> I don't understand what do you mean with "[...] or interrupts take a few
>> cycles to work through the system". Is it possible to read a rolled-over
>> hw counter value (0x00000003) and a not incremented _ticks_high?
>
> There is always a delay between an overflow in the timer, and the actual
> interrupt function being started. Depending on details of the chip, it
> is possible there will be several cpu cycles of the current instruction
> stream executed before the interrupt function is run. If I remember
> rightly, the M3/M4 has a 5 stage pipeline. When the interrupt is
> registered by the core, it effectively means that a "jump to interrupt
> vector" instruction is squeezed into the instruction stream - but these
> current 5 instructions must be completed before the interrupt call takes
> effect. You will also have a cycle or two delay in the NVIC, and
> perhaps a cycle or two delay getting the signal out of the timer block
> (especially if the timer does not run at full core speed). Cortex M
> interrupts are handled quickly - but not immediately.

Really? So, the core is able to read the register of a peripheral
(TC0->COUNT32.COUNT register, in my case) with a *new* (i.e., rolled)
value, but the ISR hasn't run yet? In other words, a bus access can be
done (TC0 is on the bus) while interrupt request is pending?

If this is the case, it is a mess :-(

>> - many times you need an additional "timer running/active" flag
>>
>> The second point is more important. If you want to switch on a LED after
>> 10 minutes when a button is pressed:
>>
>> bool tmr_led_active;
>> Timer32 tmr_led;
>> void button_pressed_callback(void) {
>> TimerSet32(&tmr_led, 10 * 60 * 1000);
>> tmr_led_active = true;
>> }
>> void main_loop(void) {
>> ...
>> if (tmr_led_active && TimerExpired32(&tmr_led)) {
>> switch_on_led();
>> tmr_led_active = false;
>> }
>> }
>>
>> If you don't check the timer flag, the led will switch on at random times.
>
> So you have to add a flag - big deal. You are not using any more
> memory, your code is smaller, and it is simpler to be sure that
> everything is correct.

Yes, of course. My point here is that you have to **remember** that the
timers you are using can roll-over at any time in the future, so they
can change from "not expired" to "expired".

After using TimerSet32() and after the timer expires, you could expect
it stays "expired", until you arm it again with TimerSet32(). This is
not true, because TimerExpired32() could returns "false" at a certain
time in the future.

This is not a problem with timers that are repetitive (armed again as
soon as they expire), but with one-shot timers.

>> Yes, I will try your ideas.
>
> Good luck - it is not an easy task. Don't forget to let us know the
> source of the problem, and the solution!

I configured TC0 in 16-bis mode, so now TC0_Handler() is fired every
75ms. I changed accordingly ticks() to create a 64-bits by shifting
_ticks_high for 16-bits.

if (h2 != h1) return ((uint64_t)h2 << 16) + 0;
else return ((uint64_t)h1 << 16) + l1;

I was lucky because I can reproduce the problem more often... and
incredibly the problem is the opposite.

Remember my state-machine in bus.c:

switch(bus.state) {
case BUS_IDLE:
if (TimerExpired(&bus.tmr_answer)) {
/* Send new request on the bus */
...
TimerSet(&bus.tmr_answer, timeout_answer);
bus.state = BUS_WAITING_ANSWER;
}
break;

case BUS_WAITING_ANSWER:
if (TimerExpired(&bus.tmr_answer)) {
/* No reply */
bus.state = BUS_IDLE;
TimerSet(&bus.tmr_answer, 0);
} else {
if (reply_received() == true) {
/* Analyze the reply */
bus.state = BUS_IDLE;

TimerSet(&bus.tmr_answer, 0); [*]
}
}
break;
}

When the problem occurs (the bus blocks), bus.state is BUS_IDLE. The
problem is with instruction [*]. That instructions should arm
bus.tmr_answer timer such that it expires immediately and a new request
is send on the bus (in the future, it will be simple to introduce a
delay between the reply and the next request).

Sometimes the timer doesn't expire immediately, but after the hw counter
roll-over again. Indeed I see the bus blocked for 75ms.

We were thinking that _ticks_high hasn't incremented yet in task() when
reading hw counter value. But this would have produced an old already
expired time, not a future not-expired time. Here the problem is with a
wrong time in the *future*.
In other words, when the problem occurs, ticks() reads a new corrected
and incremented value for _ticks_high, but an old not-rolled hw counter
value.

Why this? I think it's the usual problem with register syncronization
in Atmel SAM devices? Before reading or writing certain registers, you
need to check if the peripheral is in syncing. I don't know what this
exactly means, but it relates to the presence of different asyncronous
clocks. But I use only one reference clock (an external crystal) that
is routed to the Cortex-M core and all peripherals... so I thought
syncronization wasn't necessary.

In this case, it seems syncronization solve my problem, so my ticks()
function is now:

static inline uint64_t ticks(void) {
uint32_t h1 = volatileAccess(_ticks_high);
TC0->COUNT32.CTRLBSET.reg =
TC_CTRLBSET_CMD(TC_CTRLBSET_CMD_READSYNC_Val);

while(TC0->COUNT32.SYNCBUSY) {

}
uint32_t l1 = TC0->COUNT32.COUNT.reg;
uint32_t h2 = volatileAccess(_ticks_high);
if (h2 != h1) return ((uint64_t)h2 << 32) + 0;
else return ((uint64_t)h1 << 32) + l1;
}

Datasheet says you need to write the CMD bits of TC0->CTRLB register
with a known value (TC_CTRLBSET_CMD_READSYNC_Val), before reading COUNT
register. Why? I don't know. TC0->CTRLB is a **Write-Syncronized**
register, i.e. the value you are writing will be really wrote after sync
time. Maybe the sync loop I added waits for time needed to the CTRLB
command to be executed... otherwise the COUNT value you read immediately
after could be wrong (IMHO!)

After this... I think my code is always affected by the original problem
if, as you explained, TC0 interrupt is delayed when I'm reading hw
counter in ticks(). I don't have the expertise to understand this
possibility happens really, so it's better to find another method.

I liked the idea to use a 64-bits counter for ticks that will never
roll-over during the entire lifetime of the device.

Simon Clubley

unread,

Apr 27, 2017, 8:06:59 AM4/27/17

to

On 2017-04-27, David Brown <david...@hesbynett.no> wrote:
>
> Another poster suggested fiddling with optimisation levels, which is
> equally odd here.

Not really. I suggested changing the optimisation level because if it
works reliably with lower optimisation levels then that strongly implies
an issue within the code itself instead of in the hardware.

David Brown

unread,

Apr 27, 2017, 9:39:47 AM4/27/17

to

I cannot say for sure that this can happen. But unless I can say for
sure that it /cannot/ happen, I prefer to assume the worst is possible.

> If this is the case, it is a mess :-(
>
>
>>> - many times you need an additional "timer running/active" flag
>>>
>>> The second point is more important. If you want to switch on a LED after
>>> 10 minutes when a button is pressed:
>>>
>>> bool tmr_led_active;
>>> Timer32 tmr_led;
>>> void button_pressed_callback(void) {
>>> TimerSet32(&tmr_led, 10 * 60 * 1000);
>>> tmr_led_active = true;
>>> }
>>> void main_loop(void) {
>>> ...
>>> if (tmr_led_active && TimerExpired32(&tmr_led)) {
>>> switch_on_led();
>>> tmr_led_active = false;
>>> }
>>> }
>>>
>>> If you don't check the timer flag, the led will switch on at random
>>> times.
>>
>> So you have to add a flag - big deal. You are not using any more
>> memory, your code is smaller, and it is simpler to be sure that
>> everything is correct.
>
> Yes, of course. My point here is that you have to **remember** that the
> timers you are using can roll-over at any time in the future, so they
> can change from "not expired" to "expired".

True. But perhaps that can be baked into your "Set" and "Expired"
functions, or otherwise made "unforgettable". The aim is to simplify
the code that /may/ have rare race conditions into something that cannot
possibly have such problems - even if it means other code is bigger or
less efficient.

If you can make your hardware timer function run every millisecond (or
whatever accuracy you need), then use this:

extern volatile uint64_t tickCounter_;

static inline uint64_t ticks(void) {
uint64_t a = tickCounter_;
while (true) {
uint64_t b = tickCounter_;
if (a == b) return a;
a = b;

}
}

void TC0_Handler(void) {
if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {

tickCounter++;
TC0->COUNT32.INTFLAG.reg = TC_INTFLAG_OVF;
}
}

If ticks never accesses the timer hardware register, there cannot be a
problem with synchronisation. There is no need to use the timer
peripherals complicated synchronisation and locking mechanism, nor any
concern about interrupt delays. Re-reading the 64-bit value until you
have two identical reads is sufficient to ensure that you have a
consistent value even if a timer interrupt occurs in the middle of the
64-bit read.

David Brown

unread,

Apr 27, 2017, 10:16:32 AM4/27/17

to

On 27/04/17 13:20, Dimiter_Popoff wrote:
> On 27.4.2017 г. 12:06, David Brown wrote:
>> On 27/04/17 10:02, Dimiter_Popoff wrote:
>>> On 27.4.2017 г. 01:55, pozz wrote:
>>
>>>> I suspect there's a problem in my low-level driver and sometimes, maybe
>>>> near the overflow, the code doesn't work as I expect.
>>>> Maybe the TimerSet() function sometimes sets a wrong value to the
>>>> uint64_t timer, maybe a tick value that will happen only at the next
>>>> overflow of the hw counter.
>>>>
>>>> Do you see where is the problem?
>>>>
>>>
>>> Hoping that the compiler will know how to fix this is not a very
>>> efficient approach.
>>> It is just a few lines of assembly, no matter how inconvenient the ARM
>>> assembly may be it will still save you weeks (months) of blind trial
>>> and error.
>>> You just need to _know_ how this is done, typing blindly this or that
>>> is unlikely to ever work.
>>>
>>
>> Why would you think it is easier to write this in assembly than C? The
>> issue is to be sure that the /algorithm/ is entirely safe regardless of
>> any ordering of interrupts, interrupt disables, timer overflows, etc.
>> Exactly the same issues are relevant in assembly.
>
> Perhaps not "easier", just taking a few minutes rather than a few
> months.

You don't think you are exaggerating just a /little/ ?

> Doing an lwarx/stwcx. like thing on a timer in a high level language
> is simply a waste of someone's time - but then someone might find
> it easier to waste time than to write the < 10 lines, keep on
> wondering why todays compiler output does not work while yesterdays
> did, what is actually happening etc.
> The obvious thing to do here is:
>
> -lock the 64 bit entity (test a flag and set it, if found set
> try again),
> -read the 64 bit counter lwarx/stwcx.-ish,
> -unlock the timer.
> - ensure this is done more frequently than the lower 32 bits
> overflow.
>
> And you want to write this in C (or whatever)?! Well, it is your time.

The problem has nothing to do with making an atomic 64 bit read of data
- that is /easy/ in C. The problem is the locking and synchronisation
system for reading this timer peripheral in this particular
microcontroller, which is inconvenient no matter what language is used.
The other considerations are synchronisation between updates of the
data in the interrupt function, the calling function, and the hardware
timer - again, language independent.

A lwarx/stwcx can be useful for doing read/write/modify operations on an
object - it is not necessary for atomically reading data which could be
changed underway. The simplest way to handle that, in C /or/ in
assembly, is to read it more than once:

volatile uint64_t count;
uint64_t readCount(void) {
uint64_t a = count;
while (true) {
uint64_t b = count;

if (a == b) return a;
a = b;
}
}

Alternatively, using C11, you handle it like this:

_Atomic uint64_t a_count;
uint64_t readCount(void) {
return a_count;
}

I appreciate that you are very familiar with assembly and may write it
faster than C, but the above function did not take me a month to figure out.

Using lwarx/stwcx (or ldrex/strex on ARM) will not be shorter or faster,
and it will be more complex. Basically, you need to use these to create
a lock for the access to the non-atomic type. They are always harder to
use than something like C11's atomic access support, regardless of the
language. And if you want to program in C, and don't have C11 atomics
(or other help from your compiler), you can easily write inline assembly
wrappers for those two instructions and write everything else in C.

>
>>
>> Another poster suggested fiddling with optimisation levels, which is
>> equally odd here. There are enough "volatile" accesses here to ensure
>> that the order of accesses in the C code will be kept when the code is
>> compiled to assembly - writing it in assembly or changing optimisation
>> levels will not affect that.
>
> Well good, if your C compiler behaves in such a predictable way go
> ahead. And don't come crying a year later when you discover it did
> so indeed - almost :-).

My C compiler(s) behave in a predictable manner - because I know how C
works. People who don't know all the fine details can get it wrong, but
that applies to any language.

>
> The mere fact that I see so _many_ and _lengthy_ posts lately all
> related to wrestling the compiler to do what one wants it to do, this
> taking often _months_ of peoples time instead of a few minutes
> speaks for itself.
>

I follow most threads in this newsgroup, but I can't figure out what you
are referring to here. There are occasional posts where people make
mistakes due to misunderstandings about how C works and how their
compiler works - but they are not /that/ common. And by what possible
justification could you claim that writing the code in assembly would
result in fewer bugs or shorter development times? You might reasonably
say that the programmer would not make those particular mistakes - but
they could easily make many, many more errors in their assembly code
that they would not have made with C.

David Brown

unread,

Apr 27, 2017, 10:18:48 AM4/27/17

to

On 27/04/17 14:03, Simon Clubley wrote:
> On 2017-04-27, David Brown <david...@hesbynett.no> wrote:
>>
>> Another poster suggested fiddling with optimisation levels, which is
>> equally odd here.
>
> Not really. I suggested changing the optimisation level because if it
> works reliably with lower optimisation levels then that strongly implies
> an issue within the code itself instead of in the hardware.
>

As in, "it is worth giving it a try to see if it affects behaviour, and
therefore gives a clue to help debugging" ? Yes, I can agree with that.
Once the OP has made changes that speed up the error rate to something
workable, that kind of thing can be worth trying.

John Speth

unread,

Apr 27, 2017, 1:24:05 PM4/27/17

to

>>> First you need to make ticks_high volatile. It doesn't make any sense
>>> to me why you'd cast it volatile in one execution context but not the
>>> other. If it's shared between two execution contexts, the compiler
>>> needs to know that.
>
> "volatile" does not mean that a variable is shared between two execution
> contexts - it is neither necessary nor sufficient to make such sharing work.

So true, David. I noticed that a variable shared between two
asynchronous execution contexts can benefit by volatile declaration. I
would instinctively declare it volatile because it's the right thing to
do in this case. It won't solve the OP's problem though.

JJS

David Brown

unread,

Apr 27, 2017, 4:04:53 PM4/27/17

to

You say you agree with me - then it looks like you completely /disagree/.

"Instinctively declaring it volatile" is the /wrong/ thing to do when
you share a variable between two contexts. It is wrong, because it is
often not needed, but hinders optimisation. It is wrong, because it is
often not enough to make it volatile. And it is wrong, because
"instinctively" suggests you make it volatile without thought, rather
than properly considering the situation.

It is certainly the case that making a shared variable volatile can
often be part of the solution - but no more than that.

pozz

unread,

Apr 28, 2017, 3:24:21 AM4/28/17

to

Il 27/04/2017 15:39, David Brown ha scritto:
[...]

> I cannot say for sure that this can happen. But unless I can say for
> sure that it /cannot/ happen, I prefer to assume the worst is possible.

Yes, you're right. One of my colleague would have said: "put on metal
underwear, just to be sure" :-)

[...]

>> Yes, of course. My point here is that you have to **remember** that the
>> timers you are using can roll-over at any time in the future, so they
>> can change from "not expired" to "expired".
>
> True. But perhaps that can be baked into your "Set" and "Expired"
> functions, or otherwise made "unforgettable". The aim is to simplify
> the code that /may/ have rare race conditions into something that cannot
> possibly have such problems - even if it means other code is bigger or
> less efficient.

Yes, I know. Indeed I will abandon my first approach to put together hw
and sw counter, joint in ISR code.
It is a technique learned from this newsgroup... it's a pity the
original author isn't reading (I remember Don Y added some personal
ideas to this approach and he read this ng in the past days).

[...]

>> I liked the idea to use a 64-bits counter for ticks that will never
>> roll-over during the entire lifetime of the device.
>
> If you can make your hardware timer function run every millisecond (or
> whatever accuracy you need), then use this:
>
> extern volatile uint64_t tickCounter_;
>
> static inline uint64_t ticks(void) {
> uint64_t a = tickCounter_;
> while (true) {
> uint64_t b = tickCounter_;
> if (a == b) return a;
> a = b;
> }
> }
>
> void TC0_Handler(void) {
> if (TC0->COUNT32.INTFLAG.reg & TC_INTFLAG_OVF) {
> tickCounter++;
> TC0->COUNT32.INTFLAG.reg = TC_INTFLAG_OVF;
> }
> }
>
> If ticks never accesses the timer hardware register, there cannot be a
> problem with synchronisation. There is no need to use the timer
> peripherals complicated synchronisation and locking mechanism, nor any
> concern about interrupt delays. Re-reading the 64-bit value until you
> have two identical reads is sufficient to ensure that you have a
> consistent value even if a timer interrupt occurs in the middle of the
> 64-bit read.

Yes, it is a solution. There's a small drawback: you have a frequent
interrupt (1ms).

Maybe there's another solution to fix the first approach. The problem
was that hw counter can roll over and the "rolled" value can be read,
while the sw counter (my _ticks_high) is at the "old" (not incremented)
value yet.
The idea is to configure the timer to stop when it reaches TOP
0xFFFFFFFF value (one-shot timer). It can be restarted in ISR, together
with incrementing _ticks_high.

There's another drawback, a small drawback. The hw counter is the clock
of the machine. When it reaches the TOP value, it stops for a short
time. So the system time appears frozen for this short time.
However this happens every 2^32 * Counter_Freq (in my case, every 1h and
21').

From this story, I learned another important thing. Why did I missed
this bug? Because it could appear only every 1h and 21'. It /could/
appear, because it is random, so it could appear after 1000 times 1h21'
(i.e. after 2 months!!!!)

In the future I will avoid to use so long time. In my case, I don't
really need the full 64bits. If I use a smaller 16-bits hw counter and
the full 32-bits sw counter, I will have a 48-bits system tick (in my
case, a periodicity of 10 years).
In this case, a potential bug is related to the shorter period of the
16-bits hw counter, only 75ms. There is a much greater possibility to
see the problem in my lab during testing and not in the user hands.

David Brown

unread,

Apr 28, 2017, 5:05:22 AM4/28/17

to

On 28/04/17 09:24, pozz wrote:
> Il 27/04/2017 15:39, David Brown ha scritto:
> [...]
>> I cannot say for sure that this can happen. But unless I can say for
>> sure that it /cannot/ happen, I prefer to assume the worst is possible.
>
> Yes, you're right. One of my colleague would have said: "put on metal
> underwear, just to be sure" :-)
>
> [...]
>>> Yes, of course. My point here is that you have to **remember** that the
>>> timers you are using can roll-over at any time in the future, so they
>>> can change from "not expired" to "expired".
>>
>> True. But perhaps that can be baked into your "Set" and "Expired"
>> functions, or otherwise made "unforgettable". The aim is to simplify
>> the code that /may/ have rare race conditions into something that cannot
>> possibly have such problems - even if it means other code is bigger or
>> less efficient.
>
> Yes, I know. Indeed I will abandon my first approach to put together hw
> and sw counter, joint in ISR code.
> It is a technique learned from this newsgroup... it's a pity the
> original author isn't reading (I remember Don Y added some personal
> ideas to this approach and he read this ng in the past days).
>

Don Y is around and reading and contributing to this group. I expect he
has read this thread too, and will post if he has something to say.

In my experience, that is not much of a drawback unless you are making a
very low power system that spends a long time sleeping. I usually have
lots of little tasks hanging off a 1 ms software timer.

>
>
> Maybe there's another solution to fix the first approach.

Yes, there is - I posted it earlier. You can check the overflow flag in
ticks().

> The problem
> was that hw counter can roll over and the "rolled" value can be read,
> while the sw counter (my _ticks_high) is at the "old" (not incremented)
> value yet.
> The idea is to configure the timer to stop when it reaches TOP
> 0xFFFFFFFF value (one-shot timer). It can be restarted in ISR, together
> with incrementing _ticks_high.

You are going to get inaccuracies that build up over time if you do
that. Maybe that's fine for your application, of course - in which case
it is a perfectly workable idea.

>
> There's another drawback, a small drawback. The hw counter is the clock
> of the machine. When it reaches the TOP value, it stops for a short
> time. So the system time appears frozen for this short time.
> However this happens every 2^32 * Counter_Freq (in my case, every 1h and
> 21').
>
>
> From this story, I learned another important thing. Why did I missed
> this bug? Because it could appear only every 1h and 21'. It /could/
> appear, because it is random, so it could appear after 1000 times 1h21'
> (i.e. after 2 months!!!!)

Yes. You can use testing to show the presence of bugs - but you cannot
use testing to show their absence. You have to think these things
through very carefully.

Or switch to a chip family like the Kinetis that have multiple 32-bit
timers that can be chained together in hardware :-)

pozz

unread,

Apr 28, 2017, 6:18:04 AM4/28/17

to

Il 28/04/2017 11:05, David Brown ha scritto:
[...]

>> From this story, I learned another important thing. Why did I missed
>> this bug? Because it could appear only every 1h and 21'. It /could/
>> appear, because it is random, so it could appear after 1000 times 1h21'
>> (i.e. after 2 months!!!!)
>
> Yes. You can use testing to show the presence of bugs - but you cannot
> use testing to show their absence. You have to think these things
> through very carefully.
>
> Or switch to a chip family like the Kinetis that have multiple 32-bit
> timers that can be chained together in hardware :-)

Yes, you should use all the chips to select the best for your needs.
But noone as so long time to test all chips.

I am a fan of 8-bits AVR from Atmel (mostly when compared with PICs from
Microchip... it's funny to think that now they are the same vendor), so
I naturally started with Cortex-M SAM devices.
Apart the big monster named ASF (Atmel Software Framework), the
libraries written by Atmel folks to help beginners start writing
software with minimal efforts, SAM devices are good to me.
I initially invested some time to understand datasheet and abandon ASF
and write my own low-level drivers. I'm happy with this approach now.

What I don't like of SAM devices is the register syncronization mess.
You have to write always sync waiting loops, before/after
reading/writing some peripherals registers. Even when you simply want
to read the value of a hw counter (as my story explained).

They have Cortex-M devices that works at 5V and this is a big plus.
Moreover they are mostly pin-to-pin compatible and they have a good pin
multiplexing scheme (you have an UART almost on every pin).
Atmel Studio is slow, but it works well. It sometimes crashes, mainly
during debug, but you can live with them. I don't like Eclipse too much.

They have a nice Event System peripheral that connects an output event
of a peripheral with an input event of another peripheral. For example,
you can start *automatically* an ADC conversion when a timer oveflows.

You can also connect the overflow event of a 32-bits timer to a count
event of another 32-bits timer to have a 64-bits timer/counter. However
I don't know if this event mechanism introduces some delays, so I don't
want to use it to solve my original problem.

David Brown

unread,

Apr 28, 2017, 7:10:05 AM4/28/17

to

On 28/04/17 12:18, pozz wrote:
> Il 28/04/2017 11:05, David Brown ha scritto:
> [...]
>>> From this story, I learned another important thing. Why did I missed
>>> this bug? Because it could appear only every 1h and 21'. It /could/
>>> appear, because it is random, so it could appear after 1000 times 1h21'
>>> (i.e. after 2 months!!!!)
>>
>> Yes. You can use testing to show the presence of bugs - but you cannot
>> use testing to show their absence. You have to think these things
>> through very carefully.
>>
>> Or switch to a chip family like the Kinetis that have multiple 32-bit
>> timers that can be chained together in hardware :-)
>
> Yes, you should use all the chips to select the best for your needs. But
> noone as so long time to test all chips.

My comment was not particularly serious - there are a great many reasons
for picking a particular microcontroller, and there are /always/ things
you dislike about them.

>
> I am a fan of 8-bits AVR from Atmel (mostly when compared with PICs from
> Microchip... it's funny to think that now they are the same vendor), so
> I naturally started with Cortex-M SAM devices.
> Apart the big monster named ASF (Atmel Software Framework), the
> libraries written by Atmel folks to help beginners start writing
> software with minimal efforts, SAM devices are good to me.
> I initially invested some time to understand datasheet and abandon ASF
> and write my own low-level drivers. I'm happy with this approach now.

<off-topic-rant>

Why is it that vendors write such poor quality software for these sorts
of frameworks or SDK's? I have seen a great many in my years, and /all/
of them are full of poor code. They are typically bloated, lasagne
programming (i.e., it takes 6 layers of functions calling other
functions to do something that requires a single assembly instruction),
break when you change optimisation settings, have dozens of nested
conditional compilation sections to handle devices that went out of
production decades ago, spit piles of warnings when "-Wall" is enabled,
and so on.

</off-topic-rant>

>
> What I don't like of SAM devices is the register syncronization mess.
> You have to write always sync waiting loops, before/after
> reading/writing some peripherals registers. Even when you simply want
> to read the value of a hw counter (as my story explained).
>

Sounds messy - this sort of thing can be hidden in the hardware even
when peripheral clocks are asynchronous.

> They have Cortex-M devices that works at 5V and this is a big plus.

So do the Kinetis family.

> Moreover they are mostly pin-to-pin compatible and they have a good pin
> multiplexing scheme (you have an UART almost on every pin).
> Atmel Studio is slow, but it works well. It sometimes crashes, mainly
> during debug, but you can live with them. I don't like Eclipse too much.

These things are a matter of taste, which is often a matter of what you
are used to. I don't like MSVS at all, and therefore dislike Atmel
Studio. (I wonder if they will migrate to a Netbeans IDE, which is what
Microchip uses?). Of course, I use Linux for most of my development
work, which makes me biased against Windows-only tools!

>
> They have a nice Event System peripheral that connects an output event
> of a peripheral with an input event of another peripheral. For example,
> you can start *automatically* an ADC conversion when a timer oveflows.
>

Kinetis devices have some of that, but it is not as advanced as the
event system in the newer AVRs, if the Atmel ARM devices are similar.

> You can also connect the overflow event of a 32-bits timer to a count
> event of another 32-bits timer to have a 64-bits timer/counter. However
> I don't know if this event mechanism introduces some delays, so I don't
> want to use it to solve my original problem.

In this particular case, the Kinetis has 4 programmable interrupt timers
at 32-bits each, with configurable top counts. So on a 120 MHz core I
set the first to count to 120 and trigger the second on overflow, with
the second counting to 0xffffffff and triggering the third on overflow.
This means I have a nice regular microsecond counter at 32-bit or
64-bit as needed, with easy synchronisation.

Stef

unread,

May 1, 2017, 4:14:29 AM5/1/17

to

On 2017-04-27 pozz wrote in comp.arch.embedded:
>
> Atmel SAM TCx peripherals are 16-bits counters/timers, but they can be
> chained in a couple to have a 32-bits counter/timer. I already coupled
> TC0 with TC1 to have a 32-bits hw counter. I can't chain TC0/TC1 with
> TC2/TC3 to have a hardware 64-bits counter/timer.

Just out of curiosity, I had a look at the SAM C21 Family datasheet. It's
been a long time since I used Atmel ARM controllers (SAM7).

In the discription of the TC, I see no fixed 16 bit width and coupling of
timers. Only that any TC channel can be configured in 8, 16 or 32 bit mode.
Am I looking at the wrong datasheet or section?

If the timers are indeed 8, 16 or 32 bit configurable, that could be a way
to speed up your testing. Just set your timer to 8 or 16 bit (and add some
code to set the other bits valid) and speed up overflows with a factor of
2^24 or 2^16.

--
Stef (remove caps, dashes and .invalid from e-mail address to reply by mail)

Beer -- it's not just for breakfast anymore.

pozz

unread,

May 2, 2017, 3:03:11 AM5/2/17

to

Il 01/05/2017 10:09, Stef ha scritto:
> On 2017-04-27 pozz wrote in comp.arch.embedded:
>>
>> Atmel SAM TCx peripherals are 16-bits counters/timers, but they can be
>> chained in a couple to have a 32-bits counter/timer. I already coupled
>> TC0 with TC1 to have a 32-bits hw counter. I can't chain TC0/TC1 with
>> TC2/TC3 to have a hardware 64-bits counter/timer.
>
> Just out of curiosity, I had a look at the SAM C21 Family datasheet. It's
> been a long time since I used Atmel ARM controllers (SAM7).
>
> In the discription of the TC, I see no fixed 16 bit width and coupling of
> timers. Only that any TC channel can be configured in 8, 16 or 32 bit mode.
> Am I looking at the wrong datasheet or section?

When you use a TC in 8- or 16-bits, your are using a single TC
peripheral. When you configured TC0 in 32-bits, you are automatically
using TC1 too, that works in "slave" mode:

The counter mode is selected by the Mode bit group in the Control A
register (CTRLA.MODE). By default, the counter is enabled in the
16-bit counter resolution. Three counter resolutions are available:
[...]
• COUNT32: This mode is achieved by pairing two 16-bit TC
peripherals. TC0 is paired with TC1, and TC2 is paired with TC3.
TC4 does not support 32-bit resolution.
[...]

IMHO this means TC is a 16-bits counter.

> If the timers are indeed 8, 16 or 32 bit configurable, that could be a way
> to speed up your testing. Just set your timer to 8 or 16 bit (and add some
> code to set the other bits valid) and speed up overflows with a factor of
> 2^24 or 2^16.

Oh yes, if you read one of my previous post, I made exactly this to
speed-up the raise of the bug. I discovered it was due to the lack of a
sync wait loop after writing the read command to CTRLB register.

pozz

unread,

May 2, 2017, 3:14:17 AM5/2/17

to

Il 28/04/2017 13:10, David Brown ha scritto:
>> [...]

>> Moreover they are mostly pin-to-pin compatible and they have a good pin
>> multiplexing scheme (you have an UART almost on every pin).
>> Atmel Studio is slow, but it works well. It sometimes crashes, mainly
>> during debug, but you can live with them. I don't like Eclipse too much.
>
> These things are a matter of taste, which is often a matter of what you
> are used to. I don't like MSVS at all, and therefore dislike Atmel
> Studio. (I wonder if they will migrate to a Netbeans IDE, which is what
> Microchip uses?). Of course, I use Linux for most of my development
> work, which makes me biased against Windows-only tools!

Don't think I'm a M$ fan. However I think it's much more fast to develop
under Windows, because you have all the tools already configured and
working under M$.

Atmel Studio is not that bad as a self-contained IDE, except it is very
slow. Of course, if you usually create your own Makefile to manage your
build process, it's another story.

In the past I tried to create/use an ARM toolchain with a custom
Makefile and using whatever text-editor to change source code. Of
course it worked... but the debug was a problem.

So I want to ask you a question: how do you debug your projects under
Linux? Maybe Kinetis IDE are Eclipse-based (I don't know), so I think it
works well under Linux, from coding to debugging. Is the ARM
debuggers/probes (J-Link, manufacturer specific devies) good under Linux?

In the past I tried to configure Code::Blocks IDE, that it's very nice
and fast for me, for Atmel ARM devices. It runs under Windows and Linux,
because it is wxWidgets-based. Unfortunately the problem is always the
same: debugging.
I can't think to develop an application without the plus to break the
programm, watches variables values, run the next instruction and so on.

Moreover, manufacturer IDE usually gives other functionalities. For
example, Atmel Studio gives the possibility to see core registers,
peripherals' registers (well organized), Flash and RAM content and so
on. I think you lost all those info with a "neutral" IDE during debugging.

David Brown

unread,

May 2, 2017, 5:34:59 AM5/2/17

to

On 02/05/17 09:14, pozz wrote:
> Il 28/04/2017 13:10, David Brown ha scritto:
>>> [...]
>>> Moreover they are mostly pin-to-pin compatible and they have a good pin
>>> multiplexing scheme (you have an UART almost on every pin).
>>> Atmel Studio is slow, but it works well. It sometimes crashes, mainly
>>> during debug, but you can live with them. I don't like Eclipse too much.
>>
>> These things are a matter of taste, which is often a matter of what you
>> are used to. I don't like MSVS at all, and therefore dislike Atmel
>> Studio. (I wonder if they will migrate to a Netbeans IDE, which is what
>> Microchip uses?). Of course, I use Linux for most of my development
>> work, which makes me biased against Windows-only tools!
>
> Don't think I'm a M$ fan. However I think it's much more fast to develop
> under Windows, because you have all the tools already configured and
> working under M$.
>

I'd say it is faster to develop under Linux, because you have all the
tools ready and working - and many of them work much faster on Linux
than Windows.

But of course, that depends on the tools you want to use :-)

> Atmel Studio is not that bad as a self-contained IDE, except it is very
> slow. Of course, if you usually create your own Makefile to manage your
> build process, it's another story.
>
> In the past I tried to create/use an ARM toolchain with a custom
> Makefile and using whatever text-editor to change source code. Of
> course it worked... but the debug was a problem.
>
> So I want to ask you a question: how do you debug your projects under
> Linux? Maybe Kinetis IDE are Eclipse-based (I don't know), so I think it
> works well under Linux, from coding to debugging. Is the ARM
> debuggers/probes (J-Link, manufacturer specific devies) good under Linux?
>

For serious projects, I /always/ use my own Makefiles. But I often use
the manufacturer's IDE, precisely because in many cases it makes
debugging easier. So for programming on the Kinetis, I use the "Kinetis
Design Studio" IDE. It is a perfectly reasonable Eclipse IDE (assuming
you are happy with Eclipse), with the plugins and stuff for debugging.
I use my own Makefile, but run it from within the IDE. I use a slightly
newer version of gcc (from GNU Arm Embedded) than the version that comes
with KDS. But I do my debugging directly from within the IDE.

I have the same setup on Windows /and/ Linux, and can use either. That
means I need some msys2/mingw-64 stuff installed on Windows to make it
look like a real OS with standard utilities (make, sed, cp, mv, etc.),
but that's a one-time job when you configure a new Windows system. The
build process is significantly faster on Linux than Windows on
comparable hardware, but the key point for me is that it all works and
is system independent.

For debugging, there are basically two ways to interact with hardware.
You can use OpenOCD, which is open source, or you can use propriety
devices and software. P&E Micro, for example, is usually handled by
propriety software - but tools like KDS support it on Linux as well as
Windows. Seggar J-Link work fine on Windows and Linux. And OpenOCD
works fine Windows, but even better on Linux, and supports a vast range
of hardware devices from high-end debuggers with Ethernet and trace, to
home-made devices with an FTDI chip and a couple of passive components.

The only Atmel devices I have used are AVRs, and I haven't had much use
of them for a long time. It is even longer since I have used a debugger
with them. But I have happily used Eclipse and an Atmel JTAG ICE
debugger on Linux - though you need to do a little reading on the net to
see how to set it up.

> In the past I tried to configure Code::Blocks IDE, that it's very nice
> and fast for me, for Atmel ARM devices. It runs under Windows and Linux,
> because it is wxWidgets-based. Unfortunately the problem is always the
> same: debugging.
> I can't think to develop an application without the plus to break the
> programm, watches variables values, run the next instruction and so on.
>

I agree - usually debugging is handy, especially early on in a project.
Later on it can get impractical except perhaps for post-mortem
debugging. You don't really want your motor driver to keep stopping at
breakpoints...

> Moreover, manufacturer IDE usually gives other functionalities. For
> example, Atmel Studio gives the possibility to see core registers,
> peripherals' registers (well organized), Flash and RAM content and so
> on. I think you lost all those info with a "neutral" IDE during debugging.
>

Some manufacturer IDEs give a lot of useful extra features, others are
less useful. And sometimes you can get much of the effect from a
generic IDE. If the device headers for a chip define an array of
structs "TIMER[4]" with good struct definitions for the timers, then you
can just add a generic "watch" expression for TIMER[0] and expand it, to
view the contents of the TIMER[0] registers. That may screw up
registers that have volatile effects on read, but it usually works quite
well - often as good as the manufacturers' own add-ons.

For the ARM, however, there is a large project:

<http://gnuarmeclipse.github.io/>

Most ARM microcontroller manufacturers, with Atmel being the one notable
exception, make their IDEs from Eclipse with the extensions from this
project - possibly with their own small modifications and additional
extensions. You can put together a neutral IDE with off-the-shelf
Eclipse and these extensions that gives you pretty much everything you
get from a manufacturer's IDE, except for their Wizards, Project
Generators, Chip Configuration Tools, etc.

pozz

unread,

May 3, 2017, 3:04:19 AM5/3/17

to

Il 02/05/2017 11:34, David Brown ha scritto:
> On 02/05/17 09:14, pozz wrote:
>> Il 28/04/2017 13:10, David Brown ha scritto:
>>>> [...]
>>>> Moreover they are mostly pin-to-pin compatible and they have a good pin
>>>> multiplexing scheme (you have an UART almost on every pin).
>>>> Atmel Studio is slow, but it works well. It sometimes crashes, mainly
>>>> during debug, but you can live with them. I don't like Eclipse too much.
>>>
>>> These things are a matter of taste, which is often a matter of what you
>>> are used to. I don't like MSVS at all, and therefore dislike Atmel
>>> Studio. (I wonder if they will migrate to a Netbeans IDE, which is what
>>> Microchip uses?). Of course, I use Linux for most of my development
>>> work, which makes me biased against Windows-only tools!
>>
>> Don't think I'm a M$ fan. However I think it's much more fast to develop
>> under Windows, because you have all the tools already configured and
>> working under M$.
>>
>
> I'd say it is faster to develop under Linux, because you have all the
> tools ready and working - and many of them work much faster on Linux
> than Windows.
>
> But of course, that depends on the tools you want to use :-)

Of course, we are talking of tools related to ARM Cortex-M MCUs from
different manufacturers.
I don't know what others silicon vendors offer, but Atmel gives a
ready-to-use solution only under Windows (a single setup that installs
Atmel Studio IDE, GNU ARM toolchain, examples, headers, and so on). For
Linux there's a separate GNU ARM toolchain setup, but I think it will be
much more difficult/long to arrange a full system (

If you are a Linux guru and know exactly which tools you need and how to
install them, maybe you take one hour. The single Windows setup process
is much more simple and fast.
However I agree with you, this is a one-time only process and it's worth
it if the development under Linux is better.

I think the migration from Atmel to Microchip is processing in the worst
way possible. Now I don't find Atmel Software Framework for Linux
anymore, only for Windows. I remember ASF was available as a tgz file
too for Linux installation.

>> Atmel Studio is not that bad as a self-contained IDE, except it is very
>> slow. Of course, if you usually create your own Makefile to manage your
>> build process, it's another story.
>>
>> In the past I tried to create/use an ARM toolchain with a custom
>> Makefile and using whatever text-editor to change source code. Of
>> course it worked... but the debug was a problem.
>>
>> So I want to ask you a question: how do you debug your projects under
>> Linux? Maybe Kinetis IDE are Eclipse-based (I don't know), so I think it
>> works well under Linux, from coding to debugging. Is the ARM
>> debuggers/probes (J-Link, manufacturer specific devies) good under Linux?
>>
>
> For serious projects, I /always/ use my own Makefiles. But I often use
> the manufacturer's IDE, precisely because in many cases it makes
> debugging easier.

Ok, so you need to use a manufacturer that has a Linux-friendly IDE.

> So for programming on the Kinetis, I use the "Kinetis
> Design Studio" IDE. It is a perfectly reasonable Eclipse IDE (assuming
> you are happy with Eclipse), with the plugins and stuff for debugging.
> I use my own Makefile, but run it from within the IDE. I use a slightly
> newer version of gcc (from GNU Arm Embedded) than the version that comes
> with KDS. But I do my debugging directly from within the IDE.

In this case, it shouldn't be so difficult. The IDE released from the
manufacturer is Linux ready.

> I have the same setup on Windows /and/ Linux, and can use either. That
> means I need some msys2/mingw-64 stuff installed on Windows to make it
> look like a real OS with standard utilities (make, sed, cp, mv, etc.),
> but that's a one-time job when you configure a new Windows system. The
> build process is significantly faster on Linux than Windows on
> comparable hardware, but the key point for me is that it all works and
> is system independent.

I am curious: if you use the same setup (same toolchain), are you able
to create the /same/ binary file on both development systems?

Thank you for your suggestions. Maybe in the future I'll try to arrange
a Linux development system for MCUs.

David Brown

unread,

May 3, 2017, 4:59:16 AM5/3/17

to

The way I like to work, there can often be other more general software
tools involved - make, sed, cp, mv, etc., in Makefiles, subversion or
git, Python for scripts, pre-processors, post-processors (making CRC's
or other manipulation of the linked binary), and so on. These things
are mostly cross platform - but usually already available on any Linux
installation while you need to find them and install them on a Windows
system. (As noted before, this is typically a one-time job with each
new Windows installation.)

Some of my projects are simple enough to be handled by an IDE (plus
toolchain), but many are not.

> I don't know what others silicon vendors offer, but Atmel gives a
> ready-to-use solution only under Windows (a single setup that installs
> Atmel Studio IDE, GNU ARM toolchain, examples, headers, and so on). For
> Linux there's a separate GNU ARM toolchain setup, but I think it will be
> much more difficult/long to arrange a full system (

Pretty much every major ARM microcontroller manufacturer offers an
equally ready-to-use solution, but they offer it on Windows /and/ Linux,
and maybe also MacOS. Atmel is /way/ behind the times in thinking
Windows-only.

Also, by using MSVS as their basis for their IDE, Atmel severely limit
the usage on /Windows/. While Eclipse or Netbeans based IDEs will run
on a wide range of Windows systems (different versions, different
service packs, different choices of updates, etc.), MSVS is much
fussier. It wants to take over parts of your system and install stuff
in Windows directories, it has requirements about particular versions of
updates, and it does not play well with multiple installed versions of
the tools (that applied to the old Atmel Studio too).

Atmel's way is fine if you always run the latest version of Windows,
always apply the latest updates and service packs, and always want only
the latest version of the IDE and toolchain. But that is absolutely
/not/ compatible with the way /I/ work. My current Windows machine is
Win7, with all updates disabled. I have had too many bad experiences
with MS deciding that /my/ machine needs changes that break software or
drivers - I will not have automatic updates on a Windows system. And
when I want to get a new version of a toolset, I want it installed in
addition to existing versions, in its own directory. When I go back to
an old project, I need to have the tools (toolchain and library - the
IDE and debugger doesn't matter) for that project - not a newer version.

>
> If you are a Linux guru and know exactly which tools you need and how to
> install them, maybe you take one hour. The single Windows setup process
> is much more simple and fast.

Again, this is an Atmel problem - other vendors make equally
easy-to-install toolchains for Linux. The poor (in my eyes) development
tools for Atmel are a big reason why I would not consider their ARM
microcontrollers. And when I use their AVR microcontrollers, I do so
using their command-line tool builds only (combined with Eclipse, open
source AVR debugger tools, etc.).

> However I agree with you, this is a one-time only process and it's worth
> it if the development under Linux is better.
>
> I think the migration from Atmel to Microchip is processing in the worst
> way possible. Now I don't find Atmel Software Framework for Linux
> anymore, only for Windows. I remember ASF was available as a tgz file
> too for Linux installation.
>

I haven't looked at this, so I can't really comment. In general, I find
Microchip to be excellent in some ways, and terrible in other ways.

>
>>> Atmel Studio is not that bad as a self-contained IDE, except it is very
>>> slow. Of course, if you usually create your own Makefile to manage your
>>> build process, it's another story.
>>>
>>> In the past I tried to create/use an ARM toolchain with a custom
>>> Makefile and using whatever text-editor to change source code. Of
>>> course it worked... but the debug was a problem.
>>>
>>> So I want to ask you a question: how do you debug your projects under
>>> Linux? Maybe Kinetis IDE are Eclipse-based (I don't know), so I think it
>>> works well under Linux, from coding to debugging. Is the ARM
>>> debuggers/probes (J-Link, manufacturer specific devies) good under
>>> Linux?
>>>
>>
>> For serious projects, I /always/ use my own Makefiles. But I often use
>> the manufacturer's IDE, precisely because in many cases it makes
>> debugging easier.
>
> Ok, so you need to use a manufacturer that has a Linux-friendly IDE.
>

I could live without it - I /could/ get a purely open IDE working fine
for Atmel's ARMs. But it is a lot easier when the manufacturer provides
the tools - as almost every ARM microcontroller manufacturer except
Atmel does.

>
>> So for programming on the Kinetis, I use the "Kinetis
>> Design Studio" IDE. It is a perfectly reasonable Eclipse IDE (assuming
>> you are happy with Eclipse), with the plugins and stuff for debugging.
>> I use my own Makefile, but run it from within the IDE. I use a slightly
>> newer version of gcc (from GNU Arm Embedded) than the version that comes
>> with KDS. But I do my debugging directly from within the IDE.
>
> In this case, it shouldn't be so difficult. The IDE released from the
> manufacturer is Linux ready.
>
>
>> I have the same setup on Windows /and/ Linux, and can use either. That
>> means I need some msys2/mingw-64 stuff installed on Windows to make it
>> look like a real OS with standard utilities (make, sed, cp, mv, etc.),
>> but that's a one-time job when you configure a new Windows system. The
>> build process is significantly faster on Linux than Windows on
>> comparable hardware, but the key point for me is that it all works and
>> is system independent.
>
> I am curious: if you use the same setup (same toolchain), are you able
> to create the /same/ binary file on both development systems?
>

Yes - bit-perfect. For serious projects, I am not happy with my tools
until I can generate bit-perfect identical binaries on at least two
computers, preferably under two different OS's.

Stef

unread,

May 4, 2017, 4:14:43 AM5/4/17

to

On 2017-05-02 pozz wrote in comp.arch.embedded:
> Il 01/05/2017 10:09, Stef ha scritto:
>> On 2017-04-27 pozz wrote in comp.arch.embedded:
>>>
>>> Atmel SAM TCx peripherals are 16-bits counters/timers, but they can be
>>> chained in a couple to have a 32-bits counter/timer. I already coupled
>>> TC0 with TC1 to have a 32-bits hw counter. I can't chain TC0/TC1 with
>>> TC2/TC3 to have a hardware 64-bits counter/timer.
>>
>> Just out of curiosity, I had a look at the SAM C21 Family datasheet. It's
>> been a long time since I used Atmel ARM controllers (SAM7).
>>
>> In the discription of the TC, I see no fixed 16 bit width and coupling of
>> timers. Only that any TC channel can be configured in 8, 16 or 32 bit mode.
>> Am I looking at the wrong datasheet or section?
>
> When you use a TC in 8- or 16-bits, your are using a single TC
> peripheral. When you configured TC0 in 32-bits, you are automatically
> using TC1 too, that works in "slave" mode:
>
> The counter mode is selected by the Mode bit group in the Control A
> register (CTRLA.MODE). By default, the counter is enabled in the
> 16-bit counter resolution. Three counter resolutions are available:
> [...]
> • COUNT32: This mode is achieved by pairing two 16-bit TC
> peripherals. TC0 is paired with TC1, and TC2 is paired with TC3.
> TC4 does not support 32-bit resolution.
> [...]
>
> IMHO this means TC is a 16-bits counter.

Yes, you are right, found it now. Earlier I didn't dive deep enough into
the 1000+ page datasheet to see this 'detail', sorry.

>> If the timers are indeed 8, 16 or 32 bit configurable, that could be a way
>> to speed up your testing. Just set your timer to 8 or 16 bit (and add some
>> code to set the other bits valid) and speed up overflows with a factor of
>> 2^24 or 2^16.
>
> Oh yes, if you read one of my previous post, I made exactly this to
> speed-up the raise of the bug. I discovered it was due to the lack of a
> sync wait loop after writing the read command to CTRLB register.

Ah, may have missed that, I started reading the thread a bit late and there
were a lot of posts. ;-)

Good you found the cause.

--
Stef (remove caps, dashes and .invalid from e-mail address to reply by mail)

He who has but four and spends five has no need for a wallet.