Scalability of VxWorks watchdog timer (wdLib) implementation

Walter Zimmer

unread,

Oct 29, 2001, 3:32:06 AM10/29/01

to

Hi!

We have a rather theoretical question:
How is the scalability of the VxWorks watchdog timer implementation?
Is there a limit on how many timers should be used concurrently?
Is timer interrupt latency affected by having many timers?

The reason behind the question is that we have an application
with multiple (200) instances. Now we think about an implementation
using 200 timers (which is fairly easy to implement) or the
alternative way, using one timer for all 200 instances (which would
be more coding overhead).

Therefore, we hope that many watchdog timers do not affect
interrupt latency.

Are there other resources used than memory for the timer
implementation?

Any hints and pointers would be greatly appreciated.

Walter Zimmer
--
Fraunhofer-Einrichtung
Systeme der Kommunikationstechnik

Walter Zimmer Hansastraße 32
Dipl.-Inf. D-80686 München
Telefon: +49(0)89-547088-344
E-Mail: walter...@esk.fhg.de Telefax: +49(0)89-547088-225

Manuel van den Berg

unread,

Oct 29, 2001, 4:32:39 AM10/29/01

to

Hi,

Unfortunately for you, watchdog timers are not so very scalable.
They are checked at every timer interrupt (as far as I know).
I suggest you just try it to find out.

Manuel

"Walter Zimmer" <walter...@esk.fhg.de> wrote in message
news:3BDD1406...@esk.fhg.de...

John

unread,

Oct 29, 2001, 11:19:54 AM10/29/01

to

Hello,

"Manuel van den Berg" <qm...@oce.nl> wrote in message news:<3bdd2206$0$28887$4d4e...@oce.news.eu.uu.net>...

> Hi,
>
> Unfortunately for you, watchdog timers are not so very scalable.
> They are checked at every timer interrupt (as far as I know).
> I suggest you just try it to find out.

No, they are not *all* checked every clock tick. What happens is that
all the delay related items (timeouts, taskDelay calls and watchdogs)
are maintained in a sorted queue (much like the scheduler queue). At
each tick, the head of the queue is checked to see if it matches the
new tick count. A match is pulled off and processed, then the new head
is checked to see if it matches. When the head no longer matches the
current tick count, or the queue is empty, the checking stops.

So, only if all 200 timers expire on the same tick will all 200 need
to be checked. And that will only happen for the one tick value that
they expire on.

HTH,

John...

Raghu

unread,

Oct 29, 2001, 11:46:26 AM10/29/01

to

I remember that the Max timers are 256. Any two timers(of the same Value)
expiring at the same time might have some serious effects on the system.

John <john_...@yahoo.com> wrote in message
news:488e459a.01102...@posting.google.com...

Leonid Rosenboim

unread,

Oct 29, 2001, 1:14:02 PM10/29/01

to

Raghu, you are wrong, and should not argue with people who know the VxWorks
source code by heart.

There is no limit on numnber of outstanding watchdog requests, except your
memory pool, cause each wd object costs you memory, and the objects are all
part of a linked list.

When more then one timer expires at the same tick, they all need to be
processed, so this increases the duration of the system clock ISR, but if
you follow the coding rules for watchdog routines, this effect should be
acceptable. In other words, keep these functions extremely short and
conservative with respect to system resources.

Leonid

"Raghu" <raghuram...@roke.co.uk> wrote in message
news:DJfD7.14$1F5.800@psinet-eu-nl...

Message has been deleted

David Laight

unread,

Oct 30, 2001, 11:24:41 AM10/30/01

to

> > Unfortunately for you, watchdog timers are not so very scalable.
> > They are checked at every timer interrupt (as far as I know).
> > I suggest you just try it to find out.
>
> No, they are not *all* checked every clock tick. What happens is that
> all the delay related items (timeouts, taskDelay calls and watchdogs)
> are maintained in a sorted queue (much like the scheduler queue). At
> each tick, the head of the queue is checked to see if it matches the
> new tick count. A match is pulled off and processed, then the new head
> is checked to see if it matches. When the head no longer matches the
> current tick count, or the queue is empty, the checking stops.

However the list is linear, so that wdStart() is linear (actually in the
number of timers that expire before the one being created).

As with many systems the 'churning' of timeout that protocol stacks generate
is not handled well.

NB - it is the use of the generic 'priority Q' that causes vxWorks to kill
vxTicks whenever the list might wrap - zapping any attempt at a real time
clock.

I also think I've spotted a bug, the timeout function and argument are
saved in the timeout structure while a wdStart is deferred to the kernel
work Q. Now if the (old) timeout expires before the deferred wdStart is
taken from the work Q, then the new timeout routine is likely to be called
immediately. The following might show this 'feature':

void f2( int x ) { logmsg( "f2 %x\n", x ); }
void f1( int x ) { logmsg( "f1 %x\n", x ); wdStart( (void *)x, 1000, f2,
x ); }
...
wdStart( t1, 100, f1, (int)t2 );
wdStart( t2, 100, f1, (int)t1 );

Just work out for yourself what the possible valid outputs from this test
are - however it should take 11 seconds (100Hz tick) for any output from
f2().

David

Raymond Yeung

unread,

Oct 30, 2001, 5:37:02 PM10/30/01

to

It may depend on your application's requirements. If your
application doesn't require the resolution of a watchdog (one
tick at about 16-17 msec, on the boards that I worked with this far),
you may be better off setting up the timer at longer interval, and
delagate the application timer processing to a high priority task.

E.g. if you've protocol timers of order of seconds, but you have
something like DSP packetization occuring at 10-30 msecs intervals,
and must be on time. In this application, using watchdog directly
could open to burst of ISR work due to slight misalignment in timers.
These bursts could affect schedules of other more time critical
works.

Walter Zimmer

unread,

Oct 31, 2001, 3:24:15 AM10/31/01

to

Hi!

First, thanks to everyone for responding. I'm glad VxWorks implemented
the wdLib in the right way :)

Raymond Yeung wrote:
> It may depend on your application's requirements. If your
> application doesn't require the resolution of a watchdog (one
> tick at about 16-17 msec, on the boards that I worked with this far),
> you may be better off setting up the timer at longer interval, and
> delagate the application timer processing to a high priority task.

That's what we're currently doing in the timer routine: Only insert
a message into a queue and exit.

> E.g. if you've protocol timers of order of seconds, but you have
> something like DSP packetization occuring at 10-30 msecs intervals,
> and must be on time. In this application, using watchdog directly
> could open to burst of ISR work due to slight misalignment in timers.
> These bursts could affect schedules of other more time critical
> works.

I understand the point that it's not good if my 200 timers expire all at
once, but as the running time increases, synchronization issues will
lead to the point that they will produce bursts.

Now, what is the alternative?
1.
Using one timer and doing the work of all 200 routines at once is
worse in my oppinion since it _always_ produces burst, especially
for the task that gets the 200 messages.

2.
Use a task that suspends itself periodically instead of a timer
routine. This would be optimal from the load balancing view. Are
there any performance penalties or other disadvantages of this
approach? For a protocol timer which uses times in the orders
of seconds this could be a feasable alternative.

3.
Avoid self-synchonizing of the 200 timers. This would be the ultimate
solution, since it balances the load and consumes less resources
than task switching (like in 2.)
Any ideas? I don't have any clue how to accomplish this with the
current wdLib API. I'm not aware of any means to tell the API to
e.g. insert the timer routine at an expiration time 5 ticks away
from any other expiration time.

Did someone experience this self synchronizing effect, or is it
extremely unlikely? Any links to papers?

Thanks again,
Walter

Walter Zimmer

unread,

Oct 31, 2001, 3:35:53 AM10/31/01

to

Hi!

David Laight schrieb:

> However the list is linear, so that wdStart() is linear (actually in the
> number of timers that expire before the one being created).

Thanks for this excellent point. Now I know the potential time consuming
call for my 200 timers. I'm not sure how we can trace the time wdStart()
will cause the kernel to burn, but I guess we will try.

[...]

> I also think I've spotted a bug, the timeout function and argument are

Did you tell the VxWorks people?

Thanks,
Walter

Raymond Yeung

unread,

Oct 31, 2001, 2:18:33 PM10/31/01

to

It may not be that bad to have the "burst" if you allow the job
to be pre-empted by higher priority task. This is especially so
if your application needing the 200 timers would suffer no grave
consequence if they're delayed. In this case, your option #1 may
be effective.

At the minimum, you've now moved the work away from ISR to task level.
So your timer interrupt would not lock out other lower priority interrupts
and the high priority tasks for a "long" time. Do you know how high the
priority of your timer interrupt?

Walter Zimmer <walter...@esk.fhg.de> wrote in message news:<3BDFB52F...@esk.fhg.de>...

Walter Zimmer

unread,

Nov 5, 2001, 8:41:53 AM11/5/01

to

Hi!

Raymond Yeung schrieb:

> It may not be that bad to have the "burst" if you allow the job
> to be pre-empted by higher priority task. This is especially so
> if your application needing the 200 timers would suffer no grave
> consequence if they're delayed. In this case, your option #1 may
> be effective.

Agreed.

> At the minimum, you've now moved the work away from ISR to task level.
> So your timer interrupt would not lock out other lower priority interrupts
> and the high priority tasks for a "long" time. Do you know how high the
> priority of your timer interrupt?

No. But I guess I now do what I can to have the ISR short, so the
timer interrupt should not be the problem.

However, the task that receives the messages is the netTask, so it will
get very busy when bursts occur.
My options are to implement the desired functionality into another
low priority task or to make sure no bursts occur. Since I hope
avoiding the bursts is much easier than setting up another task and
move the functionality there I was hoping someone has a ready made
receipe for burst avoidance :)

Perhaps I will first watch how badly the bursts accumulate.