Thread safety proposal

30 views
Skip to first unread message

Steve Weinrich

unread,
Dec 21, 2017, 1:50:00 AM12/21/17
to Developers
I have read some very interesting topics and thoughts about multi-threading (I use the term thread, but it could just as easily be task, fiber, etc.).  I think the consensus is that it should be supported.  Exactly how is, of course, still undecided.  Nonetheless, I believe that if the final solution is preemptive, the existing libraries should be made thread-safe.

Without doing a survey of the libraries, I assume that every library will fall into one of four categories:
1) It is already thread-safe.
2) It is relatively simple to make it thread-safe.
3) It is a little difficult to make it thread-safe.
4) It is very hard to make it thread-safe.

Previous experience in dealing with libraries of types 3 and 4 forces the introduction of a Mutex.  For the type 3 case, the Mutex is a short term solution, awaiting a further effort.  For the type 4 case, it is likely to be a permanent solution.

I would like to propose three projects:

A) The introduction of a Mutex class into the Arduino library.  The appropriate virtual methods would default to empty methods.
B) A project be started to examine/modify the entire C RTL to make sure that it is thread-safe.  The use of Mutex is not allowed.
C A project be started to examine/modify the entire Arduino library to make sure that it is thread-safe.  The use of Mutex is allowed, but only when not using it would require a significant amount of work.


Thomas Roell

unread,
Dec 21, 2017, 7:46:36 AM12/21/17
to Arduino Developers
I would advise against a Mutex class (ok, that sounds really funny).

A Mutex is an object that if locked is owned by a thread. An ISR cannot lock a mutex. There are a lot of cases where you'd want to lock something from an ISR. 

Perhaps calling the class "Lock" with a ::lock(), ::unlock(), ::trylock() set of operations. A ISR then could use :;trylock() instead of the blocking ::lock(). I am thinking right now about the problem that a SPI instance is used by a thread, but an ISR would want to quick send/receive data if possible.

Implementation wise typically such locks use atomics as first level and then mutexes for wait resolution (priority inheritance). 

In general perhaps a classification of APIs that can be called from an ISR would be great, too. I suppose the classic mutex approach around malloc()/free() prevents a "new" operator in an ISR ... 


--
You received this message because you are subscribed to the Google Groups "Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to developers+unsubscribe@arduino.cc.

Steve Weinrich

unread,
Dec 21, 2017, 11:18:01 AM12/21/17
to Developers
I need to make a correction to the Mutex proposal.  It should not use C++ virtual methods.  Instead, it should be optional, perhaps using a macro.  The exact implementation is not important now and can be decided independently.

Steve Weinrich

unread,
Dec 21, 2017, 12:01:31 PM12/21/17
to Developers
Thomas,
Thank you so much for your suggestions.
I believe that an ISR is, by definition, not thread-safe.  Obviously, if the Mutex is optional for the entire sketch (as I just corrected in my proposal), the sketch is free to be single-threaded.

In the scenario you mention, the SPI code, via the useInterrupt method, prevents an ISR from being invoked in between the beginTransaction and endTransaction methods.  IMHO, ISR's in a multi-threaded environment should not use shared resources.  They need to set 'flags' that can then be queried by a thread to perform an action.  However, there may be other solutions that I hope our contributors will reveal!

Regards!
To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.

Thomas Roell

unread,
Dec 21, 2017, 12:26:20 PM12/21/17
to Arduino Developers
Steve,

this is where it gets interesting. SPI.beginTransaction() and SPI.endTransation() could implicitly lock, whereby ::beginTransaction() could return "false" if called from within an ISR and a thread already locked the SPI periperhal.

"usingInterrupt()" is not good enough. It shields again using a GPIO triggered event, but not against something coming from say a timer (RTC) or such. 

SPI is often shared. Say a SDCARD and a TFT display. Or even a Radio. 

Right now in the SW stack for STM32L0/STM32L4 I have this stupid problem where USB/MSC can access the SDCARD from and ISR (or LoRaWAN can pick up RX data). There are 2 ways to solve that. One is to move those parts of the stack into a separate thread, and use mutexes. Or use a scheme where you check for a lock, and if locked recheck with a timeout. The latter solution (backoff on contention is something I had seen been used successfully in flight controller software (TauLabs)). The solution via a thread is not that attractive for other reasons. If you now have a USB/MSC task, which priority should that have relative to the rest of the system ? Is it exposed to the user somehow ? Remember, Mutexes need to be priority inheritance, so you better lay out the priority scheme properly or you end up doing unexpected things.

Perhaps this problem is really just for shared HW resources, and other could be tracked via classical event/semaphore/mutex problems. In any case this fits your classification of the harder to resolve part.

For a Mutex class, I'd suggest a priority inheritance based definition that cannot be locked. One of the biggies with Mutex design is lock/unlock order. I'd suggest adding semantics that enforce that the order for unlocking is exactly the reverse of the one for locking. This way a user can detect mistakes early on.

In commercial project we do use a mutex hierachy, which means we disallow certain combinations of locks to avoid deadlocks (T0 owns A, now waits for B, while T1 owns B, but waits for A).

- Thomas



- Thomas

To unsubscribe from this group and stop receiving emails from it, send an email to developers+unsubscribe@arduino.cc.

Dennis Lee Bieber

unread,
Dec 21, 2017, 2:24:22 PM12/21/17
to devel...@arduino.cc
On Thu, 21 Dec 2017 10:26:15 -0700, Thomas Roell
<grumpyo...@gmail.com> declaimed the
following:

>For a Mutex class, I'd suggest a priority inheritance based definition that
>cannot be locked. One of the biggies with Mutex design is lock/unlock
>order. I'd suggest adding semantics that enforce that the order for
>unlocking is exactly the reverse of the one for locking. This way a user
>can detect mistakes early on.
>
>In commercial project we do use a mutex hierachy, which means we disallow
>certain combinations of locks to avoid deadlocks (T0 owns A, now waits for
>B, while T1 owns B, but waits for A).
>

Unfortunately, deadlock avoidance is something that the programmer must
be able to analyze... Any software support will likely be so bloated it
makes the device unusable; or it will make use of more than one
lock/semaphore/mutex meaningless (if all attempts to lock C requires first
locking B and A, why bother?)

http://greenteapress.com/wp/semaphores/


(Consolidating from the last day or so) ISR should have nothing that
could involve locks -- what use is a "trylock()" going to be? If the ISR
couldn't gain the lock what will it do -- totally ignore the event that
triggered the ISR and return? Set a missed-interrupt flag and return
(basically the same as ignore the event, but defers notification to a
non-ISR routine)? Busy-wait -- in an ISR that should do minimal action and
return?

For an input device interrupt, the actions should probably be: test for
circular buffer full (if full, drop the data and set an overrun event
flag), if not full, insert data to next location and increment "next
pointer", set data-available flag.

Non-ISR thread reading available data tests for buffer empty (do
nothing) or pulls next data and increments read pointer. No locks really
needed as the ISR and reader never modify the same control values the
data-waiting flag may be the only risk -- it would be used to block the
reader from busy-looping when the reader has emptied the circular buffer.
But the event-flag code itself should probably disable interrupts for the
short period needed to set/clear the event flag, so again no overlapping
shared access.


Output device processing would be similar -- ISR (on device ready)
would pull data from circular buffer; non-ISR would fill buffer



--
Wulfraed Dennis Lee Bieber AF6VN
wlf...@ix.netcom.com HTTP://wlfraed.home.netcom.com/

Thomas Roell

unread,
Dec 21, 2017, 2:36:30 PM12/21/17
to Arduino Developers
Comments embedded.

- Thomas

On Thu, Dec 21, 2017 at 12:24 PM, Dennis Lee Bieber <wlf...@ix.netcom.com> wrote:
On Thu, 21 Dec 2017 10:26:15 -0700, Thomas Roell
<grumpyo...@gmail.com> declaimed the
following:

>For a Mutex class, I'd suggest a priority inheritance based definition that
>cannot be locked. One of the biggies with Mutex design is lock/unlock
>order. I'd suggest adding semantics that enforce that the order for
>unlocking is exactly the reverse of the one for locking. This way a user
>can detect mistakes early on.
>
>In commercial project we do use a mutex hierachy, which means we disallow
>certain combinations of locks to avoid deadlocks (T0 owns A, now waits for
>B, while T1 owns B, but waits for A).
>

        Unfortunately, deadlock avoidance is something that the programmer must
be able to analyze... Any software support will likely be so bloated it
makes the device unusable; or it will make use of more than one
lock/semaphore/mutex meaningless (if all attempts to lock C requires first
locking B and A, why bother?)

Actually it's not that tricky to do. But something that should be kept around as thought. The easier it is for users not to screw up, the better.

Enforcine lock/unlock sequence. You simply keep a per thread list of locked mutexes (which you may need anyway for thread termination). When you lock a mutex, it gets prepended to the list, if you unlock, only the head of the list can be unlocked. Avoids a lot of mutex ordering issues ...
 

http://greenteapress.com/wp/semaphores/


        (Consolidating from the last day or so) ISR should have nothing that
could involve locks -- what use is a "trylock()" going to be? If the ISR
couldn't gain the lock what will it do -- totally ignore the event that
triggered the ISR and return? Set a missed-interrupt flag and return
(basically the same as ignore the event, but defers notification to a
non-ISR routine)? Busy-wait -- in an ISR that should do minimal action and
return?

If the trylock() fails the ISR could for example schedule a timer callback in say 5ms and check again.

If you add locks to SPI you need to be able to at least allow an ISR to detect a "::isLocked()" state ... (which is more or less the same as the trylock ...)
 
        For an input device interrupt, the actions should probably be: test for
circular buffer full (if full, drop the data and set an overrun event
flag), if not full, insert data to next location and increment "next
pointer", set data-available flag.

Yup. So you propose to add the typical event-flags and semaphores to the core concept suggested initially just to do mutexes ;-) And of course a queue object. So we are now again at a full RTOS ?

        Non-ISR thread reading available data tests for buffer empty (do
nothing) or pulls next data and increments read pointer. No locks really
needed as the ISR and reader never modify the same control values the
data-waiting flag may be the only risk -- it would be used to block the
reader from busy-looping when the reader has emptied the circular buffer.
But the event-flag code itself should probably disable interrupts for the
short period needed to set/clear the event flag, so again no overlapping
shared access.


        Output device processing would be similar -- ISR (on device ready)
would pull data from circular buffer; non-ISR would fill buffer



--
        Wulfraed                 Dennis Lee Bieber         AF6VN
    wlf...@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

cfo64nc

unread,
Dec 21, 2017, 2:43:26 PM12/21/17
to devel...@arduino.cc
Please remove me from this thread/email group/ etc. etc.  Not sure why I keep getting these when I stopped following the Arduino.CC comunity.


To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.

William Westfield

unread,
Dec 21, 2017, 7:16:18 PM12/21/17
to devel...@arduino.cc

> B) A project be started to examine/modify the entire C RTL to make sure that it is thread-safe.

Well, THAT sounds like a never-ending task, fraught with much peril. Which C (and C++, presumably, those parts that might exist) RTL are you talking about? AVR-Libc? gcclib? the compiler internals? (Don’t forget that on AVR "PORTB |= 4;” is probably atomic, but “PORTH |= 4;” isn’t…) The obsolete libsam used on Due? STM SPL/Cube/HAL or whatever is being used on the ST parts? Ditto for the Nordic and Expressif parts? newlib-nano and newlib?

Ugh.
BillW/WestfW

Steve Weinrich

unread,
Dec 21, 2017, 8:18:46 PM12/21/17
to Developers
I was referring to avr-libc.  If the other libraries are applicable (I hope other folk would know better than me), they should be examined as well.  Obviously, like many projects, all libraries for all chips do not have to be completed before one or more chips are fully supported.  The compiler does not have to be thread safe. As far as PORTB, etc., preemptive thread-safe does not mean atomic.
Thank you for your input.
Reply all
Reply to author
Forward
0 new messages