Re: [Developers] Six projects for the near future of the Arduino open source project

Matthew Ford

unread,

Jun 13, 2017, 8:41:40 AM6/13/17

to Thomas Roell, vbextreme vbextreme, Carmine Spizuoco, devel...@arduino.cc

On 13/06/2017 10:17 PM, Thomas Roell wrote:
> as then the consumer needs to know about all the special cases
No only the support library's "free()" method needs to know how to
free/empty the special cases. The user's method does not free the data
arg, that is done by the scheduler on the return.
To the user the data just looks like an arg to their processing method.
They would not expect to either free an arg or expect it to survive past
the end of the processing method.

It idea is to empathize the message passing and include the data in a
way that makes it easy for the novice user to write simple code that
just works and avoids globals and locks.

As much as possible all sources of messages+data should appear the same
the user and be able to be handled by the same style of user processing
code.

Thomas Roell

unread,

Jun 13, 2017, 8:53:34 AM6/13/17

to matthe...@forward.com.au, vbextreme vbextreme, Carmine Spizuoco, devel...@arduino.cc

Not sure whether the discussion about all those details is productive. Guess we need to first answer the core question of what the goal should be to support concurrency.

It seems there are 2 fundamentally different approaches on the table:

(1) A classic cooperative scheduler with locks and a per thread stack, that solves the problem of "delay()", "yield()" and blocking IO.

(2) A classic event based system, where each event triggers one or more stackless threads to be executed in a oneshot manner. The default loop could be the only thread that has a stack, and can use "delay()", "yield()" and blocking IO.

The attractiveness of (2) is that each of those event handlers does not need an extra stack, because only one handler can execute at one point of time. So this model is better suited for smaller devices (say 16k or 32k of SRAM). Downside is that somewhere in the system there is a queue to buffer all the pending events, which needs to be properly sized.

Any deviation from the core oneshort thread model in (2) will lead to (1) more or less. Any implementation that I had the chance to look at that claims to do (2) with features from (1), but without inheriting the flaws from (1) turned out to be conceptually flawed, or only worked under a undesirable subset of domain specific set of circumstances.

If one adds oneshot threads to (1), then (2) can always be implemented ontop of (1).

In reality two competing systems use the two approaches:

microPython/Python3: simple cooperative scheduler with locks

Node.js: event driven dispatcher with IO abstracted behind the application API.

- Thomas

On Tue, Jun 13, 2017 at 6:17 AM, Thomas Roell <grumpyo...@gmail.com> wrote:

Matthew,

you described the message passing process you had in mind that the producer allocates the container for the data via malloc() and the consumer would free that container again when the data has been processed via free().

Coming up for a special case for each instance when an ISR wants to be the producer of an event is not acceptable, as then the consumer needs to know about all the special cases, and where an event comes from. Even worse, each time a user wants to code a producer that is triggered via a callback to "attachInterrupt()" they need to special code that.

At the end of the day, malloc() / free() does not work for that purpose.

What you could do is to have an event that is a tuple of { uint32_t eventID, void *data } and then let the user decide what "data" is, and how it's allocated and free, or unused. But in reality that will lead to a queue or ringbuffer type class implementation for each type of data you are passing around. So you need to supply a generic queue implementation the is preemption safe as well (as this is for the ISR producer, which cannot lock or block). In practice that is not any better for the user than having one global queue where data is copied into.

Another solution could be to simply give up on passing around data, and just live with some "eventID" being passed around.

In any case, if the goal is to have one producer and one consumer, in a classic RTOS this problem is solved by a queue object, where the writer is always non-blocking (ISR, Task), and the reader may be blocking/waiting (Task), or non-blocking (ISR). If the size of the data item is zero, this type of queue simple degenerates to a semaphore. So why reinvent the wheel ?

- Thomas

On Tue, Jun 13, 2017 at 1:45 AM, Matthew Ford <Matthe...@forward.com.au> wrote:

For Serial I was thinking of having 2 pre-allocated buffers which are never freed. At the end of the message processing the call to "free" just marks the buffer a empty

so receive a char, put in buffer 1 post msg with buffer 1.

Switch to buffer 2 to hold subsequent received chars while waiting for buffer 1 to be processed and 'freed' i.e. marked as empty

When buffer 1 is 'freed' then post msg with buffer 2 and switch to filling buffer 1 etc

On 13/06/2017 2:49 PM, Thomas Roell wrote:

Ad iii). The implied use of malloc for an event producer poses a problem with system generated events from the ISR level. I'd assume we'd want to have events like "SERIAL_DATA_AVAILABLE". That also means that any kind of queueing operation needs to work without the use of malloc() (no list entries via new(), no queue resizing).

Multiple consumers are typically uses for barriers.

On Mon, Jun 12, 2017 at 8:58 PM, Matthew Ford <Matthe...@forward.com.au> wrote:

It seems to me that most RTOS projects concentrate on scheduling and largely ignore data consistency.

i.e. it is left to the user to work out the locking etc.

I have a couple of suggestions to assist novice users to avoid corrupted data with out the complication of locks, so avoiding deadlocks.

i) Non-peremptive tasks (on a single cpu) ensures data consistency between blocking operations, without locks.

ii) static locals can be used instead of globals in a lot of cases for better data security without locks

iii) message passing which includes data can be used to protect data in a simple manner, from the user's point of view
The idea for passing data via msgs is that the producer calls new() and passes that ptr with the message, BUT does not keep a copy if it.
The consumer ALWAYS frees the pointer on exit.
The consumer should be able to call a support method to find the size of the data since the malloc() management must know this.
Of course being C the user can always shoot themselves in the foot, but it requires extra code, i.e. explicitly saving the new() pointer in a global variable

This should work in many cases.
The exception is where there are multiple consumers.

Question: How many commons use cases are there where there are multiple consumers of complex data i.e. data bigger then int, (which can be carried with the msg)?

micro:bits ManagedStrings https://lancaster-university.github.io/microbit-docs/data-types/string/#managedstring
which are immutable would also be useful.

--
You received this message because you are subscribed to the Google Groups "Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to developers+unsubscribe@arduino.cc.

Andrew Kroll

unread,

Jun 13, 2017, 9:31:50 AM6/13/17

to matthe...@forward.com.au, vbextreme vbextreme, Carmine Spizuoco, devel...@arduino.cc

Here, I just wrote your scheduled tasking, just needs something like priorities, but... it really isn't that hard to write it yourself.

/* Simple, super light dumb 'tasks' example.
* By Andrew J. Kroll
*/

#define MAX_TASKS 10

int Task_ID = 0;

void (*Tasks[MAX_TASKS])(void);

void setup() {
        Serial.begin(9600);
        // Init tasks to NULL.
        for(int i=0; i < MAX_TASKS; i++) {
                Tasks[i] = NULL;
        }

        Tasks[0] = &task_A;
        Tasks[1] = &task_B;
}

void loop() {
        if(*Tasks[Task_ID] != NULL) Tasks[Task_ID]();
        Task_ID++;
        if(Task_ID == MAX_TASKS) Task_ID = 0;
}

// Simple non-blocking task 'A'
void task_A() {
        // if we have enough room, send id for task A
        if(Serial.availableForWrite()> 4) Serial.println(Task_ID);
}

// Simple non-blocking task 'B'
void task_B() {
        // if we have enough room, send id for task B
        if(Serial.availableForWrite()> 4) Serial.println(Task_ID);
}

--

Visit my github for awesome Arduino code @ https://github.com/xxxajk

Thomas Roell

unread,

Jun 13, 2017, 9:42:11 AM6/13/17

to Andrew Kroll, matthe...@forward.com.au, vbextreme vbextreme, Carmine Spizuoco, devel...@arduino.cc

Andrew,

this looks all nice and cool, but it does not solve the problem of blocking IO, "delay()" or "yield()".

- Thomas

Andrew Kroll

unread,

Jun 13, 2017, 9:47:49 AM6/13/17

to Thomas Roell, matthe...@forward.com.au, Carmine Spizuoco, vbextreme vbextreme, devel...@arduino.cc

Neither does the original Loop function solve these problems... non-blocking Loop is easy to accomplish by using math on millis(). If anything this won't break anything out in the wild currently. Sure it doesn't do a lot but then there isn't much to it.

Steve

unread,

Jun 13, 2017, 9:51:28 AM6/13/17

to Andrew Kroll, Thomas Roell, matthe...@forward.com.au, Carmine Spizuoco, vbextreme vbextreme, devel...@arduino.cc

Sorry all - I am getting these emails by mistake. The unsubscribe link fails and tells me I am not a group member.

Can one of the admin unsubscribe me.

Cheers,

Steve e

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Paul Stoffregen

unread,

Jun 13, 2017, 10:03:17 AM6/13/17

to devel...@arduino.cc

My hope is to create a simple API which is basically option (2) "classic event based system" you've described. My main goal is to offer a solution to many (but not all) concurrency needs with an API that has the sort of simplicity you'd expect from Arduino.

While my main interest is to integrate with traditional (single threaded with interrupt handlers) Arduino code, it should be generic enough to be implemented together with a cooperative scheduler or preemptive RTOS with zero modification to libraries and little or no changes in sketches.

API design can be done top-down (focusing on what the user sees) or bottom-up (considering low level details). Good bottom-up thinking is critical for efficiency & performance. I'm really glad Thomas and others here are so good at thinking this way. But for just a moment I'd like to describe this from a top-down perspective.

Perhaps a typical usage might look like this:

    EventScheduler myevent;

    void setup()
    {
      pinMode(13, OUTPUT);
      myevent.attach(dosomething);
    }

    void loop()
    {
      Wire.requestFrom(112, 2, myevent);
      digitalWrite(13, HIGH);
      delay(250);
      digitalWrite(13, LOW);
      delay(250);
    }

    void dosomething(EventSchedulerRef event)
    {
      int num = event.getStatus(); // the final return status of Wire.requestFrom
      if (num == 2) {
        byte b1 = Wire.read();
        byte b2 = Wire.read();
        // do something with the data
        // default attach means we get called
        // from non-interrupt context, so we
        // can use Serial.print, String and
        // non-reentrant Arduino libs.
      }
    }

The main API addition is EventScheduler (for lack of a better name at this point). The idea is users can attach their function to it. More details on that in a moment... keeping this top-down. Libraries like Wire, SPI, Bounce, timers and anything that generates "events" can take an EventScheduler object as a parameter. When the event occurs, the library or other code calls an EventScheduler method to emit the event.

A key API design point is libraries can accept an EventScheduler reference and call a well defined method when they wish to trigger the event. All the low-level code that deals with the many thorny low-level details already discussed on this thread lives within the EventScheduler in the core library (or perhaps partially within the API library after the upcoming "chainsaw" project). Libraries like Wire can be written once to this simple API, which allows them to automatically adapt to whatever concurrency approach is actually used.

Zero code changes to libraries like Wire for differing concurrency implementations is a primary goal of this API proposal.

The one part of this API that probably does change a bit is myevent.attach(function). A least a few different attach() methods should be defined. The simplest default "attach" should avoid calling the user's code from interrupt context. How exactly the user's code is called is a bottom-up detail to be discussed. Different boards might implement the default using different approaches, depending on their resources and how they provide concurrency in other ways.

Perhaps an attachImmediate() method could cause EventScheduler to instantly call the user's code, even if interrupt context. As we've seen from serialEvent, some people really do need this rapid response. Maybe another attachLoopEnd() method might give the lowest, least overhead performance. There should probably also be some sort of attachThread() method that allows specifying a thread context, perhaps with a fallback taking a void or template parameter so sketches written with it at least compile and fall back to the default when compiled on non-RTOS Arduino boards. Whatever set of attachXYZ methods are "officially" defined by the API, each core library can choose to actually implement the ones it doesn't support by just calling back to the default. The point is user code must always compile and use the default is a special method isn't supported.

I've intentionally glossed over many important details. The point of this message was to look at this from a top-down perspective. Hopefully this means some sense?

To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.

Thomas Roell

unread,

Jun 13, 2017, 11:07:46 AM6/13/17

to Paul Stoffregen, Arduino Developers

Thanx Paul.

I was about to write almost the very same e-mail ;-)

Here a quick paper / slide set that is worth reading: https://web.stanford.edu/~ouster/cgi-bin/papers/threads.pdf

The point that John makes is that while threads (and preemptive scheduling) have theirs place and are probably more generic, a pure event model is simpler to understand and work with starting off. With threads all the complexity hits you right up front.

Paul brought up a very good example with a async Wire transfer. Simply add an API that takes some event argument so that the system knows what event to generate, pass by reference, avoid any fancy lookup.

Another key idea that needs to be reiterated is that you want 2 types of event handlers. One is "queued" where you wait till the main loop gives up control, and one is "unqueued", where the event handler gets called right away, perhaps from within an interrupt context. "myEvent.attach(callback, bool queued = true)".

Consequence of this would be that "serialEvent()" is a queued callback to a serial-data-event-ready, implicitly.

To be backwards compatible the current "loop()" code needs to pass control to the event-dispatcher only at defined points:

blocking-IO (aka Wire.endTransfer() or Wire.requestFrom())

delay()

yield()

after exiting loop()

I'd suggest to have the "myEvent" somehow include a number of entries to be queued, with the default being 1. It's tricky to size a global queue, but a programmer will have a reasonable understanding of how many events need to be queued to not overflow the queued for a given event. This could also be the place where one plugs in how much data per event is transferred. Since in the classic event model event handlers finish in sequence, the queue for each event is really just a queue, which is a simple problem to solve.

Besides a "myEvent" object being passed to the event handler, and perhaps a data/size pair, it would make sense also to include a timestamp of some sorts. Perhaps "micros()" a the point of time when the event was posted.

I do believe in this model there is no locking required for "loop()" giving up control.

There are some downsides, and it's fair to list them:

- a event handler cannot use blocking IO, only async IO where later on a event is generated upon completion

- a event handler cannot use "delay()" as this would recursively call the event dispatcher

- a event handler cannot call "yield()" as this would recursively call the event dispatcher

- for complex problems a sequence of event handlers will be needed realizing a state machine ... at that point classic threads may be easier to deal with

The model does have room to grow. A cooperative scheduler can be plugged in by viewing queued event handlers as lightweight tasks.

(trying not to include too many implementation details; it seems that keeping that scope all issues can be solved consistently)

Now why do I recommend this mode over a pure cooperative scheduler (or preemptive) ?

Really boils down to 3 main ideas besides "to much rope to hang yourself". On a classic RTOS one of the critical problems is how you get data out of an ISR to a thread. A queue ? A semaphore ? A event-group ? What do you do about race conditions (lost events) ? The problem is utterly non-trivial. A consistent event publish/subscribe model solves that.

The other one is a detail that Paul added sneakily. If a event can trigger an immediate callback at the interrupt context level, that means there is one level of preemption. IO in general needs to responded to very quickly. Pure cooperative scheduling cannot do that down to the timeframe required. But this single level of preemption allows for that. In research papers it's called a hybrid event model, and the immediate callbacks are often called fibers.

Lastly by having a "loop()" that can call "yield()" an application can solve the "too long running event handler" problem but putting exactly this code into the main "loop()" and split the runtime via "yield()" into smaller segments.

- Thomas

To unsubscribe from this group and stop receiving emails from it, send an email to developers+unsubscribe@arduino.cc.

Paul Stoffregen

unread,

Jun 13, 2017, 11:34:34 AM6/13/17

to Thomas Roell, Arduino Developers

On 06/13/2017 08:07 AM, Thomas Roell wrote:
> - a event handler cannot use blocking IO, only async IO where later on
> a event is generated upon completion
> - a event handler cannot use "delay()" as this would recursively call
> the event dispatcher
> - a event handler cannot call "yield()" as this would recursively call
> the event dispatcher

I was imagining implementation for single-threaded systems would use a
static boolean to prevent recursive event handler calls. A check for
available stack memory might also be done on boards with very small
memory size, to prevent calling handlers when yield() is called from a
deeply nested functions.

Of course users still should avoid these things within their event
handlers. But some almost certainly will. When they do, I'm confident
we can craft an implementation that limits the damage to only poor
performance that's easy to understand based on their own code.
Detecting and preventing infinite recursion is something we can
implement pretty easily.

(and you folks posting about "Free Arduino" drama on this developer list
are not helping Arduino... you're only causing disruption to important
conversations that hopefully will lead to improving Arduino for everyone)

Álvaro Lopes

unread,

Jun 13, 2017, 11:49:52 AM6/13/17

to Thomas Roell, Paul Stoffregen, Arduino Developers

My two cents, perhaps a bit sparse.

First of all, most Arduino and embedded systems in general have memory limitations. Memory is a scarse resource, and must be smartly managed.
This raises a couple of issues.

First issue is that most microcontrollers no not ship with a MPU (Memory Protection Unit), so we cannot detect any kind of memory overrun, and we cannot protect
memory areas from a "thread", "process" or similar concept from interfering with each other.

Second issue is that for threads/threadlets/processes/whatever a stack is often needed, and typical microcontrollers do not have any "smart" way to manage stack
- you either allocate it manually (with a fixed size, fixed location) or dynamically (with a fixed/variable size, but variable location). Overwriting other
thread stacks is the most usual source of problems, and it's hard to debug, hard to understand, and sometimes very hard to fix.

Those two issues need to be addressed in software, somehow.

Now, concurrency.

In the safety-critical area (where I work) all couplings need to be understood. This involves control coupling, data coupling, and external coupling at least.
From the coupling analysis we can infer when a certain variable or memory area (let's call it a resource) can be accessed "simultaneously" by tho threads
(living or not in same CPU, not relevant), by a interrupt routine, an exception handler, so on.

Now, Massimo brought my attention to the arduino-preprocessor using clang+ĺlvm. This is not only excellent news, but this can also provide us some more
information on the user code, which we can eventually use to minimize issues with concurrent memory accesses - just need to extract all data+control coupling
from the clang+llvm core, extract the callgraphs, identify if there are any shared resources between them, and change their type to "atomic". If this is easy, I
am not sure - not sure also if it is worth the effort, but is technically possible.

Scheduling details are, well, details. In fact the scheduler itself is not that important.

Since llvm+clang is to be used, I think we can extract much more information from it to help the user code to run flawlessly, and even optimize async/sync IO.
And to automatically size the "thread" stacks, if we are to use those - AFAIK we do not support dynamic stack changes (like asprintf and friends), so each
function should have a fixed stack size, and the callgraph can (to some extent) tell you how much the thread stack can grow.

I am sure other benefits will also arise from using LLVM.

Alvie

>>> To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc
>>> <mailto:developers+...@arduino.cc>.

>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups "Developers" group.

>> To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc
>> <mailto:developers+...@arduino.cc>.

>
>
> --
> You received this message because you are subscribed to the Google Groups "Developers" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc <mailto:developers+...@arduino.cc>.

>
>
> --
> You received this message because you are subscribed to the Google Groups "Developers" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc <mailto:developers+...@arduino.cc>.

Thomas Roell

unread,

Jun 13, 2017, 11:50:14 AM6/13/17

to Paul Stoffregen, Arduino Developers

Good questions.

Suppose "delay()" in a event handler could degenerate into a busy wait. So yes, guards could be in place. On the other hand it's perhaps time to prevent things that in any user application need to be fixed anyway, like blocking IO in an ISR. Adding this event based concurrency model is perhaps a good point in time document those fundamentally bad things, and simply don't allow them going forward.

"Detecting and preventing infinite recursion is something we can implement pretty easily."

Things of that nature could be detected in the implementation. But how do you feed this back to the programmer ?

- Thomas

vbextreme vbextreme

unread,

Jun 13, 2017, 11:50:24 AM6/13/17

to Paul Stoffregen, devel...@arduino.cc, Thomas Roell

Check available stack memory is not very simple for all architetture.

If you use one task for primary work in the stack and other task for help the primary task, in the heap or static memory, is more safe.

128 byte for secondary task work fine, but reallocate size on runtime is very complicated.

You need to discharge original stack/heap and implement new memory model, you lose std malloc.

And if you use heap for task when the stack are end and you try to realloc but malloc change the oroginal address how do you resolve pointers address?

Matthew Ford

unread,

Jun 13, 2017, 2:03:49 PM6/13/17

to Thomas Roell, Arduino Developers

On 14/06/2017 1:07 AM, Thomas Roell wrote:

There are some downsides, and it's fair to list them:

- a event handler cannot use blocking IO, only async IO where later on a event is generated upon completion

- a event handler cannot use "delay()" as this would recursively call the event dispatcher

- a event handler cannot call "yield()" as this would recursively call the event dispatcher

- for complex problems a sequence of event handlers will be needed realizing a state machine ... at that point classic threads may be easier to deal with

I don't agree that these downsides exist from the user's point of view

Blocking IO is OK -- post message with data to send and sleep this method (fiber in micro:bit terms) until output is sent (fiber woken up again)

delay()/sleep() does not seem to be recursive at all in the micro:bit code. Just puts the method to sleep. Same for yield() == sleep(0)

state machines are not implicit in the basic design but as you say may need to be implemented by the user for complex problems.

Here are the permissible options for event handlers in micro:bit

Seems to cover most cases and is selectable by user at coding time per listener

There are four permissible modes for event handlers. These are:

Threading mode	Brief Description
MESSAGE_BUS_LISTENER_IMMEDIATE	Handler is called directly from the code raising the event. Event handler is not permitted to block.
MESSAGE_BUS_LISTENER_DROP_IF_BUSY	Handler is executed through its own fiber. If another event arrives whilst the previous event is still being processed, the new event will be silently dropped.
MESSAGE_BUS_LISTENER_QUEUE_IF_BUSY	Handler is executed though its own fiber. If another event arrives, it is queued, and the event handler will immediately be called again once processing is complete. (default)
MESSAGE_BUS_LISTENER_REENTRANT	Every event is executed in its own fiber. if another event arrives, it is handled concurrently in its own fiber.

Thomas Roell

unread,

Jun 13, 2017, 2:08:44 PM6/13/17

to Paul Stoffregen, Arduino Developers

Paul, one more question where you insight might be helpful.

A pure event model based implementation requires the use of non-blocking IO to be remotely efficient. That means that at the end of the day, having the main loop doing blocking-IO will not offer real opportunity for overlapping waiting for IO with other useful work. In all reality the main "loop()" will degenerate to one big "delay(forever)", as it has no real purpose anymore. Is this asking to much of a user ?

Sorry to raise this again, but the whole issue is so multifaceted that there does not seem to be a simple obvious good solution.

- Thomas

Matthew Ford

unread,

Jun 13, 2017, 9:05:50 PM6/13/17

to Thomas Roell, Arduino Developers

On 14/06/2017 4:24 AM, Thomas Roell wrote:
> The implementation in micro:bit allocates a stack-swap area behind the
> application's back resulting in unbound memory consumption
You obviously don't think much of micro:bit's design, but 1 million
school children are successfully using it, so in my eyes it serves its
purpose.
But that does not mean we cannot do it better

Ignore the implementation details for now and lets just decide what the
user sees work from there.

Below is my sample user code for blink.
There are other possible extensions would allow for parameterized tasks
and setup and pulldown methods, but for now this is what I think a basic
method should look like
We really need to agree on something re the user facing code/api first
before delving into the limitations/compromises of real code
You comments about the micro:bit implementation are not the point. The
point is what features do we want and only then how best to implement them

On 14/06/2017 4:24 AM, Thomas Roell wrote:
> the implementation is set up to randomly crash if it's out of memory.
It is not hard to get Arduino to run out of memory either. Once there
is agreement on what the user code looks like, we can go on to
considering how to protect them from themselves or at least notify them
when something blows up.

On 14/06/2017 4:24 AM, Thomas Roell wrote:
> It would be way more useful if the code that tries to send an event
> would understand that it did not get send
OK what does that look like to the user.

////////////////////////////////// example blink code
//////////////////////////////////////////
#include <ArduinoTasks.h>

void setup() {
pinMode(LED_BUILTIN, OUTPUT);

// put your setup code here, to run once:
ArduinoTask blinkTask = ArduinoTask.create(blinkHandler); // default
mode non-reenterent, blocking allowed
// blinkSetup is used, then calle ArduinoTask.create(blinkHandler,
blinkSetup, intArg)

ArduinoTask.start(blinkTask);
// if blinkSetup has args add them here
// i.e. ArduinoTask.start(blinkTask, taskArg);
// Task arg is just an int for simplicity
// if an int is not enought then can always index global array of
blinkTask args struct using the int
// but in most cases an int will do. (could have void* instead but...)
}

// if loop is defined it is run repeatedly
// when cpu not otherwise occupied
// defining loop will prevent auto-sleep
//void loop() {
//}

// this task never exits
void blinkHandler(TaskEvent *evt) {
while (1) {
digitalWrite(LED_BUILTIN, HIGH); // turn the LED on (HIGH is the
voltage level)
sleep(1000); // wait 1sec and release cpu for
other tasks
digitalWrite(LED_BUILTIN, LOW); // turn the LED off by making
the voltage LOW
sleep(1000); // wait 1sec and release cpu for
other tasks
}
}

Thomas Roell

unread,

Jun 13, 2017, 10:50:36 PM6/13/17

to matthe...@forward.com.au, Arduino Developers

Matthew,

It's not only the implementation that is the problem, it's the concept that is fundamentally flawed. You cannot do lazy stack allocation for one-shot threads at context switch time on a memory constraint device. It may be interesting on larger scale devices with a MMU where you can plug in memory into a virtual address range after the fact. On small devices like the Arduino target you are fighting OOM as well as internal fragmentation of the heap.

This may work on a few small scale examples that have enough memory left over, or have a cyclic pattern where by chance the memory allocations end up in a static state. But in general there is no guarantee that you won't hit a OOM (and crash) after a while. Again, the concept of micro:bit is fundamentally flawed.

The example that you provided is actually a something that is a normal cooperative scheduler, not an event driven system. That implies private thread stacks and at least a set of locks. So you end up with an API along the line of "_thread" or "threading" from Python3.

- Thomas

Matthew Ford

unread,

Jun 14, 2017, 3:34:04 AM6/14/17

to Arduino Developers

What I want from this multi-tasking API/Library is

i) A non-preemptive task method that may block. This lets me update globals without using locks and lets me intuitively use delay() as and when I wish, and to access slow IO from within the task.

ii) Tasks driven by messages that pass the associated data as an argument to the task, or within the message instance. (i.e. not using globals or locks). As with normal method arguments, I don't what to bothered with memory recovery or disposal of these args when the task method exits. The library should do what ever is necessary behind the scenes.

iii) IO that is available in both blocking and async (i.e. send it a message) forms, so I can choose which one suits my task. IO includes Serial, Wire, SPI, external memory, etc.

iv) Don't force me to even know about the existence of locks, semaphores, synchronization etc just to code a simple blink task and a task that handles IO. Sure I may need these when I start doing fancy complex sketches, but I should be able to code most tasks without explicit access to, or knowledge of, them.

The code example previously posted, below, is still an example of what I would like to use.

For use on the Arduino UNO, I don't care if I can only run a couple of tasks and if there is a strict limit to the number of pending messages. I already have libraries and sketches that won't run on UNO. That's just life.

The progress in uC development is much faster then the speed at which we can sort out what to do here and get this api/library coded.
We should not be too worried about uC memory limitations as they will quickly disappear.
What we really need to do is devise a multi-tasking coding style that is simple, straightforward, elegant and as intuitive as possible for non-programmers to use. Lots of Arduino users are non-programmers.

It think this makes my position clear and I am bowing out of this discussion now.

I look forward to those "on high" making a decision on the way forward.

--
You received this message because you are subscribed to the Google Groups "Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.

Paul Stoffregen

unread,

Jun 14, 2017, 7:43:10 AM6/14/17

to devel...@arduino.cc

On 06/14/2017 12:34 AM, Matthew Ford wrote:

What I want from this multi-tasking API/Library is

i) A non-preemptive task method that may block. This lets me update globals without using locks and lets me intuitively use delay() as and when I wish, and to access slow IO from within the task.

ii) Tasks driven by messages that pass the associated data as an argument to the task, or within the message instance. (i.e. not using globals or locks). As with normal method arguments, I don't what to bothered with memory recovery or disposal of these args when the task method exits. The library should do what ever is necessary behind the scenes.

iii) IO that is available in both blocking and async (i.e. send it a message) forms, so I can choose which one suits my task. IO includes Serial, Wire, SPI, external memory, etc.

iv) Don't force me to even know about the existence of locks, semaphores, synchronization etc just to code a simple blink task and a task that handles IO. Sure I may need these when I start doing fancy complex sketches, but I should be able to code most tasks without explicit access to, or knowledge of, them.

The code example previously posted, below, is still an example of what I would like to use.

Maybe you could explain with a more substantial example? As nearly as I can tell, the example below only creates 1 thread which merely blinks a LED.

I read it again just now. I didn't see updating of globals without locks (i). I didn't see any message passing stuff (ii). I didn't see any use of blocking or async I/O, not to mention a mix of the two to suit different needs (iii). I did see blinking a LED, but not together with any task handling I/O (iv). On this 4th point, I believe I have a good mental picture. But I'm struggling to see how you envision the first 3 working.

For use on the Arduino UNO, I don't care if I can only run a couple of tasks and if there is a strict limit to the number of pending messages. I already have libraries and sketches that won't run on UNO. That's just life.

The progress in uC development is much faster then the speed at which we can sort out what to do here and get this api/library coded.
We should not be too worried about uC memory limitations as they will quickly disappear.
What we really need to do is devise a multi-tasking coding style that is simple, straightforward, elegant and as intuitive as possible for non-programmers to use. Lots of Arduino users are non-programmers.

I absolutely agree with this sentiment. I only wish I could envision this simple multi-tasking coding style you have in mind and how it works in Arduino's already very well established ecosystem.

It think this makes my position clear and I am bowing out of this discussion now.

I've re-read every message you've written on this thread. I even spent some time looking through the nearly 2000 lines of C++ code you shared for serial message parsing.

Your position is crystal clear regarding esteem for Microbit. I've looked at their documentation. I might even pick up a board and play with it. They do seem to have some good ideas. But I do share Thomas's concern about memory usage.

Your position about message passing is also very clear, at least in an abstract sense. But how that could work on Arduino, integrating with the already well established APIs, isn't clear at all, at least not to me.

Paul Stoffregen

unread,

Jun 14, 2017, 8:47:31 AM6/14/17

to Thomas Roell, Arduino Developers

What I've proposed probably shouldn't be considered a "pure" event model. Perhaps it could be used that way in some glorious future where all libraries gain non-blocking event-based functions (covering all needed usage cases). If someone were to craft a 100% event driven program, then indeed they would have an empty loop() function.

Typical event usage probably won't look so "pure". It's almost certain to be mixed together with the existing blocking and non-blocking APIs we already have. In fact, I'd imagine many users will probably never use events directly, but will reap their benefits by using libraries that do.

For example, someone might build a project using both a GPS and an IMU. Maybe they find some blocking GPS example code. Maybe they install a motion sensing library that uses events to keep up with a rapid pace from the IMU, to feed low-jitter gyro data into a Kalman filter. Today the GPS example that works fine without other stuff would tend to block servicing the IMU. But if the Wire library gets event-based functions, maybe Adafruit or someone else will make a great IMU library that leverages events to "just work" even when used together with that blocking GPS, of course assuming it blocks in some way yield gets called. The user probably never knows about events or finer points of blocking vs non-blocking I/O. They just combine two examples they find and both magically work together the same as each does alone.

My end goal is exactly the opposite of asking too much of users. Events should add one more option. They're not meant to be mandatory. They're not meant to become the dominant API on Arduino. Usage will rarely be a "pure" event model where loop as no real purpose.

I'm guessing the most compelling use cases will be within libraries, to make them far more interoperable with delay-based code.

Todd Krein

unread,

Jun 14, 2017, 9:02:13 AM6/14/17

to Paul Stoffregen, Thomas Roell, Arduino Developers

+1

I like the idea that libraries can be “fixed” w/o the user having to know anything, it just works better.

--

You received this message because you are subscribed to the Google Groups "Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.

Thomas Roell

unread,

Jun 14, 2017, 9:03:02 AM6/14/17

to matthe...@forward.com.au, Arduino Developers

Some more technical followup to the micro:bit model.

First off, I do find it intriguing as it seems to be rather userfriendly to start off with. My real grief is the implementation and some of the conceptual consequences. But then again, one should never give up early.

micro:bit uses a lightweight thread model, in that only one real stack is used. If a thread blocks it allocates a swap area and copies the current stack contents into this swap area. When it is unblocked again, the saved contents are copied back to the stack and execution can continue. They key problem to solve is really "how to allocate this swap area". Normal malloc is undesirable, due to the fact that it tries to allocate a linear chunk, which leads to heap fragmentation over time. However what if this allocation is done over smaller chunks, of say 256 bytes ? If the allocation/deallocation is done via fixed blocks it does not fragment. An implementation like this would require that some space is set aside for the swap space. So that needed to be made configurable somehow.

micro:bit simple crashes on OOM. I think one can do better. In reality there are only a handful of APIs that block, and trigger the allocation of the swap area. As Paul pointed out "yield()" could simply not yield and "delay()" could do busy waiting, as could every blocking IO. However there needed to be some way to tell the application that it ran out of swap space ...

Next to the event distribution. micro:bit uses the idea of launching a new thread (out of the thread pool) for every subscribed event. If a thread then subsequently blocks, you run the risk of an unbound swap space requirement. A way around that could be a event queue, where threads are only launched for an event if the previous invocation of the thread has terminated. The problem with that queue however is that events get retired out of order (because previous invocations of the threads can terminate out of order). That means that each entry within that queue has to have a fixed size, because otherwise the queue would fragment. micro:bit uses a 3x uint32_t event body, so for that scheme (that does not pass around message data) this queue would work. An implementation like this would require that some space is set aside for the event queue. So that needed to be made configurable somehow.

So a concept like micro:bit could be implemented in a way that it avoid many of the pitfall the current real micro:bit implementation has. However it requires that the applications configures the size of the swap space, and the size of the event queue (defaults might be in place though). It should also be understood that because of the user of swapping the stack, rather than swapping the space pointer, this scheme is very slow when it comes to context switching.

- Thomas

vbextreme vbextreme

unread,

Jun 14, 2017, 9:51:54 AM6/14/17

to Thomas Roell, matthe...@forward.com.au, devel...@arduino.cc

Swap is extremely slow.

John Plocher

unread,

Jun 14, 2017, 1:34:48 PM6/14/17

to Arduino Developers

Is the elephant in the room the fact that most of these ideas are impractical or impossible on "low end" chips, but more and more "required" on the high end?

One size fits all is a good concept, unless it can't. On a 'tiny45, and even an UNO, none of this discussion about threads and RTOS and stacks matters because, well, space. As a user, I need to focus intently on the design tradeoffs required to make *my* code fit on these low end systems; layers of abstractions and glue are explicitly unwelcome...

On the other hand, with a multi-core, multi-MB ESP32 or ARM...

The types of problems we (users) solve with low end devices are significantly different from the ones we solve at the high end, so it follows that the environments we use could or maybe even should be different.

If we acknowledge the fact that some chips can't support "arduinoAPI.next", and others that can, we can focus efforts on the latter, and not cripple the new design by forcing it to be used where it can't really work, isn't really needed and probably isn't wanted...

"Use Arduino 1.x for [list of devices based on 8 bit AVR/PIC cores], use Arduino 2.x for more powerful ones" is not a bad statement to make, especially if it is made intentionally and with a plan...

-John

bob

unread,

Jun 14, 2017, 2:00:53 PM6/14/17

to Developers

And the fact is, it is quite easy to write multi-tasking applications on tiny45 without any RTOS (coding without delay()!). All that is really required to make life easier for users is to ensure no library calls can block.

I have been following the thread with interest. I am comfortable with Comp sci concepts (I is one), but I am wondering how to marry simplicity of use with simplicity of implementation, I suspect there is an inverse relationship. Allowing blocking calls inevitably means context switching, allocating task stacks, locks, etc, and then you are on the slippery slope to a full fledged RTOS.

I don't think users will ever be able to write multi-tasking code, without some level of understanding of multi-tasking. We can't hide it completely from the user.

Thomas Roell

unread,

Jun 14, 2017, 2:32:00 PM6/14/17

to bob, Developers

Bob,

... the slippery slope ... that's kind of the real biggy ;-)

It seems that there are 2 groups of users of such a facility in general.

One is the one who implements a library and needs to make use of concurrency in some form behind the main sketch execution. USB comes to mind, or a non-blocking GNSS library, or a LoRaWAN library.

The other one is the one where a sketch wants to hide the waiting time in blocking IO or "delay()" and let something useful be done in that time. This scenario may not need user visible locks if APIs and interactions are properly designed. However there may need to be synchronization primitives (messages, events, or semaphores ....) between threads/ISRs.

Then of course there is the ugly hybrid case, BLE. There in pretty much every implementation, you register a callback to an event that gets called somehow, either at an interrupt context or within the main "loop()".

So the real tricky part is how to address all of that. The library case would prefer a pure event based system, while the user case probably would want to use cooperatively scheduled threads.

- Thomas

Andrew Kroll

unread,

Jun 14, 2017, 2:38:48 PM6/14/17

to Thomas Roell, bob, Developers

Yes, it can be a slippery slope. Perhaps the correct way to do this is in layers.

Layer 1: You have preemptive scheduler with all it's power and caveats as your 'base' for use with libraries, and power users.

Layer 2: handles any major pitfalls

Layer 3: for doing simple things, without diving into the details and pitfalls, handling things such as atomic reads/writes via a macro or method call.

So basically, 3 libraries.

Thomas Roell

unread,

Jun 14, 2017, 3:02:12 PM6/14/17

to Andrew Kroll, bob, Developers

Layering is a good idea. Perhaps a variant thereof:

Layer 1: Access to SWI or PendSV based fibers for libraries. This is what USB Host 3 is using, this is what the STM32L4 core is using for USB/MSC. What is needed in both cases is some sort of timeout facility to have time triggered fibers.

Layer 2: Have a cooperative thread scheduler with a minimal API.

Layer 3: Have a event queue based system ontop of layer 1 and 2 that makes concurrency accessible without the core knowledge of the upper layers.

If I'd code a library then I'd want to make sure that any cooperative scheduling at the "loop()" level does not screw around with my functionality (latency requirements and such). While conversely as a user sketch, I would not want to deal with obscure side-effects that a preemptive scheduler may have for parts of the system that I am not aware of.

- Thomas

Mauro lesto

unread,

Jun 15, 2017, 1:51:07 AM6/15/17

to Thomas Roell, Andrew Kroll, bob, Developers

Ad John said, an OS/sheduler male no sense or is just impractical for low end micros; so can we first put a line on what is supposed to be supported? And how will low power stuff handled? That is really important to understand the possible solution. (I guess would be nice to keep full support of the atmega 328)

Also nobody is talking about the issue with data synchronization and starvation between task, with is one of the most common error I see around from people using thread without knowing what they are doing.

if we ditch low end MCU, then my guess is to use an existing rtos like freertos or chibios or whatever is the most complete (and open) out there, so both project can help each other.

Someone proposed an event driven approach, this is extremely interesting in my opinion as we approach cost-zero abstraction (for hw interrupt) and we "just" have to implement the missing event on software.
Of course NO blocking function as all should be handled by event, and this also require library update (and a good excuse to clean up the API standard).

To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.

--
You received this message because you are subscribed to the Google Groups "Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.

--
Visit my github for awesome Arduino code @ https://github.com/xxxajk

--
You received this message because you are subscribed to the Google Groups "Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@arduino.cc.

Reply all

Reply to author

Forward

Re: [Developers] Six projects for the near future of the Arduino open source project - Data Protection

Matthew Ford

Thomas Roell

Andrew Kroll

Thomas Roell

Andrew Kroll

Steve

Paul Stoffregen

Thomas Roell

Paul Stoffregen

Álvaro Lopes

Thomas Roell

vbextreme vbextreme

Matthew Ford

Thomas Roell

Matthew Ford

Thomas Roell

Matthew Ford

Paul Stoffregen

Paul Stoffregen

Todd Krein

Thomas Roell

vbextreme vbextreme

John Plocher

bob

Thomas Roell

Andrew Kroll

Thomas Roell

Mauro lesto