Hi developers,
Inspired by the BBBMini project I spend some time in the last months to get the ArduCopter running on a standard RPI A+ and would like to share my experiences now.
My hardware setup is as follow:
Raspberry PI A+ running a Linux PREEMPT_RT kernel, Sensors are the MPU9250\AK8963 and MS5611 connected to the SPI bus, GPS on serial /dev/ttyAPM.
After some source code configurations to compile the ArduCopter project for this setup, I realized many overruns in the 400 Hz main loop by getting performance messages like:
PERF: 57/4000 15373 685
Because of this (heavy) overruns I searched for the reasons and found two:
1. The SPI bus
SPI sensors will be read by ArduPilot's highest prio 1ms timer-thread. This thread then calls the standard Linux SPI driver synchronously. The Linux driver however use it's own kernel thread to perform SPI operations, but the priority of this thread is less than the priority of the 1ms timer thread.
My Conclusion: Using the Linux driver leads to a priority inversion which blocks the whole ArduCopter for occasionally long times. There were two options I thought about to improve the real-time capability:
Using the RPI SPI hardware directly
Using a software bit-banging driver
I focused on bit-banging:
Sensors like the MPU9250 can be clocked up to 20Mhz. Reachable with software bit-banging and hal.gpio methods to get\set gpio pins is between 1.5Mhz and 2.0Mhz on the RPI. With RPI specific gpio access macros, a clock rate between 6.5 Mhz and 7.5 Mhz could be reached. Sofware bit-banging is (as expected) slower than using SPI hardware, but keep in mine only few bytes have to be read each 1ms. First tests shows: bit-baning only with 1.5Mhz (to read the 14 bytes of the MPU9250's sensor) is faster (more than double) as using the Linux SPI driver and (more importend) with a predictable real-time behavior. This was confirmed by performance messages I got with bit-banging:
PERF: 15/4000 3298 1721
What do you think about an integration of a bit-banging SPI driver into ArduPilot?
2. The main Loop priority
The ArduCopter's main loop isn't the highest prior thread and can be interrupted (for example) by the 1ms timer thread and by backround Linux IRQ-threads which handles USB stuff. The main loop however contains a time critical part, which is waiting for the next loop time, reading AHRS values and executing the rate controllers. In detail:
void Copter::loop(){
// wait for an INS sample
ins.wait_for_sample();
...
// IMU DCM Algorithm
// --------------------
read_AHRS();
// run low level rate controllers that only require IMU data
attitude_control.rate_controller_run();
...
I changed the code by increasing the priority temporary as follow:
void Copter::loop(){
→ set the priority to highest possible, go sleeping but return immediately when time is elapsed
// wait for an INS sample
ins.wait_for_sample();
...
// IMU DCM Algorithm
// --------------------
read_AHRS();
// run low level rate controllers that only require IMU data
attitude_control.rate_controller_run();
→ set the priority back to default
With this changes only I
got worst case performance messages (with video broadcasting in the
backround) like this:
PERF: 0/4000 2791 2392
What do you think about temporary increasing the priority of the main-loop thread?
Markus
Hi Tridge,
I can't explain the reasons in detail (I'am not a linux expert), but found a discussion that may be does https://www.raspberrypi.org/forums/viewtopic.php?f=44&t=19489&start=142.
In recent kernel version the SPI driver has obviously been improved, but when reading the whole discussion, I have some doubts that using this driver in real time applications like ArduPilot is the right way. I think the Linux SPI stack was designed with other use cases in mine.
>>I'd like Lucas to look at this as well, but I'm not opposed to having bit-banging as an option.
In this case I will polish my current bit-banging implementation and make a pull request within the next week.
>> this should definately go in, but we should use the hal.scheduler->delay_microseconds_boost() method.
I looked for this function and how it is implemented in the PX4 HAL. A little bit 'magic' for me is the constant “APM_MAIN_PRIORITY_BOOST_USEC”. How was it determined? Includes this time also the execution of the rate controllers on a PX4 with boost priority?
Hi Luca,
thank
you for your answer and sorry for the late response. Last week I
invested the problem more in detail and want to share my findings with
you.
To your questions:
>>which version?
The version I'm using is „3.18.9-rt5+“
>>I've
seen on some rpi-based hardware a huge workload on irq thread
>>when
using PREEMPT_RT. Here the patch set is causing more harm than
>>it's helping with the latency
I checked the workload, but found no aberrations.
>>It
shouldn't use a separate kernel thread. At least not in upstream
>>kernel. It should just configure the spi
controller and be done with
>>that.
I can't verify this behaviour. Please see my scheduler tracing results below.
In
any case, Linux has built-in support for priority inversion,
>>so
even if the RPI-based kernel used a kernel thread, an inversion in
>>the priority should cause the low priority thread to be
boosted
>>automatically.
Yes, but only when using semaphores, if I understand your linked articles correctlly.
>>Besides
that, kernel thread usually run within the 50-99 rt
>>priorities.
And our threads are in the 1-15 range, which means our
>>thread
will never have higher priority than the kernel threads.
>>What does the following command return to you?
>> ps -eL -o tid,comm,class,rtprio| grep -ve "-$"
The command shows:
TID COMMAND CLS RTPRIO
3 ksoftirqd/0 FF 1
10 rcub/0 FF 1
11 rcuc/0 FF 1
13 posixcputmr/0 FF 99
20 irq/65-ARM Mail FF 50
37 irq/16-bcm2708_ FF 50
41 irq/66-VCHIQ do FF 50
44 irq/32-dwc_otg FF 50
45 irq/32-dwc_otg_ FF 50
46 irq/32-dwc_otg_ FF 50
48 irq/24-DMA IRQ FF 50
49 irq/25-DMA IRQ FF 50
50 irq/84-mmc0 FF 50
252 irq/79-bcm2708_ FF 50
3261 irq/80-bcm2708_ FF 50
3385 ArduCopter.elf FF 12
3386 sched-timer FF 15
3387 sched-uart FF 14
3388 sched-rcin FF 13
3389 sched-tonealarm FF 11
3390 sched-io FF 10
3391 irq/83-uart-pl0 FF 50
Not shown by this command are the kworker-threads, which are used in the SPI-Driver.
ps -eL -o tid,comm,class,pri,rtprio shows for example:
3394 kworker/0:0 TS 19 -
3395 kworker/u2:4 TS 19 -
3403 kworker/u2:3 TS 19 -
3411 kworker/u2:1 TS 19 -
3415 kworker/u2:2 TS 19 -
3418 ps TS 19 -
>>As
mentioned above this should really not occur. I think what you are
>>seeing is not due to PI. In order to check a simple
thing to do would
>>be to change the priority of the
kernel thread. This can be as simple
>>as using the chrt
utility. See this example for changing the priority
To find out what's happens exactly, I used the linux kernel tracing capabilities to trace schedule-switches together with the tool kernelshark for visualization.
For this test, a recent 3.4dev version of ArduCopter compiled for „pfxmini“ (release version) was used. Target was a Rasberry PI A+ running at 900Mhz. As before, Sensors are the MPU9250\AK8963 and MS5611 connected to the SPI bus.
The first picture shows the reading sequence of the MPU9250's gyro and acceleration values.
The sequence in detail:
1. The thread 'sched-timer' starts
2. ArduCopter calls ioctrl() of the SPIBus to read sensor data. The message is then queued in the SPI driver, and the sched-timer thread goes sleeping.
3. The kworker-thread, which belongs to the SPI driver queue starts, enqueues the message and starts the transmission together with IRQ (irq/80-bcm2708)
4. At point 4 the thread 'rcin' gets ready and starts immediately, because it has a higher priority as the SPIs kworker-thread and as a result, the SPI communication gets stuck.
5. ioctrl() calls returns
6. sched-timer finished.
Because kworker-threads are running with priorities below the real time threads, all ArduCopter threads may interrupt the SPI communication. Shown below is a worst case picture, where the timer-thread blocks for more then 14ms.
I don't know whether there are any faults in my considerations, but it exactly described my observations.
Markus