Dummy questions from a newbie regarding how to use DMA

pozz

unread,

Aug 22, 2016, 4:30:16 AM8/22/16

to

Until now I never used a microcontroller with an embedded DMA
controller, so I always use the CPU to move data from memory to memory,
from memory to peripheral, from peripheral to memory.

The micro usually have only a two-byte hardware FIFO (the byte currently
shifting in and the previous completely received byte).
So I often implement a sw FIFO buffer for receiving data from UART. In
the ISR of receiving character, I move data from peripheral (UART) to
memory (FIFO buffer).
I implement a uart_getchar() function that checks if a new byte is
present in the FIFO an returns that byte or EOF. This function is
called from the application (background) not in the foreground.

#define ROLLOVER(x, max) (((x) + 1) >= (max) ? 0 : (x + 1))

uint8_t rx_buff[32];
size_t rx_size;
volatile size_t rx_in;
size_t rx_out;

/* ISR of a new received character */
void isr_rx(void) {
size_t i = ROLLOVER(rx_in, rx_size);
if(i == rx_out) return; // Rx FIFO full, discard character
rx_buff[rx_in] = c;
rx_in = i;
}

/* uart_getchar() function */
int uart_getchar(void) {
if (rx_out == rx_in) return EOF; // FIFO empty
unsigned char data;
data = rx_buff[rx_out];
rx_out = ROLLOVER(rx_out, rx_size);
return data;
}

Now I'm starting using new microcontrollers that embed a DMA engine, for
example SAM C21 Cortex-M0+ from Atmel (I think many Cortex-M0+ micros
out there integrate a DMA engine).

So I'm asking if it is possible to avoid completely the ISR and use the
DMA to move received character in the FIFO buffer.

I read the datasheet, but I couldn't found a solution to my problem.

First of all, the destination memory address could be fixed or
auto-incrementing, but there isn't a mechanism to wrap-around (FIFO
buffer is a *circular* array, so after pushing N bytes, the address
should start again from the beginning).
Maybe I have to configure the DMA for a transaction with a byte count
equals to the size of the FIFO buffer. When the last byte is received,
the transaction ends and an interrupt could be raises (if correctly
configured). In the relevant ISR, a new DMA transaction could be
started (with the destination memory address equals to the beginning of
the FIFO buffer).

Another issue is how the application (background) could know how many
bytes are present in the FIFO buffer so it can pop and process them as
it wants.

raimond....@gmail.com

unread,

Aug 22, 2016, 5:50:00 AM8/22/16

to

You can link descriptors in a circular manner.
Using two, pointing to each other, is the simplest.
You can know how many bytes are there by address arithmetic,
you have access to the dma descriptors addresses.
For interrupts, use terminal count interrupts.
Better granularity if you use many "small" descriptors...

pozz

unread,

Aug 22, 2016, 6:15:56 AM8/22/16

to

Il 22/08/2016 11:49, raimond....@gmail.com ha scritto:
> You can link descriptors in a circular manner.
> Using two, pointing to each other, is the simplest.

Should the two transfer descriptors (linked together in a circular
manner) completely identical?

The transfer descriptor includes the total number of data (bytes) of the
block transfer (BTCNT register). This value is automatically
decremented at each new data transfer. At the end, BTCTN will be zero,
so it can't be used *as is* for the next linked descriptor.

I should "re-arm" BTCNT register when a transfer descriptor ends,
shouldn't I?

> You can know how many bytes are there by address arithmetic,
> you have access to the dma descriptors addresses.

I think I have to check BTCNT register in SRAM (transfer descriptor)
that initially stores the total number of data for that "block
transfer", but is automatically decremented when new data are transferred.

The application should take the count of the number of bytes already
popped from the FIFO.

while((sizeof(FIFObuf) - BTCNT -
number_of_bytes_already_popped_from_FIFO) > 0) {

new_byte = FIFObuf[number_of_bytes_already_popped_from_FIFO++];
// process new_byte
}

The only problem I see here is when one transfer is just finished and
the new transfer (identical to the previous and linked from it) is just
started.

BTCNT used in the arithmetic above shouldn't be the currently used
transfer descriptor, but the last one until all the bytes from FIFO has
popped up by the application.

> For interrupts, use terminal count interrupts.

In my barebone OS, I usually "poll" the presence of new bytes in the
FIFO through the uart_getchar(), so I don't use interrupts.

> Better granularity if you use many "small" descriptors...

Could you explain?

rickman

unread,

Aug 22, 2016, 6:27:30 AM8/22/16

to

On 8/22/2016 6:15 AM, pozz wrote:
> Il 22/08/2016 11:49, raimond....@gmail.com ha scritto:
>> You can link descriptors in a circular manner.
>> Using two, pointing to each other, is the simplest.
>
> Should the two transfer descriptors (linked together in a circular
> manner) completely identical?

That's up to you and your messages. Think of it as a ping-pong buffer.
You don't need to even fill the buffer. There should be a time out
somewhere so if no more chars are received, an interrupt is generated to
say "a message is waiting". Then the descriptor pointer is changed and
the next buffer is used for the next message.

Otherwise you need to fill the buffer which may be part of a message or
more than one message. If you have a fixed message size, Bob's your uncle.

--

Rick C

pozz

unread,

Aug 22, 2016, 7:03:55 AM8/22/16

to

Il 22/08/2016 12:27, rickman ha scritto:
> On 8/22/2016 6:15 AM, pozz wrote:
>> Il 22/08/2016 11:49, raimond....@gmail.com ha scritto:
>>> You can link descriptors in a circular manner.
>>> Using two, pointing to each other, is the simplest.
>>
>> Should the two transfer descriptors (linked together in a circular
>> manner) completely identical?
>
> That's up to you and your messages. Think of it as a ping-pong buffer.
> You don't need to even fill the buffer. There should be a time out
> somewhere so if no more chars are received, an interrupt is generated to
> say "a message is waiting". Then the descriptor pointer is changed and
> the next buffer is used for the next message.

I'd like to use the DMA to manage the transfer from UART to a temporary
FIFO buffer. In this process, the details about the protocol (message
size, fixed or not, preamble, start of frame, end of frame, ...) aren't
known.

Hey one byte is received, put it here so the application (background)
will be able to read and move to the final destination (pop from the
FIFO buffer) when it has free time.

> Otherwise you need to fill the buffer which may be part of a message or
> more than one message. If you have a fixed message size, Bob's your uncle.

No, I don't have a fixed message size.

David Brown

unread,

Aug 22, 2016, 7:29:31 AM8/22/16

to

Technically, rx_buff and rx_out should be volatile too, but it is
unlikely for it to be a problem (your compiler would have to be inlining
multiple copies of the uart_getchar function - possible with link-time
optimisation - and do some legal but unlikely re-ordering).

And have rx_size as a variable is going to be inefficient compared to
using a compile-time constant and making ROLLOVER a mask.

>
> Now I'm starting using new microcontrollers that embed a DMA engine, for
> example SAM C21 Cortex-M0+ from Atmel (I think many Cortex-M0+ micros
> out there integrate a DMA engine).
>
> So I'm asking if it is possible to avoid completely the ISR and use the
> DMA to move received character in the FIFO buffer.
>
> I read the datasheet, but I couldn't found a solution to my problem.
>
> First of all, the destination memory address could be fixed or
> auto-incrementing, but there isn't a mechanism to wrap-around (FIFO
> buffer is a *circular* array, so after pushing N bytes, the address
> should start again from the beginning).

My experience is only with Freescale's DMA engine (on Kinetis ARMs and
MPC's), which have support for circular buffers for precisely this
reason. It seems a strange omission from Atmel to have no similar support.

> Maybe I have to configure the DMA for a transaction with a byte count
> equals to the size of the FIFO buffer. When the last byte is received,
> the transaction ends and an interrupt could be raises (if correctly
> configured). In the relevant ISR, a new DMA transaction could be
> started (with the destination memory address equals to the beginning of
> the FIFO buffer).

That sounds about right.

Depending on the type of communication you are expecting, and the
resources you have, it might be possible to simply have a large enough
buffer to support all legal incoming packets.

>
> Another issue is how the application (background) could know how many
> bytes are present in the FIFO buffer so it can pop and process them as
> it wants.

Again, I don't know Atmel's DMA engine, but usually there are values you
can read (byte count, current destination pointer, etc.) that will help
here.

And you also want to check if you really /need/ the DMA here. If the
processor is not overloaded, an ISR can be simpler and easier - and
often it does not matter if that "wastes" a few percent of your cpu
capacity. DMA on transmit is often very simple, but for receive the
complications of timings and timeouts can make the DMA less of a win.

Dave Nadler

unread,

Aug 22, 2016, 7:45:09 AM8/22/16

to

On Monday, August 22, 2016 at 7:03:55 AM UTC-4, pozz wrote:
> ...No, I don't have a fixed message size.

A variable sized message means you have to periodically poll
the results, which means stopping the DMA and likely restarting
DMA with a different buffer.

Even for fixed size messages, you'll need to do this in case
a character gets lost!

The fun starts when the DMA behavior is not guaranteed if you
start/stop during transfers: I've worked with chips where the
DMA was unusable because characters would be dropped whilst
switching (ie old NEC V25).

Calculate whether the classic ISR technique's overhead is
such that using DMA is really required. Should not but can
also happen if ISR latency is large and FIFO is small, such
that classic ISR may drop characters at high speed.

Hope that helps!
Best Regards, Dave

pozz

unread,

Aug 22, 2016, 8:42:32 AM8/22/16

to

Il 22/08/2016 13:45, Dave Nadler ha scritto:
> On Monday, August 22, 2016 at 7:03:55 AM UTC-4, pozz wrote:
>> ...No, I don't have a fixed message size.
>
> A variable sized message means you have to periodically poll
> the results, which means stopping the DMA and likely restarting
> DMA with a different buffer.

Why stopping the DMA? DMA should silently work in background, even if
I'm checking for its results.

pozz

unread,

Aug 22, 2016, 8:49:42 AM8/22/16

to

I'm not sure, maybe it is possible to automatically resume the transfer
descriptor as soon as the transfer is complete... in this way you will
have a circular buffer managed by DMA without any CPU intervention.

>> Maybe I have to configure the DMA for a transaction with a byte count
>> equals to the size of the FIFO buffer. When the last byte is received,
>> the transaction ends and an interrupt could be raises (if correctly
>> configured). In the relevant ISR, a new DMA transaction could be
>> started (with the destination memory address equals to the beginning of
>> the FIFO buffer).
>
> That sounds about right.
>
> Depending on the type of communication you are expecting, and the
> resources you have, it might be possible to simply have a large enough
> buffer to support all legal incoming packets.
>
>>
>> Another issue is how the application (background) could know how many
>> bytes are present in the FIFO buffer so it can pop and process them as
>> it wants.
>
> Again, I don't know Atmel's DMA engine, but usually there are values you
> can read (byte count, current destination pointer, etc.) that will help
> here.

I see.

> And you also want to check if you really /need/ the DMA here. If the
> processor is not overloaded, an ISR can be simpler and easier - and
> often it does not matter if that "wastes" a few percent of your cpu
> capacity. DMA on transmit is often very simple, but for receive the
> complications of timings and timeouts can make the DMA less of a win.

What do you mean with "timings" and "timeouts"? I think those
complications must be implemented even with the "simple" ISR-only approach.

David Brown

unread,

Aug 22, 2016, 9:24:16 AM8/22/16

to

Indeed they do need to be implemented in the ISR-only approach - but
then you have fewer bits that need to interact.

For example, if you have an incoming telegram and you want to react
quickly when it has been completely received, an ISR can give you that
easily. When the interrupt for the final character arrives, you trigger
your handling action. But if you are using a DMA then it will just be
one more character in the buffer - you don't get an interrupt until the
buffer is full. So you need additional methods, such as timer
interrupts, to regularly poll the DMA buffer to see if the final
character has come in.

So a DMA on receive is good for some things, but adds complications for
other tasks.

DMA on transmit, however, is usually very simple because you know
exactly how much you are sending. It gets "fun" if you want to add to
the DMA buffer while the DMA is running - synchronising between DMA and
the processor is not always a simple task. It is really easy to make
something that works fine most of the time (and during all your
testing), yet fails if the DMA triggers half-way through the buffer add
function.

pozz

unread,

Aug 22, 2016, 10:36:52 AM8/22/16

to

Il 22/08/2016 15:24, David Brown ha scritto:
> On 22/08/16 14:49, pozz wrote:
>> Il 22/08/2016 13:29, David Brown ha scritto:
>
>>> And you also want to check if you really /need/ the DMA here. If the
>>> processor is not overloaded, an ISR can be simpler and easier - and
>>> often it does not matter if that "wastes" a few percent of your cpu
>>> capacity. DMA on transmit is often very simple, but for receive the
>>> complications of timings and timeouts can make the DMA less of a win.
>>
>> What do you mean with "timings" and "timeouts"? I think those
>> complications must be implemented even with the "simple" ISR-only approach.
>>
>
> Indeed they do need to be implemented in the ISR-only approach - but
> then you have fewer bits that need to interact.
>
> For example, if you have an incoming telegram and you want to react
> quickly when it has been completely received, an ISR can give you that
> easily. When the interrupt for the final character arrives, you trigger
> your handling action.

Oh, I got your point now. However I never had the need to trigger an
action quickly after receiving a message from the UART. I usually use
UART for not real-time tasks.
Anyway the processing of an incoming message could take some time (CRC
check, destination address, ...)

> But if you are using a DMA then it will just be
> one more character in the buffer - you don't get an interrupt until the
> buffer is full. So you need additional methods, such as timer
> interrupts, to regularly poll the DMA buffer to see if the final
> character has come in.

In ISR-only implementation, I already use this approach. In the main
task I poll the FIFO buffer (continuously or with a timer) to check for
new incoming data.

> So a DMA on receive is good for some things, but adds complications for
> other tasks.
>
> DMA on transmit, however, is usually very simple because you know
> exactly how much you are sending. It gets "fun" if you want to add to
> the DMA buffer while the DMA is running - synchronising between DMA and
> the processor is not always a simple task. It is really easy to make
> something that works fine most of the time (and during all your
> testing), yet fails if the DMA triggers half-way through the buffer add
> function.

I see.

Tim Wescott

unread,

Aug 22, 2016, 1:40:16 PM8/22/16

to

On Mon, 22 Aug 2016 14:42:29 +0200, pozz wrote:

> Il 22/08/2016 13:45, Dave Nadler ha scritto:
>> On Monday, August 22, 2016 at 7:03:55 AM UTC-4, pozz wrote:
>>> ...No, I don't have a fixed message size.
>>
>> A variable sized message means you have to periodically poll the
>> results, which means stopping the DMA and likely restarting DMA with a
>> different buffer.
>
> Why stopping the DMA? DMA should silently work in background, even if
> I'm checking for its results.

Well, if you mean "should" as in "is morally obliged to", yes.

But if you mean "should" as in "can be reasonably expected to" --
unfortunately, no.

Bottom line -- you can't count on your assumed behavior for a part to be
the designer's assumed behavior for the part, or for the part's actual
behavior to match either of the assumed behaviors. You just have to read
the data sheet (or "user's guide") closely, and then hope that you
haven't missed anything.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

I'm looking for work -- see my website!

pozz

unread,

Aug 23, 2016, 4:21:38 AM8/23/16

to

Il 22/08/2016 19:40, Tim Wescott ha scritto:
> On Mon, 22 Aug 2016 14:42:29 +0200, pozz wrote:
>
>> Il 22/08/2016 13:45, Dave Nadler ha scritto:
>>> On Monday, August 22, 2016 at 7:03:55 AM UTC-4, pozz wrote:
>>>> ...No, I don't have a fixed message size.
>>>
>>> A variable sized message means you have to periodically poll the
>>> results, which means stopping the DMA and likely restarting DMA with a
>>> different buffer.
>>
>> Why stopping the DMA? DMA should silently work in background, even if
>> I'm checking for its results.
>
> Well, if you mean "should" as in "is morally obliged to", yes.
>
> But if you mean "should" as in "can be reasonably expected to" --
> unfortunately, no.

I think you haven't to stop DMA to check how many bytes are already
transferred in the FIFO buffer.
If there are any, the application can pop those bytes from FIFO buffer
to an application-level buffer where the protocol is able to understand
if the message is complete, corrupted, and so on.