Kernel driver accessing PRU shared RAM

112 views
Skip to first unread message

Mark A. Yoder

unread,
Jul 29, 2016, 3:40:21 PM7/29/16
to BeagleBoard
How do I access the PRUs' shared RAM from within a kernel driver?  Is there an equivalent to mmap()?

Can you point me to some tutorials?

--Mark

ZeekHuge

unread,
Jul 30, 2016, 10:44:06 AM7/30/16
to BeagleBoard

You can probably use it as an attribute.
something like this :


__far __attribute__((cregister("PRU_SHAREDMEM", near))) volatile uint32_t variable_1;
__far __attribute__
((cregister("PRU_SHAREDMEM", near))) volatile uint32_t variable_2;


//Somewhere In code

variable_1
= data;
variable_2
= _2_data;


So basically, the compiler allocates memory to these variables somewhere in the shared mem and you have no control over where it gets allocated. So if you wish to share data between the two PRUs, you can probably
define the variable in a common file and then use this exact same variable in the two source codes.

I haven't actually tried this way. Please let us know if this works.

William Hermans

unread,
Jul 30, 2016, 1:58:11 PM7/30/16
to beagl...@googlegroups.com

I do not think so . . .because somehow, you have to "tap into" the kernels virtual memory pool. *Or* if the peripheral in question is not already represented in virtual memory. You need to use something like ioremap(). Read the first answer here: http://unix.stackexchange.com/questions/239205/whats-the-difference-between-ioremap-and-file-operation-mmap

A few search hit's I found yesterday on the subject mention reading LDD3 chapters 9, and 15 . . . but I saw no examples so did not mention anything. Plus, I have no personal hands on . . . but I do find it an interesting question.

Charles Steinkuehler

unread,
Jul 30, 2016, 2:04:59 PM7/30/16
to beagl...@googlegroups.com
On 7/29/2016 2:40 PM, Mark A. Yoder wrote:
> How do I access the PRUs' shared RAM from within a kernel driver? Is there an
> equivalent to mmap()?

http://stackoverflow.com/questions/7894160/accessing-physical-memory-from-linux-kernel

> Can you point me to some tutorials?

Chapter 15 of LDD3:

https://lwn.net/Kernel/LDD3/
https://lwn.net/images/pdf/LDD3/ch15.pdf

The BeagleLogic code probably does what you're wanting to do
(high-speed kernel mode access to the PRU):

https://github.com/abhishek-kakkar/BeagleLogic

--
Charles Steinkuehler
cha...@steinkuehler.net

John Pehowich

unread,
Jul 30, 2016, 3:14:35 PM7/30/16
to beagl...@googlegroups.com

990

> --
> For more options, visit http://beagleboard.org/discuss
> ---
> You received this message because you are subscribed to the Google Groups "BeagleBoard" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to beagleboard...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/beagleboard/CALHSORqFPgeAnZrrCw8T%2BEpeMQJtMV_ERS6876w8ZOnF%2BY_DxA%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

William Hermans

unread,
Jul 30, 2016, 3:48:08 PM7/30/16
to beagl...@googlegroups.com
DMA in my mind almost seems the way to go. But there are several key points that I'd need to understand to make that determination.But at the moment I'm imagining all kinds of cool possibilities. . . too bad I do not have the time to invest into looking into this right now.

Charles Steinkuehler

unread,
Jul 30, 2016, 4:54:43 PM7/30/16
to beagl...@googlegroups.com
On 7/30/2016 2:47 PM, William Hermans wrote:
>
> DMA in my mind almost seems the way to go. But there are several key points that
> I'd need to understand to make that determination.But at the moment I'm
> imagining all kinds of cool possibilities. . . too bad I do not have the time to
> invest into looking into this right now.

DMA is not really necessary, as the PRU can read/write to the ARM
system DRAM and the ARM can read/write to the PRU memories. There are
some ways DMA could improve performance of a high-performance
application using both the ARM and the PRU heavily, but it's not a
clear win in all cases.

However, any kernel-level physical memory access for talking to the
PRU is going to have a lot in common with doing DMA. You need to map
physical addresses into logical memory space, issue fence instruction
to guarantee memory coherency, etc. Basically, the PRU can be
considered a "custom" DMA controller, in that it is something other
than the application processor that is accessing and changing main
memory contents. The usage semantics for talking to the PRU in kernel
space are very similar to using DMA. Just 's/DMA/PRU/g' and you won't
go too far wrong! ;-)

--
Charles Steinkuehler
cha...@steinkuehler.net

William Hermans

unread,
Jul 30, 2016, 5:13:10 PM7/30/16
to beagl...@googlegroups.com
On Sat, Jul 30, 2016 at 1:54 PM, Charles Steinkuehler <cha...@steinkuehler.net> wrote:
DMA is not really necessary, as the PRU can read/write to the ARM
system DRAM and the ARM can read/write to the PRU memories.  There are
some ways DMA could improve performance of a high-performance
application using both the ARM and the PRU heavily, but it's not a
clear win in all cases.

However, any kernel-level physical memory access for talking to the
PRU is going to have a lot in common with doing DMA.  You need to map
physical addresses into logical memory space, issue fence instruction
to guarantee memory coherency, etc.  Basically, the PRU can be
considered a "custom" DMA controller, in that it is something other
than the application processor that is accessing and changing main
memory contents.  The usage semantics for talking to the PRU in kernel
space are very similar to using DMA.  Just 's/DMA/PRU/g' and you won't
go too far wrong!  ;-)

--
Charles Steinkuehler
cha...@steinkuehler.net

Well, what I was thinking, and perhaps this would have more of a home on the Beagleboard X15. . .  Is that, you have a PRU reading data in some fashion from an external peripheral, at high speeds. Then you want to get that data out of the PRU shared memory as fast as possible to some external storage.

With the Beaglebones, you're going to be limited by your fastest block device, or interface. In the case of the Beaglebone, that would probably be USB, which as it stands ( stock ) is not really much faster than ethernet. Real world performance that is.

But on the X15 where you have dual GbE, PCIe, USB3.0, and SATA . . . you have much faster external "storage". IN this case, you might want a DMA buffer in kernel space that blasts this data directly onto the storage peripheral. All the while keeping your CPU load as low as possible, for other potential duties.

But as I said, I have no practical hands on here, but the theory seems possible at first glance.

William Hermans

unread,
Jul 30, 2016, 5:17:52 PM7/30/16
to beagl...@googlegroups.com
Please do also realize that I left out  mmc media on purpose in the context of high speed "permanent" storage. Then the GPMC connected to *something* could possibly be of use too.
Message has been deleted

ZeekHuge

unread,
Jul 31, 2016, 12:58:43 AM7/31/16
to BeagleBoard

oops ! Sorry. Somehow I just read 'PRUs' shared RAM' and didnt read the 'within a kernel driver' part of the main question, and then thought he was asking about PRU's shared memory usage, to share data between the two PRUs. So the answer was to be able to use shared mem for PRU to PRU communication.

Mark A. Yoder

unread,
Aug 1, 2016, 11:14:46 AM8/1/16
to BeagleBoard
It turns out the answer is in page 250 of Chapter 9 of LDD 3 (https://lwn.net/Kernel/LDD3/).

#include <asm/io.h>
void *ioremap(unsigned long phys_addr, unsigned long size);
void iounmap(void * addr);

unsigned int ioread32(void *addr);

So I used:
#include <asm/io.h>
#define PRU_ADDR 0x4A300000 // Start of PRU memory Page 184 am335x TRM
#define PRU_SHAREDMEM 0x10000 // Offset to shared memory
#define PRU_SHAREDMEM_LEN 0x3000       // Length of shared memory
void *shared_mem;

shared_mem = ioremap(PRU_ADDR+PRU_SHAREDMEM, 0x3000);

and then to read the 0th shared memory:

ioread32(shared_mem+0);

I'll soon post the simple kernel driver I have working...

--Mark
Reply all
Reply to author
Forward
0 new messages