The mechanism for generating an interrupt from a PRU to the A8 (host) is well-documented. Is there a way to send an interrupt (one of the 64 system interrupt events documented in the PRU-ICSS literature) from userspace?
From reading the TI documentation, the only two that seem to be candidates are two "mailbox" interrupts. I recall reading something about a version of the remoteproc (or RPMsg, or virtio) drivers that utilized these mailboxes, but ultimately abandoned them as they are not available on all platforms. (that may be incorrect).Setting a flag in PRU DRAM or shared RAM is clearly a method that will work. However, it appears that polling DRAM or shared RAM is a multi-clock task; if a PRU system interrupt can be generated, it can be polled in one clock by examining R31 bits 30/31 (if configured correctly). Is this possible?
--
For more options, visit http://beagleboard.org/discuss
---
You received this message because you are subscribed to the Google Groups "BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beagleboard+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beagleboard/f49b6ac0-7cc8-467c-8fcf-3060b65dec05%40googlegroups.com.
For the purpose of this discussion with ags, I do not think the actual definition of what an interrupt is, is quite so important, as much as how to achieve an end goal. On a single threaded "system", I also do not think asynchronous is really ever a factor. But I usually do tend to view interrupts as prioritized, and preemptive.
Correct - to preserve deterministic execution, the PRU cannot be asynchronously interrupted. Polling (of some form) is required.Back to the OP, there is a way to register a (non-async) interrupt with the PRU. One can force a system interrupt (any one of the 64 that the PRUSS recognizes) by setting a bit in the Interrupt Status Register. From userspace it looks just like writing to the PRU DRAM since it's just writing a value to mmap()'d physical address. The advantage over what's been discussed here is that depending on how it's set up, it could be faster than polling from DRAM. I will have to implement to provide actual measurements.
I've had a hard time getting any definitive responses to questions on the subject of memory access & latency. It is true that the PRU cores have faster access to DRAM that is part of the PRU-ICSS (through the 32-bit interconnect SCR) - though not single-cycle - than to system DDR. However, the ARM core accesses DDR through L3 fabric, but the PRU-ICSS through L4FAST, so I'm thinking that it can access DDR faster than PRU-ICSS memory.I've also asked about differences in latency/throughput/contention comparing PRU-ICSS 12KB shared RAM v the 8KB data RAM. No response. Since both 8K data RAM is accessible to both PRU cores, I'm not sure what the benefit of the 12KB shared RAM is (thought I imagine there is, I just can't figure it out).Lastly - and even more importantly - is total agreement that you have to be careful about accessing any memory correctly. I have posted several times asking about the am335x_pru_package examples (using UIO). In at least one (https://github.com/beagleboard/am335x_pru_package/blob/master/pru_sw/example_apps/PRU_PRUtoPRU_Interrupt/PRU_PRUtoPRU_Interrupt.c), there is hardcoded use of the first 8 bytes of physical memory at 0x8000_0000. I don't see how that can be OK. It may be that I don't know some secrets of Linux internals, but from a theoretical perspective, I just don't know how one can make the assumption that any part of main memory is not in use by another process unless it is guaranteed by the kernel.
OK, according to some dicumentation I was able to find quickly, address 0x8000000 is the base address for the start of the DDR memory on the TI EVM board. Which is very similar to the beaglebone in memory layout.
On Fri, Mar 10, 2017 at 7:38 PM, William Hermans <yyr...@gmail.com> wrote:
Thinking on it for a little longer, I almost want to say that the Address 0x8000000h is actually the start of Linux's virtual memory map. But I'm not 100% sure.I'm doing my own research for a paying project, so can't really dive into documentation for something else right now . . .
On Fri, Mar 10, 2017 at 7:24 PM, William Hermans <yyr...@gmail.com> wrote:
On Fri, Mar 10, 2017 at 2:53 PM, ags <alfred.g...@gmail.com> wrote:
I've had a hard time getting any definitive responses to questions on the subject of memory access & latency. It is true that the PRU cores have faster access to DRAM that is part of the PRU-ICSS (through the 32-bit interconnect SCR) - though not single-cycle - than to system DDR. However, the ARM core accesses DDR through L3 fabric, but the PRU-ICSS through L4FAST, so I'm thinking that it can access DDR faster than PRU-ICSS memory.I've also asked about differences in latency/throughput/contention comparing PRU-ICSS 12KB shared RAM v the 8KB data RAM. No response. Since both 8K data RAM is accessible to both PRU cores, I'm not sure what the benefit of the 12KB shared RAM is (thought I imagine there is, I just can't figure it out).
Lastly - and even more importantly - is total agreement that you have to be careful about accessing any memory correctly. I have posted several times asking about the am335x_pru_package examples (using UIO). In at least one (https://github.com/beagleboar d/am335x_pru_package/blob/mast er/pru_sw/example_apps/PRU_PRU toPRU_Interrupt/PRU_PRUtoPRU_ Interrupt.c), there is hardcoded use of the first 8 bytes of physical memory at 0x8000_0000. I don't see how that can be OK. It may be that I don't know some secrets of Linux internals, but from a theoretical perspective, I just don't know how one can make the assumption that any part of main memory is not in use by another process unless it is guaranteed by the kernel.
So here is what I meant. Of course, I have no personal hands on,but looking at things from 35k feet. I *know* writing directly to the PRU shared memory from userspace, would be, performance wise, just as fast as writing to the 512M of system DDR. Through /dev/mem/. On the PRU side however, the PRU's would have single cycle access to their own memory. So the tricky part for me here would not be making sure we're writing to the right memory location, but knowing it's possible to begin with because I have not attempted this personally. In fact my hands on experience with the PRU is limited to just setting up a couple examples, and proving to myself it would work with a 4.x kernel.
So my only real "concern" is, if it really is possible to mmap() the physical address for the PRU's shared memory, and if that could be done "safely". But I do know that if it is possible, it would be faster than reading and writing to the systems 512M DDR because of the fabric latency. From the PRU side. Not only that, from what I've read in the past, is that accessing devices, or memory through that fabric can add a little bit of non deterministic latency. So my thinking here is that "we'd" gain back our little bit of determinism that we lost using DDR.After that, I have no idea how important what I'm talking about is to you, with your given project. Address 0x8000000h though, I seem to recall is possibly related to the kernel, or perhaps the initrd. But another thing, that I do not pretend to know 100% about is how Linux virtual memory works. So when we say we're accessing "physical memory", through mmap() we're actually accessing the device modules, or external memory through virtual memory. Which it could very well be possible the person who wrote the uio pru examples knew this going in, and it's not by accident at all. But rather by design. I'd have to look further into the gory details of everything, before I could make this determination.
On Mar 13, 2017, at 9:24 PM, ags <alfred.g...@gmail.com> wrote:@William Hermans like you I won't be able to dig into the gory details of loading Linux. This is an interesting read (albeit high-level and prompting more questions). I think I can say a few things without understanding all the details:It is correct (from detailed reading of the TI TRM) that 0x80000000 is the physical memory address of the L3 DDR.If Linux is leaving any physical memory unmapped, unused - that's a shame. Wasted precious resource.The PRUSS UIO driver allocates memory and exposes the physical address in userspace. If this is not used, it is also a precious wasted resource.Now comes the subjective stuff:I'm going to presume that Linux isn't stupid, and not count on it leaving permanently-allocated and undocumented physical memory addresses available for those that know the secret handshake.I will use the memory allocated by the PRUSS UIO driver to communicate between userspace the PRUICSS.If someone from TI/BeagleBoard.org responds with clarification on where I'm incorrect, I'll adjust my position. As of now, for over two years I've been asking this same question and gotten no definitive response. Anyone know who came up with the the am335x_pru_package examples?
Thanks for your input and replies. Much appreciated.
On Friday, March 10, 2017 at 7:30:25 PM UTC-8, William Hermans wrote:Here is another link that should explain it clear enough. http://processors.wiki.ti.com/index.php/HOWTO_Change_the_Linux_Kernel_Start_Address#Modifying_memory.hSo I would say that it is not by accident that the base address of 0x8000000 works. In fact, if you think about it a little bit. . Read the opening paragraph labeled "purpose", and replace "DSP" with "PRU", for all intents and purposes. of this discussion.On Fri, Mar 10, 2017 at 7:59 PM, William Hermans <yyr...@gmail.com> wrote:OK, according to some dicumentation I was able to find quickly, address 0x8000000 is the base address for the start of the DDR memory on the TI EVM board. Which is very similar to the beaglebone in memory layout.On Fri, Mar 10, 2017 at 7:38 PM, William Hermans <yyr...@gmail.com> wrote:Thinking on it for a little longer, I almost want to say that the Address 0x8000000h is actually the start of Linux's virtual memory map. But I'm not 100% sure.I'm doing my own research for a paying project, so can't really dive into documentation for something else right now . . .On Fri, Mar 10, 2017 at 7:24 PM, William Hermans <yyr...@gmail.com> wrote:On Fri, Mar 10, 2017 at 2:53 PM, ags <alfred.g...@gmail.com> wrote:So here is what I meant. Of course, I have no personal hands on,but looking at things from 35k feet. I *know* writing directly to the PRU shared memory from userspace, would be, performance wise, just as fast as writing to the 512M of system DDR. Through /dev/mem/. On the PRU side however, the PRU's would have single cycle access to their own memory. So the tricky part for me here would not be making sure we're writing to the right memory location, but knowing it's possible to begin with because I have not attempted this personally. In fact my hands on experience with the PRU is limited to just setting up a couple examples, and proving to myself it would work with a 4.x kernel.I've had a hard time getting any definitive responses to questions on the subject of memory access & latency. It is true that the PRU cores have faster access to DRAM that is part of the PRU-ICSS (through the 32-bit interconnect SCR) - though not single-cycle - than to system DDR. However, the ARM core accesses DDR through L3 fabric, but the PRU-ICSS through L4FAST, so I'm thinking that it can access DDR faster than PRU-ICSS memory.I've also asked about differences in latency/throughput/contention comparing PRU-ICSS 12KB shared RAM v the 8KB data RAM. No response. Since both 8K data RAM is accessible to both PRU cores, I'm not sure what the benefit of the 12KB shared RAM is (thought I imagine there is, I just can't figure it out).Lastly - and even more importantly - is total agreement that you have to be careful about accessing any memory correctly. I have posted several times asking about the am335x_pru_package examples (using UIO). In at least one (https://github.com/beagleboard/am335x_pru_package/blob/master/pru_sw/example_apps/PRU_PRUtoPRU_Interrupt/PRU_PRUtoPRU_Interrupt.c), there is hardcoded use of the first 8 bytes of physical memory at 0x8000_0000. I don't see how that can be OK. It may be that I don't know some secrets of Linux internals, but from a theoretical perspective, I just don't know how one can make the assumption that any part of main memory is not in use by another process unless it is guaranteed by the kernel.So my only real "concern" is, if it really is possible to mmap() the physical address for the PRU's shared memory, and if that could be done "safely". But I do know that if it is possible, it would be faster than reading and writing to the systems 512M DDR because of the fabric latency. From the PRU side. Not only that, from what I've read in the past, is that accessing devices, or memory through that fabric can add a little bit of non deterministic latency. So my thinking here is that "we'd" gain back our little bit of determinism that we lost using DDR.After that, I have no idea how important what I'm talking about is to you, with your given project. Address 0x8000000h though, I seem to recall is possibly related to the kernel, or perhaps the initrd. But another thing, that I do not pretend to know 100% about is how Linux virtual memory works. So when we say we're accessing "physical memory", through mmap() we're actually accessing the device modules, or external memory through virtual memory. Which it could very well be possible the person who wrote the uio pru examples knew this going in, and it's not by accident at all. But rather by design. I'd have to look further into the gory details of everything, before I could make this determination.
--
For more options, visit http://beagleboard.org/discuss
---
You received this message because you are subscribed to the Google Groups "BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beagleboard...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beagleboard/dfbb7cc9-0bd0-4b4f-b05a-5cb59ae11878%40googlegroups.com.