Itbegins with details about the out-of-box demo provided in theProcessor SDK Linux filesystem, followed by rebuilding the demo code andrunning the built images. ( This covers the use case with the Hostrunning linux OS and the slave cores running RTOS).
IPC package and its examples are delivered in RTOS Processor SDK, butcan be built from Linux Proc SDK. To build IPC examples, both Linux andRTOS processor SDKs need to be installed. They can be downloaded fromSDK downloadpage
Once the Linux and RTOS Processor SDKs are installed at their defaultlocations, the IPC Linux library, not included in the Linux Proc SDK,can be built on Linux host machine with the following commands:
For AM57x platforms, Modify the symbolic links in /lib/firmware of thedefault image names to the built binaries. The images pointed by thesymbolic links will be downloaded to and started execution on thecorresponding processors by remoteproc during Linux Kernel boots.
This article is geared toward AM57xx users that are running Linux on theCortex A15. The goal is to help users understand how to gain entitlementto the DSP (c66x) and IPU (Cortex M4) subsystems of the AM57xx.
AM572x device has two IPU subsystems (IPUSS), each of which has 2 cores.IPU2 is used as a controller in multi-media applications, so if you haveProcessor SDK Linux running, chances are that IPU2 already has firmwareloaded. However, IPU1 is open for general purpose programming to offloadthe ARM tasks.
In order to setup IPC on slave cores, we provide some pre-built examplesin IPC package that can be run from ARM Linux. The subsequent sectionsdescribe how to build and run this examples and use that as a startingpoint for this effort.
This shows that we have defined a CMEM block at physical base address of0xA0000000 with total size 0xc000000 (192 MB). This block contains abuffer pool consisting of 1 buffer. Each buffer in the pool (only one inthis case) is defined to have a size of 0xc000000 (192 MB).
First, it is important to understand that there are a pair of MemoryManagement Units (MMUs) that sit between the DSP subsystems and the L3interconnect. One of these MMUs is for the DSP core and the other is forits local EDMA. They both serve the same purpose of translating virtualaddresses (i.e. the addresses as viewed by the DSP subsystem) intophysical addresses (i.e. addresses as viewed from the L3 interconnect).
The physical location where the DSP code/data will actually reside isdefined by the CMA carveout. To change this location, you must changethe definition of the carveout. The DSP carveouts are defined in theLinux dts file. For example for the AM57xx EVM:
You must ensure that the sizes of your sections are consistent with thecorresponding definitions in the resource table. You should create yourown resource table in order to modify the memory map. This is describein the page IPC ResourcecustomTable.You can look at anexisting resource table inside IPC:
The physical location where the M4 code/data will actually reside isdefined by the CMA carveout. To change this location, you must changethe definition of the carveout. The M4 carveouts are defined in theLinux dts file. For example for the AM57xx EVM:
The 3 entries above from the resource table all come from the associatedIPU CMA pool (i.e. as dictated by the TYPE_CARVEOUT). The secondparameter represents the virtual address (i.e. input address to theIOMMU). These addresses must be consistent with both the AMMU mappingas well as the linker command file. The ex02_messageq example fromipc defines these memory sections in the fileexamples/DRA7XX_linux_elf/ex02_messageq/shared/config.bld.
The IPUs and DSPs auto-idle by default. This can prevent you from beingable to connect to the device using JTAG or from accessing local memoryvia devmem2. There are some options sprinkled throughout sysfs that areneeded in order to force these subsystems on, as is sometimes needed fordevelopment and debug purposes.
A common thing people want to do is take an existing DSP applicationand add IPC to it. This is common when migrating from a DSP onlysolution to a heterogeneous SoC with an Arm plus a DSP. This is thefocus of this section.
Now we want to copy configuration and source files from theex02_messageq IPC example into our project. The IPC example islocated atC:\ti\ipc_3_43_02_04\examples\DRA7XX_linux_elf\ex02_messageq.To copy files into your CCS project, you can simply select the filesyou want in Windows explorer then drag and drop them into your projectin CCS.
Comment out the line that calls Board_init(boardCfg). This call is inthe original example because it assumes TI-RTOS is running on the Armbut in our case here, we are running Linux and this call isdestructive so we comment it out.
A common thing people want to do is take an existing IPU applicationthat may be controlling serial or control interfaces and add IPC to itso that the firmware can be loaded from the ARM. This is common whenmigrating from a IPU only solution to a heterogeneous SoC with anMPUSS (ARM) and IPUSS. This is the focus of this section.
Connect to the ARM core and make sure GEL runs multicore initializationand brings the IPUSS out of reset. Connect to IPU2 core0 and load andrun the M4 UART example. When you run the code you should see thefollowing log on the serial IO console:
Linux kernel enables all SOC HW modules which are required for itsconfiguration. Appropriate drivers configure required clocks andinitialize HW registers. For all unused IPs clocks are not configured.
Now we want to copy configuration and source files from theex02_messageq IPC example into our project. The IPC example islocated atC:\ti\ipc_3_xx_xx_xx\examples\DRA7XX_linux_elf\ex02_messageq.To copy files into your CCS project, you can simply select the filesyou want in Windows explorer then drag and drop them into your projectin CCS.
Comment out the line that calls Board_init(boardCfg). This call is inthe original example because it assumes TI-RTOS is running on the Armbut in our case here, we are running Linux and this call is destructiveso we comment it out. The board init call does all pinmux configuration,module clock and UART peripheral initialization.
In order to run the UART Example on M4, you need to disable the UART inthe Linux DTB file and interact with the Linux kernel using Telnet (Thiswill be described later in the article). Since Linux will be runninguboot performs the pinmux configuration but clock and UART Stdio setupneeds to be performed by the M4.
There are two MMUs inside each of the IPU1, and IPU2 subsystems. The L1MMU is referred to as IPU_UNICACHE_MMU or AMMU and L2 MMU. Thedescription of how this is configured in IPC-remoteproc has beendescribed in sectionChanging_Cortex_M4_IPU_Memory_Map.IPC handling of L1 and L2 MMU is different from how the PDK driverexamples setup the memory access using these MMUs which the users needto manage when integrating the components. This difference ishighlighted below:
Therefore after integrating IPC with PDK drivers, it is recommended thatthe alias addresses are used to access peripherals and PRCM registers.This requires changes to the addresses used by PDK drivers and inapplication code.
DRA7xx/AM57xx SOCs have multiple processor cores - Cortex A15, C66x DSPs andARM M4 cores. The A15 typically runs a HLOS like Linux/Android andthe remotecores(DSPs and M4s) run an RTOS. In the normal operation,bootloader (U-Boot/SPL) boots and loads the A15 with the HLOS. The A15boots the DSP and the M4 cores. In this sequence, the interval betweenthe Power on Reset and the remotecores (i.e. the DSPs and the M4s)executing is dependent on the HLOS initialization time. This delay maynot be suitable for realizing some usecases with tight time constraints.e.g. Rear View Camera.
Early Boot/Late Attach functionality is supported for IPUs and enabled by defaultfor IPU1 remote processor on the TI SDKs for all TI DRA7xx/AM57xx platforms.The functionality relies on matching configuration/code betweenSPL and Linux kernel in terms of memory and timers used by the firmwares,and matching firmwares in boot media (used by SPL) and in the rootfsin /lib/firmware folder (used by kernel).
The reserved memory nodes above should match the reserved-memory node regiondefinitions in the corresponding dts board file in the kernel. For example,see the defined reserved-memory nodes in arch/arm/boot/dts/am57xx-beagle-x15-common.dtsifile used for all AM57xx EVM boards:
If the allocations do not match, the MLO execution may fail when trying to allocate memory for the carveouts.Further, the kernel can overwrite the memory being used by firmwares and can result in crashes.
The Early boot code in U-Boot does the necessary configuration to bringup a remotecore. This includes the timers and the MMUs. It does notconfigure any other peripherals by default. Some usecases may requireadditional peripheral configuration before running the remotecore.U-Boot includes placeholder functions that can be populated for thispurpose. These can be found in the filedrivers/remoteproc/ipu_rproc.c.
In DRA7xx/AM57xx, the boot ROM copies the first stage boot loader(MLO/SPL) fromQSPI flash at a conservative speed of 48MHz. For certain use cases the time spentin ROM copy plays a significant role in the usecases time. To speed up the timeto copy the firststage boot loader we use a umlo (micro MLO).
The umlo configures the DRA7xx/AM57xx to operate at the maximum QSPI interfacespeed, which is 76.8MHz interface frequency, Quad mode, and Mode 0 operation. Theumlo copies the MLO to the execution address in OCMC and jumps to it. With thisthe time taken to enter a 168 KB size of MLO is significantly reduced to 5.5 msapproximately.
The umlo source can be cloned from the below link. The tool shall be compiled withany baremetal compiler that supports Cortex A15. Ensure that the toolchain is installedand have arm-none-eabi-gcc in the PATH. Follow README.md from the repo tocompile and build umlo binary.
3a8082e126