In the latest Debian images, the TI OpenCL tools are there, including /usr/share/ti/examples/opencl.
This stuff covers running code on the DSPs, but you can run code on the M4 processors the same way.
...or, you can just load M4 code using remoteproc the same as we do for the PRUs.
Anyway, that said, I haven't quite figured out the magic outside of TIDL. Below is where I'm at with TIOpenCL, but I'll forward this on to the respective folks within TI to figure out what I'm missing.
clplat.cpp:
#include <CL/TI/cl.hpp>
#include <iostream>
int main() {
std::vector<cl::Platform> platforms;
std::vector<cl::Device> devices;
std::string str;
cl::Platform::get(&platforms);
platforms[0].getInfo(CL_PLATFORM_VERSION, &str);
std::cout << str << "\n";
std::cout << platforms.size() << "\n";
std::cout << platforms[0].getInfo<CL_PLATFORM_NAME>() << "\n";
platforms[0].getDevices(CL_DEVICE_TYPE_ALL, &devices);
std::cout << devices.size() << "\n";
std::cout << devices[0].getInfo<CL_DEVICE_NAME>() << "\n";
return(0);
}
debian@beaglebone:/var/lib/cloud9$ g++ clplat.cpp -lTIOpenCL && sudo ./a.out
OpenCL 1.1 TI product version 01.01.17.01 (Jan 3 2019 15:42:47)
1
TI AM57x
1
TI Multicore C66 DSP