bbx15: v4.4.x + OpenCL

33 views
Skip to first unread message

Robert Nelson

unread,
Oct 27, 2016, 4:40:37 PM10/27/16
to Beagle Board, beagleb...@googlegroups.com
Hey Everyone,

Finally got OpenCL working again on v4.4.x, lots of fun "dkms" module
issues, so ripped all those out..

mkbian@beaglebone:/usr/share/ti/examples/opencl/float_compute$ sudo
modprobe cmem
This example computes y[i] = M[i] * x[i] + C on single precision
floating point arrays of size 2097152
- Computation on the ARM is parallelized across the A15s using OpenMP.
- Computation on the DSP is performed by dispatching an OpenCL NDRange
kernel across the compute units (C66x cores) in the compute device.

Running.....

Average across 5 runs:
ARM (2 OpenMP threads) : 0.008669 secs
DSP (OpenCL NDRange kernel) : 0.007781 secs
OpenCL-DSP speedup : 1.114124

For more information on:

This Sunday's lxqt image will have all the fun bits..

For older images do this:

sudo apt-get update
sudo apt-get upgrade

sudo apt-get remove dkms --purge #get rid of dkms/etc..

cd /opt/scripts/tools/
git pull
sudo ./update_kernel.sh
sudo reboot

cd /usr/share/ti/examples/opencl/float_compute/
sudo make
sudo modprobe cmemk
sudo ./float_compute

Regards,

--
Robert Nelson
https://rcn-ee.com/

Christopher Hansen

unread,
Oct 29, 2016, 12:40:51 PM10/29/16
to beagleboard-x15, beagl...@googlegroups.com
I get this error:

modprobe: FATAL: Module cmemk not found in directory /lib/modules/4.4.23-ti-r51

Any suggestions?

Thanks.

Chris

Robert Nelson

unread,
Oct 29, 2016, 12:54:28 PM10/29/16
to Beagle Board, beagleboard-x15
On Sat, Oct 29, 2016 at 11:40 AM, Christopher Hansen <hans...@gmail.com> wrote:
> I get this error:
>
> modprobe: FATAL: Module cmemk not found in directory /lib/modules/4.4.23-ti-r51
>
> Any suggestions?

Well, that's expected on 4.4.23-ti-r51 ;)

Like i mentioned:

cd /opt/scripts/tools/
git pull
sudo ./update_kernel.sh
sudo reboot

Robert Nelson

unread,
Oct 29, 2016, 2:08:55 PM10/29/16
to Beagle Board, beagleboard-x15
On Sat, Oct 29, 2016 at 11:53 AM, Robert Nelson <robert...@gmail.com> wrote:
> On Sat, Oct 29, 2016 at 11:40 AM, Christopher Hansen <hans...@gmail.com> wrote:
>> I get this error:
>>
>> modprobe: FATAL: Module cmemk not found in directory /lib/modules/4.4.23-ti-r51
>>
>> Any suggestions?
>
> Well, that's expected on 4.4.23-ti-r51 ;)
>
> Like i mentioned:
>
> cd /opt/scripts/tools/
> git pull
> sudo ./update_kernel.sh
> sudo reboot

For background, turns out the cmemk module as written doesn't like to
be loaded on a kernel built for THUMB2..

That was one of the big changes i made last week..

Christopher Hansen

unread,
Oct 29, 2016, 3:45:23 PM10/29/16
to beagleboard-x15, beagl...@googlegroups.com
I followed those steps, but I am on the 4.4.23-ti-r51 kernel. What kernel are you using? How do I get there?

Christopher Hansen

unread,
Oct 29, 2016, 6:43:31 PM10/29/16
to beagleboard-x15, beagl...@googlegroups.com
OK, I found my problem and I fixed it. I had a "bad PPA" I needed to remove in order for the update_kernal.sh script to complete properly. Here's what I get from the example code:

This example computes y[i] = M[i] * x[i] + C on single precision floating point arrays of size 2097152
- Computation on the ARM is parallelized across the A15s using OpenMP.
- Computation on the DSP is performed by dispatching an OpenCL NDRange kernel across the compute units (C66x cores) in the compute device.

Running.....

Average across 5 runs:
ARM (2 OpenMP threads) : 0.007877 secs
DSP (OpenCL NDRange kernel) : 0.007614 secs
OpenCL-DSP speedup : 1.034475


Is that the expected result?

Chris

Robert Nelson

unread,
Oct 29, 2016, 7:50:01 PM10/29/16
to Beagle Board, beagleboard-x15
On Sat, Oct 29, 2016 at 5:43 PM, Christopher Hansen <hans...@gmail.com> wrote:
> OK, I found my problem and I fixed it. I had a "bad PPA" I needed to remove in order for the update_kernal.sh script to complete properly. Here's what I get from the example code:
>
> This example computes y[i] = M[i] * x[i] + C on single precision floating point arrays of size 2097152
> - Computation on the ARM is parallelized across the A15s using OpenMP.
> - Computation on the DSP is performed by dispatching an OpenCL NDRange kernel across the compute units (C66x cores) in the compute device.
>
> Running.....
>
> Average across 5 runs:
> ARM (2 OpenMP threads) : 0.007877 secs
> DSP (OpenCL NDRange kernel) : 0.007614 secs
> OpenCL-DSP speedup : 1.034475
>
>
> Is that the expected result?

Yeah, i was getting around 1.1x on v4.4.x

When i last tried ti's sdk (v4.4.x based on the Alpha -X15 (no support
for the rev b yet)) i was getting around 0.7/0.8 "speedup"...

Back in v4.1.x (about a year ago, with the alpha-x15) i thought it was
around 3x/4x speedup

So there's definitely a speed regression, (maybe we are in a slow
clock state for the dsp?)

But it atleast it's working again... ;)
Reply all
Reply to author
Forward
0 new messages