Hello people,
I got to compile and test Beignet, Intel's OpenCL SDK and runtime for
IVB, HSW, Broadwell + GPUs on Linux on a work-issued HP ProBook 4340s
with Intel HD 4000 Graphics (IVB). The laptop is running Ubuntu
14.04, with the latest Mesa stack installed from Oibaf's PPA.
For the last few months, I've been following Beignet's active
development, and yesterday, they dropped a new version with major
bug-fixes and full support for OpenCL 1.2 spec:
Link:
http://lists.freedesktop.org/archives/beignet/2014-June/003492.html
Some of the new features introduced in this earth-shattering release:
* Added 4th Generation Intel Core Processors support.
* Added Intel "Bay Trail" platform with Intel HD Graphics support.
* Significant performance improvement compared to 0.8. For Luxmark benchmark
and some OpenCV performance test cases, we measured 10x-20x performance
boost.
* Compile speed up about 30% compared to 0.8.
* Support OpenCL spec 1.2. Support printf in GPU kernel side which is very
helpful for kernel debugging. Support both clLinkProgram and clCompileProgram
which allow application to compile and link the opencl binaries at runtime
and is faster than rebuilding everything.
* Support runtime library separate from the compiler backend. For mobile
system which don't need to compile kernel dynamically, we can strip down
the Beignet to less than 2MB. Which is very suitable for small footprint
products.
* Update documents including how to optimize kernels and how to do
corss-compile
etc.
Notes on compiling this release on Ubuntu:
1. Ensure that the "Universe" repo is enabled.
2. Run the latest Mesa stack from Oibaf's PPA. Google-fu works wonders
here, and shortens this email.
3. Read the README.md and set the mentioned variables correctly under
/etc/profile, then source it.
4. Kill many midgets.
5. Reboot after install so the i915 driver loads with the new variables.
Results of your hard work:
Run clinfo and see the results. You can drool over mine below:
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.2 beignet 0.9.0
Platform Name: Intel Gen OCL Driver
Platform Vendor: Intel
Platform Extensions:
cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics
cl_khr_byte_addressable_store cl_khr_icd
Platform Name: Intel Gen OCL Driver
Number of devices: 1
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 358
Max compute units: 16
Max work items dimensions: 3
Max work items[0]: 512
Max work items[1]: 512
Max work items[2]: 512
Max work group size: 1024
Preferred vector width char: 16
Preferred vector width short: 16
Preferred vector width int: 16
Preferred vector width long: 16
Preferred vector width float: 16
Preferred vector width double: 0
Native vector width char: 16
Native vector width short: 16
Native vector width int: 16
Native vector width long: 16
Native vector width float: 16
Native vector width double: 16
Max clock frequency: 1000Mhz
Address bits: 32
Max memory allocation: 268435456
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 8192
Max image 2D height: 8192
Max image 3D width: 8192
Max image 3D height: 8192
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: No
Round to +ve and infinity: No
IEEE754-2008 fused multiply-add: No
Cache type: Read/Write
Cache line size: 128
Cache size: 8192
Global memory size: 1073741824
Constant buffer size: 524288
Max number of constant args: 8
Local memory type: Global
Local memory size: 65536
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 80
Device endianess: Little
Available: Yes
Compiler available: No
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: Yes
Queue properties:
Out-of-Order: No
Profiling : Yes
Platform ID: 0x7f047f3a46a0
Name: Intel(R) HD Graphics IvyBridge M GT2
Vendor: Intel
Device OpenCL C version: OpenCL C 1.2 beignet 0.9.0
Driver version: 0.9.0
Profile: EMBEDDED_PROFILE
Version: OpenCL 1.2 beignet 0.9.0
Extensions: cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store
cl_khr_icd
Text file with results attached below.
Party on. Party hard. And burn midnight oils.
Finally, back to bench-marking its' performance.
--
Please avoid sending me Word or PowerPoint attachments.
See
http://www.gnu.org/philosophy/no-word-attachments.html