A couple of users reported OpenCL problems:
Wojowu: "Failed to open file. Error:
OpenCL_RD::ReloadKernelIfNeeded : expecting CL_KERNEL_WORK_GROUP_SIZE
to be a power of 2"
and
Hektor: "An error occurred when running the simulation:
OpenCL_RD::Update : clEnqueueNDRangeKernel failed: Invalid work group size"
I think both of these are to do with bad assumptions I was making in
choosing the local work group size - the number of cells that are
processed in parallel. I've just discovered (r722) that we actually
don't need to specify the local work group size at all - we can pass
NULL instead and let the OpenCL system choose. On my system this also
improves the speed by 25% on some patterns. I've also improved the
OpenCL diagnostics function to return more details (r720) which might
have helped with diagnosing these problems.
I'm 80% confident that this will fix the problems reported, so if you
can test it for me on your machines that would be great. I'm
interested in:
a) whether everything still works.
b) whether the speed is faster, slower, or about the same.
If it's all OK then we can deliver a new build.
--
Tim Hutton - http://www.sq3.org.uk - http://profiles.google.com/tim.hutton/
I'm 80% confident that this will fix the problems reported, so if you
can test it for me on your machines that would be great. I'm
interested in:
a) whether everything still works.
b) whether the speed is faster, slower, or about the same.
If it's all OK then we can deliver a new build.
The runtime_error issue I've fixed. (I think it's because MSVC's
implementation of STL is a bit different.)
The CL_* things are due to the Apple build using whichever OpenCL.h is
installed instead of our OpenCL_Dyn_Load.h. If you can give me the
full list of errors I'll exclude those. It's just for reporting, so it
doesn't matter if some are missing.
--
> ...
> I'm 80% confident that this will fix the problems reported, so if you
> can test it for me on your machines that would be great. I'm
> interested in:
> a) whether everything still works.
I had to add some defines to OpenCL_utils.hpp that are missing from
Apple's OpenCL headers. (But I just saw your reply to Robert's report
so feel free to remove my additions if you think it's better simply
not to use them in the diagnostics. I've appended my diagnostic
output in case it reveals anything interesting.)
> b) whether the speed is faster, slower, or about the same.
A bit slower unfortunately. Mostly around 10% slower, but pulsate.vti
dropped from 1800 mcgs to 1200 mcgs. Still fast enough for me, so
I'm quite hapopy to live with this if it means Mac users no longer
see any errors about work group size.
Andrew
------------------------
Found 1 platform(s):
Platform 1:
CL_PLATFORM_PROFILE : FULL_PROFILE
CL_PLATFORM_VERSION : OpenCL 1.0 (Jan 2 2011 18:00:11)
CL_PLATFORM_NAME : Apple
CL_PLATFORM_VENDOR : Apple
CL_PLATFORM_EXTENSIONS :
Found 2 device(s) on this platform.
Device 1:
CL_DEVICE_EXTENSIONS : cl_APPLE_gl_sharing
CL_DEVICE_NAME : ATI Radeon HD 6970M
CL_DEVICE_PROFILE : FULL_PROFILE
CL_DEVICE_VENDOR : AMD
CL_DEVICE_VERSION : OpenCL 1.0
CL_DRIVER_VERSION : 1.0
CL_DEVICE_ADDRESS_BITS : 32
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE : 0
CL_DEVICE_MAX_CLOCK_FREQUENCY : 149
CL_DEVICE_MAX_COMPUTE_UNITS : 10
CL_DEVICE_MAX_CONSTANT_ARGS : 8
CL_DEVICE_MAX_READ_IMAGE_ARGS : 0
CL_DEVICE_MAX_SAMPLERS : 128
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS : 3
CL_DEVICE_MAX_WRITE_IMAGE_ARGS : 0
CL_DEVICE_MEM_BASE_ADDR_ALIGN : 32768
CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE : 128
CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT : 4
CL_DEVICE_VENDOR_ID : 16915200
CL_DEVICE_AVAILABLE : yes
CL_DEVICE_COMPILER_AVAILABLE : yes
CL_DEVICE_ENDIAN_LITTLE : no
CL_DEVICE_ERROR_CORRECTION_SUPPORT : no
CL_DEVICE_IMAGE_SUPPORT : no
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : 0
CL_DEVICE_GLOBAL_MEM_SIZE : 536870912
CL_DEVICE_LOCAL_MEM_SIZE : 32768
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE : 65536
CL_DEVICE_MAX_MEM_ALLOC_SIZE : 134217728
CL_DEVICE_IMAGE2D_MAX_HEIGHT : 8192
CL_DEVICE_IMAGE2D_MAX_WIDTH : 8192
CL_DEVICE_IMAGE3D_MAX_DEPTH : 0
CL_DEVICE_IMAGE3D_MAX_HEIGHT : 0
CL_DEVICE_IMAGE3D_MAX_WIDTH : 0
CL_DEVICE_MAX_PARAMETER_SIZE : 1024
CL_DEVICE_MAX_WORK_GROUP_SIZE : 1024
CL_DEVICE_PROFILING_TIMER_RESOLUTION : 37
CL_DEVICE_MAX_WORK_ITEM_SIZES : 1024, 1024, 1024
CL_DEVICE_TYPE : GPU
Device 2:
CL_DEVICE_EXTENSIONS : cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_APPLE_gl_sharing cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions
CL_DEVICE_NAME : Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
CL_DEVICE_PROFILE : FULL_PROFILE
CL_DEVICE_VENDOR : Intel
CL_DEVICE_VERSION : OpenCL 1.0
CL_DRIVER_VERSION : 1.0
CL_DEVICE_ADDRESS_BITS : 64
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE : 64
CL_DEVICE_MAX_CLOCK_FREQUENCY : 3100
CL_DEVICE_MAX_COMPUTE_UNITS : 4
CL_DEVICE_MAX_CONSTANT_ARGS : 8
CL_DEVICE_MAX_READ_IMAGE_ARGS : 128
CL_DEVICE_MAX_SAMPLERS : 16
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS : 3
CL_DEVICE_MAX_WRITE_IMAGE_ARGS : 8
CL_DEVICE_MEM_BASE_ADDR_ALIGN : 1024
CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE : 128
CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT : 4
CL_DEVICE_VENDOR_ID : 16909312
CL_DEVICE_AVAILABLE : yes
CL_DEVICE_COMPILER_AVAILABLE : yes
CL_DEVICE_ENDIAN_LITTLE : yes
CL_DEVICE_ERROR_CORRECTION_SUPPORT : no
CL_DEVICE_IMAGE_SUPPORT : yes
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : 6291456
CL_DEVICE_GLOBAL_MEM_SIZE : 12884901888
CL_DEVICE_LOCAL_MEM_SIZE : 16384
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE : 65536
CL_DEVICE_MAX_MEM_ALLOC_SIZE : 3221225472
CL_DEVICE_IMAGE2D_MAX_HEIGHT : 8192
CL_DEVICE_IMAGE2D_MAX_WIDTH : 8192
CL_DEVICE_IMAGE3D_MAX_DEPTH : 2048
CL_DEVICE_IMAGE3D_MAX_HEIGHT : 2048
CL_DEVICE_IMAGE3D_MAX_WIDTH : 2048
CL_DEVICE_MAX_PARAMETER_SIZE : 4096
CL_DEVICE_MAX_WORK_GROUP_SIZE : 1
CL_DEVICE_PROFILING_TIMER_RESOLUTION : 1
CL_DEVICE_MAX_WORK_ITEM_SIZES : 1, 1, 1
CL_DEVICE_TYPE : CPU
> a) whether everything still works.
> b) whether the speed is faster, slower, or about the same.
> What should I try (in other words, define "everything" :-) ?
>
> I clicked on all the patterns, and hit Space, Run (return key), Reset
> (command-R), Step By N (tab), and Run Slower (-) a few times, and it
> did what I expected.
That sounds fine. I'd be curious to see the output from
Action > Show OpenCL Diagnostics.
> How can I measure the speed ...
> More importantly, how do I get and build an older version to compare it to?
Just download Ready-0.2.1-Mac.zip from the Ready website and compare
its speed with your latest build. Run a few patterns for a few secs
and compare the status bar numbers showing fps and mcgs.
Andrew
Found 1 platform(s):
Platform 1:
CL_PLATFORM_PROFILE : FULL_PROFILE
CL_PLATFORM_VERSION : OpenCL 1.0 (Dec 23 2010 17:30:26)
CL_PLATFORM_NAME : Apple
CL_PLATFORM_VENDOR : Apple
CL_PLATFORM_EXTENSIONS :
Found 3 device(s) on this platform.
Device 1:
CL_DEVICE_EXTENSIONS : cl_APPLE_gl_sharing
CL_DEVICE_NAME : ATI Radeon HD 5770
CL_DEVICE_PROFILE : FULL_PROFILE
CL_DEVICE_VENDOR : AMD
CL_DEVICE_VERSION : OpenCL 1.0
CL_DRIVER_VERSION : 1.0
CL_DEVICE_ADDRESS_BITS : 32
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE : 0
CL_DEVICE_MAX_CLOCK_FREQUENCY : 1195
CL_DEVICE_EXTENSIONS : cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_APPLE_gl_sharing cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions
CL_DEVICE_NAME : GeForce GT 120
CL_DEVICE_PROFILE : FULL_PROFILE
CL_DEVICE_VENDOR : NVIDIA
CL_DEVICE_VERSION : OpenCL 1.0
CL_DRIVER_VERSION : CLH 1.0
CL_DEVICE_ADDRESS_BITS : 32
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE : 0
CL_DEVICE_MAX_CLOCK_FREQUENCY : 1400
CL_DEVICE_MAX_COMPUTE_UNITS : 4
CL_DEVICE_MAX_CONSTANT_ARGS : 9
CL_DEVICE_MAX_READ_IMAGE_ARGS : 128
CL_DEVICE_MAX_SAMPLERS : 16
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS : 3
CL_DEVICE_MAX_WRITE_IMAGE_ARGS : 8
CL_DEVICE_MEM_BASE_ADDR_ALIGN : 1024
CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE : 128
CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT : 1
CL_DEVICE_VENDOR_ID : 16918016
CL_DEVICE_AVAILABLE : yes
CL_DEVICE_COMPILER_AVAILABLE : yes
CL_DEVICE_ENDIAN_LITTLE : yes
CL_DEVICE_ERROR_CORRECTION_SUPPORT : no
CL_DEVICE_IMAGE_SUPPORT : yes
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : 0
CL_DEVICE_GLOBAL_MEM_SIZE : 536870912
CL_DEVICE_LOCAL_MEM_SIZE : 16384
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE : 65536
CL_DEVICE_MAX_MEM_ALLOC_SIZE : 134217728
CL_DEVICE_IMAGE2D_MAX_HEIGHT : 4096
CL_DEVICE_IMAGE2D_MAX_WIDTH : 4096
CL_DEVICE_IMAGE3D_MAX_DEPTH : 2048
CL_DEVICE_IMAGE3D_MAX_HEIGHT : 2048
CL_DEVICE_IMAGE3D_MAX_WIDTH : 2048
CL_DEVICE_MAX_PARAMETER_SIZE : 4352
CL_DEVICE_MAX_WORK_GROUP_SIZE : 512
CL_DEVICE_PROFILING_TIMER_RESOLUTION : 1000
CL_DEVICE_MAX_WORK_ITEM_SIZES : 512, 512, 64
CL_DEVICE_TYPE : GPU
Device 3:
CL_DEVICE_EXTENSIONS : cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_APPLE_gl_sharing cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions
CL_DEVICE_NAME : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz
CL_DEVICE_PROFILE : FULL_PROFILE
CL_DEVICE_VENDOR : Intel
CL_DEVICE_VERSION : OpenCL 1.0
CL_DRIVER_VERSION : 1.0
CL_DEVICE_ADDRESS_BITS : 64
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE : 64
CL_DEVICE_MAX_CLOCK_FREQUENCY : 2260
CL_DEVICE_MAX_COMPUTE_UNITS : 16
CL_DEVICE_MAX_CONSTANT_ARGS : 8
CL_DEVICE_MAX_READ_IMAGE_ARGS : 128
CL_DEVICE_MAX_SAMPLERS : 16
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS : 3
CL_DEVICE_MAX_WRITE_IMAGE_ARGS : 8
CL_DEVICE_MEM_BASE_ADDR_ALIGN : 1024
CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE : 128
CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT : 4
CL_DEVICE_VENDOR_ID : 16909312
CL_DEVICE_AVAILABLE : yes
CL_DEVICE_COMPILER_AVAILABLE : yes
CL_DEVICE_ENDIAN_LITTLE : yes
CL_DEVICE_ERROR_CORRECTION_SUPPORT : no
CL_DEVICE_IMAGE_SUPPORT : yes
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : 8388608
CL_DEVICE_GLOBAL_MEM_SIZE : 20937965568
CL_DEVICE_LOCAL_MEM_SIZE : 16384
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE : 65536
CL_DEVICE_MAX_MEM_ALLOC_SIZE : 5234491392
CL_DEVICE_IMAGE2D_MAX_HEIGHT : 8192
CL_DEVICE_IMAGE2D_MAX_WIDTH : 8192
CL_DEVICE_IMAGE3D_MAX_DEPTH : 2048
CL_DEVICE_IMAGE3D_MAX_HEIGHT : 2048
CL_DEVICE_IMAGE3D_MAX_WIDTH : 2048
CL_DEVICE_MAX_PARAMETER_SIZE : 4096
CL_DEVICE_MAX_WORK_GROUP_SIZE : 1
CL_DEVICE_PROFILING_TIMER_RESOLUTION : 1
CL_DEVICE_MAX_WORK_ITEM_SIZES : 1, 1, 1
CL_DEVICE_TYPE : CPU
That sounds fine. I'd be curious to see the output from
Action > Show OpenCL Diagnostics.
Just download Ready-0.2.1-Mac.zip from the Ready website and compare its speed with your latest build. Run a few patterns for a few secs and compare the status bar numbers showing fps and mcgs.
Failed to open file. Error:
OpenCL_RD::ReloadContextIfNeeded : Failed to create command queue: Invalid value
Found 1 platform(s):
Platform 1:
CL_PLATFORM_PROFILE : FULL_PROFILE
CL_PLATFORM_VERSION : OpenCL 1.0 (Dec 26 2010 12:52:21)
CL_PLATFORM_NAME : Apple
CL_PLATFORM_VENDOR : Apple
CL_PLATFORM_EXTENSIONS :
Found 2 device(s) on this platform.
Device 1:
CL_DEVICE_EXTENSIONS : cl_APPLE_gl_sharing
CL_DEVICE_NAME : ATI Radeon HD 6750M
CL_DEVICE_PROFILE : FULL_PROFILE
CL_DEVICE_VENDOR : AMD
CL_DEVICE_VERSION : OpenCL 1.0
CL_DRIVER_VERSION : 1.0
CL_DEVICE_ADDRESS_BITS : 32
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE : 0
CL_DEVICE_MAX_CLOCK_FREQUENCY : 150
CL_DEVICE_MAX_COMPUTE_UNITS : 5
CL_DEVICE_EXTENSIONS : cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_APPLE_gl_sharing cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions
CL_DEVICE_NAME : Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz
CL_DEVICE_PROFILE : FULL_PROFILE
CL_DEVICE_VENDOR : Intel
CL_DEVICE_VERSION : OpenCL 1.0
CL_DRIVER_VERSION : 1.0
CL_DEVICE_ADDRESS_BITS : 64
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE : 64
CL_DEVICE_MAX_CLOCK_FREQUENCY : 2200
CL_DEVICE_MAX_COMPUTE_UNITS : 8
CL_DEVICE_MAX_CONSTANT_ARGS : 8
CL_DEVICE_MAX_READ_IMAGE_ARGS : 128
CL_DEVICE_MAX_SAMPLERS : 16
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS : 3
CL_DEVICE_MAX_WRITE_IMAGE_ARGS : 8
CL_DEVICE_MEM_BASE_ADDR_ALIGN : 1024
CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE : 128
CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT : 4
CL_DEVICE_VENDOR_ID : 16909312
CL_DEVICE_AVAILABLE : yes
CL_DEVICE_COMPILER_AVAILABLE : yes
CL_DEVICE_ENDIAN_LITTLE : yes
CL_DEVICE_ERROR_CORRECTION_SUPPORT : no
CL_DEVICE_IMAGE_SUPPORT : yes
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : 6291456
CL_DEVICE_GLOBAL_MEM_SIZE : 3221225472
CL_DEVICE_LOCAL_MEM_SIZE : 16384
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE : 65536
CL_DEVICE_MAX_MEM_ALLOC_SIZE : 1073741824
CL_DEVICE_IMAGE2D_MAX_HEIGHT : 8192
CL_DEVICE_IMAGE2D_MAX_WIDTH : 8192
CL_DEVICE_IMAGE3D_MAX_DEPTH : 2048
CL_DEVICE_IMAGE3D_MAX_HEIGHT : 2048
CL_DEVICE_IMAGE3D_MAX_WIDTH : 2048
CL_DEVICE_MAX_PARAMETER_SIZE : 4096
CL_DEVICE_MAX_WORK_GROUP_SIZE : 1
CL_DEVICE_PROFILING_TIMER_RESOLUTION : 1
CL_DEVICE_MAX_WORK_ITEM_SIZES : 1, 1, 1
CL_DEVICE_TYPE : CPU
To remove the effects of non-OpenCL code, I tend to do this:
- use a representative pattern like tip-splitting.vti
- set the dimensions to something large, e.g. 512x512x1
- set the timesteps_per_render (running speed) to something large
enough that the number of steps far outweighs the render time, such
that rendering happens once a second or less.
>
> --
> Robert Munafo -- mrob.com
> Follow me at: gplus.to/mrob - fb.com/mrob27 - twitter.com/mrob_27 -
> mrob27.wordpress.com - youtube.com/user/mrob143 - rilybot.blogspot.com
>
--
Do you get the same error on the CPU on that platform?
--
pulsate.vti is slower for me too in the new Ready, but only at smaller
sizes. At e.g. 512x128x32 the new one is faster again.
With this approach the work group size allocation is done by the
OpenCL driver, so we probably will see strange variations. But yes I
think for now we should go with this new approach. In future we can
revisit the issue by looking at performance tuning algorithms (search
the parameter space for the fastest way to process).
>
> Andrew
>
> ------------------------
>
> Found 1 platform(s):
> Platform 1:
> CL_PLATFORM_PROFILE : FULL_PROFILE
> CL_PLATFORM_VERSION : OpenCL 1.0 (Jan 2 2011 18:00:11)
> CL_PLATFORM_NAME : Apple
> CL_PLATFORM_VENDOR : Apple
> CL_PLATFORM_EXTENSIONS :
>
> Found 2 device(s) on this platform.
> Device 1:
> CL_DEVICE_EXTENSIONS : cl_APPLE_gl_sharing
> CL_DEVICE_NAME : ATI Radeon HD 6970M
> CL_DEVICE_PROFILE : FULL_PROFILE
> CL_DEVICE_VENDOR : AMD
> CL_DEVICE_VERSION : OpenCL 1.0
> CL_DRIVER_VERSION : 1.0
> CL_DEVICE_ADDRESS_BITS : 32
> CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE : 0
> CL_DEVICE_MAX_CLOCK_FREQUENCY : 149
> CL_DEVICE_MAX_COMPUTE_UNITS : 10
> CL_DEVICE_NAME : Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
> CL_DEVICE_PROFILE : FULL_PROFILE
> CL_DEVICE_VENDOR : Intel
> CL_DEVICE_VERSION : OpenCL 1.0
> CL_DRIVER_VERSION : 1.0
> CL_DEVICE_ADDRESS_BITS : 64
> CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE : 64
> CL_DEVICE_MAX_CLOCK_FREQUENCY : 3100
> CL_DEVICE_MAX_COMPUTE_UNITS : 4
> CL_DEVICE_MAX_CONSTANT_ARGS : 8
> CL_DEVICE_MAX_READ_IMAGE_ARGS : 128
> CL_DEVICE_MAX_SAMPLERS : 16
> CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS : 3
> CL_DEVICE_MAX_WRITE_IMAGE_ARGS : 8
> CL_DEVICE_MEM_BASE_ADDR_ALIGN : 1024
> CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE : 128
> CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT : 4
> CL_DEVICE_VENDOR_ID : 16909312
> CL_DEVICE_AVAILABLE : yes
> CL_DEVICE_COMPILER_AVAILABLE : yes
> CL_DEVICE_ENDIAN_LITTLE : yes
> CL_DEVICE_ERROR_CORRECTION_SUPPORT : no
> CL_DEVICE_IMAGE_SUPPORT : yes
> CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : 6291456
> CL_DEVICE_GLOBAL_MEM_SIZE : 12884901888
> CL_DEVICE_LOCAL_MEM_SIZE : 16384
> CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE : 65536
> CL_DEVICE_MAX_MEM_ALLOC_SIZE : 3221225472
> CL_DEVICE_IMAGE2D_MAX_HEIGHT : 8192
> CL_DEVICE_IMAGE2D_MAX_WIDTH : 8192
> CL_DEVICE_IMAGE3D_MAX_DEPTH : 2048
> CL_DEVICE_IMAGE3D_MAX_HEIGHT : 2048
> CL_DEVICE_IMAGE3D_MAX_WIDTH : 2048
> CL_DEVICE_MAX_PARAMETER_SIZE : 4096
> CL_DEVICE_MAX_WORK_GROUP_SIZE : 1
> CL_DEVICE_PROFILING_TIMER_RESOLUTION : 1
> CL_DEVICE_MAX_WORK_ITEM_SIZES : 1, 1, 1
> CL_DEVICE_TYPE : CPU
--
> Failed to open file. Error:
>
> OpenCL_RD::ReloadContextIfNeeded : Failed to create command queue: Invalid value
Someone here
https://discussions.apple.com/thread/3695124?start=0&tstart=0
reported they got the above error when connecting to their Mac remotely
via ssh. Were you doing something like that?
Ready looks like it is calling clCreateCommandQueue() correctly so
I suspect the problem is in your GPU or is due to an Apple-specific bug.
Andrew
The documentation says that clCreateCommandQueue returns
CL_INVALID_VALUE when the properties value isn't supported:
http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clCreateCommandQueue.html
We pass in 0, which should be fine.
Do you get the same error on the CPU on that platform?
> Failed to open file. Error:
>
> OpenCL_RD::ReloadContextIfNeeded : Failed to create command queue: Invalid value
Someone here
https://discussions.apple.com/thread/3695124?start=0&tstart=0
reported they got the above error when connecting to their Mac remotely via ssh. Were you doing something like that?
Ready looks like it is calling clCreateCommandQueue() correctly so I suspect the problem is in your GPU or is due to an Apple-specific bug.
Andrew
It's very odd that the 8-threads i7-2720QM should report
CL_DEVICE_MAX_WORK_GROUP_SIZE of 1. That's why it's only using one
core. Is there any way to update your OpenCL drivers?
--
There's a function on the Action menu: Select OpenCL Device... that
should show the two devices and allow you to select which one to use.
Did you try that?
It's very odd that the 8-threads i7-2720QM should report
CL_DEVICE_MAX_WORK_GROUP_SIZE of 1. That's why it's only using one
core. Is there any way to update your OpenCL drivers?
Have you tried turning it on and then running Ready?
>
> Apple's system software is very aggressive about using the discrete GPU even
> when it is absolute overkill, and even after all client apps have been shut
> down.
What is a 'discrete GPU'?
>
> Even something as trivial as the Twitter dashboard widget (really!) causes
> the OS to turn on the Radeon GPU. The battery life is about 2.5 times longer
> if you disable theis behaviour. See:
>
> http://www.anandtech.com/show/4205/the-macbook-pro-review-13-and-15-inch-2011-brings-sandy-bridge/9
> http://reviews.cnet.com/8301-13727_7-20042341-263.html
> http://www.thepunditreport.com/2011/08/revealed-mac-os-x-lion-battery-drain.html
> http://tidbits.com/article/11982
> http://www.macworld.com/article/1163166/gfxcardstatus_2_1.html
> http://mihail.stoynov.com/2011/08/18/dramatically-increase-battery-life-on-a-macbook-pro-with-2-gpus/
>
> In the rare event that I want to play a 3D game I then turn it on manually
> using the gfx menubar icon.
>
> - Robert
>
>
>> It's very odd that the 8-threads i7-2720QM should report
>> CL_DEVICE_MAX_WORK_GROUP_SIZE of 1. That's why it's only using one
>> core. Is there any way to update your OpenCL drivers?
>
>
> --
> Robert Munafo -- mrob.com
> Follow me at: gplus.to/mrob - fb.com/mrob27 - twitter.com/mrob_27 -
> mrob27.wordpress.com - youtube.com/user/mrob143 - rilybot.blogspot.com
>
--
> I do indeed have an ATI Radeon HD 6750 M GPU in this computer, but I keep it turned off to save battery life, using the excellent and very popular gfxCardStatus utility.
Have you tried turning it on and then running Ready?
> Apple's system software is very aggressive about using the discrete GPU even when it is absolute overkill, and even after all client apps have been shut down.
What is a 'discrete GPU'?
> I do indeed have an ATI Radeon HD 6750 M GPU in this computer, but I keep
> it turned off to save battery life, using the excellent and very
> popular gfxCardStatus utility.
Tim:
> Have you tried turning it on and then running Ready?
Robert:
> I wouldn't do that. If it requires having half the battery life, or less,
> then I'll just not use the application.
I think Tim just wanted to know if turning on the GPU prevents the
error messages Ready was reporting (when the GPU is selected as your
OpenCL device of course). Once you've done that test you can select
the CPU as your OpenCL device and Ready will remember that setting
in your ReadyPrefs file.
Tim, I'm not sure if this problem is common enough to worry about,
but maybe the code in OpenCL_RD::ReloadContextIfNeeded() should
automatically switch to another device (if there is one, and with
a suitable warning) if either clCreateContext or clCreateCommandQueue
return an error.
Andrew
I cannot use "Apple : ATI Radeon HD 6750 M GPU" to open the file "schlogl.vtl". You might wish to try selecting a different OpenCL Device or to open a different pattern file.Details of error:OpenCL_RD::ReloadContextIfNeeded : Failed to create command queue: Invalid value
I think Tim just wanted to know if turning on the GPU prevents the
error messages Ready was reporting (when the GPU is selected as your
OpenCL device of course). Once you've done that test you can select
the CPU as your OpenCL device and Ready will remember that setting
in your ReadyPrefs file.
Tim, I'm not sure if this problem is common enough to worry about,
but maybe the code in OpenCL_RD::ReloadContextIfNeeded() should
automatically switch to another device (if there is one, and with
a suitable warning) if either clCreateContext or clCreateCommandQueue
return an error.
Nope, looks good to me.
Andrew