Hi Stephan,
I think our documentation for the GPU support is very wanting. My apologies for
not doing this carefully. GPU support does require a few points to keep in mind
and I somehow missed mentioning this anywhere. Here are a few pointers so you
can see how one can use the GPU support a bit better.
Look at the tests here:
https://github.com/pypr/pysph/blob/master/pysph/sph/tests/test_acceleration_eval.py
The tests are pretty extensive, we test every feature that is available on both
CPU and GPU backends -- as you can see from the sheer length of that file. You
can see the minor changes that are needed for GPU support. Search for "gpu" and
you will notice a .gpu attribute. For example see this test:
https://github.com/pypr/pysph/blob/master/pysph/sph/tests/test_acceleration_eval.py#L684
You will see a pa.gpu.pull('u')
Similarly see this:
https://github.com/pypr/pysph/blob/master/pysph/sph/tests/test_acceleration_eval.py#L910
You will see a pa.gpu.push('u')
This is one major point that we have not documented properly. Basically if you
choose a GPU option, the particle array has a ".gpu" attribute that manages the
data going between the host and the device. For CPU backends the .gpu is None.
So you can use that as a simple check.
When you change a value on the host, i.e. with
pa.u[1] = 123.0
You must follow it up with a pa.gpu.push('u'), this will push just that
attribute and you can pass multiple args like pa.gpu.push('u', 'v'). If you
just call pa.gpu.push() it will push all arrays which can be a large amount of
transfers and may not be what you want.
Now when the data changes on the GPU you need to pull it back which explains the
pa.gpu.pull calls. Once you get this set, the rest is pretty straightforward.
Apart from this change you need to be careful when writing reductions by again
keeping these in mind. The tests check all of these and show examples for all
of them so is a good resource to read. The tests are also fairly easy to read (I
hope) and show how you can use the framework to do things.
Outside of the pull/push we try to make the methods all do the right thing in
the particle array class. If you find specific bugs please do let us know.
I hope this clarifies and makes it easier to use the GPU support.
cheers,
Prabhu
> * python square_droplet.py --opencl
> o simulation worked fine but gave some warnings
> * python square_droplet.py --scheme morris --opencl
> o simulation worked same as previous one which makes sense because
> morris scheme is defined as the default option in square_droplet.py
> * python square_droplet.py --scheme morris --opencl --use-double
> o simulation gives same results as previous one but took little bit
> longer than the case without --use-double option which is obvious
> * python square_droplet.py --scheme adami --opencl
> o simulation finishes with some new warnings not seen before and takes
> a very long time
> * python square_droplet.py --scheme adami --opencl --use-double
> o simulation crashes with a TypeError