Hardware recommendations for SPH

marina.k...@gmail.com

unread,

Feb 4, 2019, 7:10:48 AM2/4/19

to pysph-users

I hope this questions is not too much off Topic For my new SPH project I have an opportuninty to by new hardware. I am fairly new to this field and wonder if there are any particular recommandations for hardware especially good for SPH, e.g. Particular Graphics Cards, Processosrs etc. How much RAM should be availible.

Thanks alot in advance,

Marina

alom...@gmail.com

unread,

Feb 4, 2019, 11:12:20 AM2/4/19

to pysph-users

Hi Mariana,

I am no specialist in Hardware (or software for that matter), but I work with large scale SPH simulations and at some point parallelism is a must so you might consider running your simulations (if large enough) in a HPC cluster if you have access to one. Nonetheless, if you want to buy a new machine for your own use, I would recommend you get a desktop with at least a 8-core processor (Xeon?) and minimum 32Gb of RAM. If you need to generate lots of output, hd memory will be an issue, and you might consider > 2-3Tb of HD. As for GPUs, If you can get a top card with CUDA/OpenCL support, in the future might be helpful. Finally, get a Linux based system or creat a boot for Linux in your dedicated machine, it will help you so much over a Windows based one.

I hope this helps a little, and maybe other people with more experience can give more details (names, models) of specific processors, etc.

Good luck,

Alomir

alom...@gmail.com

unread,

Feb 4, 2019, 11:16:06 AM2/4/19

to pysph-users

*Marina (Sorry for the typo)

Em segunda-feira, 4 de fevereiro de 2019 04:10:48 UTC-8, marina.k...@gmail.com escreveu:

marina.k...@gmail.com

unread,

Feb 4, 2019, 4:45:49 PM2/4/19

to pysph-users

Thank you Alomir! The numbers you give are a good guideline for component selection!

Prabhu Ramachandran

unread,

Feb 5, 2019, 2:21:01 AM2/5/19

to alom...@gmail.com, pysph-users

Hi Marina,

On 2/4/19 9:42 PM, alom...@gmail.com wrote:

Hi Mariana,

I am no specialist in Hardware (or software for that matter), but I work with large scale SPH simulations and at some point parallelism is a must so you might consider running your simulations (if large enough) in a HPC cluster if you have access to one. Nonetheless, if you want to buy a new machine for your own use, I would recommend you get a desktop with at least a 8-core processor (Xeon?) and minimum 32Gb of RAM. If you need to generate lots of output, hd memory will be an issue, and you might consider > 2-3Tb of HD. As for GPUs, If you can get a top card with CUDA/OpenCL support, in the future might be helpful. Finally, get a Linux based system or creat a boot for Linux in your dedicated machine, it will help you so much over a Windows based one.

I hope this helps a little, and maybe other people with more experience can give more details (names, models) of specific processors, etc.

To add to the excellent points above, with GPUs, it all depends on your budget and requirements. You can get much cheaper gaming cards like the GeForce 1080Ti for a fraction of the price of a P100 but the difficulty is that they do not perform nearly as well on double precision calculations. Their single precision performance can be equivalent to the much more expensive Tesla cards. So it really depends on your budget and requirements. You can see the specs here: https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units

cheers,

Prabhu

marina.k...@gmail.com

unread,

Apr 9, 2019, 2:38:59 PM4/9/19

to pysph-users

It's been a while since I last asked, but I am still unsure concerning the GPU

does it then makes sence to get a gaming GPU with something like 300 GFlops if I TeslaTFLOPsyond the budget? is the double precision processing power the only feature that matters? in this case AMD GPU have much better price/perfomance ratio...

Prabhu Ramachandran

unread,

Apr 11, 2019, 3:51:58 AM4/11/19

to pysph...@googlegroups.com

On 4/10/19 12:08 AM, marina.k...@gmail.com wrote:

It's been a while since I last asked, but I am still unsure concerning the GPU

does it then makes sence to get a gaming GPU with something like 300 GFlops if I TeslaTFLOPsyond the budget? is the double precision processing power the only feature that matters? in this case AMD GPU have much better price/perfomance ratio...

We recently bought a few gaming GPUs, specifically 1050 and 1070 Ti's -- they are old but my budget is limited and our goal is to make sure that pysph performs reasonably on this hardware. They are quite fast though (the equivalent of a 40-50 core CPU for a large problem). With this hardware, the double precision does slow things by a factor of 1.7 or so which is not too bad considering that the cards are cheap. The performance is comparable to a P100. I think a 1080Ti is very close in performance to a P100. I am sure people with hand tuned codes may be able to extract more performance from these. My numbers are all based on some simple tests with PySPH. It is true that the gaming GPUs are in theory much worse at double precision than the tesla ones, however, most of our CFD problems are not compute limited but limited by memory bandwidth, so a simplistic comparison is not enough. You also need to make sure you have enough particles to feed the GPU, with too few particles it may not give you any speed up.

We do not yet support multiple GPUs with PySPH but hope to support that in the future. Unfortunately, things are a bit tricky when finding the right hardware. Your best bet may be to test your own code and then decide if it works well enough for you.

cheers,

Prabhu

Am Dienstag, 5. Februar 2019 08:21:01 UTC+1 schrieb Prabhu Ramachandran:

Hi Marina,

On 2/4/19 9:42 PM, alom...@gmail.com wrote:

Hi Mariana,

I am no specialist in Hardware (or software for that matter), but I work with large scale SPH simulations and at some point parallelism is a must so you might consider running your simulations (if large enough) in a HPC cluster if you have access to one. Nonetheless, if you want to buy a new machine for your own use, I would recommend you get a desktop with at least a 8-core processor (Xeon?) and minimum 32Gb of RAM. If you need to generate lots of output, hd memory will be an issue, and you might consider > 2-3Tb of HD. As for GPUs, If you can get a top card with CUDA/OpenCL support, in the future might be helpful. Finally, get a Linux based system or creat a boot for Linux in your dedicated machine, it will help you so much over a Windows based one.

I hope this helps a little, and maybe other people with more experience can give more details (names, models) of specific processors, etc.

To add to the excellent points above, with GPUs, it all depends on your budget and requirements. You can get much cheaper gaming cards like the GeForce 1080Ti for a fraction of the price of a P100 but the difficulty is that they do not perform nearly as well on double precision calculations. Their single precision performance can be equivalent to the much more expensive Tesla cards. So it really depends on your budget and requirements. You can see the specs here: https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units

cheers,

Prabhu

--
You received this message because you are subscribed to the Google Groups "pysph-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pysph-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/pysph-users.
For more options, visit https://groups.google.com/d/optout.

marina.k...@gmail.com

unread,

Apr 11, 2019, 5:11:14 AM4/11/19

to pysph-users

Thanks Parabu, yor experience is very helpful! Is there a particular reason why do you prefere Nvidia Grafic cards? I am thinking about AMD Radeon VII which has an intersting perfomance/price ratio. It supports openCL 2.0 this should be also usefull for pyOpenCl libraries, right?

Cheers,

Marina

Am Donnerstag, 11. April 2019 09:51:58 UTC+2 schrieb Prabhu Ramachandran:

To unsubscribe from this group and stop receiving emails from it, send an email to pysph...@googlegroups.com.

Prabhu Ramachandran

unread,

Apr 11, 2019, 9:21:29 AM4/11/19

to marina.k...@gmail.com, pysph-users

On 4/11/19 2:41 PM, marina.k...@gmail.com wrote:

Thanks Parabu, yor experience is very helpful! Is there a particular reason why do you prefere Nvidia Grafic cards? I am thinking about AMD Radeon VII which has an intersting perfomance/price ratio. It supports openCL 2.0 this should be also usefull for pyOpenCl libraries, right?

No particular reason, I think these nVidia cards were easily available. I haven't tried with the Radeon VII, I have a Radeon 560 on my macbook but it is not too fast but it does work. I would need to test the Radeon VII but yes if it supports OpenCL it should work.

Regards,

Prabhu

marina.k...@gmail.com

unread,

Apr 11, 2019, 3:01:49 PM4/11/19

to pysph-users

Well in this case I will give it a try :) If you have a standartised text example and are interested I could run test once I get the system up and going

Am Donnerstag, 11. April 2019 15:21:29 UTC+2 schrieb Prabhu Ramachandran:

Prabhu Ramachandran

unread,

Apr 12, 2019, 2:04:17 AM4/12/19

to marina.k...@gmail.com, pysph-users

On 4/12/19 12:31 AM, marina.k...@gmail.com wrote:

Well in this case I will give it a try :) If you have a standartised text example and are interested I could run test once I get the system up and going

Sure, once you install pyopencl you should install compyle from master and pysph also from master. Here are some quick instructions assuming you have a suitable python environment -- a miniconda env works very well with the latest python 3.7 for example.

pip install cyarraypyopencl

git clone https://github.com/pypr/pysph

cd pysph

pip install -r requirements.txt

python setup.py develop

Once this is all set up you should be able to run the following:

pysph run cube --opencl --np 1e6 --tf 2e-3 --disable-output

This will take 20 timesteps with 1million particles and not dump any output, its a silly test but useful as you can just asses raw performance.

You can compare with a CPU by looking at the numbers for instance:

pysph run cube --openmp --np 1e6 --tf 2e-3 --disable-output

You could run a more realistic case if you want for example::

pysph run dam_break_3d --opencl --tf 0.5

or

pysph run sphysics.dam_break --opencl --tf 0.5

The 3D benchmarks perform much better and we are still optimizing the GPU performance but it does work. You can change the --tf options to suit your needs. The new progressbar is pretty handy to get a quick sense of the performance. The default on the GPU is with floating point precision. For double precision you can do:

pysph run sphysics.dam_break --opencl --tf 0.5 --use-double

HTH.

cheers,

Prabhu

Marina Kauffeldt

unread,

Sep 18, 2019, 6:50:49 AM9/18/19

to pysph-users

Hi Prabhu,

my last post is already a while ago, however I finally got my new hardware which I wanted to test. However when I try to run any of your benchmark codes with --opencl attribute i get an error

File "/usr/local/lib/python3.7/site-packages/pyopencl/__init__.py", line 1385, in create_some_context
platforms = get_platforms()
pyopencl._cl.LogicError: clGetPlatformIDs failed: PLATFORM_NOT_FOUND_KHR

I have followed the upper installation procedure and I have installed the drivers provided by the manufacturer and I'm using miniconda enviroment

My GPU is AMD Radeon VII and the manufacturer claims to support openCL 2.0

I also tried to google this error but did not found any usefull suggestions so far.

Do you have any clue what is going wrong there?

Thanks alot for your help in advance,

Marina

Marina Kauffeldt

unread,

Sep 20, 2019, 7:26:29 AM9/20/19

to pysph-users

Hi Parabhu,

I have now fixed my initial error (missing driver). For the 4 openCL benchmark tests One is running

pysph run sphysics.dam_break --opencl --tf 0.5

Run took: 136.84368 secs

How does that competes to what you usually get?

pysph run sphysics.dam_break --opencl --tf 0.5 --use-double

gives meMemory access error (memory dumped)

The others give me zero devision and double conversion errors. I put the summary in the log file.

Any ideas, what can I do to make them run?

Thanks,

Marina

log_benchmark_error

Prabhu Ramachandran

unread,

Sep 21, 2019, 3:15:55 AM9/21/19

to Marina Kauffeldt, pysph-users

Hi,

On a 1070Ti I get this:

pysph run sphysics.dam_break --opencl --tf 0.5 --pfreq 1000

Run took: 166.82990 secs

I've increased the pfreq as dumping the files out takes a long while due to the host device transfers. I would imagine that if you did the same you would get even better numbers. Our cards have 8GB RAM but this example takes less than 0.5GB so I am not sure why your double precision case is crashing.

I get this:

$ pysph run sphysics.dam_break --opencl --tf 0.1 --pfreq 1000 --use-double

Run took: 46.54599 secs

and

$ pysph run sphysics.dam_break --opencl --tf 0.1 --pfreq 1000

Run took: 28.66729 secs

As regards the other errors, I seem to be getting them too! We will investigate this and fix them before we push a long overdue release. A few months ago we added some preliminary CUDA support and it is possible that some of these examples broke at that point. I cannot promise anything for the coming week but thereafter I hope to be able to spend some time fixing these and pushing a release. Thanks for your patience. We have been busy with other things, our main GPU contributor Aditya has graduated and I am unable to find sustained development time during the semester.

Regards,

Prabhu

--

You received this message because you are subscribed to the Google Groups "pysph-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to pysph-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pysph-users/46df13d2-a6b7-4344-91f1-c6b0f2bac05e%40googlegroups.com.

Stephan

unread,

Oct 21, 2021, 12:32:02 PM10/21/21

to pysph-users

Dear all,

I actually have the same objective as initially posed by Marina: I have the opportunity to assemble a new system. We were actually not sure if we had to go for a compute node, with say 28 cores, or a system with a dedicated GPU but based on the info here a GPU is the clear winner I suppose (please correct me if I'm wrong).The information here was very helpful and I concluded that using a GPU exceeds the performance of a multi-CPU system easily.

I have some additional questions specifically related to the GPU selection. I've read that the compyle package is not a silver bullet and performance is still very dependent on the system: "Performance optimization can be hard and is platform specific. What works on the CPU may not work on the GPU and vice-versa. Compyle does not do anything to make this aspect easier. All the issues with memory bandwidth, cache, false sharing etc. still remain. Differences between memory architectures of CPUs and GPUs are not avoided at all – you still have to deal with it. But you can do so from the comfort of one simple programming language, Python."

I just use PySPH as in the examples in the documentation and here in the group. I write my own equations and time steppers while sticking to the conventions in the manual. Still, I'm not sure if this comment about the compyle package is a concern to deal with or if it has already been taken care of in the PySPH package itself. I would like to get optimal performance while just simply running my scripts with:

$ python pysph_script.py --cuda --use-double

instead of

$ python pysph_scrip.py

So my first question actually is: which GPU architecture is the optimal choice to avoid the mentioned optimization issues, so that I can simply run my code as efficient as possible without having to dig in the parallelization theory?

My second question is related to the possibility of buying either 1 really good GPU or spending the same amount on 2 or even 3 lesser GPU's to run different simulations at the same time. Is there an optimum in performance/price ratio? Then I could compare, for example, 3 of the best performance/price GPU with the best GPU within the budget range. Does anyone have experience with running PySPH on a system with multiple GPUs?

I saw that double precision GPU can be up to 1.7 times faster than a single precision GPU so I'm aiming for a system like this:

Intel Xeon 8 core processor
Minimum of 32 GB RAM
3 TB HD
1 or 2 NVIDIA GPUs with double precision and OpenCL + CUDA support

If there are any other suggestions for my setup please share them with me.

Thanks in advance,