Running Multiple Chrono Simulations in Parallel

Jared Long-Fox

unread,

May 29, 2024, 7:16:12 PMMay 29

to ProjectChrono

Hello all! I am working on a project in which I need to calibrate a model to experimental data. To do this, I need to run a parameter estimation algorithm such as Monte Carlo or a grid search to sweep across parameter values to determine which model parameters give the best fit to the data.

However, such calibrations take hundreds to thousands of model runs. There are ways to reduce this, but I was originally planning on running multiple DEM models in parallel instead of in serial to speed the process up. Is it possible to run more than one Chrono model at once on a single machine such as my desktop workstation (assuming sufficient hardware, e.g., 2x NVIDIA 3090 GPUs)? If so, how much performance is sacrificed (if any)?

Thank you all for the time, consideration, and insights on this!

All the best,

Jared Long-Fox

Ruochun Zhang

unread,

May 31, 2024, 5:22:54 PMMay 31

to Jared Long-Fox, ProjectChrono

Hi,

Since you mentioned the GPUs on your desktop, I assume it's about the GPU-based Chrono DEM tools. You should be able to simply run a couple of them on parallel, but truthfully, about the throughput impact, we never tested.

My guess would be that on consumer GPUs, running 2 or 3 instances together may slightly improve the throughput. On data center GPUs, running a small number of instances may moderately improve the throughput. Again, this is a guess and I am nore than interested in knowing your findings should you test it out.

Thank you,

Ruochun

--
You received this message because you are subscribed to the Google Groups "ProjectChrono" group.
To unsubscribe from this group and stop receiving emails from it, send an email to projectchron...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/projectchrono/730e41ac-b7d2-40cd-96da-a5472f9cbc1cn%40googlegroups.com.

Ruochun Zhang

unread,

Jun 4, 2024, 7:24:53 AMJun 4

to Jared Long-Fox, ProjectChrono

Hi Jared,

That I am not an expert. The hard part is still that this is a throughput question: Although when using CPU it is more straightforward to run multiple instances (like launching a series of jobs on a server CPU, each using a core and a given amount of RAM), is it better than running jobs in serial but each with many cores?

I think it's quite hardware-dependent. I guess the former is more scalable on server CPUs assuming you don't have that many reads and writes, and could have a better throughput than the latter. You just have to deal with the fact that you'll have to wait long before you get each batch of results, considering terramechanics problems can be large in size. This is worth testing out too.

Thank you,

Ruochun

On Tue, Jun 4, 2024 at 7:19 AM Jared Long-Fox <jared....@gmail.com> wrote:

Hi Ruochun,

Thank you very much for the speedy and informative response. If/when I end up trying this, I will post any findings here. Apologies for another question, but has there been any parallelization testing on CPUs? If so, what were the results? Thanks in advance!

All the best,
Jared

--

Ruochun Zhang
Email: ruoc...@gmail.com
Email: ruochu...@alumni.rice.edu
Tel: 832-353-5111

Jared Long-Fox

unread,

Jun 4, 2024, 1:56:23 PMJun 4

to Ruochun Zhang, ProjectChrono

Hi Ruochun,

Thank you very much for the speedy and informative response. If/when I end up trying this, I will post any findings here. Apologies for another question, but has there been any parallelization testing on CPUs? If so, what were the results? Thanks in advance!

All the best,

Jared

On Thu, May 30, 2024 at 9:31 PM Ruochun Zhang <ruoc...@gmail.com> wrote:

Radu Serban

unread,

Jun 14, 2024, 10:55:43 AMJun 14

to ProjectChrono

Jared,

A couple of quick observations.

What you are asking for is not directly related to Chrono. This type of parallelization is what’s called HTC (high throughput computing) as opposed to HPC (high performance computing). The way you set up the multiple, independent parallel runs (multi-processing) to cover the “design space” is first and foremost dependent on the hardware you have available. The actual simulation software you use for these parallel runs (Chrono in this case) comes into play only insofar as you need to understand the resource requirements (cores, GPUs, memory, etc.) of each instance and how to best distribute the runs over the existing hardware so that you minimize contention for the same resource.

For Chrono-based simulations, the answer to the question above will depend on which Chrono module(s) you are using and the particular type of problem you are solving. We have done this in the past, with both CPU- and GPU-based Chrono modules, typically by running this on clusters (there’s only so much one can do with the hardware resources available on a desktop).