Pointers for installing and running cantera in parallel

830 views
Skip to first unread message

Abhinav Verma

unread,
May 4, 2016, 3:32:23 PM5/4/16
to Cantera Users' Group
Dear Users and Developers, 
 
 I am relatively new to cantera and I really appreciate the efforts and a fantastic code. I had no problems (maybe a few, but manageable) to install and run cantera with all its dependencies. What I want to know is if there is any resource that can help me run the code in parallel? Or is it inherently serial? I came across in this forum as well as other places that makes me believe that parts of cantera (compute intensive) can be run using MPI. Therefore I would like to get some directions for the same. 

1. Do I need to install all dependent libraries also with MPI? (Sundials, Boost)?
2. How do I submit such a multi-processor job with a job scheduler? Do I use mpirun? 
3. I am used to writing python scripting interface for running cantera? Do I need to do it in C interface then? 

Hoping to get some pointers...
Best, 
Abhi

Ray Speth

unread,
May 4, 2016, 5:30:50 PM5/4/16
to Cantera Users' Group
Abhi,

Cantera itself is essentially serial, but it can be used in parallel in one of two cases. The first is to run many separate Cantera simulations (e.g. reactor network integrations) in parallel. This is frequently done in CFD applications where each grid cell is treated as a homogeneous reactor, or where you have a large parameter space of simulations. In this case, the method of implementing the parallelism is up to you -- you could use MPI or any number of different Python libraries. The other case is for moderately large reaction mechanisms, where most of the computational time is taken up factorizing and solving linear systems, in which case you can link Cantera to a parallel version of BLAS/LAPACK such as the Intel MKL.

Regards,
Ray

Abhinav Verma

unread,
May 4, 2016, 10:41:21 PM5/4/16
to Cantera Users' Group
Thanks Ray, 
 That is good. In my case I am interested more in the 2nd option to deal with moderately large reaction mechanims. Now in case, (if I understand correctly), I just need to link with parallel mkl libraries. However, I am still a bit confused on how to submit such a python script with a job scheduler? can you help me with that?
thanks and regards, 
Abhi

Kyle B

unread,
May 4, 2016, 10:50:14 PM5/4/16
to Cantera Users' Group
Abhi:

I had to deal with this recently.  You should check out this thread, as it will likely answer the majority of your questions.

Kyle

Bryan W. Weber

unread,
May 5, 2016, 8:38:02 AM5/5/16
to Cantera Users' Group
Abhi,

You didn't mention what OS you're using. If you're really limited by the large mechanism size, then running multiple reactors in parallel won't help too much, and you're right, you should link to MKL. The posting that Kyle linked to is mostly a discussion of how to enable multiple reactors in parallel, rather than running 1 reactor on multiple threads. If you're on Linux, you can use the instructions on my website to build and install the stable version of Cantera, 2.2.1: http://bryanwweber.com/writing/personal/2014/01/08/installing-cantera-on-ubuntu-12.04.3-from-scratch-source-with-Intel-compilers/ or the developer's version: http://bryanwweber.com/writing/personal/2016/01/01/how-to-install-cantera-on-ubuntu-updated/

If you're on Windows, unfortunately, I don't have any experience with that.

Regards,
Bryan

Abhinav Verma

unread,
May 5, 2016, 11:35:57 PM5/5/16
to Cantera Users' Group
Hi Bryan, 
 thanks a lot. Sorry for missing the important. I am on Linux and I have my own installed versions, but I will follow your instructions for a clean install (making sure that I am using MKL properly and not getting mixed up). And you are right, I did check the thread (thanks to Kyle) but it is not something that I am concerned at the moment. I do run multiple reactors, but that I can submit separately using my good old scripting skills :). 

 Can you share a script that will help me run a large mechanism in parallel (single reactor) with a cantera source that is compiles using MKL? And also how to launch such a simulation (using any job scheduler such as qsub after allocating several processors)? 

Thanks, 
Abhi

Alex Fridlyand

unread,
May 23, 2016, 1:25:43 PM5/23/16
to Cantera Users' Group
Hi Bryan,

I just noticed this discussion and your comment regarding large mechanisms is of interest to me. I work with large models frequently, ones where a single integration takes 10-20 minutes using Cantera on Windows. Running batch jobs with multiprocessing (4 and 12 core CPUs)  hardly provides any improvement (~10-20%). Even doing it in the crudest way possible, launch two identical python scripts from separate directories, I do not see much improvement over just running them serially. In other words, if I run 1 simulation it takes 10 minutes, if I run two in parallel it will take ~18 minutes. I found this behavior strange, but it maybe I'm missing something about how Cantera works. I've done the same with Chemkin before and got significant performance improvements. 

Why exactly won't batch jobs with multiprocessing improve performance with large mechanisms?

Thanks in advance,

Alex

Bryan W. Weber

unread,
May 31, 2016, 10:00:19 AM5/31/16
to Cantera Users' Group
Dear Alex,

If you are saturating the number of threads with one process already, adding more processes won't make everything go faster. By default, MKL will use all of the threads on the machine. You can control this by setting the environment variable MKL_NUM_THREADS. You might see a speed improvement by using, say, half the threads on each of two processes, but you'll have to test to be sure.

Regards,
Bryan

Alex Fridlyand

unread,
May 31, 2016, 12:16:23 PM5/31/16
to Cantera Users' Group
Hi Bryan,

Thanks for the response. Just to clarify, I did not compile Cantera with Intel MKL. I just installed the 2.2 binary. So even without the optimized Intel library, Cantera/MKL is saturating the number of threads? I am trying to run Cantera in an embarrassingly parallel way (parametric studies), and am not seeing performance improvement between running the jobs serially or in parallel on a 12 core, Windows 7 64bit, and 32gb ram workstation.  

>If you're really limited by the large mechanism size, then running multiple reactors in parallel won't help too much

That is the portion of your previous response that caught my eye. If I am interpreting your comment correctly, it is not possible to get performance improvement with batch simulations using large mechanisms? Or at least, not without compiling with Intel MKL?

Thanks again,

Alex

Bryan W. Weber

unread,
May 31, 2016, 2:06:02 PM5/31/16
to Cantera Users' Group
Hi Alex,

Well that's very odd then. Without MKL (and on Windows, Sundials can't be compiled with MKL unless you buy the Intel Fortran compiler), Cantera should only be using 1 thread per process. Can you confirm when you run the multiprocessing that multiple Python processes are spawned by checking the Task Manager? Can you show the code that you're using (and the CTI file, if its not in Cantera already)?

To be honest, I can't remember what I meant with that comment :-) In hindsight, probably the best way to interpret it is to take it out of its context and assume that it refers to saturating the threads, like I mentioned. For embarrassingly parallel single thread simulations, you should see the factor of speed up ~equal to the number of processes you're running in parallel. Another way to interpret it might be that you might get *more* speed up if you run only a few multi-threaded simulations as compared to many single-threaded simulations.

Best,
Bryan

Alex Fridlyand

unread,
May 31, 2016, 4:17:31 PM5/31/16
to canter...@googlegroups.com
Thanks Bryan,

I'll make a separate thread for my problem then and make some dummy code that reproduces the problem.

Alex

--
You received this message because you are subscribed to a topic in the Google Groups "Cantera Users' Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cantera-users/q_eUU6r0j_M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cantera-user...@googlegroups.com.
To post to this group, send email to canter...@googlegroups.com.
Visit this group at https://groups.google.com/group/cantera-users.
For more options, visit https://groups.google.com/d/optout.



--
Alex Fridlyand

Ray Speth

unread,
Jun 2, 2016, 5:26:56 PM6/2/16
to Cantera Users' Group
Alex,

I spent a bit of time testing this, and was surprised by the results. With large mechanisms (I ran the LLNL gasoline surrogate with 1389 species), where the computation time is completely dominated by the LU factorization of the Jacobian, the computation time does indeed scale badly for independent processes, depending on the BLAS/LAPACK implementation used. Using the implementation of the LU factor/solve included with Sundials (which is what you will be using if you use the Windows binaries for Cantera), I get the following computation times on a quad-core machine:

1 process: 12:22
2 processes: 23:45
3 processes: 34:25

My suspicion is that the straightforward implementation of the LU factorization ends up being constrained by memory bandwidth rather than actual processing speed. What's interesting is that for optimized BLAS/LAPACK implementations which take into account the sizes of the various levels of processor caches available, the performance scales much better for multiple processes. Using ATLAS:

1 process: 4:06
3 processes: 5:50

Using MKL:

1 process: 2:10
3 processes: 2:41

I think the takeaway here is that if you want good performance, you really need to use an optimized BLAS/LAPACK library, both for the improvement in single-core performance, as well as for better scaling to multiple processors.

Regards,
Ray
To unsubscribe from this group and all its topics, send an email to cantera-users+unsubscribe@googlegroups.com.

To post to this group, send email to canter...@googlegroups.com.
Visit this group at https://groups.google.com/group/cantera-users.
For more options, visit https://groups.google.com/d/optout.



--
Alex Fridlyand

Alex Fridlyand

unread,
Jun 3, 2016, 11:01:00 AM6/3/16
to Cantera Users' Group
Thanks Ray,

I'm happy to see that my observations have been confirmed and I'm not crazy. I'll have to try Cantera with one of the optimized BLAS/LAPACK libraries. 

Alex
To unsubscribe from this group and all its topics, send an email to cantera-user...@googlegroups.com.

To post to this group, send email to canter...@googlegroups.com.
Visit this group at https://groups.google.com/group/cantera-users.
For more options, visit https://groups.google.com/d/optout.



--
Alex Fridlyand
Reply all
Reply to author
Forward
0 new messages