adda_mpi on Ubuntu

36 views
Skip to first unread message

Kevin Aptowicz

unread,
Apr 18, 2024, 1:34:35 AM4/18/24
to ADDA questions and answers
We have successfully used sequential mode adda and want to use it in parallel mode.

The simulations are being run on Ubuntu 22.04.4 LTS.  The computer is a virtual workstation with 56 CPUs.

However, when we run ... 

mpiexec -n 4 ./adda_mpi

it appears to run four instances of the same code, creating four run folders:

run096_sphere_g16_m1.5
run097_sphere_g16_m1.5
run098_sphere_g16_m1.5
run099_sphere_g16_m1.5


The log file in each folder states it is running on a single core:

Generated by ADDA v.1.5.0-alpha
The program was run on: 1 processors (cores)
command: './adda_mpi '
lambda: 6.283185307
shape: sphere; diameter: 6.734551818
box dimensions: 16x16x16
refractive index: 1.5+0i
Dipole size: 0.418879 (cubical)
Dipoles/lambda: 15
(Volume correction used)
Required relative residual norm: 1e-05
Total number of occupied dipoles: 2176
Volume-equivalent size parameter: 3.367275909

The mueller files in each of the folders appears to be the same. The only difference in the log files is the timing. 

Based on Section 6.6 in the manual, it seems like something must be amiss.  It isn't partitioning the particle over different processors.  Perhaps we are misunderstanding the use of adda_mpi.  Does the MPI functionality work for a single particle with fixed orientation?  Not sure if our understanding is incorrect or there is an issue with the code (perhaps on the MPI side).   

Any thoughts would be greatly appreciated.

Thanks - Kevin
West Chester University

Maxim Yurkin

unread,
Apr 18, 2024, 2:56:17 AM4/18/24
to adda-d...@googlegroups.com
Dear Kevin,

You correctly identified the problem from the second line of the output. MPI implementation executes 4 instances of ADDA independently, instead of linking them together. And, yes, adda_mpi can definitely handle large particles in fixed orientation (slicing the computational domain into several processors) - it was explicitly designed for this. 

So most probably, the problem can be solved by changing the execution line (or some additional setting of the MPI implementation). Some hints can be found at https://github.com/adda-team/adda/wiki/InstallingMPI , but they are necessarily vague, since the details of MPI implementation and its installation can differ. Potentially, you may even need to use a queue system (some examples are mentioned in Section 3.2 "Parallel mode" of the manual). However, this is typically needed only for a distributed-memory cluster, while your virtual workstation seems to have shared memory.

It is easier with some centrally managed cluster (or supercomputer), since then you usually have some engineer to consult with. Or at least you have a test examples of running some sample MPI program. Replacing the program executable with adda_mpi in the execution line should then work. In your case, it seems that you manage your workstation yourself. But even then, you have somehow installed the MPI implementation - it should provide some trivial run example, which you can probably reuse.

If aiming for a quick fix, I can provide a few wild guesses from my experience of running adda_mpi at various systems:
- try 'mpirun -np 4' instead of 'mpiexec -n 4' (although the latter is the standard one)
- you may need to manually specify '-hostfile ...' with a list of nodes on which the run should be executed (see, e.g., https://docs.open-mpi.org/en/v5.0.x/man-openmpi/man1/mpirun.1.html#label-schizo-ompi-hostfile ). However, again this is probably relevant only for a distributed-memory cluster and/or some queue systems (resource manager).

Another idea is to run 'adda_mpi -V' and compare the used MPI implementation (during ADDA compilation) to the one used for running. Sometimes a mismatch here can cause problems, but this is probably relevant only for large clusters where several MPI implementations are used (as modules).

If all fails, please provide more details about the MPI implementation, test MPI runs (with another programs), etc. By contrast, if you manage to run adda_mpi properly, please also share your success, since it will benefit other ADDA users.

Maxim.
--
You received this message because you are subscribed to the Google Groups "ADDA questions and answers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to adda-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/adda-discuss/8049d6d1-f6df-4016-8c0c-49b7f39dcaa9n%40googlegroups.com.


Kevin Aptowicz

unread,
Apr 25, 2024, 1:08:06 PM4/25/24
to ADDA questions and answers

Thanks, Maxim, for your advice.  We figured it out.  We had three different MPI packages installed on the workstation.  One is the default ubuntu package, one comes with the Intel oneapi fortran compiler, and one came with the install of anaconda.  We determined this by searching for the file name MPIEXEC in the system.  Using the 'which' function in linux, we could see that the anaconda version was the one being called when running adda_mpi.  Once we added the default version folder ( /usr/bin) to the top of the $PATH variable, it worked as expected.  It didn't work with either the anadonda version or the intel version of mpiexec.  I'm guessing when we installed ADDA, it used the default version to compile it and thus we needed to use the default version when running the code.

Maxim Yurkin

unread,
Apr 25, 2024, 4:45:24 PM4/25/24
to adda-d...@googlegroups.com
Great, thanks for your update. Indeed, mismatch between MPI packages (implementations) at compilation and runtime is a common issue. When a module system is used, it is more or less predictable (at least, you can easily switch the complete environment from one to another). Otherwise, when several packages are available simultaneously (at different paths), the used packages at compile time and runtime will be determined by various environmental variables (and wrappers). For runtime, it is mostly determined by $PATH (i.e., which `mpiexec` is used), as you explained. By contrast, for compile time, ADDA makefile (specifically `mpi/Makefile`) uses $MPICC or the `mpicc` wrapper, if any is available (but they are not necessarily available for all packages). Thus, `mpiexec` will be used from the package on the top of $PATH, while the `mpicc` may be available only from another package (lower on the $PATH). Modifying $MPICC should lead ADDA compilation to use any MPI package you want (in case you want to continue testing). However, usually there is no significant difference (in performance) between different MPI packages as long as ADDA compiles and runs properly.

Maxim.
Reply all
Reply to author
Forward
0 new messages