parallel runs of basilisk

felixh...@hispeed.ch

unread,

Nov 4, 2015, 7:48:07 AM11/4/15

to basilisk-fr

Hi all

I realized in my first runs of basilisk examples that they use all processors of the machine I run it (compiled with -fopenmp). How can I run them on my small cluster having actually 2 machines with 4 cores each and organized as a beowulf? gerris works fine with the runmpi -n 8 -hostfile xxx command.

cheers felix

Stephane Popinet

unread,

Nov 4, 2015, 8:21:07 AM11/4/15

to basil...@googlegroups.com

Hi Felix,

If you use makefiles, you can try:

% CC='mpicc -D_MPI=8' make mycode.tst

to run with 8 MPI processes.

Otherwise you can use something like:

% CC99='mpicc -std=c99' qcc -Wall -O2 -D_MPI=1 mycode.c -o mycode -lm
% mpirun -np 8 ./mycode

cheers

Stephane

felixh...@hispeed.ch

unread,

Nov 5, 2015, 8:33:27 AM11/5/15

to basilisk-fr

Merci beaucopu Stephane!

in the meantime I did different tries and found that it works for the tutorial example bump.c, i.e. the saint-venant equations, but not for karman or sphere. Those produce a lot of segmentation and address not mapped errors on my systems. Is it a question of the solver used?

and b.t.w., what does -D_MPI=x really do? I ran through the docs, but did not find anything about. Does it some kind of domain decomposition (as gerris works) or something else?

tnx a lot

cheers felix

Stephane Popinet

unread,

Nov 5, 2015, 10:06:10 AM11/5/15

to basil...@googlegroups.com

> in the meantime I did different tries and found that it works for the
> tutorial example bump.c, i.e. the saint-venant equations, but not for
> karman or sphere. Those produce a lot of segmentation and address not
> mapped errors on my systems. Is it a question of the solver used?

Please see:

http://basilisk.fr/src/COMPATIBILITY

and you will see that 'solids' do not work together with MPI yet.

> and b.t.w., what does -D_MPI=x really do? I ran through the docs, but
> did not find anything about. Does it some kind of domain decomposition
> (as gerris works) or something else?

Yes, it is domain decomposition but much more flexible than that in Gerris.

cheers

Stephane

Antoon van Hooft

unread,

Mar 15, 2016, 4:15:22 AM3/15/16

to basilisk-fr

Hello,

I am currently using the compiler tag ' -fopenmp' for my multithreaded simulations. If i want to change the number of theads (to 5 for example) I type (as a shell command):

% export OMP_DYNAMIC=FALSE

% export OMP_NUM_THREADS=5

This seems to work fine. In the taskmanager (top) i see that my executable uses 500% CPU and my code runs much faster compared to a single thread.

However, when I try the suggested method I get an error message:

antoon@waal:~/basilisk/src/examples> CC99='mpicc -std=c99' qcc -Wall -O2 -D_MPI=1 buildingGABLS.c -o dip3 -lm
antoon@waal:~/basilisk/src/examples> mpirun -np 5 ./dip3
[waal:20727] *** Process received signal ***
[waal:20727] Signal: Segmentation fault (11)
[waal:20727] Signal code: Address not mapped (1)
[waal:20727] Failing at address: 0x68
[waal:20727] [ 0] /lib64/libpthread.so.0(+0xf1f0) [0x7f2ff5e401f0]
[waal:20727] [ 1] ./dip3() [0x40f8ce]
[waal:20727] [ 2] ./dip3() [0x411a1a]
[waal:20727] [ 3] ./dip3() [0x419c02]
[waal:20727] [ 4] ./dip3() [0x4172f9]
[waal:20727] [ 5] ./dip3() [0x42ee4d]
[waal:20727] [ 6] ./dip3() [0x401e37]
[waal:20727] [ 7] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2ff5aa5a15]
[waal:20727] [ 8] ./dip3() [0x401e71]
[waal:20727] *** End of error message ***
[waal:20725] *** Process received signal ***
[waal:20725] Signal: Segmentation fault (11)
[waal:20725] Signal code: Address not mapped (1)
[waal:20725] Failing at address: 0x20
[waal:20725] [ 0] /lib64/libpthread.so.0(+0xf1f0) [0x7f99288761f0]
[waal:20725] [ 1] ./dip3() [0x40f8b1]
[waal:20725] [ 2] ./dip3() [0x411a1a]
[waal:20725] [ 3] ./dip3() [0x419c02]
[waal:20725] [ 4] ./dip3() [0x4172f9]
[waal:20725] [ 5] ./dip3() [0x42ee4d]
[waal:20725] [ 6] ./dip3() [0x401e37]
[waal:20725] [ 7] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f99284dba15]
[waal:20725] [ 8] ./dip3() [0x401e71]
[waal:20725] *** End of error message ***
[waal:20726] *** Process received signal ***
[waal:20726] Signal: Segmentation fault (11)
[waal:20726] Signal code: Address not mapped (1)
[waal:20726] Failing at address: (nil)
[waal:20726] [ 0] /lib64/libpthread.so.0(+0xf1f0) [0x7f7837cba1f0]
[waal:20726] [ 1] ./dip3() [0x40f943]
[waal:20726] [ 2] ./dip3() [0x411a1a]
[waal:20726] [ 3] ./dip3() [0x419c02]
[waal:20726] [ 4] ./dip3() [0x4172f9]
[waal:20726] [ 5] ./dip3() [0x42ee4d]
[waal:20726] [ 6] ./dip3() [0x401e37]
[waal:20726] [ 7] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f783791fa15]
[waal:20726] [ 8] ./dip3() [0x401e71]
[waal:20726] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 4 with PID 20727 on node waal exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

I also noticed that before the crash (before first timestep) the executable was listed 5 times in the taskmanager (@100%). What am i doing wrong?

Stephane Popinet

unread,

Mar 15, 2016, 5:04:40 AM3/15/16

to basil...@googlegroups.com

Hi Anton,

> However, when I try the suggested method I get an error message:

What do you mean by "the suggested method"? The first part of your
message seems to refer to OpenMP parallelism, whereas the second part
uses MPI parallelism. These are two very different methods.

You should check in particular that MPI parallelisation is compatible
with the other features of Basilisk you are using (masking, periodic
boundaries etc..). See:

http://basilisk.fr/Features

cheers,

Stephane

Antoon van Hooft

unread,

Mar 15, 2016, 6:15:56 AM3/15/16

to basilisk-fr, pop...@basilisk.fr

Hallo Stephane,

Sorry for being unclear. I meant the method you mentioned in your first reply (MPI). However, you are absolutely right. I should add the "Features" link to my favorites since i was indeed trying to do something incompatible again.

felixh...@hispeed.ch

unread,

Mar 15, 2016, 10:03:41 AM3/15/16

to basilisk-fr

Hello Antoon

as far as I know openmp is good if you have some cores which share all the same memory. in such a configuration you may use mpi as well, it depends on the problem which one is faster. If you have a cluster of different machines (which may have multiple cores each) you are bound to use mpi. I did some experiments with D_MPI=1 or D_MPI=n (=number of cores in the system) and found a slight advance in speed for the latter.

cheers felix

Reply all

Reply to author

Forward