Cannot compile in parallel after update to Ubuntu 22.04.3

336 views
Skip to first unread message

alienor...@gmail.com

unread,
Dec 7, 2023, 11:13:52 AM12/7/23
to basilisk-fr
Dear Basilisk Users,

I recently updated my Ubuntu to 22.04.3. Now, I can't compile in parallel anymore.

Here is what I have:

1. I reinstalled Basilisk (I ended up removing the whole folder and do every step from the beginning). The code compiles but I get the same warnings as here:

"./qcc -std=c99 -D_XOPEN_SOURCE=700 -O2 -g -Wall -pipe -D_FORTIFY_SOURCE=2 -autolink bview.c -o bview2D -lfb_tiny -lm
.qcc2CeENS/bview.c:1:7: warning: line number out of range
/home/riviere/Documents/basilisk/src/.qcc2CeENS//:1:7: warning: line number out of range
<built-in>: warning: line number out of range
/usr/include/stdc-predef.h:1:7: warning: line number out of range
./qcc -std=c99 -D_XOPEN_SOURCE=700 -O2 -g -Wall -pipe -D_FORTIFY_SOURCE=2 -autolink -grid=octree bview.c -o bview3D -lfb_tiny -lm
.qccIQjRZk/bview.c:1:7: warning: line number out of range
/home/riviere/Documents/basilisk/src/.qccIQjRZk//:1:7: warning: line number out of range
<built-in>: warning: line number out of range
/usr/include/stdc-predef.h:1:7: warning: line number out of range"

2. Compilation in serial works but with a few warnings (the same as above).
 
For instance, compiling the first version of the tutorial (namely, just include St Venant and then run) with:
make bump.tst
gives:
.qccWEblOB/bump.c:1:7: warning: line number out of range
bump-cpp.c: warning: line number out of range
<built-in>: warning: line number out of range
/usr/include/stdc-predef.h:1:7: warning: line number out of range

But the code seems to run fine.

3. Finally, if I tried to compile the same code in parallel 
(either using the standard makefile: CC='mpicc -D_MPI=4' make bump.tst or by hand), it does not work.

Here is a subset of the warnings and errors I get:
"/home/riviere/Documents/basilisk/src/grid/tree-mpi.h: In function ‘mpi_boundary_refine’:
/home/riviere/Documents/basilisk/src/grid/tree-mpi.h:1071:5: warning: ‘MPI_Waitall’ accessing 1 byte in a region of size 0 [-Wstringop-overflow=]
/home/riviere/Documents/basilisk/src/grid/tree-mpi.h:1071:5: note: referencing argument 3 of type ‘struct MPI_Status *’
In file included from /usr/include/x86_64-linux-gnu/mpich/mpi.h:977,
                 from /home/riviere/Documents/basilisk/src/common.h:21:
/usr/include/x86_64-linux-gnu/mpich/mpi_proto.h:592:5: note: in a call to function ‘MPI_Waitall’
/home/riviere/Documents/basilisk/src/grid/tree-mpi.h: In function ‘mpi_boundary_refine’:
/home/riviere/Documents/basilisk/src/grid/tree-mpi.h:1071:5: warning: ‘MPI_Waitall’ accessing 1 byte in a region of size 0 [-Wstringop-overflow=]
/home/riviere/Documents/basilisk/src/grid/tree-mpi.h:1071:5: note: referencing argument 3 of type ‘struct MPI_Status *’
/usr/include/x86_64-linux-gnu/mpich/mpi_proto.h:592:5: note: in a call to function ‘MPI_Waitall’
/usr/bin/ld : /usr/lib/x86_64-linux-gnu/libmpich.a(lib_libmpich_la-irecv.o) : dans la fonction « MPIDI_OFI_do_irecv.constprop.0 » :
(.text+0x1803) : référence indéfinie vers « fi_strerror »
/usr/bin/ld : (.text+0x1878) : référence indéfinie vers « fi_strerror »
/usr/bin/ld : /usr/lib/x86_64-linux-gnu/libmpich.a(lib_libmpich_la-irecv.o) : dans la fonction « MPIDIG_do_irecv.constprop.0 » :
(.text+0x33e3) : référence indéfinie vers « fi_strerror »
/usr/bin/ld : (.text+0x3509) : référence indéfinie vers « fi_strerror »
/usr/bin/ld : /usr/lib/x86_64-linux-gnu/libmpich.a(lib_libmpich_la-isend.o) : dans la fonction « MPIDI_OFI_do_am_isend_rdma_read.constprop.0 » :
(.text+0x161d) : référence indéfinie vers « fi_strerror »
/usr/bin/ld : /usr/lib/x86_64-linux-gnu/libmpich.a(lib_libmpich_la-isend.o):(.text+0x1add) : encore plus de références indéfinies suivent vers « fi_strerror »"
etc....

I get the same result when trying to compile any of the Basilisk's examples so it's not related to St Venant.

Has any of you experienced the same problem and managed to solve it?

qcc --version: 11.4.0
mpicc --version: gcc 11.4.0
mpirun --version: 4.0

Thank you for your help,

Aliénor

Stephane Popinet

unread,
Dec 7, 2023, 11:28:14 AM12/7/23
to basil...@googlegroups.com
Hi Alienor,

It looks like the problem comes from gcc-11 being more picky than
earlier versions, or the C-preprocessor working differently (I am using
gcc 10.2.1), so a workaround is to use an older version of gcc. It is
easy to install older versions, just do something like:

sudo apt install gcc-10

then make sure (using PATH and/or the CC environment variables) that you
are using gcc-10 and not 11. You can also probably entirely remove
gcc-11 using:

sudo apt remove gcc-11

The problem should still be fixed within Basilisk though.

Stephane

PS: note that this is a problem with Ubuntu, it is based on the
"testing" branch of Debian which, as its name indicates, is not
considered "stable" (unlike the "stable" branch of Debian which I am
using). This can cause this type of issues...
OpenPGP_0x78F22AD6304D74BE.asc
OpenPGP_signature.asc

Wojciech (Voitek) Aniszewski

unread,
Dec 7, 2023, 5:03:33 PM12/7/23
to Stephane Popinet, basil...@googlegroups.com
I would be very interested in whether that suggestion (gcc-10) worked,
indeed.
Because it looks like a ld issue in a call to fi_strerror() which is a
part of libfabric requested by their MPICH installation.


Our supercomputer runs 11.2.0 with mpcc supplied via MPICH
(/opt/cray/pe/mpich/8.1.20/ofi/gnu/9.1/bin/mpicc) and I haven't seen
this (but I never ran the test suite).

PS. Stephane, you're no longer running stable then, but oldstable.
Bookworm premiered in the summer, where gcc is version 12.2.0-3.

Testing and sid nowadays have 13.2 and rest assured Basilisk has zero
issues with that.

cheers
voitek

alienor...@gmail.com

unread,
Dec 13, 2023, 4:29:54 AM12/13/23
to basilisk-fr
Dear Stéphane and Voitek,

Thank you for your help! Before getting your answers, with the help of Laurent Duchemin, we managed to make to code compile by reinstalling completely libmpich.
This however does not remove the warnings (points 1 and 2 of my first email).
I will try your solution to see if it works better.

Thanks a lot,

Aliénor

Aliénor Rivière

unread,
Dec 13, 2023, 2:15:32 PM12/13/23
to basilisk-fr
Dear all, 

For the records, I tried using gcc-10 instead of gcc-11. This removes all the warnings. Now, everything compiles and runs like a charm. 

Thanks for you help, 

Aliénor

-- 
You received this message because you are subscribed to a topic in the Google Groups "basilisk-fr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/basilisk-fr/EuNqYH10Sng/unsubscribe.
To unsubscribe from this group and all its topics, send an email to basilisk-fr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/basilisk-fr/903032fe-ffc6-46f7-b19c-5dabc51b222cn%40googlegroups.com.

Bruno Deremble

unread,
Dec 13, 2023, 2:15:39 PM12/13/23
to alienor...@gmail.com, basil...@googlegroups.com

I've learned to live (dangerously) with these warnings ;)

Andrés Castillo-Castellanos

unread,
Dec 13, 2023, 2:36:39 PM12/13/23
to basilisk-fr
Hello,

Just a technical question. Any reason to use MPICH over OpenMPI?

Cheers,

Wojciech (Voitek) Aniszewski

unread,
Dec 13, 2023, 6:35:38 PM12/13/23
to Andrés Castillo-Castellanos, basilisk-fr

A mixture of historical, technical and factual reasons.

MPICH is historically better supported on Cray/Blue gene type machines, and many centres that deploy those will prefer them and install them beforehand furthering the popularity. There are certain instructions supported only on mpich (mpi_thread_multiple()) and some MPI standards that first got full support in mpich with openmpi catching up later. So, as often in history, this means many centers deployed MPICH as it seemed 'more mature'. Finally, remembering that what is comonly known as a 'supercomputer' is a giga-hamburger composed of thousands individual computers connected via (specific kinds of) network, MPICH has had more of net protocols covered first - but then OpenMPI caught up as well.
 
check out the excellent coverage here:
 
personally, I think it's a great idea to perform comparative calculations - also with Basilisk - with both protocols, to recompilation needed.
 
cheers
v
Reply all
Reply to author
Forward
0 new messages