On our HPC cluster, I built Singularity image file for Ubuntu 19.04, no IB drivers installed inside the container image file
After setup
export SINGULARITY_BINDPATH=/opt/slurm,/etc/libibverbs.d,/usr/include/infiniband,/usr/include/rdma
export SINGULARITY_BINDPATH=$SINGULARITY_BINDPATH,$(echo /usr/bin/ib*_* | sed -e 's/ /,/g')
export SINGULARITY_CONTAINLIBS=$(echo /usr/lib64/libmlx*.so* /usr/lib64/libi40iw-rdmav2.so* /lib64/libib*.so* /usr/lib64/libnl.so* | xargs | sed -e 's/ /,/g’)
I can build OpenMPI 3.1.3 successfully with IB and SLURM enabled inside the container.
For OSU bandwidth test, pt2pt/osu_bw, I have similar IB bandwidth performance on host and inside Singularity container.
[wang@c10-01 mpi-singularity]$ cat run-benchmarks.sh
#!/bin/bash
module purge
export LD_LIBRARY_PATH=/opt/slurm/lib64
img=/beegfs/work/public/singularity/ubuntu-19.04.sif
export SINGULARITY_BINDPATH=/opt/slurm,/etc/libibverbs.d,/usr/include/infiniband,/usr/include/rdma
export SINGULARITY_BINDPATH=$SINGULARITY_BINDPATH,$(echo /usr/bin/ib*_* | sed -e 's/ /,/g')
export SINGULARITY_CONTAINLIBS=$(echo /usr/lib64/libmlx*.so* /usr/lib64/libi40iw-rdmav2.so* /lib64/libib*.so* /usr/lib64/libnl.so* | xargs | sed -e 's/ /,/g')
exe=pt2pt/osu_bw
srun --mpi=pmi2 \
/home/wang/mpi-singularity/host/osu-local/libexec/osu-micro-benchmarks/mpi/$exe
srun --mpi=pmi2 \
singularity exec $img \
/home/wang/mpi-singularity/singularity/osu-local/libexec/osu-micro-benchmarks/mpi/$exe
[wang@c10-01 mpi-singularity]$ sh run-benchmarks.sh
srun: error: spank: x11.so: Plugin file not found
# OSU MPI Bandwidth Test v5.6.1
# Size Bandwidth (MB/s)
1 3.79
2 7.55
4 15.09
8 30.09
16 59.88
32 117.54
64 235.17
128 410.05
256 792.83
512 1296.26
1024 2240.31
2048 3941.70
4096 5834.29
8192 7806.05
16384 10099.20
32768 11436.05
65536 11781.49
131072 11968.60
262144 12065.43
524288 12077.23
1048576 12133.04
2097152 12115.81
4194304 12114.35
srun: error: spank: x11.so: Plugin file not found
# OSU MPI Bandwidth Test v5.6.1
# Size Bandwidth (MB/s)
1 4.22
2 8.44
4 16.83
8 33.49
16 66.87
32 131.68
64 259.69
128 446.83
256 880.66
512 1449.26
1024 2675.45
2048 4752.75
4096 7268.75
8192 9895.56
16384 9653.57
32768 11418.04
65536 11785.02
131072 11969.93
262144 12064.00
524288 12114.16
1048576 12134.41
2097152 12116.23
4194304 12114.46
[wang@c10-01 mpi-singularity]$
Without SINGULARITY_CONTAINLIBS setup, OpenMPI inside container is running with much lower bandwidth
[wang@c10-01 mpi-singularity]$ sh run-benchmarks.sh
srun: error: spank: x11.so: Plugin file not found
# OSU MPI Bandwidth Test v5.6.1
# Size Bandwidth (MB/s)
1 3.83
2 7.60
4 15.16
8 30.11
16 60.26
32 118.11
64 235.40
128 411.47
256 788.84
512 1271.36
1024 2295.44
2048 3850.66
4096 5665.37
8192 7812.25
16384 10185.56
32768 11438.84
65536 11787.29
131072 11968.12
262144 12066.93
524288 12114.29
1048576 12128.00
2097152 12114.70
4194304 12113.99
srun: error: spank: x11.so: Plugin file not found
# OSU MPI Bandwidth Test v5.6.1
# Size Bandwidth (MB/s)
1 0.47
2 0.94
4 2.01
8 4.00
16 7.78
32 12.53
64 27.81
128 46.53
256 100.01
512 138.82
1024 391.96
2048 489.30
4096 628.20
8192 787.51
16384 937.60
32768 1078.95
65536 2351.52
131072 2926.45
262144 3178.66
524288 3411.78
1048576 3640.92
2097152 3908.76
4194304 3741.52
[wang@c10-01 mpi-singularity]$
Best,
Shenglong