INLA on HPC server causes core dumps

193 views
Skip to first unread message

David Dayi Li

unread,
Jul 7, 2021, 8:09:56 PM7/7/21
to R-inla discussion group
Hi,

I am running LGCP simulation with inlabru and SPDE on a HPC server. But I kept on getting core dump for every simulation that is run. I have contacted support staff of the HPC server about this issue, and they said the problem is from INLA. Below is the result I got after trying to view one of the core files:

$ gdb core core.224843

[New LWP 224843]
Core was generated by `grep ^export'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000564fdff492a8 in ?? ()

The support staff does not really know what is going on, and I have no idea what is happening here. The code I am running is outputting the result I wanted, so there should not be any program crashing. 

Although I can ignore these core dumps, they become a problem if I want to do tens of thousands of simulation. Them being outputted to my remote server directory is likely to cause storage and access problem. Does anyone know what is happening here?

Thanks,
David

Helpdesk

unread,
Jul 8, 2021, 3:21:07 AM7/8/21
to David Dayi Li, R-inla discussion group


can you do

Sys.setenv(INLA_DEBUG=1)
inla(y~1,data=data.frame(y=0), verbose=TRUE,debug=TRUE)

and send me the output? hopefully, it should be clear from this what is
going on.
> --
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to r-inla-discussion...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/r-inla-discussion-group/beef4b96-4714-4025-b651-19a68871b8a2n%40googlegroups.com
> .

--
Håvard Rue
he...@r-inla.org

David Dayi Li

unread,
Jul 8, 2021, 1:37:38 PM7/8/21
to R-inla discussion group
Hi Havard,

I tried your code and I found the part where it outputted segmentation fault. The result is here:

Run inla...*** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run *** LD_PRELOAD=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first/libjemalloc.so.2
*** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run *** LD_LIBRARY_PATH=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first:/scratch/dli346/R_locals/INLA/bin/linux/64bit:/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.0.5/lib64/R/lib:/lib64:/usr/lib64:/lib:/usr/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.0.5/lib64/R/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/mkl/lib/intel64:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/lib/intel64:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/gcccore/9.3.0/lib64:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/gcccore/9.3.0/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/compilers_and_libraries_2020.1.217/linux/compiler/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/compilers_and_libraries_2020.1.217/linux/mkl/lib/intel64_lin:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/compilers_and_libraries_2020.1.217/linux/compiler/lib/intel64_lin:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/lib/server:/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.0.5/lib64/R/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/mkl/lib/intel64:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/lib/intel64:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/gcccore/9.3.0/lib64:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/gcccore/9.3.0/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/compilers_and_libraries_2020.1.217/linux/compiler/lib:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/compilers_and_libraries_2020.1.217/linux/mkl/lib/intel64_lin:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2020.1.217/compilers_and_libraries_2020.1.217/linux/compiler/lib/intel64_lin:/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/lib/server
/scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run: line 48: 24897 Segmentation fault      (core dumped) ldd -r "$DIR/$prog"
 *** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run *** exec /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl -b -s -v -t32:1 -B0 /tmp/RtmpJ66xAR/file60a3525beaf8/Model.ini

Is there any fix for this?

Thanks,
David

INLA help

unread,
Jul 8, 2021, 2:22:00 PM7/8/21
to R-inla discussion group, David Dayi Li
And this the centos7 build? And it’s supposed to be the correct one?

Haavard Rue
HelpDesk 
help@r-inla. org

David Dayi Li

unread,
Jul 8, 2021, 2:39:17 PM7/8/21
to R-inla discussion group
Yes, this is the centOS7 build and I checked my HPC system version, it is centOS7. The support staff of HPC server just emailed me that the error is a relocation error. But it does not affect the program from running. Hope this helps.

David

Helpdesk

unread,
Jul 9, 2021, 12:58:49 PM7/9/21
to David Dayi Li, R-inla discussion group

and this fails as well I guess (do in shell)

export INLA_DEBUG=1
bash -vx /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run

???

David Dayi Li

unread,
Jul 9, 2021, 1:03:53 PM7/9/21
to R-inla discussion group
Hi Harvard,

This is the output I got:

#!/bin/bash
# -*- shell-script -*-
#
########################################################################
# Start Lmod BASHENV
########################################################################

if [ -z "${LMOD_SH_DBG_ON+x}" ]; then
   case "$-" in
     *v*x*) __lmod_vx='vx' ;;
     *v*)   __lmod_vx='v'  ;;
     *x*)   __lmod_vx='x'  ;;
   esac;
fi
+ '[' -z '' ']'
+ case "$-" in
+ __lmod_vx=vx

[ -n "${__lmod_vx:-}" ] && set +$__lmod_vx
+ '[' -n vx ']'
+ set +vx
Shell debugging temporarily silenced: export LMOD_SH_DBG_ON=1 for this output (/cvmfs/soft.computecanada.ca/custom/software/lmod/lmod/init/bash)
Shell debugging restarted
+ unset __lmod_vx

########################################################################
# End Lmod BASHENV
########################################################################
#
# Local Variables:
# mode: shell-script
# indent-tabs-mode: nil
# End:
#!/bin/bash

cmd=$(readlink -e "$0")
++ readlink -e /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run
+ cmd=/scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.run
DIR=$(readlink -e $(dirname "$cmd"))
+++ dirname /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.run
++ readlink -e /scratch/dli346/R_locals/INLA/bin/linux/64bit
+ DIR=/scratch/dli346/R_locals/INLA/bin/linux/64bit
tmp=$(basename "$0")
++ basename /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run
+ tmp=inla.mkl.run
prog=${tmp%%.run}
+ prog=inla.mkl

D=""
+ D=
if [ ! -z ${R_HOME+x} ]; then
    d="$R_HOME/lib"
    if [ -d "$d" ]; then
        D=$d
    fi
fi
+ '[' '!' -z ']'

for d in {,/usr}/lib64 /usr/lib64/R/lib {,/usr}/lib/x86_64-linux-gnu {,/usr}/lib; do
    if [ -d "$d" ]; then
        if [ -z "$D" ]; then
            D="$d"
        else
            D="$D:$d"
        fi
    fi
done
+ for d in {,/usr}/lib64 /usr/lib64/R/lib {,/usr}/lib/x86_64-linux-gnu {,/usr}/lib
+ '[' -d /lib64 ']'
+ '[' -z '' ']'
+ D=/lib64
+ for d in {,/usr}/lib64 /usr/lib64/R/lib {,/usr}/lib/x86_64-linux-gnu {,/usr}/lib
+ '[' -d /usr/lib64 ']'
+ '[' -z /lib64 ']'
+ D=/lib64:/usr/lib64
+ for d in {,/usr}/lib64 /usr/lib64/R/lib {,/usr}/lib/x86_64-linux-gnu {,/usr}/lib
+ '[' -d /usr/lib64/R/lib ']'
+ for d in {,/usr}/lib64 /usr/lib64/R/lib {,/usr}/lib/x86_64-linux-gnu {,/usr}/lib
+ '[' -d /lib/x86_64-linux-gnu ']'
+ for d in {,/usr}/lib64 /usr/lib64/R/lib {,/usr}/lib/x86_64-linux-gnu {,/usr}/lib
+ '[' -d /usr/lib/x86_64-linux-gnu ']'
+ for d in {,/usr}/lib64 /usr/lib64/R/lib {,/usr}/lib/x86_64-linux-gnu {,/usr}/lib
+ '[' -d /lib ']'
+ '[' -z /lib64:/usr/lib64 ']'
+ D=/lib64:/usr/lib64:/lib
+ for d in {,/usr}/lib64 /usr/lib64/R/lib {,/usr}/lib/x86_64-linux-gnu {,/usr}/lib
+ '[' -d /usr/lib ']'
+ '[' -z /lib64:/usr/lib64:/lib ']'
+ D=/lib64:/usr/lib64:/lib:/usr/lib

for f in $DIR/first/lib*.so*; do
    case "$f" in
        $DIR/first/libjemalloc.so*)
            export LD_PRELOAD="$f";;
    esac
done
+ for f in $DIR/first/lib*.so*
+ case "$f" in
+ for f in $DIR/first/lib*.so*
+ case "$f" in
+ export LD_PRELOAD=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first/libjemalloc.so.2
+ LD_PRELOAD=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first/libjemalloc.so.2
+ for f in $DIR/first/lib*.so*
+ case "$f" in
+ for f in $DIR/first/lib*.so*
+ case "$f" in
+ for f in $DIR/first/lib*.so*
+ case "$f" in

if [ -n "${INLA_NATIVE_LD_LIBRARY_PATH}" ]; then
    ## so we can revert back to old behaviour
    export LD_LIBRARY_PATH="$DIR/first:$D:$DIR:$LD_LIBRARY_PATH"
else
    ## this is the new default, is that we use the libs used when we
    ## build
    export LD_LIBRARY_PATH="$DIR/first:$DIR:$D:$LD_LIBRARY_PATH"
fi
+ '[' -n '' ']'
+ export LD_LIBRARY_PATH=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first:/scratch/dli346/R_locals/INLA/bin/linux/64bit:/lib64:/usr/lib64:/lib:/usr/lib:
+ LD_LIBRARY_PATH=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first:/scratch/dli346/R_locals/INLA/bin/linux/64bit:/lib64:/usr/lib64:/lib:/usr/lib:
export PARDISOLICMESSAGE=1
+ export PARDISOLICMESSAGE=1
+ PARDISOLICMESSAGE=1
export OMP_NESTED=TRUE
+ export OMP_NESTED=TRUE
+ OMP_NESTED=TRUE

if [ -n "${INLA_DEBUG}" ]; then
    echo "*** $0 *** LD_PRELOAD=$LD_PRELOAD"
    echo "*** $0 *** LD_LIBRARY_PATH=$LD_LIBRARY_PATH"
    ldd -r "$DIR/$prog"
fi
+ '[' -n 1 ']'
+ echo '*** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run *** LD_PRELOAD=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first/libjemalloc.so.2'
*** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run *** LD_PRELOAD=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first/libjemalloc.so.2
+ echo '*** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run *** LD_LIBRARY_PATH=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first:/scratch/dli346/R_locals/INLA/bin/linux/64bit:/lib64:/usr/lib64:/lib:/usr/lib:'
*** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run *** LD_LIBRARY_PATH=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first:/scratch/dli346/R_locals/INLA/bin/linux/64bit:/lib64:/usr/lib64:/lib:/usr/lib:
+ ldd -r /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl
/scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run: line 48: 38721 Segmentation fault      ldd -r "$DIR/$prog"

## set default levels for nested openmp
nt="-t0:1"
+ nt=-t0:1
for arg in "$@"; do
    case "$arg" in
        -t*) nt="$arg";;
    esac
done
eval $("$DIR/$prog" $nt -mopenmp | grep ^export)
++ /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl -t0:1 -mopenmp
++ grep '^export'
+ eval
if [ -n "${INLA_DEBUG}" ]; then
    set | grep ^OMP_ | while read v; do echo "*** $0 *** $v"; done
    echo " *** $0 *** exec $DIR/$prog $@"
fi
+ '[' -n 1 ']'
+ set
+ grep '^OMP_'
+ read v
+ echo ' *** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run *** exec /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl '
 *** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run *** exec /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl

exec "$DIR/$prog" "$@"
+ exec /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl


***************************************************************************
CONTAINS Runtime Modules of Parallel Sparse Linear Solver PARDISO Vers. 7.0
CUSTOMIZED FOR THE R-INLA PACKAGE WHICH SOLVES A LARGE CLASS OF STATISTICAL
MODELS USING THE INLA APPROACH.
Copyright Universita della Svizzera italiana 2000-2020 All Rights Reserved.

No PARDISO license file found.  Please see `http://www.pardiso-project.org/r-inla
where to place the license file pardiso.lic
***************************************************************************

        1f6a39183ef43d8ef33f10ff3f04fd13f8432758 - Mon Feb 22 21:27:50 2021 +0300

*** Error: Expected argument after options.

Usage: /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl [-v] [-V] [-h] [-f] [-e var=value] [-t MAX_THREADS] [-m MODE] FILE.INI

David

Helpdesk

unread,
Jul 9, 2021, 1:17:28 PM7/9/21
to David Dayi Li, R-inla discussion group
goood, so this runs fine!

what about this in R

inla(y~1,data=data.frame(y=0), verbose=TRUE,
inla.call="/scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run")

David Dayi Li

unread,
Jul 9, 2021, 1:21:04 PM7/9/21
to R-inla discussion group
Just tried it. It still produces the same segmentation fault as previous.

Jeff Lee

unread,
Oct 28, 2023, 1:44:07 AM10/28/23
to R-inla discussion group
Hi David,
I have exactly the same issue as you mentioned above. I am running my job on HPC and got core files for every simulation I ran. The outputs seems to be alright without any errors. Have you solved this issue?
Many thanks,
Jeff

Helpdesk (Haavard Rue)

unread,
Oct 29, 2023, 2:21:03 PM10/29/23
to Jeff Lee, R-inla discussion group
I guess you're using the correct binary ? inla.binary.install()
> > > > rst:/scratch/dli346/R_locals/INLA/bin/linux/64bit:/lib64:/usr/li
> > > > b64:/lib:/usr/lib:
> > > > +
> > > > LD_LIBRARY_PATH=/scratch/dli346/R_locals/INLA/bin/linux/64bit/fi
> > > > rst:/scratch/dli346/R_locals/INLA/bin/linux/64bit:/lib64:/usr/li
> > > > b64:/lib:/usr/lib:
> > > > export PARDISOLICMESSAGE=1
> > > > + export PARDISOLICMESSAGE=1
> > > > + PARDISOLICMESSAGE=1
> > > > export OMP_NESTED=TRUE
> > > > + export OMP_NESTED=TRUE
> > > > + OMP_NESTED=TRUE
> > > >
> > > > if [ -n "${INLA_DEBUG}" ]; then
> > > >     echo "*** $0 *** LD_PRELOAD=$LD_PRELOAD"
> > > >     echo "*** $0 *** LD_LIBRARY_PATH=$LD_LIBRARY_PATH"
> > > >     ldd -r "$DIR/$prog"
> > > > fi
> > > > + '[' -n 1 ']'
> > > > + echo '***
> > > > /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run ***
> > > > LD_PRELOAD=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first/l
> > > > ibjemalloc.so.2'
> > > > *** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run
> > > > ***
> > > > LD_PRELOAD=/scratch/dli346/R_locals/INLA/bin/linux/64bit/first/l
> > > > ibjemalloc.so.2
> > > > + echo '***
> > > > /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run ***
> > > > LD_LIBRARY_PATH=/scratch/dli346/R_locals/INLA/bin/linux/64bit/fi
> > > > rst:/scratch/dli346/R_locals/INLA/bin/linux/64bit:/lib64:/usr/li
> > > > b64:/lib:/usr/lib:'
> > > > *** /scratch/dli346/R_locals/INLA/bin/linux/64bit/inla.mkl.run
> > > > ***
> > > > LD_LIBRARY_PATH=/scratch/dli346/R_locals/INLA/bin/linux/64bit/fi
> > > > rst:/scratch/dli346/R_locals/INLA/bin/linux/64bit:/lib64:/usr/li
> > > > b64:/lib:/usr/lib:
Reply all
Reply to author
Forward
0 new messages