error while loading shared libraries: libcudart.so.7.5

579 views
Skip to first unread message

Chris Reidy

unread,
May 1, 2018, 4:23:55 PM5/1/18
to singularity
Hi 
I'm trying to help my user who is getting this message:
probtrackx2_gpu: error while loading shared libraries: libcudart.so.7.5: cannot open shared object file: No such file or directory...

She created a singularity container by pulling a Dockerfile, which I could attach if needed
In this Dockerfile it calls the cuda4singularity code.

She submits it as a PBS job including "module load cuda75" which sets the library path to:

echo $LD_LIBRARY_PATH

/cm/shared/apps/cuda75/toolkit/7.5.18/extras/CUPTI/lib64:/cm/local/apps/cuda/libs/current/lib64:/cm/shared/apps/cuda75/toolkit/7.5.18/lib64

And so;

find /cm/shared/apps/cuda75 -name libcudart.so.7.5

/cm/shared/apps/cuda75/toolkit/7.5.18/lib64/libcudart.so.7.5

/cm/shared/apps/cuda75/toolkit/7.5.18/lib/libcudart.so.7.5


Thanks in advance for help
Chris

Kandes, Martin

unread,
May 1, 2018, 4:36:30 PM5/1/18
to singu...@lbl.gov
Hi Chris,

Are these paths visible from within the container?

Marty


From: Chris Reidy [chris...@email.arizona.edu]
Sent: Tuesday, May 01, 2018 1:23 PM
To: singularity
Subject: [Singularity] error while loading shared libraries: libcudart.so.7.5

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Chris Reidy

unread,
May 1, 2018, 5:05:26 PM5/1/18
to singularity, mka...@sdsc.edu
Hi Marty

That is interesting.  I wonder if it is expecting to find that library in somewhere like /usr/local/cuda, where it does not exist.
Can I put something in the container recipe like:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/cm/shared/apps/cuda75/toolkit/7.5.18/lib64
And then I would need to add a bind for /cm/shared I suppose also

Chris

Jason Stover

unread,
May 1, 2018, 5:13:07 PM5/1/18
to singu...@lbl.gov
If the /cm/shared location is bind mounted and available in the
container, then in the job script after loading the module, you may
want to try setting the following:

SINGULARITYENV_LD_LIBRARY_PATH=${LD_LIBRARY_PATH}

There's also the --nv option, which tries pulling in the libraries
listed in ${sysconfdir}/singularity/nvliblist.conf ... It uses
ldconfig -p output to try and figure out where the library is, and
bind mounts them into the container.

-J

Chris Reidy

unread,
May 1, 2018, 6:30:51 PM5/1/18
to singularity
Hi Jason

We do not include /cm/shared in singularity.conf (this is a Centos 6 system so we have to specify the bind directories).  So I will add libcudart.so.7.5 (and I guess the newer ones for anyone using a newer cuda) into nvliblist.conf.  She uses --nv so that should be compatible.  I will give it a whirl

Thanks
Chris

Kandes, Martin

unread,
May 1, 2018, 6:40:35 PM5/1/18
to singu...@lbl.gov
Chris,

Jason's suggestions make sense.

Another potential problem and thing to double check is what nvidia driver version she installed with cuda4singularity [1] within the container. It'll have to match the driver version you have on the unerlying system. However, the --nv option might sidestep this issue altogether. But I don't have a lot of experience using it since we started using Singularity before the --nv option existed. And since we don't often change driver versions, I still bake in them into the containers we support for users [2].

Marty

[1]

https://github.com/NIH-HPC/gpu4singularity

[2]

https://github.com/mkandes/naked-singularity/blob/master/definition-files/us/ucsd/sdsc/comet/ubuntu/ubuntu-cuda.def

________________________________________
From: Jason Stover [jason....@gmail.com]
Sent: Tuesday, May 01, 2018 2:13 PM
To: singu...@lbl.gov
Subject: Re: [Singularity] error while loading shared libraries: libcudart.so.7.5

Chris Reidy

unread,
May 1, 2018, 6:42:51 PM5/1/18
to singularity
Hmm well, I tried adding the library to nvliblist.conf and reloaded the module.  I also tried setting SINGULARITYENV_LD_LIBRARY_PATH.  And I get the same error when running:

singularity run --nv ${WORK}/bipbids_gpu.simg ${WORK}/Data ${WORK}/Data/derivatives participant --participant_label 327 --stages bip --tract arc_r --gpu yes --skip_bids_validator


It is looking more like I will have to add /cm/shared as a bind location in singularity.conf and have her include that in her recipe.  Thoughts?
Thanks for the help
Chris

On Tuesday, May 1, 2018 at 2:13:07 PM UTC-7, Jason Stover wrote:

David Godlove

unread,
May 2, 2018, 10:11:50 AM5/2/18
to singu...@lbl.gov
Hi Chris,

Just my $00.2.  You should note that cuda4singularity is deprecated.  It's trying to solve the same problem as --nv but in a non-portable way.  If the container was built using cuda4singularity there is a good chance that it contains the wrong version of the nvidia driver and will not be compatible with the host system where you are trying to run it.  If you have access to the def file, I would remove the cuda4singularity bits, rebuild, and try to run using just --nv.  

Dave

To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

David Godlove

unread,
May 2, 2018, 10:15:01 AM5/2/18
to singu...@lbl.gov
Oh sorry!  I got confused about the name of my own code!  It is gpu4singularity that is deprecated.  I can't speak to cuda4singularity because I don't really know what that is.  Sorry for the confusion.

Dave

To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.


Kandes, Martin

unread,
May 2, 2018, 2:05:48 PM5/2/18
to singu...@lbl.gov
Dave,

I think the confusion may have come from an earlier version of gpu4singularity, which may have been called cuda4singularity [1].

Marty

[1]

wget ftp://helix.nih.gov/CUDA/cuda4singularity



From: David Godlove [davidg...@gmail.com]
Sent: Wednesday, May 02, 2018 7:14 AM
To: singu...@lbl.gov
Subject: Re: [Singularity] error while loading shared libraries: libcudart.so.7.5

David Godlove

unread,
May 2, 2018, 5:21:00 PM5/2/18
to singu...@lbl.gov
Right you are Marty!  I'll take any excuse I can get for my poor memory.  

To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

David Godlove

unread,
May 2, 2018, 5:22:33 PM5/2/18
to singu...@lbl.gov
That name is a misnomer btw and I really regret adding to the considerable confusion that already exists about the difference between the Nvidia driver and CUDA.  

Dave
Reply all
Reply to author
Forward
0 new messages