no gpu detected

252 views
Skip to first unread message

Jesse Hansen

unread,
Mar 26, 2024, 3:22:15 AM3/26/24
to IsoNet
hi,

I am running isonet on a cluster setup. In the past it was worked without issue, however recently I have begun encountering the issue below and cannot figure out why, or what might have changed. The deconv/mask prep etc seem to not be the issue, since they run fine in the GUI while I'm logged on interactively in a GPU.  However when I begin the refine process I get the error below about no GPU available.  Note that this HAS worked in the past, not sure why it no longer works.  I have tried multiple GPUs and multiple version of isonet (including 0.2, 0.2.1, and the latest available). I have indeed ensured that the GPUs e.g. 0 1 2 3 are available, and still it says there are no GPUs.   Confusing to me. 

thanks again for the awesome program!  Looking forward to spISONET ;)   

here is the submission script:

#!/bin/bash

#SBATCH --ntasks=1

#SBATCH --nodes=1

#SBATCH --cpus-per-task=8

#SBATCH --time=96:00:00

#SBATCH --mem-per-cpu=10G

#SBATCH --partition=gpu

#SBATCH --gres=gpu:1

#SBATCH --export=NONE


cd $SLURM_SUBMIT_DIR


export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

unset SLURM_EXPORT_ENV

module load isonet/0.2.1


srun --cpu_bind=verbose  isonet.py refine subtomo.star --gpuID 0 --iterations 50 --noise_start_iter 10,15,20,25 --noise_level 0.05,0.1,0.15,0.2 


and here is the output:

######Isonet starts refining######


2024-03-26 08:13:26.476967: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered

2024-03-26 08:13:26.477115: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered

2024-03-26 08:13:26.565722: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

03-26 08:13:32, INFO     Note: NumExpr detected 56 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.

03-26 08:13:33, ERROR    No GPU detected, Please check your CUDA version and installation

03-26 08:13:33, ERROR    Traceback (most recent call last):

  File "/mnt/nfs/clustersw/shared/isonet/0.2.1/IsoNet/bin/refine.py", line 43, in run

    check_gpu(args)

  File "/mnt/nfs/clustersw/shared/isonet/0.2.1/IsoNet/bin/refine.py", line 227, in check_gpu

    raise RuntimeError('No GPU detected, Please check your CUDA version and installation')

RuntimeError: No GPU detected, Please check your CUDA version and installation



best,

Jesse Hansen

Yuntao Liu

unread,
Mar 26, 2024, 3:25:58 PM3/26/24
to Jesse Hansen, IsoNet
Hi Jesse,

I do not have this experience in dealing with this error. But I guess it is due to some compatibility problems.
Please check whether the versions of tensorflow cuda and cudnn match. If those have not been changed, probably downgrading python to <=3.10 may help. 
--
You received this message because you are subscribed to the Google Groups "IsoNet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isonet+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/isonet/5a86ab65-615e-4f0a-b840-af23e98c4110n%40googlegroups.com.


--
Yuntao Liu,  Postdoc.

California NanoSystem Institute
University of California Los Angeles

Tanvir Shaikh

unread,
Mar 26, 2024, 5:23:37 PM3/26/24
to IsoNet
I didn't experience the same exact problem, but I once had had a different no-GPU error.

A minimal test was to run the following from IsoNet's Python environment:
>>> import tensorflow as tf
>>> tf.config.list_physical_devices('GPU')

If your GPUs aren't detected, you'll see it here, and hopefully the error message will be informative.

-Tapu Shaikh

Rangana Warshamanage

unread,
Apr 1, 2024, 3:24:40 PM4/1/24
to Jesse Hansen, IsoNet
Hi,

I recently had a very similar issue. It was Tensorflow not communicating with CUDA.
To test this you try the following
(isonet_env) bash-4.4$ python
>>> import tensorflow as tf
>>> tf.test.gpu_device_name()

The last step should print the GPU IDs if Tensorflow can find them.

While I installing I prepared this (attached) document for my future reference. You're welcome to use it that helps.

Best,
Rangana




To unsubscribe from this group and stop receiving emails from it, send an email to isonet+un...@googlegroups.com.


--
Yuntao Liu,  Postdoc.

California NanoSystem Institute
University of California Los Angeles

--
You received this message because you are subscribed to the Google Groups "IsoNet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isonet+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/isonet/CAAYt73ucSr7ZEW8UNZQmekGvpXbzupiEqxuGZRcVa8K_TsaeTA%40mail.gmail.com.
Isonet-install-instructions.docx
Reply all
Reply to author
Forward
0 new messages