WE DID NOT GET A CUDA GPU!!! ( forward compatibility was attempted on non supported HW )

5,587 views
Skip to first unread message

Junaedi Fahmi

unread,
Oct 17, 2019, 3:11:03 AM10/17/19
to kaldi-help
Hi, I just learning Kaldi for a month by now, and I have reached on the phase learning deep neural network. Anyway, I have an error with CUDA integration, as far as I check, the installation of Kaldi sure CUDA = true, I have nvcc installed on my docker machine, Nvidia-smi also installed correctly. But when I try to run step/nnet/pretrained_dbn.sh it gives me an error saying they did not get a Cuda GPU. The log error said like, in the title, this is for the detail

# INFO

./steps/nnet/pretrain_dbn.sh : Pre-training Deep Belief Network as a stack of RBMs

         dir      
: ./exp/nnet/dbn  
         
Train-set : ./data/train '73163'



LOG
([5.5.513~1-b5f4cf]:main():cuda-gpu-available.cc:60)  


### IS CUDA GPU AVAILABLE? 'e08411810e87' ###

ERROR
([5.5.513~1-b5f4cf]:SelectGpuId():cu-device.cc:166) No CUDA GPU detected!, diagnostics: cudaError_t 804 : "forward compatibility was attempted on non supported HW", in cu-device.cc:166



[ Stack-Trace: ]

/root/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x82c) [0x7f7d095e72ca]

/root/kaldi/src/lib/libkaldi-cudamatrix.so(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x21) [0x7f7d09825969]

/root/kaldi/src/lib/libkaldi-cudamatrix.so(kaldi::CuDevice::SelectGpuId(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)+0x3c8) [0x7f7d098249b8]

cuda
-gpu-available(main+0x1c6) [0x4016a9]

/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f7d08a82830]

cuda
-gpu-available(_start+0x29) [0x401379]



kaldi
::KaldiFatalError

LOG
([5.5.513~1-b5f4cf]:main():cuda-gpu-available.cc:96) ...

### WE DID NOT GET A CUDA GPU!!! ###

### If your system has a 'free' CUDA GPU, try re-installing latest 'CUDA toolkit' from NVidia (this updates GPU drivers too).

### Otherwise 'nvidia-smi' shows the status of GPUs:

### - The versions should match ('NVIDIA-SMI' and 'Driver Version'), otherwise reboot or reload kernel module,

### - The GPU should be unused (no 'process' in list, low 'memory-usage' (<100MB), low 'gpu-fan' (<30%)),

### - You should see your GPU (burnt GPUs may disappear from the list until reboot),

And for the information,  I have checked that the CPU has free memory since no other uses the machine.

Is this have anything to do with docker installation? How can I overcome this error? Is the GPU is not compatible with this script?

Thanks in advance,
Junaedi Fahmi.

 

Daniel Povey

unread,
Oct 17, 2019, 1:13:19 PM10/17/19
to kaldi-help
I suspect it is a mismatch involving the hardware, driver version and CUDA toolkit version.
The output of `nvidia-smi` would help.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/e126f4ae-6946-43e3-ada0-ddce33b07a2a%40googlegroups.com.

Justin Luitjens

unread,
Oct 17, 2019, 4:14:18 PM10/17/19
to kaldi...@googlegroups.com
what does nvidia-smi return?

Junaedi Fahmi

unread,
Oct 17, 2019, 9:48:14 PM10/17/19
to kaldi-help


cuda.png


Hi, thanks for the reply. This is the output of the command.

Daniel Povey

unread,
Oct 17, 2019, 11:06:07 PM10/17/19
to kaldi-help
Your driver version is too old for CUDA 10.1 toolkit.  According to this
https://docs.nvidia.com/deploy/cuda-compatibility/index.html
it needs at least 418.39, you have 384.130.
I think it's best not to use Docker, it won't let you install drivers probably.


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

Junaedi Fahmi

unread,
Oct 21, 2019, 6:26:49 AM10/21/19
to kaldi-help
Alright, I try to update the nvidia driver. Thank you.

On Friday, 18 October 2019 10:06:07 UTC+7, Dan Povey wrote:
Your driver version is too old for CUDA 10.1 toolkit.  According to this
https://docs.nvidia.com/deploy/cuda-compatibility/index.html
it needs at least 418.39, you have 384.130.
I think it's best not to use Docker, it won't let you install drivers probably.


To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages