Hi, I just learning Kaldi for a month by now, and I have reached on the phase learning deep neural network. Anyway, I have an error with CUDA integration, as far as I check, the installation of Kaldi sure CUDA = true, I have nvcc installed on my docker machine, Nvidia-smi also installed correctly. But when I try to run
step/nnet/pretrained_dbn.sh it gives me an error saying they did not get a Cuda GPU. The log error said like, in the title, this is for the detail
# INFO
./steps/nnet/pretrain_dbn.sh : Pre-training Deep Belief Network as a stack of RBMs
dir : ./exp/nnet/dbn
Train-set : ./data/train '73163'
LOG ([5.5.513~1-b5f4cf]:main():cuda-gpu-available.cc:60)
### IS CUDA GPU AVAILABLE? 'e08411810e87' ###
ERROR ([5.5.513~1-b5f4cf]:SelectGpuId():cu-device.cc:166) No CUDA GPU detected!, diagnostics: cudaError_t 804 : "forward compatibility was attempted on non supported HW", in cu-device.cc:166
[ Stack-Trace: ]
/root/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x82c) [0x7f7d095e72ca]
/root/kaldi/src/lib/libkaldi-cudamatrix.so(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x21) [0x7f7d09825969]
/root/kaldi/src/lib/libkaldi-cudamatrix.so(kaldi::CuDevice::SelectGpuId(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)+0x3c8) [0x7f7d098249b8]
cuda-gpu-available(main+0x1c6) [0x4016a9]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f7d08a82830]
cuda-gpu-available(_start+0x29) [0x401379]
kaldi::KaldiFatalError
LOG ([5.5.513~1-b5f4cf]:main():cuda-gpu-available.cc:96) ...
### WE DID NOT GET A CUDA GPU!!! ###
### If your system has a 'free' CUDA GPU, try re-installing latest 'CUDA toolkit' from NVidia (this updates GPU drivers too).
### Otherwise 'nvidia-smi' shows the status of GPUs:
### - The versions should match ('NVIDIA-SMI' and 'Driver Version'), otherwise reboot or reload kernel module,
### - The GPU should be unused (no 'process' in list, low 'memory-usage' (<100MB), low 'gpu-fan' (<30%)),
### - You should see your GPU (burnt GPUs may disappear from the list until reboot),
And for the information, I have checked that the CPU has free memory since no other uses the machine.
Is this have anything to do with docker installation? How can I overcome this error? Is the GPU is not compatible with this script?
Thanks in advance,
Junaedi Fahmi.