CUDA fails after every reboot

501 views
Skip to first unread message

sp...@ymail.com

unread,
Dec 18, 2015, 6:51:26 PM12/18/15
to Caffe Users
Hi All,

I am using Caffe with GPU (NVIDIA GTX 590) support. I have installed the NVIDIA Driver 352 (comes with Cuda 7.5 runfile) with no open gl libs and Caffe is working fine.

However, every time I start a new Caffe session after turning my computer on I get the following error ,

WARNING: Logging before InitGoogleLogging() is written to STDERR
E1219 10:42:22.132685 23546 common.cpp:104] Cannot create Cublas handle. Cublas won't be available.
E1219 10:42:22.159312 23546 common.cpp:111] Cannot create Curand generator. Curand won't be available.
F1219 10:42:22.176038 23546 common.cpp:142] Check failed: error == cudaSuccess (30 vs. 0)  unknown error

and have to re-make caffe every time. Do you know a permenant solution for this? It would be of great help... Thanks in advance.



Alex Sokolov

unread,
Feb 7, 2016, 3:54:39 PM2/7/16
to Caffe Users
I have experienced the same issue today and it took several hours before I have realized that I used to have the same problem with CUDA for Theano. It doesn't have to do with remaking caffe or blas, so don't waste your time on that. The problem is that Caffe won't be able to use CUDA until you launch some CUDA-related program with sudo. I couldn't find the original stackoverflow thread, but I think it said, that you should either launch nvidia-x-server, nvidia-smi or CUDA sample test (deviceQuery) with sudo. I am using the last one and haven't tested the first two this time.

The permanent solution that worked for me was to create a startup file, calling deviceQuery with sudo every time you reboot the system. So I have created a file 'cuda_startup' in /etc/init.d with the following content: 'sudo /usr/local/cuda/samples/1_Utilities/deviceQuery/deviceQuery' and linked it to /etc/rc2.d with terminal command: 'sudo ln -s /etc/init.d/cuda_startup /etc/rc2.d/S99cuda_startup'.

Felix Abecassis

unread,
Feb 8, 2016, 1:11:24 AM2/8/16
to Caffe Users
You can also try "nvidia-modprobe -u -c=0" (without sudo), it might work instead of having to run deviceQuery with sudo.

Enes Deumić

unread,
Feb 8, 2016, 3:10:42 AM2/8/16
to Caffe Users
When you restart, /dev/nvidia-uvm is probably not up (ls /dev/nvidia*). You need to run deviceQuery with sudo (probably in /usr/local/cuda/samples/1_Utilities/deviceQuery, don't forget to make) or:
sudo modprobe nvidia-uvm
sudo mknod -m 666 /dev/nvidia-uvm c 250 0

Samitha

unread,
Feb 14, 2016, 5:32:38 PM2/14/16
to Caffe Users
Thanks :) running deviceQuery works...

Samitha

unread,
Feb 14, 2016, 5:33:31 PM2/14/16
to Caffe Users
Thanks this worked :) ...
Reply all
Reply to author
Forward
0 new messages