Error running classfication example "Cannot use GPU in CPU-only Caffe: check mode."

1,092 views
Skip to first unread message

Claude Phan

unread,
Mar 29, 2016, 4:29:05 AM3/29/16
to DIGITS Users
Hello,

Wondering if anyone could help me with an issue I am having.

I've installed DIGITS following the instructions provided. https://github.com/NVIDIA/DIGITS/blob/master/docs/BuildDigits.md

All is well, I've also followed these instructions here http://caffe.berkeleyvision.org/installation.html to build caffe on CPU_ONLY mode. Digits server boots up fine and i've trained a model.

I now have downloaded my model and am trying to use it outside of the digits GUI by following these instructions https://github.com/NVIDIA/DIGITS/tree/v2.0.0-preview/examples/classification

However i am getting an error while executing

:~/PythonTest$ ./use_archive.py 20160321-135522-5ad9_epoch_30.0.tar.gz 8.jpg
Unknown file: train_val.prototxt
Unknown file: solver.prototxt
WARNING
: Logging before InitGoogleLogging() is written to STDERR
F0329
01:14:16.736501 18372 conv_layer.cpp:78] Cannot use GPU in CPU-only Caffe: check mode.
*** Check failure stack trace: ***
Aborted (core dumped)


,
I must be missing some check or something, but it is clearly thinking that I am in gpu mode. In terms of hardware i can't be in gpu mode, because my gfx card isn't supported. Below are some fixes I have tried from researching the issue, still no dice.

When I run "cmake .." in the build folder it says
--   BUILD_SHARED_LIBS :   ON
--   BUILD_python      :   ON
--   BUILD_matlab      :   OFF
--   BUILD_docs        :   ON
--   CPU_ONLY          :   ON
--   USE_OPENCV        :   ON
--   USE_LEVELDB       :   ON
--   USE_LMDB          :   ON
--   ALLOW_LMDB_NOLOCK :   OFF




I also made sure the "Makefile.config" file had the cpu_only line uncommented
# CPU-only switch (uncomment to build without GPU support).
CPU_ONLY
:= 1


Also, in the model I downloaded, the "solver.prototxt" file has this line
snapshot_prefix: "snapshot"
solver_mode: CPU
net
: "train_val.prototxt"
solver_type
: SGD

I've also tried removing or commenting out the section of code in the "conv_layer.cpp" file that is in question, but then I just get errors when trying to build
//#ifdef CPU_ONLY
//STUB_GPU(ConvolutionLayer);
//#endif


leads to

[ 83%] Built target caffe
Linking CXX executable caffe
../lib/libcaffe-nv.so.0.14.2: undefined reference to `caffe::ConvolutionLayer<float>::Backward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'
../lib/libcaffe-nv.so.0.14.2: undefined reference to `
caffe::ConvolutionLayer<double>::Backward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<bool, std::allocator<bool> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
../lib/libcaffe-nv.so.0.14.2: undefined reference to `caffe::ConvolutionLayer<float>::Forward_gpu(std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&, std::vector<caffe::Blob<float>*, std::allocator<caffe::Blob<float>*> > const&)'

../lib/libcaffe-nv.so.0.14.2: undefined reference to `caffe::ConvolutionLayer<double>::Forward_gpu(std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&, std::vector<caffe::Blob<double>*, std::allocator<caffe::Blob<double>*> > const&)'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/caffe] Error 1
make[1]: *** [tools/CMakeFiles/caffe.bin.dir/all] Error 2
make: *** [all] Error 2



I'm at the point where i'm about to buy a new gfx card just so I don't have to deal with the cpu/gpu issues. However, this brings up another question. If I'm getting these errors when trying to integrate the model in other applications (currently python right now), does that mean other machines trying to use this model needs to have caffe installed/configured as well? If so, doesn't this defeat the point of training a weighted network and distributing it? My end goal is to integrate the model via embedded python code into a GUI application developed in Qt. Thanks for any assistance!

Greg Heinrich

unread,
Mar 29, 2016, 6:44:00 PM3/29/16
to DIGITS Users
Hello Claude,
your Caffe binary looks OK. You need to add '--nogpu' in your use_archive.py command line.

And yes, Caffe needs to be installed on any machine you need to deploy your neural network onto.

Regards.

Claude Phan

unread,
Mar 29, 2016, 11:57:30 PM3/29/16
to DIGITS Users
Greg,

Thanks for your reply. I'm so careless, it was right there in the example. Must have been up too late yesterday.

Appreciate your time. Thanks!

- Claude
Reply all
Reply to author
Forward
0 new messages