GPU mode does not seem to work in caffe-based C++ project

86 views
Skip to first unread message

Rafal Kapela

unread,
Aug 17, 2017, 4:29:32 AM8/17/17
to Caffe Users
Hi,
 I'm sorry in advance to duplicate the issue reporting but I couldn't find any similar problem.

Platform: Ubuntu 16.04
caffe commit: 4ba654f5c88c36ee8ba53964b7faf25c6d7010b4
graphics: GeForce GTX 1070

caffe is configured for GPU (#CPU_ONLY := 1; USE_CUDNN := 1 in Makefile.config), I have cuda libs installed. NVIDIA driver is working fine. When I train caffe models nvidia-smi shows 80-99% GPU usage. The same with dlib BTW.

So I have caffe-based project and there seem to be an issue with GPU mode since it does not introduce any speedup or even GPU usage in nvidia-smi output.
Questions:
1. When I toggle Caffe::set_mode(Caffe::GPU/CPU); the _net->Forward() times are pretty much the same. nvidia-smi shows 0% GPU usage in both cases. Is there anything I can do to see if Forward() function distributes network calculation task to the GPU?
2. I implemented GPU based input preprocessing like here:
https://github.com/NVIDIA/gpu-rest-engine/blob/master/caffe/classification.cpp

so now I've replaced mutable_cpu_data() with mutable_gpu_data()
This does not seem to change anything in processing speed but nvidia-smi shows 2-5% of GPU usage.

This looks weird since _net->Forward() times of my models are about 100-500ms so GPU usage should be shown above the 0% in my opinion if these tasks are going for GPU processing. Moreover top command shows all the CPU cores to be taken by the application at 40-90% where there should be way less because I have only two parallel threads.



nvidia-smi output
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57                 Driver Version: 367.57                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 0000:01:00.0      On |                  N/A |
|  0%   57C    P2    42W / 230W |    589MiB /  8106MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                              
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1139    G   /usr/lib/xorg/Xorg                              66MiB |
|    0      2061    G   kwin_x11                                        15MiB |
|    0      2073    G   /usr/bin/krunner                                10MiB |
|    0      2075    G   /usr/bin/plasmashell                            38MiB |
|    0      3075    G   /usr/lib/firefox/firefox                         1MiB |
|    0      7322    C   ./analyzer                                     454MiB |
+-----------------------------------------------------------------------------+



top output
%Cpu0  : 57,5 us,  0,7 sy,  0,0 ni, 61,2 id,  0,7 wa,  0,0 hi,  0,0 si,  0,0 st
%Cpu1  : 52,5 us,  2,0 sy,  0,0 ni, 55,5 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
%Cpu2  : 84,5 us,  1,4 sy,  0,0 ni, 14,2 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
%Cpu3  : 59,3 us,  0,7 sy,  0,0 ni, 59,7 id,  0,3 wa,  0,0 hi,  0,0 si,  0,0 st
KiB Mem :  8165796 total,  1456812 free,  3226004 used,  3482980 buff/cache
KiB Swap:  8377340 total,  8370476 free,     6864 used.  4521444 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                        
 7367 rafal     20   0 23,747g 0,994g 384208 S 201,7 12,8   0:26.44 analyzer 


Reply all
Reply to author
Forward
0 new messages