Hi,
I'm sorry in advance to duplicate the issue reporting but I couldn't find any similar problem.
Platform: Ubuntu 16.04
caffe commit: 4ba654f5c88c36ee8ba53964b7faf25c6d7010b4
graphics: GeForce GTX 1070
caffe is configured for GPU (#CPU_ONLY := 1; USE_CUDNN := 1 in Makefile.config), I have cuda libs installed. NVIDIA driver is working fine. When I train caffe models nvidia-smi shows 80-99% GPU usage. The same with dlib BTW.
So I have caffe-based project and there seem to be an issue with GPU mode since it does not introduce any speedup or even GPU usage in nvidia-smi output.
Questions:
1. When I toggle Caffe::set_mode(Caffe::GPU/CPU); the _net->Forward() times are pretty much the same. nvidia-smi shows 0% GPU usage in both cases. Is there anything I can do to see if Forward() function distributes network calculation task to the GPU?
2. I implemented GPU based input preprocessing like here:
https://github.com/NVIDIA/gpu-rest-engine/blob/master/caffe/classification.cppso now I've replaced mutable_cpu_data() with mutable_gpu_data()
This does not seem to change anything in processing speed but nvidia-smi shows 2-5% of GPU usage.
This looks weird since _net->Forward() times of my models are about 100-500ms so GPU usage should be shown above the 0% in my opinion if these tasks are going for GPU processing. Moreover top command shows all the CPU cores to be taken by the application at 40-90% where there should be way less because I have only two parallel threads.
nvidia-smi output
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57 Driver Version: 367.57 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 Off | 0000:01:00.0 On | N/A |
| 0% 57C P2 42W / 230W | 589MiB / 8106MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1139 G /usr/lib/xorg/Xorg 66MiB |
| 0 2061 G kwin_x11 15MiB |
| 0 2073 G /usr/bin/krunner 10MiB |
| 0 2075 G /usr/bin/plasmashell 38MiB |
| 0 3075 G /usr/lib/firefox/firefox 1MiB |
| 0 7322 C ./analyzer 454MiB |
+-----------------------------------------------------------------------------+top output
%Cpu0 : 57,5 us, 0,7 sy, 0,0 ni, 61,2 id, 0,7 wa, 0,0 hi, 0,0 si, 0,0 st
%Cpu1 : 52,5 us, 2,0 sy, 0,0 ni, 55,5 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
%Cpu2 : 84,5 us, 1,4 sy, 0,0 ni, 14,2 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
%Cpu3 : 59,3 us, 0,7 sy, 0,0 ni, 59,7 id, 0,3 wa, 0,0 hi, 0,0 si, 0,0 st
KiB Mem : 8165796 total, 1456812 free, 3226004 used, 3482980 buff/cache
KiB Swap: 8377340 total, 8370476 free, 6864 used. 4521444 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7367 rafal 20 0 23,747g 0,994g 384208 S 201,7 12,8 0:26.44 analyzer