Hi ,
A typical resnet, model forward is taking 31ms for 0.2 data points in my home 6gb pc, but the same model forward is taking 51 ms for the same data on GCP k80 12GB. I am confused, missing something.
These are the stats while model is training on k80 gpu;
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.51 Driver Version: 375.51 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 On | 0000:00:04.0 Off | 0 |
| N/A 40C P0 107W / 149W | 644MiB / 11439MiB | 85% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 3957 C /usr/local/torch/install/bin/luajit 642MiB |
+-----------------------------------------------------------------------------+
It's hardly using any memory. Any thoughts?
Thanks