Caffe gpu mode behavior in unified memory architecture

53 views
Skip to first unread message

Pankaj Randhe

unread,
Apr 15, 2016, 3:42:09 PM4/15/16
to Caffe Users
Hi..

I am running Caffe on NVIDIA Jetson TK1 which is a platform for embedded system development. I am using Pycaffe in GPU mode. I am worried about the speedup of my Caffe model since, in spite of 192 cores Jetson Kepler GPU, a forward pass is taking approx. 5 seconds in contrast to the 2.07 seconds it takes on GeForce GPU with 48 cores. 

Jetson's Kepler GPU uses unified memory approach in contrast to the dedicated device memory used by GeForce GPUs. Could this be the reason (probably because GPU version of Caffe couldn't handle unified memory well) for higher forward pass time.

Any thoughts or hints on this behaviour would really be helpful!

Thank you.
Reply all
Reply to author
Forward
0 new messages