Caffe RC3 slower with cuDNN v4 vs. without?!

29 views
Skip to first unread message

ma...@markable.us

unread,
Jul 28, 2016, 11:58:59 PM7/28/16
to Caffe Users
The Caffe docs say, "cuDNN Caffe: for fastest operation Caffe is accelerated by drop-in integration of NVIDIA cuDNN. To speed up your Caffe models, install cuDNN then uncomment the USE_CUDNN := 1 flag in Makefile.config when installing Caffe." However when I tried this, I found that Caffe (PyCaffe specifically) actually ran slower with cuDNN than without cuDNN. Has anyone else experienced this? Or does anyone think there's anything suspect with my builds?

Below are operation times running the same feature extraction code in a loop with the same Caffe model on servers having different Caffe RC3 builds.

ATLAS with CUDA 7.5, without cuDNN

optime: 0:00:00.141123
optime: 0:00:00.141044
optime: 0:00:00.140796
optime: 0:00:00.140881
optime: 0:00:00.140706
optime: 0:00:00.141275
optime: 0:00:00.141032
optime: 0:00:00.141049
optime: 0:00:00.140871

ATLAS with CUDA 7.0 and cuDNN v4

optime: 0:00:00.157828
optime: 0:00:00.157653
optime: 0:00:00.157314
optime: 0:00:00.156893
optime: 0:00:00.155795
optime: 0:00:00.157192
optime: 0:00:00.155587
optime: 0:00:00.155364
optime: 0:00:00.155914

OpenBLAS with CUDA 7.5, without cuDNN

optime: 0:00:00.150775
optime: 0:00:00.152572
optime: 0:00:00.152900
optime: 0:00:00.154615
optime: 0:00:00.151565
optime: 0:00:00.153476
optime: 0:00:00.151332
optime: 0:00:00.151705
optime: 0:00:00.153208

OpenBLAS with CUDA 7.0 and cuDNN v4

optime: 0:00:00.162633
optime: 0:00:00.161286
optime: 0:00:00.161574
optime: 0:00:00.162428
optime: 0:00:00.159456
optime: 0:00:00.160352
optime: 0:00:00.161450
optime: 0:00:00.162175
optime: 0:00:00.160275
Reply all
Reply to author
Forward
0 new messages