Caffe RC3 slower with cuDNN v4 vs. without?!

29 views

Skip to first unread message

ma...@markable.us

unread,

Jul 28, 2016, 11:58:59 PM7/28/16

to Caffe Users

The Caffe docs say, "cuDNN Caffe: for fastest operation Caffe is accelerated by drop-in integration of NVIDIA cuDNN. To speed up your Caffe models, install cuDNN then uncomment the USE_CUDNN := 1 flag in Makefile.config when installing Caffe." However when I tried this, I found that Caffe (PyCaffe specifically) actually ran slower with cuDNN than without cuDNN. Has anyone else experienced this? Or does anyone think there's anything suspect with my builds?

Below are operation times running the same feature extraction code in a loop with the same Caffe model on servers having different Caffe RC3 builds.

ATLAS with CUDA 7.5, without cuDNN

optime: 0:00:00.141123

optime: 0:00:00.141044

optime: 0:00:00.140796

optime: 0:00:00.140881

optime: 0:00:00.140706

optime: 0:00:00.141275

optime: 0:00:00.141032

optime: 0:00:00.141049

optime: 0:00:00.140871

ATLAS with CUDA 7.0 and cuDNN v4

optime: 0:00:00.157828

optime: 0:00:00.157653

optime: 0:00:00.157314

optime: 0:00:00.156893

optime: 0:00:00.155795

optime: 0:00:00.157192

optime: 0:00:00.155587

optime: 0:00:00.155364

optime: 0:00:00.155914

OpenBLAS with CUDA 7.5, without cuDNN

optime: 0:00:00.150775

optime: 0:00:00.152572

optime: 0:00:00.152900

optime: 0:00:00.154615

optime: 0:00:00.151565

optime: 0:00:00.153476

optime: 0:00:00.151332

optime: 0:00:00.151705

optime: 0:00:00.153208

OpenBLAS with CUDA 7.0 and cuDNN v4

optime: 0:00:00.162633

optime: 0:00:00.161286

optime: 0:00:00.161574

optime: 0:00:00.162428

optime: 0:00:00.159456

optime: 0:00:00.160352

optime: 0:00:00.161450

optime: 0:00:00.162175

optime: 0:00:00.160275

Reply all

Reply to author

Forward

0 new messages