I am running Caffe on AWS on the g2.2x GPU. I have cuDNN enabled and ECC is off.
I have the Hybrid CNN model (MIT) from the model zoo.
the best speeds am getting is around 650ms for classification / recognition. I think the AWS instance is a K10.
The standard demo on Caffe (imageNet) runs in 70-80ms, but when we put in the bigger models it starts to settle around 600-700ms.
But is that all we can get?
Has anyone done better and are there any specific tweaks that can be done to get this to better levels - say <100ms?