I have essentially the same question as the original poster, what modifications are needed to make the classification example run on a GPU? I have tried setting the mode and device but my GPU remains idle and everything clearly runs on the CPU. I have CUDA (version 8) installed and Caffe was built with CUDA support.
The WrapInputLayer member function specifically calls mutable_cpu_data(), is this correct or does it need to be changed (see below)?
void Classifier::WrapInputLayer(std::vector<cv::Mat>* input_channels)
{
Blob<float>* input_layer = net_->input_blobs()[0];
int width = input_layer->width();
int height = input_layer->height();
float* input_data = input_layer->mutable_cpu_data();
for (int i = 0; i < input_layer->channels(); ++i)
{
cv::Mat channel(height, width, CV_32FC1, input_data);
input_channels->push_back(channel);
input_data += width * height;
}
}
There doesn't appear to be any useful debug information explaining why the GPU is not being used, unless I'm missing something. Seems to me if I explicitly set the mode to GPU it should either run on the GPU or error out with some useful error message.
-Dave