Bad performance with pthread and GPU

Francisco José Sánchez Sánchez

unread,

Dec 23, 2015, 6:46:54 AM12/23/15

to Caffe Users

Hi all,

I'm developing a server for classifying images. I have a thread for classify and other for receive the image. To classify I use the MemoryDataLayer and Mat. I only time to the function ForwardPrefilled and I get ~0.05s without threads and ~0.8s with threads. Do you have any idea about it?

    caffe::Caffe::set_mode(caffe::Caffe::GPU);
    caffe::Caffe::SetDevice(0);
    
    m_net = new caffe::Net<float>(MODEL_FILE, caffe::TEST);
    m_net->CopyTrainedLayersFrom(TRAINED_FILE);

----------------------------------
float* Predict(const cv::Mat *image)
{
    float loss = 0.0;
    std::vector<cv::Mat> patches;
    boost::shared_ptr<caffe::MemoryDataLayer<float> > memory_data_layer;
    std::vector<int> labels(1);
    
    // set the patch for testing
    patches.push_back(*image);

    // push vector<Mat> to data layer
    memory_data_layer = boost::static_pointer_cast<caffe::MemoryDataLayer<float> >(m_net->layer_by_name("data"));
    memory_data_layer->AddMatVector(patches, labels);
    
    // Net forward
    
    const std::vector<caffe::Blob<float>*> & results = m_net->ForwardPrefilled(&loss);
    
    float *output = results[1]->mutable_cpu_data();

    return output;
}

Thanks

Francisco José Sánchez Sánchez

unread,

Dec 23, 2015, 7:59:36 AM12/23/15

to Caffe Users

I solved my problem. For some reason, when I run with threads, caffe changes to CPU mode. My solution consists in to set mode GPU with each prediction.

Felix Abecassis

unread,

Dec 24, 2015, 5:26:41 AM12/24/15

to Caffe Users

This is because the Caffe context is thread-local storage:
https://github.com/BVLC/caffe/blob/master/src/caffe/common.cpp#L13

Reply all

Reply to author

Forward