Hi all,
I am new to boost::thread, and i am trying to understand how CAFFE use multi thread to do data prefetching.
What i think about prefetching is that it starts a thread to fetch data into the queue at the beginning. Afterwards during each step, the data layer only need to read data from the queue.
But i am confused after reading the "base_data_layer.cpp".
template <typename Dtype>
void BasePrefetchingDataLayer<Dtype>::Forward_gpu( const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top)
{ // First, join the thread
JoinPrefetchThread();
// Reshape to loaded data. top[0]->ReshapeLike(this->prefetch_data_);
// Copy the data caffe_copy(prefetch_data_.count(), prefetch_data_.cpu_data(), top[0]->mutable_gpu_data());
if (this->output_labels_)
{
// Reshape to loaded labels. top[1]->ReshapeLike(prefetch_label_);
// Copy the labels. caffe_copy(prefetch_label_.count(), prefetch_label_.cpu_data(), top[1]->mutable_gpu_data());
}
#ifdef USE_MPI
//advance (all_rank - (my_rank+1)) mini-batches to be ready for next run
BaseDataLayer<Dtype>::OffsetCursor(top[0]->num() * (Caffe::MPI_all_rank() - 1));
#endif
// Start a new prefetch thread
CreatePrefetchThread(); }
Seems in every call to the "net.forward", the datalayer run the "JoinPrefetchThread();" which blocks main thread and read a batch data from the disk.
If this is the case, then how this be considered "prefecthing"? I think we should read from a queue that is simultanously be feeded with another thread?
Can anyone shed some light on where i undersand wrong? I appreciate a lot!