How does caffe transform a 4-D weights blobs_[0] to 2-D in base_conv_layer.cpp

33 views
Skip to first unread message

Lazio

unread,
Jun 20, 2018, 4:30:24 AM6/20/18
to Caffe Users
Hello everyone !

I'm learning the source code of caffe especially how caffe conduct convolution, so I tried to read the source code in base_conv_layer.cpp, base_conv_layer.hpp, conv_layer.cpp and conv_layer.hpp. I have learnt the caffe processes the convolution using im2col and col2im. The im2col will transform the 3-D input data to a 2-D matrix and then use the caffe_cpu_gemm function which based on the blas_gemm to conduct convolution matrix by matrix. During the Forward two 2-D matrixes should be put into the caffe_cpu_gemm function to calculate the output of the  convolution layer , one is the im2col transformed_input_data from bottom and the other is the weights in this layer and both should be 2-D matrix. based on the rules of  blas_gemm.

In base_conv_layer.cpp, blobs_[0] holds the filter weights ,and this->blobs_[0].reset(new Blob<Dtype>(weight_shape)) Initialize the weights.   And weight_shape is 4-D :
vector<int> weight_shape(2);  
  weight_shape[0] = conv_out_channels_;
  weight_shape[1] = conv_in_channels_ / group_;
  for (int i = 0; i < num_spatial_axes_; ++i) {  
    weight_shape.push_back(kernel_shape_data[i]);  
  }

When forward caffe process like this:
template <typename Dtype>  
void ConvolutionLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,  
      const vector<Blob<Dtype>*>& top) {  
  const Dtype* weight = this->blobs_[0]->cpu_data();
  for (int i = 0; i < bottom.size(); ++i) {  
    const Dtype* bottom_data = bottom[i]->cpu_data();
    Dtype* top_data = top[i]->mutable_cpu_data();  
    for (int n = 0; n < this->num_; ++n) {
      this->forward_cpu_gemm(bottom_data + n * this->bottom_dim_, weight,  
          top_data + n * this->top_dim_);  
      if (this->bias_term_) {
        const Dtype* bias = this->blobs_[1]->cpu_data();  
        this->forward_cpu_bias(top_data + n * this->top_dim_, bias);
      }  
    }  
  }  
}  

Caffe uses a 4-D weight directly which will finally be put into the function caffe_cpu_gemm. The bottom_data will be transformed from 3-D to 2-D by the im2col funciton but I cannot find how does the weight be  transformed form 4-D to 2-D matrix. I cannot find the code to process the 4-D weight.

Please help me ! Because of this question I have not a good mood the watch World Cup

Thank you ! 

Przemek D

unread,
Aug 22, 2018, 12:01:20 PM8/22/18
to Caffe Users
If you're trying to understand how does Caffe do the convolution, I think you'll find Yangqing Jia's note on this, as he originally implemented it.
Reply all
Reply to author
Forward
0 new messages