Explanation of how FC choose the features from Convolutional Layer

126 views
Skip to first unread message

Antonio Paes

unread,
May 12, 2016, 9:21:22 PM5/12/16
to Caffe Users
Hi guys,

I'm trying to "go deeper" on CNN, I understood how the filters are computed, the proposite of pooling, normalization, RELU, and dropout layers. I execute a forward pass of backpropagation on the hand step-by-step on a classic fully connected layer, it's fine. But one thing I not understood until now, how FC layer choose the features for use from convolutional layer? 

I'm try to formulate more specifically: 

Suppose which i feed my network with one RGB image of size 86x86 for training the network, for simplify i'm using mini-batch of 1 image, after compute the convolution and pooling layers is the FC layer and here is my doubt:

The top shape of my convolution layer is: 1 256 3 3 and I feed the first FC layer with this and the num_output of FC layer is 160, I know which this output of my convolution would be flattened into a array of size 1*256*3*3 = 2304. So I don't understand of how the FC layer choose the features inside a array size of 2304 and feed the output of FC layer which have the size of 160?

And my next FC layer have this same output, what will happen now?

Finally I've my last FC layer with a number of class which would be computed by Softmax.

Somebody can help me? 




 

Jan

unread,
May 13, 2016, 4:17:50 AM5/13/16
to Caffe Users
I am not sure what you mean by the FC layer "choosing features". An FC layer bascially multiplies the input vector (e.g. size 2304) with a matrix (in this case of size 160x2304) and adds a bias to produce an output vector (here of size 160). During backpropagation the weights (the values in the matrix and the bias) are adjusted by gradient descent to better produce the desired output. There is no "choosing" of features, it just works with everything it gets.

The next layer has then a matrix of 160x160, if its output is also set to 160.

If that doesn't answer your question maybe you can clarify?

Jan

Antonio Paes

unread,
May 13, 2016, 8:22:48 AM5/13/16
to Caffe Users

It is almost, when you say which a FC multiplies the input vector (2304) with a matrix (160x2304) for produce my desired output which is 160, how this calculation is doing?

And suppose which is the first pass of backprop., what are the values on the vector of 160 for realize the first multiplication (160x2304)? It is random values?


Thanks Jan.

Jan

unread,
May 13, 2016, 9:34:11 AM5/13/16
to Caffe Users
Yes, usually the weights of the layers (for an fc layer the values in the matrix and the biases) are initialized with small random values. There are different kinds of random (e.g. uniform, gaussian), so there are options in the prototxt how to initialize those parameters. The backprop passes will shape these values so that for every training image the network likely predicts the (provided) label, in the best case.

Since this training/shaping of weights has mathematical pitfalls (local minima, slow convergence) as well as practical issues (what is a good learning rate? How does momentum, regularization, etc. affect learning? Weighing generalization vs. precision, ...), there is no guarantee whatsoever that a specific network with specific learning parameters and training set actually learns anything. Actually one should look at the other way: it is amazing that it actually seems to work in many instances (as shown in research papers).

Jan
Message has been deleted
Message has been deleted

Antonio Paes

unread,
May 13, 2016, 9:51:53 AM5/13/16
to Caffe Users
Hey Jan, I think I understood, is a simply multiplication between matrices right?

Conv = 256x3x3 = 1x2304 
FC = 160

So i've 1x2304 - 2304x160 =  1x160 Right?

Jan

unread,
May 13, 2016, 11:14:21 AM5/13/16
to Caffe Users
Yes, that is correct.

Jan

Antonio Paes

unread,
May 13, 2016, 1:42:58 PM5/13/16
to Caffe Users
Thanks Jan.
Reply all
Reply to author
Forward
0 new messages