Why did ResNet set up all layers (first part) are bellowed the layer conv1?

120 views
Skip to first unread message

john1...@gmail.com

unread,
Feb 13, 2017, 7:45:06 AM2/13/17
to Caffe Users
Hello all, I am reading the ResNet architecture. I am confusing that why they set up all layers are bellowed the layer conv1. Because, If I understand correctly, these layers are stacked, it means the bottom of the pooling layer (line 42) must be bottom: "conv1_relu", instead of bottom: "conv1". I found that we giving the same name to the bottom and top blobs (in-place operations) to save some memory. But they used all bottom name is conv1, instead of bottom: "bn_conv1", top: "bn_conv1" (line 25,26),  top: "scale_conv1"  bottom: "scale_conv1" (line 35,36). Why did they do it? Any benefit?

Reference:

1layer {
2    bottom: "data"
3    top: "conv1"
4    name: "conv1"
5    type: "Convolution"
6    convolution_param {
7        num_output: 64
8        kernel_size: 7
9        pad: 3
10        stride: 2
11    }
12}
13
14layer {
15    bottom: "conv1"
16    top: "conv1"
17    name: "bn_conv1"
18    type: "BatchNorm"
19    batch_norm_param {
20        use_global_stats: true
21    }
22}
23
24layer {
25    bottom: "conv1"
26    top: "conv1"
27    name: "scale_conv1"
28    type: "Scale"
29    scale_param {
30        bias_term: true
31    }
32}
33
34layer {
35    bottom: "conv1"
36    top: "conv1"
37    name: "conv1_relu"
38    type: "ReLU"
39}
40
41layer {
42    bottom: "conv1"
43    top: "pool1"
44    name: "pool1"
45    type: "Pooling"
46    pooling_param {
47        kernel_size: 3
48        stride: 2
49        pool: MAX
50    }
51}

Przemek D

unread,
Feb 14, 2017, 3:28:17 AM2/14/17
to Caffe Users
The names that layers have are pretty much only used to bind weights to them, otherwise you don't use them in a prototxt. Blobs have names which are crucial in your prototxt as they are used to control data flow through the network.
You connect layers with blobs, not other layers. By writing:
layer {
  name
: "does_not_matter_anyway"
  bottom
: "A"
  top
: "B"
 
...
}
you say: make me a layer that takes blob A below and produces blob B. If A==B, this means: make me a layer that takes blob A, does something with its data and does not produce any new blobs but leaves output in the same blob.
What name you choose for this layer is irrelevant. You don't have to know what are the layers below called and which layers will be above.

The benefit is (for example) when you transfer some weights from a pretrained model and initialize others yourself: you can only change the names of respective layers - it's just one edit per layer. If layers were connected to other layers, you'd have to look through the entire network definition and replace relevant names.

john1...@gmail.com

unread,
Feb 14, 2017, 3:44:12 AM2/14/17
to Caffe Users
Thanks. Let's look at line  41-51
layer {
    bottom
: "conv1"
    top
: "pool1"
    name
: "pool1"
    type
: "Pooling"
    pooling_param
{
        kernel_size
: 3
        stride
: 2
        pool
: MAX
   
}
 
}
The `pool1` layers are bellow `conv1`, it means its input is `conv1` output. Is it right? However, when we saw in visualization, the input of pooling layer is from `conv1_relu` (line 37). why they do not write as 
layer {
    bottom
: "conv1_relu"
    top
: "pool1"
    name
: "pool1"
    type
: "Pooling"
    pooling_param 
{
        kernel_size
: 3
        stride
: 2
        pool
: MAX
    
}
 
}

Przemek D

unread,
Feb 14, 2017, 4:04:21 AM2/14/17
to Caffe Users
The `pool1` layers are bellow `conv1`, it means its input is `conv1` output. Is it right?
Yes it is true in this case, but your reasoning behind it is not right. The pool1 layer's bottom is a blob called conv1. It doesn't know which layer produced it - it's just a naming coincidence that a layer conv1 produced a blob conv1.
 
why they do not write as  (...)
Because if you write:
  bottom: "conv1_relu"
you're saying "this layer inputs a blob conv1_relu", and there is no such blob. There is a layer of that name, but as I said, layers connect to blobs, not to other layers.

john1...@gmail.com

unread,
Feb 14, 2017, 5:01:54 AM2/14/17
to Caffe Users
Sorry, I did not really understand your point. Let's see the LeNet prototxt in examples/mnist

layer {
  name
: "conv1"
  type
: "Convolution"
  bottom
: "data"
  top
: "conv1"
 
...
}
layer
{
  name
: "pool1"
  type
: "Pooling"
  bottom
: "conv1"
  top
: "pool1"
 
...
}
layer
{
  name
: "conv2"
  type
: "Convolution"
  bottom
: "pool1"
  top
: "conv2"
 
...
}
The LeNet is clear that `conv1->pool1->conv2`. Hence, they write bottom: "conv1" the pool1 layer, and bottom: "pool1" in the conv2 layer.
If I used the same way as ResNet, I will write bottom: "conv1" the pool1 layer, and bottom: "conv1" in the conv2 layer. So the new prototxt will be
layer {
  name
: "conv1"
  type
: "Convolution"
  bottom
: "data"
  top
: "conv1"
  
...
}
layer 
{
  name
: "pool1"
  type
: "Pooling"
  bottom
: "conv1"
  top
: "pool1"
  
...
}
layer 
{
  name
: "conv2"
  type
: "Convolution"
  bottom
: "conv1"
  top
: "conv2"
  
...
}

Is it same with original LeNet? 
Message has been deleted

Przemek D

unread,
Feb 14, 2017, 6:17:17 AM2/14/17
to Caffe Users
My point was just that a layer is not a blob. Layer might input or output a blob, might have the same name as a blob, but is not a blob. The LeNet example could be written as:
layer {
  name
: "my_convolution_layer"

  type: "Convolution"
  bottom
: "data"
  top
: "conv1"
 
...
}
layer
{
  name: "pooling_is_awesome"

  type: "Pooling"
  bottom
: "conv1"
  top
: "pool1"
 
...
}
layer
{
  name: "another_conv_yeah"

  type: "Convolution"
  bottom
: "pool1"
  top
: "conv2"
 
...
}
and it would work the same.


I don't understand what is your point in this but if you write:

layer {
  name
: "conv1"
  type
: "Convolution"
  bottom
: "data"
  top
: "conv1"
 
...
}
layer
{
  name
: "pool1"
  type
: "Pooling"
  bottom
: "conv1"
  top
: "pool1"
 
...
}
layer
{
  name
: "conv2"
  type
: "Convolution"
  bottom
: "conv1"
  top
: "conv2"
 
...
}
then you first do data->conv1, then conv1->pool1 and then conv1->conv2. It doesn't seem to make any sense because blob pool1 seems to be unused (unless you use it somewhere else in the network), but it's technically correct.

john1...@gmail.com

unread,
Feb 14, 2017, 6:38:54 AM2/14/17
to Caffe Users
Yes, I write all bottom are "conv1" because I want to reference to Resnet above. As you see, the resnet also write as
 layer {
  bottom
: "data"
  top
: "conv1"
  name
: "conv1"
  type
: "Convolution"
 
...
}
layer
{
  bottom: "conv1"
  top
: "conv1"
  name
: "bn_conv1"
 
...
}
layer
{
 
bottom: "conv1"
  top
: "conv1"
  name
: "scale_conv1"
 
...
}
layer
{
 
bottom: "conv1"
  top
: "conv1"
  name
: "conv1_relu"
  type
: "ReLU"

}
layer
{
 
bottom: "conv1"
  top
: "pool1"
  name
: "pool1"
  type
: "Pooling"

 
...
}

It does not follow the LeNet style.

john1...@gmail.com

unread,
Feb 14, 2017, 7:20:42 AM2/14/17
to Caffe Users
After reading more careful your answer, I think you and me may not understand my issue. Let's make it more clear.
Yes, I was wrong when calling a layer as a blob. It must be a blob. A blob has input (defines by bottom tag) and output (defined by top tag). 
Let's see the above my answer of resnet, you can see that all blobs have same input are conv1 (highlighted). Is it wrong?  

Przemek D

unread,
Feb 14, 2017, 8:42:30 AM2/14/17
to Caffe Users
What you wrote above is correct. These are typical in-place operations, just as you mentioned in the first post. I assumed your confusion stemmed from misunderstanding of layer and blob names, which hopefully are clear to you now. But I don't understand what do you mean by "following LeNet style". ResNet and LeNet are two dramatically different architectures and obviously they will not link layers in the same way.


Yes, I was wrong when calling a layer as a blob. It must be a blob. A blob has input (defines by bottom tag) and output (defined by top tag).
I assume this is just a typo, and you meant: A layer has input (...) and output (...).

john1...@gmail.com

unread,
Feb 14, 2017, 10:01:20 AM2/14/17
to Caffe Users
Yes, I just consider the first stack part without a branch (before pool1 layer). At the first part, the structure like `data->conv1->bn_conv1->scale_conv1->conv1_relu. In LeNet, the bottom often indicates theconnection between blobs. Hence, I think the bottom value of layer conv1_relu must be bottom: "scale_conv1", instead of bottom: "conv1". You can the visualization of Resnet using Netscope. Sorry I cannot post here dueto space
Reply all
Reply to author
Forward
0 new messages