I read the descriptions in "caffe.proto" and "scale_layer.hpp".
caffe.proto
For example, if bottom[0] is 4D with shape 100x3x40x60, the output
// top[0] will have the same shape, and bottom[1] may have any of the
// following shapes (for the given value of axis):
// (axis == 0 == -4) 100; 100x3; 100x3x40; 100x3x40x60
// (axis == 1 == -3) 3; 3x40; 3x40x60
// (axis == 2 == -2) 40; 40x60
// (axis == 3 == -1) 60
scale_layer.hpp
* @brief Computes the elementwise product of two input Blobs, with the shape of
* the latter Blob "broadcast" to match the shape of the former.
* Equivalent to tiling the latter Blob, then computing the elementwise
* product. Note: for efficiency and convenience, this layer can
* additionally perform a "broadcast" sum too when `bias_term: true`
* is set.
Then I try to compute the elementwise product of (1, 2048, 72, 72)(bottom[0]) and (1, 1, 72, 72)(bottom[1]).
Configuration of the "Scale" layer is as follows:
layer {
name: "inception_attention_scale"
type: "Scale"
bottom: "inception_c2_concat"
bottom: "inception_c1_attention"
top: "inception_attention_scale"
scale_param {
axis: 0
bias_term: false
num_axes: -1
}
}
Network prototxt is included in the attachment.
When I start training, it outputs as follows:
F20211208 18:22:14.814239 571 scale_layer.cpp:85] Check failed: bottom[0]->shape(axis_ + i) == scale->shape(i) (2048 vs. 1) dimension mismatch between bottom[0]->shape(1) and scale->shape(1)
Is it possible to broadcast the bottom[1] to (1, 2048, 1, 1) using only "Scale" layer?
Or is there any problem with my "Scale" layer configuration?
Any help is appreciated.
Thanks!