Error while training a network for de-noising (network output is an image)

80 views

Skip to first unread message

SRP

unread,

Apr 2, 2016, 3:26:08 AM4/2/16

to Caffe Users

Hello everyone,

I am trying to use caffe to build a network for the purpose of de-noising. Unlike all the caffe (classification) examples provided in the Github repository / documentation, given an image as input to my network, it outputs another image (and not a singular, integer label).

After going through code, issues, and the mailing list, I was able to see that this is indeed possible in caffe.

I have prepared my dataset based on the code by @shelhamer mentioned in this issue: https://github.com/BVLC/caffe/issues/1698#issuecomment-70211045 using 50x50, 3 channel (RGB), PNG image files.

Here is a dummy network I am working with (see here for an easy visualization):

layer {

type: "Data"

top: "data"

include {

phase: TRAIN

}

data_param {

source: "./new_50_train"

batch_size: 1

backend: LMDB

}

layer {

type: "Data"

top: "res"

include {

phase: TRAIN

}

data_param {

source: "./new_42_train"

batch_size: 1

backend: LMDB

}

layer {

type: "Data"

top: "data"

include {

phase: TEST

}

data_param {

source: "./new_50_test"

batch_size: 1

backend: LMDB

}

layer {

type: "Data"

top: "res"

include {

phase: TEST

}

data_param {

source: "./new_42_test"

batch_size: 1

backend: LMDB

}

layer {

type: "Convolution"

bottom: "data"

top: "conv1"

param {

lr_mult: 1

}

param {

lr_mult: 2

}

convolution_param {

num_output: 3

kernel_size: 5

stride: 1

weight_filler {

type: "xavier"

}

bias_filler {

type: "constant"

}

layer {

type: "Convolution"

bottom: "conv1"

top: "conv2"

param {

lr_mult: 1

}

param {

lr_mult: 2

}

convolution_param {

num_output: 3

kernel_size: 5

stride: 1

weight_filler {

type: "xavier"

}

bias_filler {

type: "constant"

}

layer {

type: "Accuracy"

bottom: "conv2"

bottom: "res"

top: "accuracy"

include {

phase: TEST

}

layer {

type: "SoftmaxWithLoss"

bottom: "conv2"

bottom: "res"

top: "loss"

}

When I run the above model, I am getting the following error:

I0402 02:11:52.000156 27783 layer_factory.hpp:77] Creating layer loss

I0402 02:11:52.000206 27783 net.cpp:91] Creating Layer loss

I0402 02:11:52.000231 27783 net.cpp:425] loss <- conv2

I0402 02:11:52.000264 27783 net.cpp:425] loss <- res

I0402 02:11:52.000301 27783 net.cpp:399] loss -> loss

I0402 02:11:52.000358 27783 layer_factory.hpp:77] Creating layer loss

F0402 02:11:52.000705 27783 softmax_loss_layer.cpp:47] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (1764 vs. 5292) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.

*** Check failure stack trace: ***

@ 0x2b0c90171daa (unknown)

@ 0x2b0c90171ce4 (unknown)

@ 0x2b0c901716e6 (unknown)

@ 0x2b0c90174687 (unknown)

@ 0x2b0c8f53c7a7 caffe::SoftmaxWithLossLayer<>::Reshape()

@ 0x2b0c8f4b0507 caffe::Layer<>::SetUp()

@ 0x2b0c8f49c581 caffe::Net<>::Init()

@ 0x2b0c8f49a917 caffe::Net<>::Net()

@ 0x2b0c8f4c5b9b caffe::Solver<>::InitTrainNet()

@ 0x2b0c8f4c53be caffe::Solver<>::Init()

@ 0x2b0c8f4c4e5a caffe::Solver<>::Solver()

@ 0x2b0c8f472bab caffe::SGDSolver<>::SGDSolver()

@ 0x2b0c8f4831fb caffe::Creator_SGDSolver<>()

@ 0x41b0cf caffe::SolverRegistry<>::CreateSolver()

@ 0x41676c train()

@ 0x418c01 main

@ 0x2b0c914a9ec5 (unknown)

@ 0x4155b9 (unknown)

@ (nil) (unknown)

make: *** [new] Aborted (core dumped)

Currently, I have hit a dead-end unable to understand the error and fix it. Any pointers or suggestions in helping me resolve this would be highly appreciated.

Since the documentation for this is very sparse, if I get this working, I would love to spend some time to contribute back by writing a tutorial or updating the docs so that others working on a similar problem can get started easily.

Jan

unread,

Apr 15, 2016, 7:31:12 AM4/15/16

to Caffe Users

What are you trying to achieve? Are you doing classification on the pixel level? In that case you should set the softmax axis. Or predict the image in a different resolution? then this is a regression task and you should use an euclidean loss rather than softmax.

To your mismatch error: Have you noticed that 1764 * 3 == 5292? probably you have 3 channels on the one side and only one on the other side. What does the network scaffolding messages say about the blob sizes of res and conv2?

Jan

Reply all

Reply to author

Forward

0 new messages