Study - learning a Gaussian filter with FCN

18 views
Skip to first unread message

Hieu Do Trung

unread,
Oct 25, 2016, 10:50:57 PM10/25/16
to Caffe Users
I give a test with Fully Convolutional Network in its simplest form: only 1 convolution layer.
Input is an image, ground truth (label) is the same image, blurred by a Gaussian filter.
I created 2 lmdbs for this purpose, using convert_imageset with SHUFFLE OFF.

When start training, the loss is very high, and keep exloding.

I1026 09:39:27.398694  7588 solver.cpp:280] Learning Rate Policy: step
I1026 09:39:27.400056  7588 solver.cpp:228] Iteration 0, loss = 1.5865e+08
I1026 09:39:27.400552  7588 solver.cpp:244]     Train net output #0: label-image = 0
I1026 09:39:27.400607  7588 solver.cpp:244]     Train net output #1: label-label = 0
I1026 09:39:27.400626  7588 solver.cpp:244]     Train net output #2: loss = 1.5865e+08 (* 1 = 1.5865e+08 loss)
I1026 09:39:27.400636  7588 sgd_solver.cpp:106] Iteration 0, lr = 0.0001
I1026 09:39:27.401329  7588 solver.cpp:228] Iteration 1, loss = 7.30608e+19
I1026 09:39:27.401844  7588 solver.cpp:244]     Train net output #0: label-image = 0
I1026 09:39:27.401867  7588 solver.cpp:244]     Train net output #1: label-label = 0
I1026 09:39:27.401885  7588 solver.cpp:244]     Train net output #2: loss = 7.30608e+19 (* 1 = 7.30608e+19 loss)
I1026 09:39:27.401890  7588 sgd_solver.cpp:106] Iteration 1, lr = 0.0001
I1026 09:39:27.402525  7588 solver.cpp:228] Iteration 2, loss = 2.07349e+31
I1026 09:39:27.403029  7588 solver.cpp:244]     Train net output #0: label-image = 0
I1026 09:39:27.403059  7588 solver.cpp:244]     Train net output #1: label-label = 0
I1026 09:39:27.403076  7588 solver.cpp:244]     Train net output #2: loss = 2.07349e+31 (* 1 = 2.07349e+31 loss)


If I use to scale the input down:

transform_param {
    scale: 0.00390625
  }

the loss also explodes very quickly

I1026 09:49:33.776852  7730 solver.cpp:228] Iteration 0, loss = 3285.46
I1026 09:49:33.777329  7730 solver.cpp:244]     Train net output #0: label-image = 0
I1026 09:49:33.777371  7730 solver.cpp:244]     Train net output #1: label-label = 0
I1026 09:49:33.777389  7730 solver.cpp:244]     Train net output #2: loss = 3285.46 (* 1 = 3285.46 loss)
I1026 09:49:33.777411  7730 sgd_solver.cpp:106] Iteration 0, lr = 0.0001
I1026 09:49:33.778067  7730 solver.cpp:228] Iteration 1, loss = 297129
I1026 09:49:33.778553  7730 solver.cpp:244]     Train net output #0: label-image = 0
I1026 09:49:33.778580  7730 solver.cpp:244]     Train net output #1: label-label = 0
I1026 09:49:33.778589  7730 solver.cpp:244]     Train net output #2: loss = 297129 (* 1 = 297129 loss)
I1026 09:49:33.778604  7730 sgd_solver.cpp:106] Iteration 1, lr = 0.0001
I1026 09:49:33.779294  7730 solver.cpp:228] Iteration 2, loss = 1.36426e+07
I1026 09:49:33.779826  7730 solver.cpp:244]     Train net output #0: label-image = 0
I1026 09:49:33.779848  7730 solver.cpp:244]     Train net output #1: label-label = 0
I1026 09:49:33.779856  7730 solver.cpp:244]     Train net output #2: loss = 1.36426e+07 (* 1 = 1.36426e+07 loss)
I1026 09:49:33.779862  7730 sgd_solver.cpp:106] Iteration 2, lr = 0.0001
I1026 09:49:33.780575  7730 solver.cpp:228] Iteration 3, loss = 4.50344e+08
I1026 09:49:33.781025  7730 solver.cpp:244]     Train net output #0: label-image = 0
I1026 09:49:33.781045  7730 solver.cpp:244]     Train net output #1: label-label = 0
I1026 09:49:33.781054  7730 solver.cpp:244]     Train net output #2: loss = 4.50344e+08 (* 1 = 4.50344e+08 loss)


Any ideas why is that?
Attached is the train_val and solver for this.

Thanks in advance.
solver.prototxt
train_val-.prototxt

Hieu Do Trung

unread,
Oct 25, 2016, 11:22:27 PM10/25/16
to Caffe Users
I found out why.
Base learning rate and weight decay are way too high.
Reply all
Reply to author
Forward
0 new messages