Hello. I'm not very experienced in Caffe and seems that need help.
I want to retrain GoogleNet to reproduce Deep Dream example.
So what I did: I collected 2893 examples, convert all them to 3 channels .jpg 256x256 and put each one to each own directory, to make them all classes.
Then I downloaded Caffe, changed three num_outputs in train_val.prototxt from 1000 to 2893 and changed 1000 to 2893 in deploy.prototxt.
So in my understanding of Caffe I did all is needed to retrain (also I renamed classification layers to load weights from Imagenet trained GoogleNet).
But I faced strange issue. With learning reate 0.01 it converges in < 50 iterations to impossible little values of loss and then goes to negative!
Example:
I0404 13:00:21.652196 16349 solver.cpp:337] Iteration 55, Testing net (#0)
I0404 13:01:12.943830 16349 solver.cpp:404] Test net output #0: loss1/loss1 = 0 (* 0.3 = 0 loss)
I0404 13:01:12.943946 16349 solver.cpp:404] Test net output #1: loss1/top-1 = 1
I0404 13:01:12.943969 16349 solver.cpp:404] Test net output #2: loss1/top-5 = 1
I0404 13:01:12.943979 16349 solver.cpp:404] Test net output #3: loss2/loss2 = 0 (* 0.3 = 0 loss)
I0404 13:01:12.943985 16349 solver.cpp:404] Test net output #4: loss2/top-1 = 1
I0404 13:01:12.943991 16349 solver.cpp:404] Test net output #5: loss2/top-5 = 1
I0404 13:01:12.944000 16349 solver.cpp:404] Test net output #6: loss3/loss3 = 0 (* 1 = 0 loss)
I0404 13:01:12.944006 16349 solver.cpp:404] Test net output #7: loss3/top-1 = 1
I0404 13:01:12.944012 16349 solver.cpp:404] Test net output #8: loss3/top-5 = 1
I0404 13:01:13.739944 16349 solver.cpp:228] Iteration 55, loss = 5.54323e-08
I0404 13:01:13.740008 16349 solver.cpp:244] Train net output #0: loss1/loss1 = 0 (* 0.3 = 0 loss)
I0404 13:01:13.740020 16349 solver.cpp:244] Train net output #1: loss2/loss2 = 0 (* 0.3 = 0 loss)
I0404 13:01:13.740030 16349 solver.cpp:244] Train net output #2: loss3/loss3 = 0 (* 1 = 0 loss)
I0404 13:01:13.740038 16349 sgd_solver.cpp:106] Iteration 55, lr = 0.01
It is output with training from scratch! In that example I didn't set Imagenet weights.
Only with learning rate <= 1e-8 networks doint something that looks like normal training.
What am I doing wrong? Please, help me understand network behavior.
I've attached all files.