finetune_flickr_style does not converge

123 views
Skip to first unread message

Saurabh B

unread,
Sep 21, 2015, 2:55:40 PM9/21/15
to Caffe Users
Hi there,

I am using AWS (initialized with torch ami) with cuDNN, openBLAS. 

I am trying to finetune imagenet to my dataset using the example given. The loss rate does not seem to be going out and the accuracy is 0. In other words, the network is not learning anything. I have 6220 classes to predict and not a lot of examples for a lot of them.

I had to drop my learning rate to 0.0001 because it was exploding with 0.001. But other than that and the prescribed change to the last layer, all settings are from the example.

Here is what I see -

I0921 16:09:25.740398 29041 solver.cpp:294] Solving dressNetCaffeNet
I0921 16:09:25.740409 29041 solver.cpp:295] Learning Rate Policy: step
I0921 16:09:25.742091 29041 solver.cpp:347] Iteration 0, Testing net (#0)
I0921 16:09:42.157130 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 16:16:47.859333 29041 solver.cpp:259]     Train net output #0: loss = 8.80968 (* 1 = 8.80968 loss)
I0921 16:16:47.859359 29041 solver.cpp:590] Iteration 980, lr = 0.0005
I0921 16:16:56.108192 29041 solver.cpp:347] Iteration 1000, Testing net (#0)
I0921 16:17:12.895596 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 16:24:18.454080 29041 solver.cpp:259]     Train net output #0: loss = 8.81259 (* 1 = 8.81259 loss)
I0921 16:24:18.454097 29041 solver.cpp:590] Iteration 1980, lr = 0.0005
I0921 16:24:26.704149 29041 solver.cpp:347] Iteration 2000, Testing net (#0)
I0921 16:24:43.515666 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 16:31:48.977753 29041 solver.cpp:259]     Train net output #0: loss = 8.67613 (* 1 = 8.67613 loss)
I0921 16:31:48.977771 29041 solver.cpp:590] Iteration 2980, lr = 0.0005
I0921 16:31:57.229128 29041 solver.cpp:347] Iteration 3000, Testing net (#0)
I0921 16:32:14.039060 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 16:39:19.486088 29041 solver.cpp:259]     Train net output #0: loss = 9.15371 (* 1 = 9.15371 loss)
I0921 16:39:19.486121 29041 solver.cpp:590] Iteration 3980, lr = 0.0005
I0921 16:39:27.726157 29041 solver.cpp:347] Iteration 4000, Testing net (#0)
I0921 16:39:44.515908 29041 solver.cpp:415]     Test net output #0: accuracy = 0.0044
--
I0921 16:46:49.864162 29041 solver.cpp:259]     Train net output #0: loss = 8.21185 (* 1 = 8.21185 loss)
I0921 16:46:49.864181 29041 solver.cpp:590] Iteration 4980, lr = 0.0005
I0921 16:46:58.108023 29041 solver.cpp:347] Iteration 5000, Testing net (#0)
I0921 16:47:14.932720 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 16:54:20.469153 29041 solver.cpp:259]     Train net output #0: loss = 8.77958 (* 1 = 8.77958 loss)
I0921 16:54:20.469172 29041 solver.cpp:590] Iteration 5980, lr = 0.0005
I0921 16:54:28.724863 29041 solver.cpp:347] Iteration 6000, Testing net (#0)
I0921 16:54:45.515247 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 17:01:50.919855 29041 solver.cpp:590] Iteration 6980, lr = 0.0005
I0921 17:01:59.164470 29041 solver.cpp:347] Iteration 7000, Testing net (#0)
I0921 17:02:00.270820 29041 blocking_queue.cpp:50] Data layer prefetch queue empty
I0921 17:02:18.438571 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 17:09:23.762995 29041 solver.cpp:259]     Train net output #0: loss = 7.55561 (* 1 = 7.55561 loss)
I0921 17:09:23.763013 29041 solver.cpp:590] Iteration 7980, lr = 0.0005
I0921 17:09:32.005370 29041 solver.cpp:347] Iteration 8000, Testing net (#0)
I0921 17:09:48.804910 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 17:16:54.173910 29041 solver.cpp:259]     Train net output #0: loss = 8.87601 (* 1 = 8.87601 loss)
I0921 17:16:54.173930 29041 solver.cpp:590] Iteration 8980, lr = 0.0005
I0921 17:17:02.415092 29041 solver.cpp:347] Iteration 9000, Testing net (#0)
I0921 17:17:19.173540 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 17:24:32.835979 29041 solver.cpp:468] Snapshotting to binary proto file finetune_flickrStyle_style_iter_10000.caffemodel
I0921 17:24:35.249092 29041 solver.cpp:753] Snapshotting solver state to binary proto file finetune_flickrStyle_style_iter_10000.solverstate
I0921 17:24:35.896628 29041 solver.cpp:347] Iteration 10000, Testing net (#0)
I0921 17:24:52.284111 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 17:31:57.645282 29041 solver.cpp:259]     Train net output #0: loss = 8.62406 (* 1 = 8.62406 loss)
I0921 17:31:57.645301 29041 solver.cpp:590] Iteration 10980, lr = 0.0005
I0921 17:32:05.886790 29041 solver.cpp:347] Iteration 11000, Testing net (#0)
I0921 17:32:22.671668 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 17:39:28.018497 29041 solver.cpp:259]     Train net output #0: loss = 8.7107 (* 1 = 8.7107 loss)
I0921 17:39:28.018537 29041 solver.cpp:590] Iteration 11980, lr = 0.0005
I0921 17:39:36.260735 29041 solver.cpp:347] Iteration 12000, Testing net (#0)
I0921 17:39:53.007030 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 17:46:58.403879 29041 solver.cpp:259]     Train net output #0: loss = 8.89754 (* 1 = 8.89754 loss)
I0921 17:46:58.403898 29041 solver.cpp:590] Iteration 12980, lr = 0.0005
I0921 17:47:06.653604 29041 solver.cpp:347] Iteration 13000, Testing net (#0)
I0921 17:47:23.442884 29041 solver.cpp:415]     Test net output #0: accuracy = 0
--
I0921 17:54:28.886819 29041 solver.cpp:259]     Train net output #0: loss = 8.6877 (* 1 = 8.6877 loss)
I0921 17:54:28.886837 29041 solver.cpp:590] Iteration 13980, lr = 0.0005
I0921 17:54:37.126163 29041 solver.cpp:347] Iteration 14000, Testing net (#0)
I0921 17:54:53.907323 29041 solver.cpp:415]     Test net output #0: accuracy = 0.0366



Any tips on what I should look into next?


Saurabh B

unread,
Sep 22, 2015, 10:57:00 AM9/22/15
to Caffe Users
Figured out... I had to shuffle.. there is an option in train_val.proto. I don't know why it's not turned on.
Reply all
Reply to author
Forward
0 new messages