Different outputs on the same FCN network using pycaffe and caffe

51 vistas
Ir al primer mensaje no leído

Filip K

no leída,
25 ene 2017, 9:28:30 a.m.25/1/17
para Caffe Users
I am trying to train FCN8-s network ( actually finetune it), but I am confused with the difference in the outputs.



Method 1:

From terminal I run
"C:\Users\CGVU IF Edinburgh\Downloads\New folder\caffe\build\tools\Release\caffe.exe" train --solver "C:\Users\CGVU IF Edinburgh\Downloads\New folder\caffe\python\pascalcontext-fcn8s\solver.prototxt" -weights "C:\Users\CGVU IF Edinburgh\Downloads\New folder\caffe\python\pascalcontext-fcn16s\pascalcontext-fcn16s-heavy.caffemodel"

And here is the output that I am getting for first few iterations:
I0125 13:57:38.505415   800 solver.cpp:228] Iteration 0, loss = 570848
I0125 13:57:38.505415   800 solver.cpp:244]     Train net output #0: loss = 570848 (* 1 = 570848 loss)
I0125 13:57:38.506947   800 sgd_solver.cpp:106] Iteration 0, lr = 1e-10
I0125 13:57:46.938699   800 solver.cpp:228] Iteration 20, loss = 507598
I0125 13:57:46.939194   800 solver.cpp:244]     Train net output #0: loss = 569326 (* 1 = 569326 loss)
I0125 13:57:46.939697   800 sgd_solver.cpp:106] Iteration 20, lr = 1e-10
I0125 13:57:55.310066   800 solver.cpp:228] Iteration 40, loss = 556082
I0125 13:57:55.310694   800 solver.cpp:244]     Train net output #0: loss = 570848 (* 1 = 570848 loss)
I0125 13:57:55.311794   800 sgd_solver.cpp:106] Iteration 40, lr = 1e-10
I0125 13:58:02.932865   800 solver.cpp:228] Iteration 60, loss = 561638
I0125 13:58:02.933344   800 solver.cpp:244]     Train net output #0: loss = 570848 (* 1 = 570848 loss)
I0125 13:58:02.935354   800 sgd_solver.cpp:106] Iteration 60, lr = 1e-10
I0125 13:58:10.563884   800 solver.cpp:228] Iteration 80, loss = 561410
I0125 13:58:10.563884   800 solver.cpp:244]     Train net output #0: loss = 570848 (* 1 = 570848 loss)
I0125 13:58:10.565420   800 sgd_solver.cpp:106] Iteration 80, lr = 1e-10
I0125 13:58:17.987812   800 solver.cpp:228] Iteration 100, loss = 535013
I0125 13:58:17.988140   800 solver.cpp:244]     Train net output #0: loss = 543447 (* 1 = 543447 loss)
I0125 13:58:17.989145   800 sgd_solver.cpp:106] Iteration 100, lr = 1e-10
I0125 13:58:25.198559   800 solver.cpp:228] Iteration 120, loss = 506304
I0125 13:58:25.198559   800 solver.cpp:244]     Train net output #0: loss = 506913 (* 1 = 506913 loss)
I0125 13:58:25.200048   800 sgd_solver.cpp:106] Iteration 120, lr = 1e-10
I0125 13:58:33.086098   800 solver.cpp:228] Iteration 140, loss = 532868
I0125 13:58:33.086582   800 solver.cpp:244]     Train net output #0: loss = 506913 (* 1 = 506913 loss)
I0125 13:58:33.087570   800 sgd_solver.cpp:106] Iteration 140, lr = 1e-10
I0125 13:58:40.654434   800 solver.cpp:228] Iteration 160, loss = 554636
I0125 13:58:40.654434   800 solver.cpp:244]     Train net output #0: loss = 570848 (* 1 = 570848 loss)
I0125 13:58:40.655437   800 sgd_solver.cpp:106] Iteration 160, lr = 1e-10
I0125 13:58:48.207823   800 solver.cpp:228] Iteration 180, loss = 548775
I0125 13:58:48.207823   800 solver.cpp:244]     Train net output #0: loss = 506913 (* 1 = 506913 loss)
I0125 13:58:48.208794   800 sgd_solver.cpp:106] Iteration 180, lr = 1e-10
I0125 13:58:55.930694   800 solver.cpp:228] Iteration 200, loss = 569326
I0125 13:58:55.930694   800 solver.cpp:244]     Train net output #0: loss = 570848 (* 1 = 570848 loss)
I0125 13:58:55.932205   800 sgd_solver.cpp:106] Iteration 200, lr = 1e-10
I0125 13:59:03.667673   800 solver.cpp:228] Iteration 220, loss = 569670
I0125 13:59:03.667673   800 solver.cpp:244]     Train net output #0: loss = 570848 (* 1 = 570848 loss)
I0125 13:59:03.669677   800 sgd_solver.cpp:106] Iteration 220, lr = 1e-10
I0125 13:59:11.029958   800 solver.cpp:228] Iteration 240, loss = 518598
I0125 13:59:11.029958   800 solver.cpp:244]     Train net output #0: loss = 207515 (* 1 = 207515 loss)
I0125 13:59:11.031476   800 sgd_solver.cpp:106] Iteration 240, lr = 1e-10
I0125 13:59:18.376581   800 solver.cpp:228] Iteration 260, loss = 535090
I0125 13:59:18.376581   800 solver.cpp:244]     Train net output #0: loss = 508435 (* 1 = 508435 loss)
I0125 13:59:18.377609   800 sgd_solver.cpp:106] Iteration 260, lr = 1e-10
I0125 13:59:26.114292   800 solver.cpp:228] Iteration 280, loss = 570612
I0125 13:59:26.114292   800 solver.cpp:244]     Train net output #0: loss = 570848 (* 1 = 570848 loss)
I0125 13:59:26.115797   800 sgd_solver.cpp:106] Iteration 280, lr = 1e-10

Overall, loss doesn't seem to decrease, and go below 500 000. Moreover, Train net output #0: loss = 570848 (* 1 = 570848 loss) seems to occur every few iterations.


Method 2:
python pascalcontext-fcn8s/solve.py 0

My solve.py is attached. There is also net.py (also attached), but it never seems to be executed ( see Line 16 and Line 89 of net.py - they never execute).

I have also attached log - logPython.txt, where you can verify that those lines are never executed.

Here is the Iteration Part of the log:
I0125 14:20:33.657279 12060 solver.cpp:228] Iteration 0, loss = 569757
I0125 14:20:33.657279 12060 solver.cpp:244]     Train net output #0: loss = 569757 (* 1 = 569757 loss)
I0125 14:20:33.658788 12060 sgd_solver.cpp:106] Iteration 0, lr = 1e-10
I0125 14:20:41.960067 12060 solver.cpp:228] Iteration 20, loss = 1.61534e+06
I0125 14:20:41.960067 12060 solver.cpp:244]     Train net output #0: loss = 1.6679e+06 (* 1 = 1.6679e+06 loss)
I0125 14:20:41.961041 12060 sgd_solver.cpp:106] Iteration 20, lr = 1e-10
I0125 14:20:50.214399 12060 solver.cpp:228] Iteration 40, loss = 2.60565e+06
I0125 14:20:50.214900 12060 solver.cpp:244]     Train net output #0: loss = 759418 (* 1 = 759418 loss)
I0125 14:20:50.215903 12060 sgd_solver.cpp:106] Iteration 40, lr = 1e-10
I0125 14:20:57.798151 12060 solver.cpp:228] Iteration 60, loss = 1.22813e+06
I0125 14:20:57.798151 12060 solver.cpp:244]     Train net output #0: loss = 425897 (* 1 = 425897 loss)
I0125 14:20:57.798682 12060 sgd_solver.cpp:106] Iteration 60, lr = 1e-10
I0125 14:21:05.405422 12060 solver.cpp:228] Iteration 80, loss = 480972
I0125 14:21:05.405890 12060 solver.cpp:244]     Train net output #0: loss = 604538 (* 1 = 604538 loss)
I0125 14:21:05.406391 12060 sgd_solver.cpp:106] Iteration 80, lr = 1e-10
I0125 14:21:12.815894 12060 solver.cpp:228] Iteration 100, loss = 389368
I0125 14:21:12.815894 12060 solver.cpp:244]     Train net output #0: loss = 59256.6 (* 1 = 59256.6 loss)
I0125 14:21:12.816365 12060 sgd_solver.cpp:106] Iteration 100, lr = 1e-10
I0125 14:21:20.003808 12060 solver.cpp:228] Iteration 120, loss = 277830
I0125 14:21:20.003808 12060 solver.cpp:244]     Train net output #0: loss = 522212 (* 1 = 522212 loss)
I0125 14:21:20.004814 12060 sgd_solver.cpp:106] Iteration 120, lr = 1e-10
I0125 14:21:27.834399 12060 solver.cpp:228] Iteration 140, loss = 311317
I0125 14:21:27.834399 12060 solver.cpp:244]     Train net output #0: loss = 516067 (* 1 = 516067 loss)
I0125 14:21:27.835372 12060 sgd_solver.cpp:106] Iteration 140, lr = 1e-10
I0125 14:21:35.378269 12060 solver.cpp:228] Iteration 160, loss = 2.29701e+06
I0125 14:21:35.378269 12060 solver.cpp:244]     Train net output #0: loss = 8.10771e+06 (* 1 = 8.10771e+06 loss)
I0125 14:21:35.379292 12060 sgd_solver.cpp:106] Iteration 160, lr = 1e-10
I0125 14:21:42.941155 12060 solver.cpp:228] Iteration 180, loss = 1.01302e+06
I0125 14:21:42.941155 12060 solver.cpp:244]     Train net output #0: loss = 952015 (* 1 = 952015 loss)
I0125 14:21:42.942157 12060 sgd_solver.cpp:106] Iteration 180, lr = 1e-10
I0125 14:21:50.657610 12060 solver.cpp:228] Iteration 200, loss = 419933
I0125 14:21:50.658051 12060 solver.cpp:244]     Train net output #0: loss = 239800 (* 1 = 239800 loss)
I0125 14:21:50.658282 12060 sgd_solver.cpp:106] Iteration 200, lr = 1e-10
I0125 14:21:58.384461 12060 solver.cpp:228] Iteration 220, loss = 491792
I0125 14:21:58.384932 12060 solver.cpp:244]     Train net output #0: loss = 524246 (* 1 = 524246 loss)
I0125 14:21:58.385432 12060 sgd_solver.cpp:106] Iteration 220, lr = 1e-10
I0125 14:22:05.747962 12060 solver.cpp:228] Iteration 240, loss = 338026
I0125 14:22:05.747962 12060 solver.cpp:244]     Train net output #0: loss = 68730.8 (* 1 = 68730.8 loss)
I0125 14:22:05.748965 12060 sgd_solver.cpp:106] Iteration 240, lr = 1e-10
I0125 14:22:13.078788 12060 solver.cpp:228] Iteration 260, loss = 337921
I0125 14:22:13.078788 12060 solver.cpp:244]     Train net output #0: loss = 63947.7 (* 1 = 63947.7 loss)
I0125 14:22:13.079268 12060 sgd_solver.cpp:106] Iteration 260, lr = 1e-10
I0125 14:22:20.806751 12060 solver.cpp:228] Iteration 280, loss = 247474
I0125 14:22:20.806751 12060 solver.cpp:244]     Train net output #0: loss = 127147 (* 1 = 127147 loss)
I0125 14:22:20.807729 12060 sgd_solver.cpp:106] Iteration 280, lr = 1e-10
I0125 14:22:28.357250 12060 solver.cpp:228] Iteration 300, loss = 323460
I0125 14:22:28.357250 12060 solver.cpp:244]     Train net output #0: loss = 87453.3 (* 1 = 87453.3 loss)
I0125 14:22:28.358253 12060 sgd_solver.cpp:106] Iteration 300, lr = 1e-10
I0125 14:22:35.708989 12060 solver.cpp:228] Iteration 320, loss = 224735
I0125 14:22:35.708989 12060 solver.cpp:244]     Train net output #0: loss = 248656 (* 1 = 248656 loss)
I0125 14:22:35.709993 12060 sgd_solver.cpp:106] Iteration 320, lr = 1e-10
I0125 14:22:43.251322 12060 solver.cpp:228] Iteration 340, loss = 331050
I0125 14:22:43.251796 12060 solver.cpp:244]     Train net output #0: loss = 393993 (* 1 = 393993 loss)
I0125 14:22:43.252802 12060 sgd_solver.cpp:106] Iteration 340, lr = 1e-10
I0125 14:22:50.609666 12060 solver.cpp:228] Iteration 360, loss = 211849
I0125 14:22:50.609897 12060 solver.cpp:244]     Train net output #0: loss = 157259 (* 1 = 157259 loss)
I0125 14:22:50.610617 12060 sgd_solver.cpp:106] Iteration 360, lr = 1e-10
I0125 14:22:57.874409 12060 solver.cpp:228] Iteration 380, loss = 289918
I0125 14:22:57.874649 12060 solver.cpp:244]     Train net output #0: loss = 706549 (* 1 = 706549 loss)
I0125 14:22:57.875429 12060 sgd_solver.cpp:106] Iteration 380, lr = 1e-10
I0125 14:23:05.316555 12060 solver.cpp:228] Iteration 400, loss = 230567
I0125 14:23:05.316555 12060 solver.cpp:244]     Train net output #0: loss = 33562.8 (* 1 = 33562.8 loss)
I0125 14:23:05.317550 12060 sgd_solver.cpp:106] Iteration 400, lr = 1e-10

As you can see, not only loss changes and seems to be decreasing, but it also goes under 500 000, which is something Method 1 would almost never achieve. In addition, 
Train net output #0: loss is constantly changing and values do not repeat.


Am I doing something wrong in my code or is there a bug somewhere - I assumed that the values would be similar?


solve.py
logPython.txt

Ilya Zhenin

no leída,
25 ene 2017, 10:41:44 a.m.25/1/17
para Caffe Users
Look, I have told you earlier :)

According to your .prototxt you have renamed layers, weights of renamed layers won't be initialized with weights of .caffemodel.
And there is big difference between you running method 2 and 1 - in solve.py:


# surgeries
 interp_layers
= [k for k in solver.net.params.keys() if 'up' in k]
 surgery
.interp(solver.net, interp_layers)

in method 1 you do not run this code, so weights of your Deconvolution layers(layers that contain "up" in their names won't be initialized.) 

And actually in method 1 you use fcn8 architecture initializing it with weigths of fcn16. There is a difference only in the last layer I believe, so for me it sums up to your last Deconvolution layer weigths being zeros


среда, 25 января 2017 г., 17:28:30 UTC+3 пользователь Filip K написал:

Filip K

no leída,
25 ene 2017, 10:51:52 a.m.25/1/17
para Caffe Users
Yeah, but I have also initialized the weights in the prototxt file, so doesn't that set weights for those layers?
Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos