Training loss discrepancy?

12 views

Skip to first unread message

Alex Ter-Sarkisov

unread,

Sep 18, 2017, 11:57:24 AM9/18/17

to Caffe Users

I use a batch size of 32 during the training phase, with average loss = 1000 iteration and Infogain loss matrix with H=[2,0;0,10]. Averaged error is reported during the work of he algorithm as

Iteration 32600 (2.12984 iter/s, 46.952s/100 iters), loss = 0.16175

Train net output #0: loss = 0.137508 (* 1 = 0.137508 loss)

the first one is for a batch, the second is averaged over 1000 iterations.

Yet when I do network surgery, the error reported is much higher:

nets = range(1000,31000,1000)

for iters in nets:

net_mcn = caffe.Net("train.prototxt","network" + str(iters)+".caffemodel",caffe.TRAIN)

print iters, net_mcn.blobs['loss'].data

I got

1000 80.195098877

2000 60.3577766418

3000 53.1454048157

4000 73.8062515259

5000 83.9467010498

6000 51.0115280151

7000 63.0284347534

8000 43.6017570496

9000 48.1773223877

10000 44.3014030457

11000 46.4531173706

12000 41.0288658142

13000 56.1040344238

14000 35.1473693848

15000 36.48487854

16000 38.3397903442

17000 44.2434501648

18000 30.4674892426

19000 42.7231369019

20000 36.1248321533

21000 42.5675506592

22000 34.4444198608

23000 38.0397491455

24000 42.3689651489

25000 50.9166297913

26000 43.2592735291

27000 36.7138328552

28000 41.9512062073

29000 44.1816444397

30000 33.7083320618

Loss layer is

layer{

name: "loss"

type: "InfogainLoss"

bottom: "score_final"

bottom: "label"

top: "loss"

infogain_loss_param {

source: "infogain.binaryproto"

axis: 1

}

The situation with the validation set is about the same. This can't be explained by averaging. Then what?

Reply all

Reply to author

Forward

0 new messages