Validation loss increases while training loss decreasing

Alex Orloff

unread,

Jan 12, 2016, 7:13:22 AM1/12/16

to Caffe Users

Does it mean overfitting?

May there be any other reasons? I use quite a big (~300k images) training set.

Thanks

Giulia

unread,

Jan 13, 2016, 3:21:28 AM1/13/16

to Caffe Users

I have the same problem and if I increase the regularization (lower learning rate, dropout) this trend is alleviated (the validation loss stops increasing, but anyway it remains constant after a few epochs) and the training accuracy decreases (instead of reaching 100% it stops around at 90%). Thus I concluded that indeed I was overfitting.

Anyway, I also observed that the validation accuracy does not increase so much and it is more or less the same both in the overfitting and in the regularized cases... I am still investigating whether this means that I already achieved the maximum accuracy reachable with my data or I am not fine-tuning in the best way in both cases.

Alex Orloff

unread,

Jan 23, 2016, 7:01:35 PM1/23/16

to Caffe Users

Dear all,

I'm fine-tuning previously trained network.

Now I see that validaton loss start increase while training loss constatnly decreases.

I know that it's probably overfitting, but validation loss start increase after first epoch ended.

I use batch size=24 and training set=500k images, so 1 epoch = 20 000 iterations.

Could overfitting start so soon? It seems to me that there some another reason.

Thanks

Jan C Peters

unread,

Jan 25, 2016, 8:55:36 AM1/25/16

to Caffe Users

Try different/lower learning rates (decreasing policies) and maybe adaptive solvers (AdaGrad, Adam, AdaDelta; use a constant lr of 1.0 for those to take real effect). See if problem persists.

Jan

Ayşe Şerbetçi

unread,

Mar 4, 2016, 4:17:24 AM3/4/16

to Caffe Users

Alex,

Have you solved the problem? I have the same problem and tried so many thigns to fix it including trying different learning rates, different weight initialization (they say problems on training at the very beginning may be caused by weight initialization), different solver types

and smaller network. The training loss keeps decreasing but validation loss either starts to increase or stops decreasing after two or three epochs.

Thank you

yalcine...@gmail.com

unread,

Mar 26, 2017, 4:46:13 PM3/26/17

to Caffe Users

Thank you
it is to late but , it mean your data has nois , convert your data sinwave train your network ,you will get best results and than convert sinwave to your data format

Reply all

Reply to author

Forward