Training ImageNet, why do I get a loss of -NaN?

99 views
Skip to first unread message

Carlo Alessi

unread,
Dec 23, 2016, 11:43:18 AM12/23/16
to Caffe Users
I am training the ImageNet network with my own pictures. this is a line from the terminal:

> Train net output #0: loss = -nan (* 1 = -nan loss)

what does it mean, and how can I fix this?

Hamed Aghdam

unread,
Dec 23, 2016, 11:57:03 AM12/23/16
to Caffe Users
First, if by ImageNet network you refer to Alxnet, note that it is a big network and it requires many images to be trained.

Regarding your question, perhaps the base_lr in solver is large. Try to reduce it. For example if it is currently set to 0.001, try to decrease it to 0.0001. If you get nan again, try to reduce more. 

you may also consider adding scale:0.00039215 to your imagedata layer. But try decreasing the base_lr first.

I hope that helps.
Reply all
Reply to author
Forward
0 new messages