Why I need very low learning rate in training SSD?

45 views

Skip to first unread message

Nyan Naing

unread,

Mar 7, 2018, 2:41:46 AM3/7/18

to Caffe Users

I used SSD for object detection.

I trained 720 images with 720 x 720 width and height.
Resize is set to 300 x 300.
I trained for two classes, object and background only.
I need to set very low learning rate if not I have loss mbox_loss = 0 (* 1 = 0 loss) or Nan and Loss is Nan.
My parameters are

use_batchnorm = False 
base_lr = 0.00000001 
solver_param = {
 # Train parameters
 'base_lr': base_lr,
 'weight_decay': 0.00005,
 'lr_policy': "multistep",
 'stepvalue': [4000, 8000,10000],
 'gamma': 0.1,
 'momentum': 0.75,
 'iter_size': iter_size,
 'max_iter': 10000,
 'snapshot': 2500,
 'display': 50,
 'average_loss': 10,
 'type': "SGD",
 'solver_mode': solver_mode,
 'device_id': device_id,
 'debug_info': False,
 'snapshot_after_train': True,
 # Test parameters
 'test_iter': [test_iter],
 'test_interval': 500,
 'eval_type': "detection",
 'ap_version': "11point",
 'test_initialization': False,
 }

Training loss starts with 80 and after 4,000 iterations, it reaches to 8 and doesn't go down lower than that. I trained to 10,000 itertions. Loss stays around 8 from 4000 to 10000 with no further improvements. I trained about 700 such images.

What could be wrong with this training?
Those black squares are objects I am trying to detect in this attached image.
https://www.dropbox.com/s/9b6o3ylb8v3i1yg/S21603_1335_1.jpg?dl=0

Reply all

Reply to author

Forward

0 new messages