Why I need very low learning rate in training SSD?

45 views
Skip to first unread message

Nyan Naing

unread,
Mar 7, 2018, 2:41:46 AM3/7/18
to Caffe Users

I used SSD for object detection.

I trained 720 images with 720 x 720 width and height.
Resize is set to 300 x 300.
I trained for two classes, object and background only.
I need to set very low learning rate if not I have loss mbox_loss = 0 (* 1 = 0 loss) or Nan and Loss is Nan.
My parameters are


use_batchnorm = False
base_lr = 0.00000001
solver_param = {

 
# Train parameters
 
'base_lr': base_lr,
 
'weight_decay': 0.00005,
 
'lr_policy': "multistep",
 
'stepvalue': [4000, 8000,10000],
 
'gamma': 0.1,
 
'momentum': 0.75,
 
'iter_size': iter_size,
 
'max_iter': 10000,
 
'snapshot': 2500,
 
'display': 50,
 
'average_loss': 10,
 
'type': "SGD",
 
'solver_mode': solver_mode,
 
'device_id': device_id,
 
'debug_info': False,
 
'snapshot_after_train': True,
 
# Test parameters
 
'test_iter': [test_iter],
 
'test_interval': 500,
 
'eval_type': "detection",
 
'ap_version': "11point",
 
'test_initialization': False,
 
}

Training loss starts with 80 and after 4,000 iterations, it reaches to 8 and doesn't go down lower than that. I trained to 10,000 itertions. Loss stays around 8 from 4000 to 10000 with no further improvements. I trained about 700 such images.

What could be wrong with this training?
Those black squares are objects I am trying to detect in this attached image.
https://www.dropbox.com/s/9b6o3ylb8v3i1yg/S21603_1335_1.jpg?dl=0

Reply all
Reply to author
Forward
0 new messages