solver type Adam worse than SGD

105 views
Skip to first unread message

Feng Mao

unread,
Nov 9, 2016, 3:34:47 AM11/9/16
to Caffe Users

I use
Dataset: cifar10
model: googlenet (inception v1)
learning rate: fixed, 0.001(0~60000 iter), 0.0001(60000~65000 iter), 0.00001(65000~70000 iter) 

and try two experiments based on
1) SGD with momentum: 0.9,  weight_decay: 0.004
2) Adam with momentum: 0.9, momentum2: 0.999

I found that Adam was better than SGD in early iterations, but worse in later iterations, as shown bellow (red for SGD, green for Adam)



Is there any method to improve Adam?  Thanks!


Reply all
Reply to author
Forward
0 new messages