net: '/data/imagenet/caffe_dat/train_val.prototxt'
test_iter: 1000
test_interval: 1000
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 100000
display: 50
max_iter: 450000
momentum: 0.9
weight_decay: 0.0005
snapshot: 4000
snapshot_prefix: "/data/imagenet/caffe_dat/net_stages_trainin$
solver_mode: GPU
Does anyone have ideas of what might be causing this divergence in performance for such similar networks? I'll probably restart with everything exactly as the current defaults in caffe to make sure those two differences aren't whats causing the problem, they just seemed unlikely, and experimentation is costly on an AWS instance : )>
Thanks in advance,
Dean
layer {
name: "fixlabel"
type: "Power"
bottom: "label"
top: "label"
power_param {
shift: -1
}
}
}
Thanks in advance for any advice.
Best,
Dean