How to train two sub network alternatively in Caffe

45 views
Skip to first unread message

Qian Yang

unread,
Aug 21, 2017, 4:08:18 AM8/21/17
to Caffe Users
I need to train a neural network that consists two sub network with different functionality, one is a classic VGG-19 network, the other is an attention network, which takes the feature map of the last convolution layer as input. My training strategy is optimizing the parameters alternatively. First freeze the parameter of attention network, update the VGG. Then fix VGG, update the weights of attention. 

Do I have to write a script to switch different train_val.prototxt with different lr_mult and decay_mult? Or any other methods?

Abhishek Maheshwari

unread,
Mar 24, 2018, 2:12:34 PM3/24/18
to Caffe Users
Did you get through it? Have you any written script yet?

Przemek D

unread,
Mar 26, 2018, 3:33:16 AM3/26/18
to Caffe Users
This would be easier to accomplish by having two separate solvers for each network, and performing the optimization semi-manually - instead of calling Solver::step, you'd call Solver::Net::forward, Solver::Net::backward, and Solver::ApplyUpdate exactly when you want the parameters to be updated. Unfortunately, this isn't currently possible because the ApplyUpdate method isn't exposed to Python - something like PR #6238 is needed for that to work.
Message has been deleted

Abhishek Maheshwari

unread,
Mar 29, 2018, 9:16:09 AM3/29/18
to Caffe Users
I was thinking more like
consider two nets with only difference in parameters : net_1.ptototxt and net_2.ptototxt

turn = 0
net_1 = caffe.Net(net_1.prototxt, caffe.TRAIN)
net_2 = caffe.Net(net_2.prototxt, caffe.TRAIN)

solver = caffe.get_solver(solver_net_1)
if turn is 0:
       solver.net = net_1
       solver.step(1)          # I have doubt that...will this step(1) also change the network input data (next batch) for next step?
       x = solver.net.blobs['data'].data[...]    # saving input data for second network step
if turn is 1:
       solver.net = net_2
       solver.net.blobs['data'].data[...] = x     # for running second net on same data as first net
       solver.step(1)
#change the turn
turn = (turn+1)%2

Please suggest whether this will work or not? it will go for 2*max_iter in this way.
Reply all
Reply to author
Forward
0 new messages