Combining Networks and Selective Updating

Jordan

unread,

Oct 1, 2015, 5:11:38 PM10/1/15

to torch7

Hi,

I have two kind of related questions. First, suppose I have a few layers of a network already trained (call that network A), and now I want to attach a couple of untrained layers (call that network B). How can I glue the two networks together in Torch?

My weird follow-up question is, once I glue the two networks together, what if I only want to update the weights from network B? In other words, once I have networks A and B glued together, I don't want to update the weights that came from network A at all. Is that possible?

I'm kind of new to Torch7, so any help is appreciated. Thanks!

- J

Tushar N

unread,

Oct 1, 2015, 5:36:09 PM10/1/15

to torch7

You could add both the networks in another nn.Sequential() module

local networkA= torch.load('netA.t7')
local networkB= nn.Sequential():add(...)

local combined=nn.Sequential()
	:add(networkA)
	:add(networkB)

combined:forward() will push your input through the whole network.

As for only training networkB: If you're using optim, you can pass only networkB's parameters to it and call :forward() and backward() normally.

Alternately, if you prefer to use updateParameters(), you can call the method only for networkB by

combined.modules[2]:updateParameters(alpha) -- pick the second module of combined, and update it's parameters

PS: Wait for confirmation from someone else, I'm still new to this :)

Vislab

unread,

Oct 1, 2015, 8:01:51 PM10/1/15

to torch7

There are several ways to do model training with different learning rates. If you look closely to optim.sgd's method https://github.com/torch/optim/blob/master/sgd.lua you'll see two similar fields, namely learningRate and learningRates, where learningRate sets the same learning rate for all modules and learningRates is a MxN matrix containing all learning rates for all the parameters in the network. Its the same size as gradParameters (if i'm not mistaken).

Another way i've found is to modify https://github.com/soumith/imagenet-multiGPU.torch/blob/master/fbcunn_files/Optim.lua to specify a specific learning rate for a certain module at choice. Just need to modify opt[i].learningRate at line 172 to the desired one.

Jordan

unread,

Oct 1, 2015, 10:06:52 PM10/1/15

to torch7

Well it's not that I want networks A and B to have different learning rates, it's that I don't want the pretrained network A to have any weight change at all. I guess I could make this work by having the learning rate for all of the parameters I don't want to update to 0 though.

Vislab

unread,

Oct 2, 2015, 4:34:28 PM10/2/15

to torch7

Just do a forward pass with the first network and use it's output as the input to the next network and optimize/backprop only the last network. Its super simple.

Reply all

Reply to author

Forward