copying layer weights and biases between networks

Gilad Sharir

unread,

Sep 16, 2014, 8:03:39 AM9/16/14

to caffe...@googlegroups.com

Hi

I'm trying to copy the weights and biases of a single layer from one net to another one (the layer is fully connected).
I followed the example in "net_surgery.ipyn" , however I run into a problem while trying to do the copy operation in python:

net_1.params['fc8b'][0] = net_2.params['fc8b'][0]
net_1.params['fc8b'][1] = net_2.params['fc8b'][1]

After these lines, the layer which I was trying to copy is not the same in net_1 and net_2.

I also tried :
net_1.params['fc8b'][0].data = net_2.params['fc8b'][0].data

but this gives an error:
AttributeError: can't set attribute

Any ideas how to copy the layer params between 2 nets?

Thanks

Evan Shelhamer

unread,

Sep 16, 2014, 2:52:44 PM9/16/14

to Gilad Sharir, caffe...@googlegroups.com

Hi Gilad,

If you look closely at the assignments in the "editing model parameters" example you will see that they are in-place:

conv_params[pr_conv][1][...] = fc_params[pr][1]

This is needed to properly bridge the Python interface's ndarray and the library's underlying Blob memory. Remember to save the network after your assignment if you want to keep the new parameters.

For background on this see https://github.com/BVLC/caffe/pull/311#issuecomment-40047852

Evan Shelhamer

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/a9c16aa6-9bad-4a74-b63c-c91cc46979fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Gilad Sharir

unread,

Sep 16, 2014, 3:03:01 PM9/16/14

to caffe...@googlegroups.com, gilis...@gmail.com, shel...@eecs.berkeley.edu

Thanks for the reply.
I already tried the assignment as you described, but this gives an error :

TypeError: 'Blob' object does not support item assignment

I managed to solve it by writing my own Copy function in the _caffe.cpp code, (in CaffeBlob class) and calling it from python.

Jonathan L Long

unread,

Sep 16, 2014, 10:26:51 PM9/16/14

to Gilad Sharir, caffe...@googlegroups.com, Evan Gerard Shelhamer

To be clear: Copying blobs in pycaffe works in the usual numpy way. Evan elided the ".data" accessor in his assignment; the important difference is between attribute assignment and numpy slice assignment.

If you write

blob.data = some_array

you are doing attribute assignment; you are saying "hey net, instead of using the memory you were using before for that blob, use this other memory I have here". This is not allowed (with certain exceptions). If you write

blob.data[...] = some_array

you are doing numpy slice assignment; you are saying "hey net, copy the memory I have here into the memory you've already allocated for that blob". That is what you want.

(If you don't include ".data" at all, you assigning an ndarray to a Blob, which doesn't make sense; a Blob holds two different ndarrays.)

JLL

To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/2fe69217-7d09-4047-9490-f91668e10d5f%40googlegroups.com.

Yiak Wang Yi

unread,

Jan 11, 2017, 6:58:52 AM1/11/17

to Caffe Users, gilis...@gmail.com, shel...@eecs.berkeley.edu

Hi, I have the same problem when I try to upgrade channels from pretrained network parameters. Could you kindly tell how to create customised copy function to be called from python?

在 2014年9月16日星期二 UTC+8下午11:03:01，Gilad Sharir写道：

Yiak Wang Yi

unread,

Jan 11, 2017, 7:09:40 AM1/11/17

to Caffe Users, gilis...@gmail.com, shel...@eecs.berkeley.edu

I thing you are talking a different thing. We intend to do assignment and discard the allocated memory.

在 2014年9月17日星期三 UTC+8上午6:26:51，Jonathan L Long写道：

Reply all

Reply to author

Forward