copying layer weights and biases between networks

3,110 views
Skip to first unread message

Gilad Sharir

unread,
Sep 16, 2014, 4:03:39 AM9/16/14
to caffe...@googlegroups.com
Hi

I'm trying to copy the weights and biases of a single layer from one net to another one (the layer is fully connected). 
I followed the example in "net_surgery.ipyn"  , however I run into a problem while trying to do the copy operation in python:

net_1.params['fc8b'][0] = net_2.params['fc8b'][0]
net_1.params['fc8b'][1] = net_2.params['fc8b'][1]

After these lines, the layer which I was trying to copy is not the same in net_1 and net_2.

I also tried :
net_1.params['fc8b'][0].data = net_2.params['fc8b'][0].data

but this gives an error:
AttributeError: can't set attribute

Any ideas how to copy the layer params between 2 nets?

Thanks

Evan Shelhamer

unread,
Sep 16, 2014, 10:52:44 AM9/16/14
to Gilad Sharir, caffe...@googlegroups.com
Hi Gilad,

If you look closely at the assignments in the "editing model parameters" example you will see that they are in-place:

conv_params[pr_conv][1][...] = fc_params[pr][1]

This is needed to properly bridge the Python interface's ndarray and the library's underlying Blob memory. Remember to save the network after your assignment if you want to keep the new parameters.


Evan Shelhamer

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/a9c16aa6-9bad-4a74-b63c-c91cc46979fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Gilad Sharir

unread,
Sep 16, 2014, 11:03:01 AM9/16/14
to caffe...@googlegroups.com, gilis...@gmail.com, shel...@eecs.berkeley.edu
Thanks for the reply.
I already tried the assignment as you described, but this gives an error :

TypeError: 'Blob' object does not support item assignment

I managed to solve it by writing my own Copy function in the _caffe.cpp code, (in CaffeBlob class) and calling it from python.

Jonathan L Long

unread,
Sep 16, 2014, 6:26:51 PM9/16/14
to Gilad Sharir, caffe...@googlegroups.com, Evan Gerard Shelhamer
To be clear: Copying blobs in pycaffe works in the usual numpy way. Evan elided the ".data" accessor in his assignment; the important difference is between attribute assignment and numpy slice assignment.

If you write
blob.data = some_array
you are doing attribute assignment; you are saying "hey net, instead of using the memory you were using before for that blob, use this other memory I have here". This is not allowed (with certain exceptions). If you write
blob.data[...] = some_array
you are doing numpy slice assignment; you are saying "hey net, copy the memory I have here into the memory you've already allocated for that blob". That is what you want.
(If you don't include ".data" at all, you assigning an ndarray to a Blob, which doesn't make sense; a Blob holds two different ndarrays.)

JLL

Yiak Wang Yi

unread,
Jan 11, 2017, 1:58:52 AM1/11/17
to Caffe Users, gilis...@gmail.com, shel...@eecs.berkeley.edu
Hi, I have the same problem when I try to upgrade channels from pretrained network parameters. Could you kindly tell how to create customised copy function to be called from python?

在 2014年9月16日星期二 UTC+8下午11:03:01,Gilad Sharir写道:

Yiak Wang Yi

unread,
Jan 11, 2017, 2:09:40 AM1/11/17
to Caffe Users, gilis...@gmail.com, shel...@eecs.berkeley.edu
I thing you are talking a different thing. We intend to do assignment and discard the allocated memory.

在 2014年9月17日星期三 UTC+8上午6:26:51,Jonathan L Long写道:
Reply all
Reply to author
Forward
0 new messages