I am not sure exactly what you mean by load two pre-trained different models into a new model.
If you mean you have two pre-trained versions of a model and want to combine them into a new model, you would need to load each network and then figure out how to combine the models. This would involve taking the weights from each network and somehow combine them. Maybe you could use a process similar to what Caffe does when performing updates across multiple GPUs. I am not sure there is a good theoretical method for doing this. Within Caffe it would be relatively easy to perform mathematical operations (add/subtract/etc.), but I am not sure how the performance would be of the resulting model.
- Create and train each of the separate networks you want to use as the source (NET_A and NET_B for reference)
- Create the combined network (NET_C for reference) - This network should contain the parts of the first two networks you want to use as a source
- Load NET_C (read the PROTOTXT file or create the network programmatically)
- For each source network (NET_A and NET_B):
- Load the network (PROTOTXT and pre-trained model (CAFFEMODEL file)
- Iterate through the source network layers and update the associated NET_C layer
- Save the new NET_C network
I used this method for my research it worked pretty well just following the basic Net Surgery page as a reference.
The following is an excerpt of my basic setup I used (I am kind of a newbie at Python so I am sure there are better ways to do this):
#!/usr/bin/python
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import sys
# Import and setup the caffe environment
caffe_root = '/usr/local/src/caffe'
sys.path.insert(0, caffe_root + 'python')
import caffe
# Create the modality layer
def updateModality(modality, source, destination):
MOD = modality + '_'
print 'Performing update for modality: ' + MOD
for layer in source.blobs:
if any(MOD + layer in s for s in destination.blobs):
try:
params = source.params[layer][WEIGHTS].data, source.params[layer][BIASES].data
destination.params[MOD + layer][WEIGHTS].flat = params[WEIGHTS].flat
destination.params[MOD + layer][BIASES].flat = params[BIASES].flat
print 'Updated layer: ' + MOD + layer
except KeyError, e:
print '\tNo weights for layer: ' + layer
except Exception, e:
print '\tCaught a different exception: ' + str(e)
# Setup the Caffe Environment
caffe.set_device(0)
caffe.set_mode_gpu()
#caffe.set_mode_cpu()
WEIGHTS = 0
BIASES = 1
if (len(sys.argv) != 4):
print 'Error! invalid number of command line arguments:' + str(len(sys.argv))
sys.exit()
SPROTO = sys.argv[1]
SMODEL = sys.argv[2]
DEST = sys.argv[3]
# Read in the network
trained_net = caffe.Net(SPROTO, SMODEL, caffe.TRAIN)
update_net = caffe.Net(DEST, caffe.TRAIN)
modalities = ['ARR1', 'ARR2', 'ARRAY1', 'ARRAY2', 'HEAD0', 'HEAD1', 'HEAD2', 'HEAD3', 'LAPEL0', 'LAPEL1', 'LAPEL2', 'LAPEL3']
for mod in modalities:
updateModality(mod, trained_net, update_net)
print 'Saving the model...'
update_net.save('updated-' + DEST + '.caffemodel')
In my case, I was using a single source network for the pre-training component, but I used multiple copies to update a single network with duplicates of certain layer parameters. In my application have multiple input sources and wanted to use a pre-trained model for each input modality type. I did this across the different modalities defined the "modalities" list. In the source, I used the GoogLeNet as the reference network. For the NET_C version of my updates (update_net in the code), I pre-pended the networks to update with the names in the "modalities" list.
Using the "updateModality" function, the update_net (NET_C) layers get updated for the given modality.
This method could also be used to combine the layers (mathematically) if that is what you need to do. Just find the layers you are looking (in NET_A and NET_B) to combine and then perform your operations on them. Then save the results in the appropriate layer in NET_C.
After you are finished with your layer updates, you would just need to run the fine-tuning operations.
Hopefully, this helps.
Patrick