How to use multiple input images

1,160 views
Skip to first unread message

Daniela G

unread,
Jun 9, 2016, 7:13:28 AM6/9/16
to Caffe Users
Hello,

I have 5 images that characterize a class and I wanted to train them all in a multiple input net (with 5 inputs).

I searched about it and I found this answer: http://stackoverflow.com/questions/27436987/caffe-multiple-input-images

But I don't know:
1. How to change the convert_imageset file to put my images in different channels.
2. Or what is the right way to use multiple data layers. I read about CONCAT layer but I'm not sure what it will do to my inputs.

Can anyone help me?

Thanks in advance.

Daniela G

unread,
Jun 14, 2016, 8:28:24 AM6/14/16
to Caffe Users
Anyone?

Daniel Moodie

unread,
Jun 14, 2016, 2:58:12 PM6/14/16
to Caffe Users
Hello,

You should be able to do what you what with the stack overflow question that you linked.
Create a hdf5 database with five datasets for each one of your images.  Populate the database accordingly and then use the hdf5 data layer to load them with something like this:

layer{
name: "data"
type: "HDF5Data"
top: "data1"
top: "data2"
top: "data3"
top: "data4"
top: "data5"
top: "label"
include{
phase: TEST
}
hdf5_data_param{
source: "databases/test.h5list"
batch_size: 50
shuffle: true
}
}

Then each image will be presented as a separate blob which can be fed into your network.

Daniela G

unread,
Jun 16, 2016, 5:08:50 AM6/16/16
to Caffe Users
I wanted to use Data Layer since I read it's faster than HDF5Data.
Is there no another way to do it?

Daniel Moodie

unread,
Jun 16, 2016, 12:16:28 PM6/16/16
to Caffe Users
Just speculating, can you define two data layers with differently named tops?

Daniela G

unread,
Jun 16, 2016, 6:27:15 PM6/16/16
to Caffe Users
I tried that but it's not working. I don't know if I'm doing something wrong or if it isn't supposed to work.

I did something like this:

n.data1, n.label = L.Data(....)
n
.data2 = L.Data(....)

n
.conv1_1 = L.Convolution(n.data1, ...)
....
n
.pool1_1 = L.Pool(...)

n
.conv1_2 = L.Convolution(n.data2, ...)
...
n
.pool1_2 = L.Pool(....)

n
.conc = L.Concat([n.pool1_1, n.pool1_2])

Daniela G

unread,
Jun 20, 2016, 11:54:41 AM6/20/16
to Caffe Users
I'm still having the same problem.
My full python code to create the net is the following:

    n = caffe.NetSpec()
    n
.data1, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb[0],
                             transform_param
=dict(scale=1./255), ntop=2)
    n
.data2 = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb[1],
                             transform_param
=dict(scale=1./255), ntop=1)    
                                   
    n
.conv1_1 = L.Convolution(n.data1, ...)
    n
.pool1_1 = L.Pooling(n.conv1_1, kernel_size=2, stride=2, pool=P.Pooling.MAX)
    n
.conv2_1 = L.Convolution(n.pool1_1, ...)
    n
.pool2_1 = L.Pooling(n.conv2_1, kernel_size=2, stride=2, pool=P.Pooling.MAX)
                           
    n
.conv1_2 = L.Convolution(n.data2, ...)
    n
.pool1_2 = L.Pooling(n.conv1_2, kernel_size=2, stride=2, pool=P.Pooling.MAX)
    n
.conv2_2 = L.Convolution(n.pool1_2, ....)
    n
.pool2_2 = L.Pooling(n.conv2_2, kernel_size=2, stride=2, pool=P.Pooling.MAX)
   
    n
.conc = L.Concat([n.pool2_1, n.pool2_2])
    n
.ip1 =   L.InnerProduct(n.conc, ...)
    n
.relu1 = L.ReLU(n.ip1, in_place=True)
    n
.score = L.InnerProduct(n.relu1, ...)
                           
    n
.accuracy = L.Accuracy(n.score, n.label)    
    n
.loss =  L.SoftmaxWithLoss(n.score, n.label)
   
   
return n.to_proto()

But I get the following error
AttributeError: 'list' object has no attribute '_to_proto'


I also tried the HDF5Data layer but I get the same error. What's wrong with the net?

Daniel Moodie

unread,
Jun 20, 2016, 1:27:28 PM6/20/16
to Caffe Users
Hi Daniela,

I haven't worked with the python interface, but can you list the output of: type(ndata2)

It looks like L.Data is returning a list of data layers which is being stored in n.data2 instead of the individual element, if this is the case it can be solved with: n.data2 = n.data2[0]

oeb

unread,
Jun 22, 2016, 4:23:47 AM6/22/16
to Caffe Users
You can create an LMDB database with 3*5 channels instead of just 3 (assuming RGB images), then use a slice layer to get multiple output layers.


Assuming an input np.ndarray imgs, to create the lmdb in python (taken from a comment of shelhamers on github):

# Create LMDB database from images
np
.shape(imgs) # = [numImgs,15,m,n]
in_db
= lmdb.open('lmdbfile', map_size=int(1e9)) #1 GB
with in_db.begin(write=True) as in_txn:
   
for in_idx, in_ in enumerate(imgs):
        el
= in_
        im_dat
= caffe.io.array_to_datum(el)
        im_dat
.label = someNumber # Or create a second database and data channel to hold labels if more than one dimension is needed here
        in_txn
.put('{:0>10d}'.format(in_idx), im_dat.SerializeToString())
in_db
.close()



And for the split layer (not sure how to python it) I guess it would look like:
layers {
    name: "slice_frames"
    type: SLICE
    bottom: "data"
    top: "data_1"
    top: "data_2"
    top: "data_3"
    top: "data_4"
    top: "data_5"
    slice_param {
        slice_dim: 1
        slice_point: 3
        slice_point: 6
        slice_point: 9
        slice_point: 12
    }
}
Hope this helps

Daniela G

unread,
Jul 20, 2016, 5:12:12 AM7/20/16
to Caffe Users
Thank you.

QiJin Y

unread,
Aug 22, 2016, 3:05:21 AM8/22/16
to Caffe Users
not need for []

 n.conc = L.Concat([n.pool2_1, n.pool2_2])   
  ==>   
 n.conc = L.Concat(n.pool2_1, n.pool2_2)

在 2016年6月20日星期一 UTC+8下午11:54:41,Daniela G写道:
Reply all
Reply to author
Forward
0 new messages