Training using RGB and Depth images (CONCAT layer)

2,317 views
Skip to first unread message

Arjun Sharma

unread,
Oct 6, 2014, 5:48:17 AM10/6/14
to caffe...@googlegroups.com
I created leveldb for the depth images from the following dataset


Although the leveldb was created successfully, when loading the it using the data layer, it shows 3 channels instead of 1. (The depth image should have only 1 channel or am I wrong?)

Also, when using the concat layer and concatenating the rgb and the depth layer I get 101 Test scores instead of 2.

So I wanted to know whether the following is the correct way to concat 2 data layers channel wise:-

name: "CaffeNet"
layers {
  name: "data1"
  type: DATA
  top: "data1"
  top: "label1"
  data_param {
    source: "rgb_train1_leveldb"
    mean_file: "../../data/rgbd/rgb_mean1.binaryproto"
    batch_size: 100
    crop_size: 144
    mirror: true
  }
}
layers {
  name: "data2"
  type: DATA
  top: "data2"
  top: "label2"
  data_param {
    source: "depth_train1_leveldb"
    mean_file: "../../data/rgbd/depth_mean1.binaryproto"
    batch_size: 100
    crop_size: 144
    mirror: true
  }
}
layers {
  name: "rgbd"
  type: CONCAT
  concat_param {
    concat_dim: 1
  }
  bottom: "data1"
  bottom: "data2"
  top: "rgbd"
}
layers {
  name: "conv1"
  type: CONVOLUTION
  bottom: "rgbd"
  top: "conv1"


and so on and finally in the softmax loss layer in the training file (and the accuracy layer in the test file), I use label1 (label1 and label2 are the same obviously).

Thanks,
Arjun

Arjun Sharma

unread,
Oct 8, 2014, 2:40:19 AM10/8/14
to caffe...@googlegroups.com
The depth images are of type uint16 and are single channel. Why does Caffe create a leveldb with 3 channels then?
Can Caffe not handle uint16 format?
Any help will be greatly appreciated.

Thanks,
Arjun

Arjun Sharma

unread,
Oct 8, 2014, 4:25:17 AM10/8/14
to caffe...@googlegroups.com
Hi,
The depth image file must be opened in OpenCV using type IPL_DEPTH_16U. Otherwise, it opens as a 3 channel file with all zeroes(when opened using cv2.imread in Python OpenCV). Can anyone point me to the file where I can make this change?

Thanks,
Arjun

On Monday, 6 October 2014 15:18:17 UTC+5:30, Arjun Sharma wrote:
Message has been deleted

Arjun Sharma

unread,
Oct 8, 2014, 7:02:03 AM10/8/14
to caffe...@googlegroups.com
I changed the file io.cpp and set the flag in cv::imread to CV_LOAD_IMAGE_ANYDEPTH | CV_LOAD_IMAGE_ANYCOLOR
and set the no. of channels to 1. Can anyone proficient in OpenCV tell me whether this is the write thing to do?

On Monday, 6 October 2014 15:18:17 UTC+5:30, Arjun Sharma wrote:

Arjun Sharma

unread,
Oct 8, 2014, 7:40:26 AM10/8/14
to caffe...@googlegroups.com
This seems to have worked for me. Hope this will help others trying to work with uint16 images.

Xiaojiang Peng

unread,
Mar 26, 2015, 11:32:53 AM3/26/15
to caffe...@googlegroups.com
I guess you can just set is_color to 0 in the prototxt.

Xiaojiang Peng

unread,
Mar 26, 2015, 11:46:33 AM3/26/15
to caffe...@googlegroups.com
Do you have any idea that caffe would sample the same index of rgb and depth images before concatenation?

Arjun Sharma

unread,
Mar 27, 2015, 1:42:07 AM3/27/15
to caffe...@googlegroups.com
Hi,

I don't know for sure, but with the same batch size (and provided that you had the same order of the RGB and Depth images when creating the leveldb), I think the CONCAT layer should concatenate the correct images together.

Regarding the is_color option, I am working with an older version of Caffe and I don't think this option is supported for me. In any case, I think setting is_color to 0 will treat the image as grayscale (use the OpenCV option to read grayscale images in the imread function) and this may not be what you want.

Xiaojiang Peng

unread,
Mar 27, 2015, 4:57:33 AM3/27/15
to caffe...@googlegroups.com
Thanks Arjun. So, how should I use this model for prediction with python interface?  Is there a function for multiple inputs with the right deploy.proto? I think you have managed to do this yet.

Saumya S

unread,
Jun 23, 2015, 1:23:31 AM6/23/15
to caffe...@googlegroups.com
Hello ,

I am very new in caffe, could anybody please guide me how to proceed with the classification in the pyhton interface using this model (which is created after concatenating the data layers while training)..

Thanks,
Saumya

evera...@gmail.com

unread,
Jun 23, 2015, 8:37:01 PM6/23/15
to caffe...@googlegroups.com
Hi Saumya,

I meet the same question as you, Did you solve this problem?  can you give me some advice to solve it. 

在 2015年6月23日星期二 UTC+8下午1:23:31,Saumya S写道:

Syed Umar Amin

unread,
Apr 3, 2016, 4:41:17 AM4/3/16
to Caffe Users
How input depth image as colour image.

I am getting the following error when I am trying to initialize my caffenet with depth images and weights from pretrained imagenet model.

untrained_sign_net = caffe.Net(sign_net(train=False, subset='train'),
                                weights, caffe.TEST)
untrained_sign_net.forward()
sign_data_batch = untrained_sign_net.blobs['data'].data.copy()
sign_label_batch = np.array(untrained_sign_net.blobs['label'].data, dtype=np.int32)

Its giving me below error:

Could not open or find file /home/umar/caffe/data/sign/images/S01C01_0001.jpeg

F0403 11:10:06.664624  4562 image_data_layer.cpp:138] Check failed: cv_img.data Could not load /home/umar/caffe/data/sign/images/S01C01_0001.jpeg


Reply all
Reply to author
Forward
0 new messages