Preprocess Only One Input Data Blob

62 views
Skip to first unread message

Suleman K

unread,
Feb 17, 2016, 3:29:38 AM2/17/16
to Caffe Users
I have an input layer in my convnet:

layer {
  name: "tinynet"
  type: "Data"
  top: "data"
  top: "groundtruth"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }


layer {
  name: "tinynet"
  type: "Data"
  top: "data"
  top: "groundtruth"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
    crop_size: 227
  }
  data_param {
    source: "mnist_test_lmdb"
    batch_size: 64
    backend: LMDB
  }
  }
}


Now I only want to crop the "groundtruth"  blob, not the "data" blob, and that too only during test time. This is because my convnet outputs an image and I want to do an pixelwise euclidean distance between the convnet output (which is slightly smaller than the input) and the groundtruth. Is there a way to crop the groundtruth blob before the euclidean distance is computed?

Jan C Peters

unread,
Feb 17, 2016, 5:51:00 AM2/17/16
to Caffe Users
I don't think that is possible. How can you even use image-sized labels with LMDB? I don't think that is possible either. Afaik Data layers can only provide a single, zero-based integer class label per sample. And to crop that wouldn't make any sense at all (what would that even mean?). I think the transform params are only applied to the first (data) blob, but I am not sure of that.

Jan

Suleman

unread,
Feb 17, 2016, 8:21:22 PM2/17/16
to Caffe Users
But what if my label is actually a ground truth image, caffe seems to have support for that using the Euclidean Distance loss (which is the MSE between the two images). Does LMDB not support that? 

Jan C Peters

unread,
Feb 18, 2016, 7:39:42 AM2/18/16
to Caffe Users
That is what I am saying: Sure, caffe does support that, in the way you described for example, but loading labels other than single class values from LMDB is impossible (since the "Datum" used to store a sample in the DB can only deal with scalar int labels). But you can do the following: Create one LMDB for the input images and another one for the label images. Create two corresponding Data layers in your network. Ignore the second top blob in both cases and just use the first top blob of the data layers for "data" and "label". The rest of the network is the same us usual.

Jan
Reply all
Reply to author
Forward
0 new messages