How to set "reshape" parameter of FromProto

667 views
Skip to first unread message

Gavin Hackeling

unread,
Jun 7, 2015, 4:47:21 PM6/7/15
to caffe...@googlegroups.com
Hi all,

I am attempting to train a fully convolutional network for semantic segmentation. I am following this example: https://gist.github.com/longjon/ac410cad48a088710872#file-readme-md.
My labels LMDB has the dimensions (N, 1, 500, 500). The segmentation annotations have one channel composed of integer labels in [0, 9]. Using labels with 3 channels seems incorrect, and produced a different error. The corresponding BGR images LMDB has dimensions (N, 3, 500, 500). 

Initialization fails with the following:

F0607 16:27:55.619662 11724 blob.cpp:472] Check failed: ShapeEquals(proto) shape mismatch (reshape not set)

The following are the actual and expected dimensions of the blob:

I0607 16:27:55.619583 11724 blob.cpp:382] Blob num 1
I0607 16:27:55.619591 11724 blob.cpp:383] Blob channels 1
I0607 16:27:55.619601 11724 blob.cpp:384] Blob height 4096
I0607 16:27:55.619609 11724 blob.cpp:385] Blob width 25088
I0607 16:27:55.619618 11724 blob.cpp:387] Expected num 4096
I0607 16:27:55.619626 11724 blob.cpp:388] Expected channels 512
I0607 16:27:55.619635 11724 blob.cpp:389] Expected height 7
I0607 16:27:55.619644 11724 blob.cpp:390] Expected width 7

From this, it appears that I just need to reshape the blob. My questions are:
1) How do I specify that the blob should be reshaped?
2) Are the dimensions of my labels LMDB expected?

Thanks,
Gavin

Ihsan Ullah

unread,
Jun 7, 2015, 7:07:04 PM6/7/15
to caffe...@googlegroups.com
I think in the link you given, they have input fornat of :
input: 'data'
input_dim: 1
input_dim: 3
input_dim: 500
input_dim: 500

where as yours is
input_dim: N
input_dim: 3
input_dim: 500
input_dim: 500
I think first follow their pattern and see.
regards
ihsan

Gavin Hackeling

unread,
Jun 7, 2015, 11:46:10 PM6/7/15
to Ihsan Ullah, caffe...@googlegroups.com

Thanks Ihsan. I think you are referring to the deploy.prototxt. I am trying to train the network, so I am working with the train_val.prototxt instead. 
My Data layer looks like the following, and sets the batch size to 1.

layer {
  name: "data"
  type: "Data"
  top: "data"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_value: 104.00699
    mean_value: 116.66877
    mean_value: 122.67892
  }
  data_param {
    source: "path/to/my/lmdb"
    batch_size: 1
    backend: LMDB
  }
}

In any case, I have modified my build to reshape layers that do not match the expected size. This seems to work. The value of the loss function jumps around a lot; I am training on an extremely small data set just to check that the process works. So far the network predicts 0 for all pixels of the test image; I suspect that this is due training only 10k iterations on an inadequate data set.

I0607 19:12:19.261982 15288 solver.cpp:214] Iteration 10080, loss = 9034.48
I0607 19:12:19.262035 15288 solver.cpp:229]     Train net output #0: loss = 16185.3 (* 1 = 16185.3 loss)
I0607 19:12:19.262049 15288 solver.cpp:489] Iteration 10080, lr = 1e-10
I0607 19:12:31.258111 15288 solver.cpp:214] Iteration 10100, loss = 7571.7
I0607 19:12:31.258159 15288 solver.cpp:229]     Train net output #0: loss = 19408.8 (* 1 = 19408.8 loss)
I0607 19:12:31.258173 15288 solver.cpp:489] Iteration 10100, lr = 1e-10
I0607 19:12:43.530395 15288 solver.cpp:214] Iteration 10120, loss = 8542.87
I0607 19:12:43.530436 15288 solver.cpp:229]     Train net output #0: loss = 13480.4 (* 1 = 13480.4 loss)
I0607 19:12:43.530450 15288 solver.cpp:489] Iteration 10120, lr = 1e-10
I0607 19:12:55.746253 15288 solver.cpp:214] Iteration 10140, loss = 5557.32
I0607 19:12:55.746295 15288 solver.cpp:229]     Train net output #0: loss = 0.00578489 (* 1 = 0.00578489 loss)
I0607 19:12:55.746309 15288 solver.cpp:489] Iteration 10140, lr = 1e-10
I0607 19:13:07.895169 15288 solver.cpp:214] Iteration 10160, loss = 7662.01
I0607 19:13:07.895215 15288 solver.cpp:229]     Train net output #0: loss = 14601.2 (* 1 = 14601.2 loss)
I0607 19:13:07.895231 15288 solver.cpp:489] Iteration 10160, lr = 1e-10
I0607 19:13:20.042099 15288 solver.cpp:214] Iteration 10180, loss = 6612.53
I0607 19:13:20.042146 15288 solver.cpp:229]     Train net output #0: loss = 9599.04 (* 1 = 9599.04 loss)
I0607 19:13:20.042161 15288 solver.cpp:489] Iteration 10180, lr = 1e-10
I0607 19:13:32.172777 15288 solver.cpp:214] Iteration 10200, loss = 9919.12
I0607 19:13:32.172821 15288 solver.cpp:229]     Train net output #0: loss = 0.565339 (* 1 = 0.565339 loss)
I0607 19:13:32.172834 15288 solver.cpp:489] Iteration 10200, lr = 1e-10
I0607 19:13:44.346729 15288 solver.cpp:214] Iteration 10220, loss = 10103.3
I0607 19:13:44.346784 15288 solver.cpp:229]     Train net output #0: loss = 14821.5 (* 1 = 14821.5 loss)
I0607 19:13:44.346798 15288 solver.cpp:489] Iteration 10220, lr = 1e-10
^C^C^C^CI0607 19:13:56.368582 15288 solver.cpp:214] Iteration 10240, loss = 8975.04
I0607 19:13:56.368634 15288 solver.cpp:229]     Train net output #0: loss = 0.00257222 (* 1 = 0.00257222 loss)
I0607 19:13:56.368646 15288 solver.cpp:489] Iteration 10240, lr = 1e-10

Is reshaping the blob the right approach?



--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/beca3683-fbd3-4b10-a9d0-5a28dce4ed61%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christopher Catton

unread,
Jun 7, 2015, 11:59:22 PM6/7/15
to caffe...@googlegroups.com
I think you're working on the same thing as me, but you're getting a different error. Any chance you could post the code you are using to create the lmdb's for the labels and data?

Also, I don't think the label data should be using one channel. The different colors represent different class labels I believe.

Gavin Hackeling

unread,
Jun 8, 2015, 12:42:12 AM6/8/15
to Christopher Catton, caffe...@googlegroups.com

The script I am using is broadly the same as the example Evan posted in a PR and in a few emails; I don't have access to it at the moment.
I understood the network to be predicting a dense matrix of class labels that are represented by integers. The labels can be mapped to colors for visualization. I did test using label data with three channels; Caffe warned that the number of predictions was one third the number of labels. This seemed to corroborate using one channel for the labels. Maybe someone familiar with this work can weigh in.

Gavin

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.

Evan Shelhamer

unread,
Jun 23, 2015, 1:56:13 PM6/23/15
to Gavin Hackeling, Christopher Catton, caffe...@googlegroups.com
The softmax loss expects the labels to be a single channel where the value for each instance is the label index. The label shape for the segmentation ground truth of a single image would be H x W x 1 for height H and width W.

Different losses expect different label shapes. For instance Euclidean loss and Sigmoid Cross-Entropy expect the prediction and label to be the same shape. The Softmax loss label shape is the same as the prediction except the channel dimension is a singleton.

Hope that helps,

Evan Shelhamer

Reply all
Reply to author
Forward
0 new messages