Another topic about multi-label

39 views
Skip to first unread message

mprl

unread,
May 26, 2016, 8:55:05 AM5/26/16
to Caffe Users
Hello,
I'm currently working on multi-label classification with caffe. I saw that there is many topic about that and spent a lot of time to read them, but my mind is still not clear about it.

In my case, i have to classify images with two letters on it. To do that, i think i need a CNN with 52 output (1 to 26 for the first letter, 27 to 52 for the second letter. If the image to classify is "AA", i want the output number 1 and 27 to be activated).
It's exactly this, but with Caffe (only c++ if possible) and two digits (only letters).
It seems that the easiest way to proceed is to use a HDF5_DATA layer, which support multi-label, and a SIGMOID_CROSS_ENTROPY_LOSS layer, for the same reason.

But it also seems that there is no "official" accuracy layer which support multi-label.
Another thing that bother me is the SLICE layer.
So far, i understood that my net must start like that :

layer {
  name: "image"
  type: "HDF5_DATA"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  data_param {
    source: "train_hdf5"
    batch_size: 256
  }
}
layer {
  name: "image"
  type: "HDF5_DATA"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  data_param {
    source: "test_hdf5"
    batch_size: 256
  }
}
layers {
  name: "slice1"
  type: SLICE
  bottom: "label"
  top: "first_letter"
  top: "second_letter"
  slice_param {
      slice_dim: 1
      slice_point: 1
  }
}

The SLICE layer is here to split the label. The first blob will be the label of the first letter, the second the label of the second letter.
The rest of my net will operate on the data blob, like a standard mono-label CNN.

But what have i to do with these two label blobs ? Something like that ?

layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 52
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "loss"
  type: SIGMOID_CROSS_ENTROPY_LOSS
  bottom: "ip2"
  bottom: "first_letter"
  bottom: "second_letter"
  top: "loss"
}

I don't really get why this label split is required. If i do so, will my first 26 output will tie in the first letter and the others in the second letter ?

If i resume, to solve my problem, i have to:
- Manually preprocess the images, as HDF5_DATA layer cannot do it
- Split my images into two HDF5 database, one for train, the other for test.
- Create my net (so, solve this split layer question and find a multi-label accuracy layer)
- Train my net like any other net
- Write a variant of cpp_classification to test a single image (take the max of the 26 first output to predict the first letter and the max of the others to predict the second letter)

And that's all. Are they some others difficulties that i've forget ?


Thanks for your help !

Maxime

mprl

unread,
May 26, 2016, 11:49:06 AM5/26/16
to Caffe Users
I've found this net for multi-label classification.
It looks perfect for my problem, but it seems a bit weird to me. How can it work ? Can caffe's standard training tool deal with two different accuracy and loss layer ?
Reply all
Reply to author
Forward
0 new messages