Multi Task Learning

Riddhiman Dasgupta

unread,

Oct 22, 2014, 8:57:36 AM10/22/14

to caffe...@googlegroups.com

I want to implement multi task networks using Caffe. From what I have understood, if I have 3 tasks, I need to split the final shared layer into 3 layers, and send each resulting layer to its own softmax layer and subsequent loss layer. I can give weights to the different loss layers using the loss_weight parameter to get a weighted sum of the loss. However, I am unable to understand what should be the input to the network. Do I simply have a data layer with multiple tops, one for the data and the others for each label corresponding to each task?

If I have 3 tasks, then each data sample has 3 labels associated with it. In that case, will using something like this suffice:

layers {

  name: "input"

  type: DATA

  top: "data"

  top: "label1"

  top: "label2"

  top: "label3"

  data_param {

    source: "path/to/leveldb"

    batch_size: 100

As far as I realised, leveldb can take only one label associated with an image. In that case, how do I send data to a multi task network, with labels for each task? Do I define multiple data layers? Each data layer will have the same data, but different labels for different tasks?

lixin7...@gmail.com

unread,

Oct 27, 2014, 10:10:34 AM10/27/14

to caffe...@googlegroups.com

Hi!

I want to make a multi task network too, and having no idea how to actually implement it, have you solved this?

在 2014年10月22日星期三UTC+8下午8时57分36秒，Riddhiman Dasgupta写道：

Evan Shelhamer

unread,

Oct 28, 2014, 12:28:01 AM10/28/14

to Riddhiman Dasgupta, caffe...@googlegroups.com

As far as I realised, leveldb can take only one label associated with an image. In that case, how do I send data to a multi task network, with labels for each task? Do I define multiple data layers? Each data layer will have the same data, but different labels for different tasks?

The general solution is to define a data layer for each input, whether data or label, where each data layer has a single top (called whatever you like).

Some inputs can be combined. For instance, if each label is scalar (like a class label) then they can be combined into a vector for a single top of a DATA or HDF5 layer. If instead your input is an image and the label is an image (e.g. predicting a depth image from the RGB image) then both the "data" and "label" will have their own DATA layer and LMDB.

Evan Shelhamer

dong zhihong

unread,

Oct 30, 2014, 10:57:59 AM10/30/14

to caffe...@googlegroups.com, riddhiman...@gmail.com, shel...@eecs.berkeley.edu

input is an image and label is an image,

translate it into prototxt

layers {

  name: "source_image"

  type: DATA

  top: "source_image"

  data_param {

    source: "path/to/sourceleveldb"

    batch_size: 100

layers {

  name: "label_image"

  type: DATA

  top: "label_image"

  data_param {

    source: "path/to/labelleveldb"

    batch_size: 100

is it right?

in this example, what the output is? predict the sourceimage and labelimage belong to the same class? equal to 0 or 1.

在 2014年10月28日星期二UTC+8下午12时28分01秒，Evan Shelhamer写道：

anigma

unread,

Sep 16, 2015, 8:11:21 AM9/16/15

to Caffe Users

Did someone succeed to perform multi-task based on the solution proposed by Evan Shelmaher?

I like it a lot, but potentially something bad may happen with such treatment.

For example, if my data are audio and image, and label is person ID. I should have 3 bottom layers with the single top layers in them to correspond to

audio signal, image and person ID [I guess all these data should be generated to have the same batch size].

But is it guaranteed that these batches from different layers are not shuffled in the independent way so that all the information becomes a full mess?

Evan Shelhamer

unread,

Sep 16, 2015, 1:43:08 PM9/16/15

to Caffe Users

You have to take care that all the data layers are synchronized and load data in the same order. In particular none of the data layers can be configured to shuffle on-the-fly since they will shuffle independently.

Consider making a Python data layer as an alternative. By making a single data layer with multiple tops you can handle any coordination needed, load different formats, do transformations, and so on without C++ hacking. We need to bundle Python data layer examples, but this video data layer by Lisa Anne Hendricks might help: https://github.com/LisaAnne/lisa-caffe-public/blob/lstm_video_deploy/examples/LRCN_activity_recognition/sequence_input_layer.py

Miquel Martí

unread,

Feb 19, 2017, 10:48:20 PM2/19/17

to Caffe Users

Does this still hold? Can one create an LMDB to feed an AnnotatedData layer with multiple tops or is it not implemented?

El dijous, 17 setembre de 2015 2:43:08 UTC+9, Evan Shelhamer va escriure:

Developer

unread,

Mar 9, 2019, 9:44:52 AM3/9/19

to Caffe Users

me too i want to train a medol with multi input

how i can prepare the data please

Reply all

Reply to author

Forward