Regression with LevelDB input, HDF5 targets

58 de afișări
Accesați primul mesaj necitit

Dominick Rocco

necitită,
30 dec. 2015, 17:20:2630.12.2015
– Caffe Users
Hi,

I'm working on training a network for regression.  The architecture is a truncated version GoogLeNet with which our group has already seen good results for classification.  I am training on a GPU, so I stuck with a LevelDB input to avoid sacrificing prefetch capability.  For regression, however, I understand that HDF5 input is recommended to handle floating point values.  To that end, my regression targets are being fed in through HDF5.  

For regression, I'm using a Euclidean loss layer, which I've configured to with both in InnerProduct and HDF5 layer as bottoms.  The loss values look higher than they should be, roughly 30.  My targets values are in the ballpark of 0-5, mostly around 2.  The loss doesn't seem to be converging well, so I'm fiddling with the learning rates.  One question I have is whether the Euclidean loss is averaged over the batch, or whether it is a direct sum?   If it's a sum, a loss around 30 seems reasonable for early training.  If it's the average error, then I suspect something is going wrong.

I'm also fishing for insight about potential pitfalls to combining the HDF5 and LevelDB inputs.  I'll show the input and loss layers below, and also attach my full prototxt for training.  The LevelDB and HDF5 files are filled simultaneously, and the HDF5 file is indexed with the same keys that go into the LevelDB.  Can anyone think of a reason why this approach would be fatally flawed?  Or is there some subtle feature for which I should be watching out?  

Thanks,
Dominick 

In the prototxt configuration, I have the HDF5 layer: 
layer {
  name: "hdfdata"
  type: "HDF5Data"
  top: "regression"
  include {
    phase: TRAIN
  }
  hdf5_data_param {
    source: "/phihome/neuralnet/users/rocco/2015-12-28_energy/regression_train.txt"
    batch_size: 64
  }
}

And similarly the LevelDB layer: 
layer {
  name: "data"
  type: "Data"
  top: "data"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
  }
  data_param {
    source: "/phihome/neuralnet/users/rocco/2015-12-28_energy/TrainLevelDB"
    batch_size: 64
    prefetch: 40
    backend: LEVELDB
  }
}


The loss layer is configured as follows:
layer {
  name: "loss3/loss3"
  type: "EuclideanLoss"
  bottom: "loss3/output"
  bottom: "regression"
  top: "loss3/loss3"
  loss_weight: 1
}


It looks like things got off the ground alright: 
I1230 15:42:45.672127 11934 layer_factory.hpp:76] Creating layer loss3/loss3
I1230 15:42:45.672137 11934 net.cpp:110] Creating Layer loss3/loss3
I1230 15:42:45.672142 11934 net.cpp:477] loss3/loss3 <- loss3/output
I1230 15:42:45.672150 11934 net.cpp:477] loss3/loss3 <- regression
I1230 15:42:45.672158 11934 net.cpp:433] loss3/loss3 -> loss3/loss3



siamese_compact_energy_train_val.prototxt
Răspundeți tuturor
Răspundeți autorului
Redirecționați
0 mesaje noi