Window data layer foreground fraction

58 views
Skip to first unread message

Bryan Binotti

unread,
Nov 3, 2017, 1:06:14 PM11/3/17
to Caffe Users
Referencing the data layer below:

message WindowDataParameter {
  // Specify the data source.
  optional string source = 1;
  // For data pre-processing, we can do simple scaling and subtracting the
  // data mean, if provided. Note that the mean subtraction is always carried
  // out before scaling.
  optional float scale = 2 [default = 1];
  optional string mean_file = 3;
  // Specify the batch size.
  optional uint32 batch_size = 4;
  // Specify if we would like to randomly crop an image.
  optional uint32 crop_size = 5 [default = 0];
  // Specify if we want to randomly mirror data.
  optional bool mirror = 6 [default = false];
  // Foreground (object) overlap threshold
  optional float fg_threshold = 7 [default = 0.5];
  // Background (non-object) overlap threshold
  optional float bg_threshold = 8 [default = 0.5];
  // Fraction of batch that should be foreground objects
  optional float fg_fraction = 9 [default = 0.25];
  // Amount of contextual padding to add around a window
  // (used only by the window_data_layer)
  optional uint32 context_pad = 10 [default = 0];
  // Mode for cropping out a detection window
  // warp: cropped window is warped to a fixed size and aspect ratio
  // square: the tightest square around the window is cropped
  optional string crop_mode = 11 [default = "warp"];
  // cache_images: will load all images in memory for faster access
  optional bool cache_images = 12 [default = false];
  // append root_folder to locate images
  optional string root_folder = 13 [default = ""];
}


I'm curious, does this data layer balance the data on its own? I looked at https://github.com/BVLC/caffe/blob/master/src/caffe/layers/window_data_layer.cpp and it does mention sampling from the background and foreground for each batch based on the fg_fraction, but will it continue to sample until it runs out of either bg or fg data? Or does it resample cases?

Bryan Binotti

unread,
Nov 7, 2017, 10:03:58 AM11/7/17
to Caffe Users
Here's my understanding so far.

Caffe pools all the data into foreground windows and background windows at the network's setup

I1107 09:37:03.030887 16320 window_data_layer.cpp:157] Number of images: 4171
I1107 09:37:03.030887 16320 window_data_layer.cpp:161] class 0 has 450454 samples
I1107 09:37:03.030887 16320 window_data_layer.cpp:161] class 1 has 4014 samples
I1107 09:37:03.030887 16320 window_data_layer.cpp:165] Amount of context padding: 16


The code below from window_data_layer.cpp then uses the foreground and background pools to obtain samples of balanced batches. The sampling is done randomly, so an unbalanced class (as above) will be oversampled. So to answer my own question, the data is not balanced through caffe, but caffe implements balanced batches to avoid any biases.
Further, the network above had an accuracy of 96%, a strong indicator that the classes are still unbalanced.
Hope this helps anyone in the future


void WindowDataLayer<Dtype>::load_batch(Batch<Dtype>* batch) {
  // At each iteration, sample N windows where N*p are foreground (object)
  // windows and N*(1-p) are background (non-object) windows

...

  const int num_fg = static_cast<int>(static_cast<float>(batch_size)
      * fg_fraction);
  const int num_samples[2] = { batch_size - num_fg, num_fg };

  int item_id = 0;
  CHECK_GT(fg_windows_.size(), 0);
  CHECK_GT(bg_windows_.size(), 0);

  // sample from bg set then fg set
  for (int is_fg = 0; is_fg < 2; ++is_fg) {
    for (int dummy = 0; dummy < num_samples[is_fg]; ++dummy) {
      // sample a window
      timer.Start();
      const unsigned int rand_index = PrefetchRand();
      vector<float> window = (is_fg) ?
          fg_windows_[rand_index % fg_windows_.size()] :
          bg_windows_[rand_index % bg_windows_.size()];

      bool do_mirror = mirror && PrefetchRand() % 2;
Reply all
Reply to author
Forward
0 new messages