Why might my net be giving detection prediction scores that are all 0s?

296 views
Skip to first unread message

Rose Perrone

unread,
Nov 27, 2014, 1:07:48 PM11/27/14
to caffe...@googlegroups.com
When I detect using the bvlc reference net instead of the net I trained, I see non-zero detection prediction scores just fine. But when I train my own net on thousands of images, I get nothing but goose-eggs. My aim is to detect a single imagenet noun in imagenet, so the prediction output is two classes (presence of noun or non-presence of noun).

Here are my train_val.prototxt, solver.prototxt, and deploy.prototxt:

train_val.prototxt:
```
name: "nin_imagenet"
layers {
  top: "data"
  top: "label"
  name: "data"
  type: DATA
  data_param {
    source: "/Users/rose/home/video-object-detection/data/imagenet/n07840804/ilsvrc12_train_lmdb"
    backend: LMDB
    batch_size: 64
  }
  transform_param {
    crop_size: 224
    mirror: true
    mean_file: "/Users/rose/home/video-object-detection/data/imagenet/n07840804/image_mean.binaryproto"
  }
  include: { phase: TRAIN }
}
layers {
  top: "data"
  top: "label"
  name: "data"
  type: DATA
  data_param {
    source: "/Users/rose/home/video-object-detection/data/imagenet/n07840804/ilsvrc12_test_lmdb"
    backend: LMDB
    batch_size: 89
  }
  transform_param {
    crop_size: 224
    mirror: false
    mean_file: "/Users/rose/home/video-object-detection/data/imagenet/n07840804/image_mean.binaryproto"
  }
  include: { phase: TEST }
}
layers {
  bottom: "data"
  top: "conv1"
  name: "conv1"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "conv1"
  top: "conv1"
  name: "relu0"
  type: RELU
}
layers {
  bottom: "conv1"
  top: "cccp1"
  name: "cccp1"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 96
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.05
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "cccp1"
  top: "cccp1"
  name: "relu1"
  type: RELU
}
layers {
  bottom: "cccp1"
  top: "cccp2"
  name: "cccp2"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 96
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.05
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "cccp2"
  top: "cccp2"
  name: "relu2"
  type: RELU
}
layers {
  bottom: "cccp2"
  top: "pool0"
  name: "pool0"
  type: POOLING
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layers {
  bottom: "pool0"
  top: "conv2"
  name: "conv2"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.05
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "conv2"
  top: "conv2"
  name: "relu3"
  type: RELU
}
layers {
  bottom: "conv2"
  top: "cccp3"
  name: "cccp3"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 256
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.05
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "cccp3"
  top: "cccp3"
  name: "relu5"
  type: RELU
}
layers {
  bottom: "cccp3"
  top: "cccp4"
  name: "cccp4"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 256
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.05
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "cccp4"
  top: "cccp4"
  name: "relu6"
  type: RELU
}
layers {
  bottom: "cccp4"
  top: "pool2"
  name: "pool2"
  type: POOLING
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layers {
  bottom: "pool2"
  top: "conv3"
  name: "conv3"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "conv3"
  top: "conv3"
  name: "relu7"
  type: RELU
}
layers {
  bottom: "conv3"
  top: "cccp5"
  name: "cccp5"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 384
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.05
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "cccp5"
  top: "cccp5"
  name: "relu8"
  type: RELU
}
layers {
  bottom: "cccp5"
  top: "cccp6"
  name: "cccp6"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 384
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.05
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "cccp6"
  top: "cccp6"
  name: "relu9"
  type: RELU
}
layers {
  bottom: "cccp6"
  top: "pool3"
  name: "pool3"
  type: POOLING
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layers {
  bottom: "pool3"
  top: "pool3"
  name: "drop"
  type: DROPOUT
  dropout_param {
    dropout_ratio: 0.5
  }
}
layers {
  bottom: "pool3"
  top: "conv4"
  name: "conv4-1024"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 1024
    pad: 1
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.05
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "conv4"
  top: "conv4"
  name: "relu10"
  type: RELU
}
layers {
  bottom: "conv4"
  top: "cccp7"
  name: "cccp7-1024"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 1024
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.05
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "cccp7"
  top: "cccp7"
  name: "relu11"
  type: RELU
}
layers {
  bottom: "cccp7"
  top: "cccp8"
  name: "cccp8-1024"
  type: CONVOLUTION
  blobs_lr: 1
  blobs_lr: 2
  weight_decay: 1
  weight_decay: 0
  convolution_param {
    num_output: 2
    kernel_size: 1
    stride: 1
    weight_filler {
      type: "gaussian"
      mean: 0
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layers {
  bottom: "cccp8"
  top: "cccp8"
  name: "relu12"
  type: RELU
}
layers {
  bottom: "cccp8"
  top: "pool4"
  name: "pool4"
  type: POOLING
  pooling_param {
    pool: AVE
    kernel_size: 6
    stride: 1
  }
}
layers {
  name: "accuracy"
  type: ACCURACY
  bottom: "pool4"
  bottom: "label"
  top: "accuracy"
  include: { phase: TEST }
}
layers {
  bottom: "pool4"
  bottom: "label"
  name: "loss"
  type: SOFTMAX_LOSS
  include: { phase: TRAIN }
}
```

solver.prototxt:
```
net: "/Users/rose/home/video-object-detection/data/imagenet/n07840804/aux/train_val.prototxt"
test_iter: 1000
test_interval: 1000
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 200000
display: 20
max_iter: 1000
momentum: 0.9
weight_decay: 0.0005
snapshot: 200
snapshot_prefix: "/Users/rose/home/video-object-detection/data/imagenet/n07840804/snapshots"
solver_mode: CPU
```

deploy.prototxt
```
name: "nin_imagenet"
input: "data"
input_dim: 10
input_dim: 3
input_dim: 227
input_dim: 227
layers {
  bottom: "data"
  top: "conv1"
  name: "conv1"
  type: CONVOLUTION
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
  }
}
layers {
  bottom: "conv1"
  top: "conv1"
  name: "relu0"
  type: RELU
}
layers {
  bottom: "conv1"
  top: "cccp1"
  name: "cccp1"
  type: CONVOLUTION
  convolution_param {
    num_output: 96
    kernel_size: 1
    stride: 1
  }
}
layers {
  bottom: "cccp1"
  top: "cccp1"
  name: "relu1"
  type: RELU
}
layers {
  bottom: "cccp1"
  top: "cccp2"
  name: "cccp2"
  type: CONVOLUTION
  convolution_param {
    num_output: 96
    kernel_size: 1
    stride: 1
  }
}
layers {
  bottom: "cccp2"
  top: "cccp2"
  name: "relu2"
  type: RELU
}
layers {
  bottom: "cccp2"
  top: "pool0"
  name: "pool0"
  type: POOLING
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layers {
  bottom: "pool0"
  top: "conv2"
  name: "conv2"
  type: CONVOLUTION
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    stride: 1
  }
}
layers {
  bottom: "conv2"
  top: "conv2"
  name: "relu3"
  type: RELU
}
layers {
  bottom: "conv2"
  top: "cccp3"
  name: "cccp3"
  type: CONVOLUTION
  convolution_param {
    num_output: 256
    kernel_size: 1
    stride: 1
  }
}
layers {
  bottom: "cccp3"
  top: "cccp3"
  name: "relu5"
  type: RELU
}
layers {
  bottom: "cccp3"
  top: "cccp4"
  name: "cccp4"
  type: CONVOLUTION
  convolution_param {
    num_output: 256
    kernel_size: 1
    stride: 1
  }
}
layers {
  bottom: "cccp4"
  top: "cccp4"
  name: "relu6"
  type: RELU
}
layers {
  bottom: "cccp4"
  top: "pool2"
  name: "pool2"
  type: POOLING
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layers {
  bottom: "pool2"
  top: "conv3"
  name: "conv3"
  type: CONVOLUTION
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    stride: 1
  }
}
layers {
  bottom: "conv3"
  top: "conv3"
  name: "relu7"
  type: RELU
}
layers {
  bottom: "conv3"
  top: "cccp5"
  name: "cccp5"
  type: CONVOLUTION
  convolution_param {
    num_output: 384
    kernel_size: 1
    stride: 1
  }
}
layers {
  bottom: "cccp5"
  top: "cccp5"
  name: "relu8"
  type: RELU
}
layers {
  bottom: "cccp5"
  top: "cccp6"
  name: "cccp6"
  type: CONVOLUTION
  convolution_param {
    num_output: 384
    kernel_size: 1
    stride: 1
  }
}
layers {
  bottom: "cccp6"
  top: "cccp6"
  name: "relu9"
  type: RELU
}
layers {
  bottom: "cccp6"
  top: "pool3"
  name: "pool3"
  type: POOLING
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layers {
  bottom: "pool3"
  top: "pool3"
  name: "drop"
  type: DROPOUT
  dropout_param {
    dropout_ratio: 0.5
  }
}
layers {
  bottom: "pool3"
  top: "conv4"
  name: "conv4-1024"
  type: CONVOLUTION
  convolution_param {
    num_output: 1024
    pad: 1
    kernel_size: 3
    stride: 1
  }
}
layers {
  bottom: "conv4"
  top: "conv4"
  name: "relu10"
  type: RELU
}
layers {
  bottom: "conv4"
  top: "cccp7"
  name: "cccp7-1024"
  type: CONVOLUTION
  convolution_param {
    num_output: 1024
    kernel_size: 1
    stride: 1
  }
}
layers {
  bottom: "cccp7"
  top: "cccp7"
  name: "relu11"
  type: RELU
}
layers {
  bottom: "cccp7"
  top: "cccp8"
  name: "cccp8-1024"
  type: CONVOLUTION
  convolution_param {
    num_output: 2
    kernel_size: 1
    stride: 1
  }
}
layers {
  bottom: "cccp8"
  top: "cccp8"
  name: "relu12"
  type: RELU
}
layers {
  bottom: "cccp8"
  top: "pool4"
  name: "pool4"
  type: POOLING
  pooling_param {
    pool: AVE
    kernel_size: 6
    stride: 1
  }
}
# R-CNN classification layer made from R-CNN ILSVRC13 SVMs.
layers {
  name: "fc-rcnn"
  type: INNER_PRODUCT
  bottom: "cccp8"
  top: "fc-rcnn"
  inner_product_param {
    num_output: 2
  }
}
```

Rose Perrone

unread,
Nov 29, 2014, 11:43:22 PM11/29/14
to caffe...@googlegroups.com
Maybe I wrote these prototxt files incorrectly. Here are the instructions I created that say how to create these prototxt files. I based my protoxt files on the prototxt files of the nin_imagenet (see the Caffe Model Zoo).

After completeing these instructions, you will have made these 3 files:
train_val.protxt, solver.protxt, and deploy.prototxt
Use absolute paths for simplicity.
First, copy the model's train_val.prototxt to
data/imagenet/<wnid_dir>/images/<dataset>/aux
Change the files for each "source:" and "mean_file:" line to the lmbd
files and image mean created by prepare_data.py.
Find the last layer that contains a "num_output" field and change that to
the number of categories (in my case, two: one positive, one negative).
I also change the batch_size from 64 to 32, in case I run out of memory
on my macbook.
Copy the model's solver.prototxt to
data/imagenet/<wnid_dir>/images/<dataset>/aux
Change the "net:" path to the path to the train_val.prototxt you just created.
Change the "snapshot_prefix:" to
data/imagenet/<wnid_dir>/snapshots/images/<dataset>/snapshots
Change the "server_mode:" to CPU
Change the "snapshot:" to 200. On my macbook, it takes 3 minutes per 20
iterations, so this produces one snapshot per half hour. On a K40 machine
training the bvlc_reference_caffenet, training the
bvlc_refference_caffenet on 1000 ImageNet categories takes 26 seconds per
20 iterations, which is ~5.2 ms per image, so I think they're training on about
5100 images. I'm training on 19500 images (there are 7 times as many
negative images as positive images)
Change the "max_iter" to 1000. (this is 45000 for NIN, but that would take
me 112 hours, which is 4.7 days). This is the same as test_iter. It's
annoying that the test is also performed on an untrained network, because
that takes 70 minutes, and just tells us the ratio of our positive to negative
images.
Copy this to a new file called deploy.prototxt:
```
name: "<the name in the train_val.prototxt>"

input: "data"
input_dim: 10
input_dim: 3
input_dim: 227
input_dim: 227
```
Append to that all layers in train_val.prototxt.
Delete the first few layers that don't have a "bottom" field.
Delete all pramaters that have to do exclusively with learning.
e.g.:
  - blobs_lr
  - weight_decay
  - weight_filler
  - bias_filler
Delete the "accuracy" layer and any layer after it (probably the softmax)
Append to the file this final layer:
```

# R-CNN classification layer made from R-CNN ILSVRC13 SVMs.
layers {
  name: "fc-rcnn"
  type: INNER_PRODUCT
  bottom: <name of layer right above this one>
  top: "fc-rcnn"
  inner_product_param {
    num_output: <change this to the number of categories>
  }
}
```
In the last layer before the R-CNN layer that contains a "num_output" field,
change that value to the number of categories (in my case, 2).
References for generating deploy.prototxt:
  https://github.com/BVLC/caffe/issues/1245
  https://github.com/BVLC/caffe/issues/261

Rose Perrone

unread,
Nov 29, 2014, 11:46:38 PM11/29/14
to caffe...@googlegroups.com

Rose Perrone

unread,
Dec 2, 2014, 9:56:11 PM12/2/14
to caffe...@googlegroups.com
When I finetune the NIN model rather than train it from scratch, I still get 0s for the negative class, but for the positive class, I get high prediction scores. Here is a sample of the detection results, where the value on the left is the prediction score for the negative class, and the value on the right is the prediction score for the positive class:

0.0 10.1965
0.0 89.606
0.0 15.007
0.0 8.72097
0.0 27.801
0.0 55.6223
0.0 20.7561
0.0 11.9071
0.0 26.5453
0.0 69.9159
0.0 7.99755
0.0 53.7852
0.0 64.1716
0.0 56.2549
0.0 34.6965
0.0 21.247
0.0 31.1896
0.0 115.908
0.0 45.844
0.0 71.3556
0.0 17.0009

nod

unread,
Jan 29, 2015, 10:33:35 AM1/29/15
to caffe...@googlegroups.com
experiencing the same issue.. 
did you find a solution for it ?
Reply all
Reply to author
Forward
0 new messages