Has anybody run ZF training with COCO dataset?

Chan Kim

unread,

Aug 29, 2016, 5:05:05 AM8/29/16

to Caffe Users

Wow, many questions, few answers here.. But I hope some could give me a helping hand to me.
I'm having difficulty training ZF network with COCO dataset. (fist of all, the faster-rcnn git doesn't have all the data for doing it. I hope some could fix it.)

In my case the rpn_labels blob from rpn_data layer(python) has (1,1,A*H,W) format, but rpn_cls_score from rpn_cls_socre layer(convolution) has (2,2,A*H,W) format.

I0829 17:43:22.761220 6751 layer_factory.hpp:76] Creating layer rpn_loss_cls
I0829 17:43:22.761255 6751 net.cpp:114] Creating Layer rpn_loss_cls
I0829 17:43:22.761267 6751 net.cpp:481] rpn_loss_cls <- rpn_cls_score_reshape_rpn_cls_score_reshape_0_split_0
I0829 17:43:22.761286 6751 net.cpp:481] rpn_loss_cls <- rpn_labels
I0829 17:43:22.761303 6751 net.cpp:437] rpn_loss_cls -> rpn_cls_loss
I0829 17:43:22.761330 6751 layer_factory.hpp:76] Creating layer rpn_loss_cls
F0829 17:43:22.762058 6751 loss_layer.cpp:25] Check failed: bottom[0]->num() == bottom[1]->num() (2 vs. 1) The data and label should have the same number.
*** Check failure stack trace: ***
experiments/scripts/faster_rcnn_end2end2.sh: line 57: 6751 Aborted (core dumped) ./tools/train_net_e2e.py --gpu ${GPU_ID} --solver models/${PT_DIR}/${NET}/faster_rcnn_end2end/solver.prototxt --weights data/imagenet_models/${NET}.v2.caffemodel --imdb ${TRAIN_IMDB} --iters ${ITERS} --cfg experiments/cfgs/faster_rcnn_end2end.yml ${EXTRA_ARGS}

Can any one suggest what's wrong?

Chan Kim

unread,

Aug 30, 2016, 1:14:26 AM8/30/16

to Caffe Users

The data blob had num = 2 so I set cfg.TRAIN.IMS_PER_BATCh to 1, and the problem is gone now. But I'm faced with another problem later. The error message follows :

I0830 14:01:57.355235 17807 net.cpp:1023] Copying source layer relu7
I0830 14:01:57.355264 17807 net.cpp:1023] Copying source layer drop7
I0830 14:01:57.355271 17807 net.cpp:1020] Ignoring source layer fc8
I0830 14:01:57.355276 17807 net.cpp:1020] Ignoring source layer prob
Solving...
I0830 14:01:57.466291 17807 net.cpp:602] ## : net_input_blobs_.size() : 0
Traceback (most recent call last):
File "./tools/train_net_e2e.py", line 114, in <module>
    max_iters=args.max_iters)
File "/home/ckim/Neuro/py-faster-rcnn.org/tools/../lib/fast_rcnn/train.py", line 160, in train_net
    model_paths = sw.train_model(max_iters)
File "/home/ckim/Neuro/py-faster-rcnn.org/tools/../lib/fast_rcnn/train.py", line 101, in train_model
    self.solver.step(1)
File "/home/ckim/Neuro/py-faster-rcnn.org/tools/../lib/roi_data_layer/layer.py", line 145, in forward
    blobs = self._get_next_minibatch()
File "/home/ckim/Neuro/py-faster-rcnn.org/tools/../lib/roi_data_layer/layer.py", line 63, in _get_next_minibatch
    return get_minibatch(minibatch_db, self._num_classes)
File "/home/ckim/Neuro/py-faster-rcnn.org/tools/../lib/roi_data_layer/minibatch.py", line 55, in get_minibatch
    num_classes)
File "/home/ckim/Neuro/py-faster-rcnn.org/tools/../lib/roi_data_layer/minibatch.py", line 125, in _sample_rois
    roidb['bbox_targets'][keep_inds, :], num_classes)
KeyError: 'bbox_targets'

Chan Kim

unread,

Aug 30, 2016, 10:58:18 AM8/30/16

to Caffe Users

I solved this problem. See https://github.com/rbgirshick/py-faster-rcnn/issues/305 .

Reply all

Reply to author

Forward