py-faster-rcnn train suspended.

96 views
Skip to first unread message

Chan Kim

unread,
Jul 28, 2016, 8:53:58 PM7/28/16
to Caffe Users
Hi,

I tried training py-faster-rcnn. With all the VOC2007 and VOC2012 data set, I gave this command

experiments/scripts/faster_rcnn_alt_opt.sh 0 ZF --set EXP_DIR foobar RNG_SEED 42 TRAIN.SCALES "[400,500,600,700]"

(by the way, I'm using version 95918a575d (Thu Dec 17 22:20:26 2015 -0300))

The log file is here : http://pastebin.com/ZShaX5gW

BTW, when the training seems stuck, when I give 'ps -aux | grep caffe' I get

ckim     29597  0.0  0.1 1102984 96800 pts/11  S+   00:04   0:01 python ./tools/train_faster_rcnn_alt_opt.py --gpu 0 --net_name ZF --weights data/imagenet_models/ZF.v2.caffemodel --imdb voc_2007_trainval --cfg experiments/cfgs/faster_rcnn_alt_opt.yml --set EXP_DIR foobar RNG_SEED 42 TRAIN.SCALES [400,500,600,700]

(S+ state means suspended forward process in linux)

I'll look into the traning python script, but can any one suggest what can be wrong? (test runs ok)

Message has been deleted

Chan Kim

unread,
Jul 29, 2016, 3:17:40 AM7/29/16
to Caffe Users
Hi, I later git-pulled and did
experiments/scripts/faster_rcnn_alt_opt.sh 0 ZF pascal_voc --set EXP_DIR foobar RNG_SEED 42 TRAIN.SCALES "[400,500,600,700]"
and training is still running now.  In the mean time, I had to block the "engine: CAFFE " line from models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt.
Reply all
Reply to author
Forward
0 new messages