Hi,
I am wondering whether the options to pad or resize images when creating a database for object detection also re-map the bounding box coordinates for the labels? I have managed to gain a decent accuracy using the KITTI training set described in the walkthrough (with the graph inconsistencies already flagged on github) but am having difficulty applying the same method to my own data. I have tried formatting the labels as:
(Car -1.00 -1 -1.00 433.00 273.00 613.00 383.00 -1.00 -1.00 -1.00 -1000.00 -1000.00 -1000.00 -10.00)
and
(Car 0.00 0 0.00 433.00 273.00 613.00 383.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00) .
Am I correct in thinking Digits only needs the object class and 4 bounding box coordinates?
My images are of the size (750,1000,3) with all bounding box edges in the range 80<x<200. So far I have not been able to achieve an maP > 0 for anything other than the set example.
A nice addition in future versions would be a visualisation of the DB within digits containing the bounding boxes, this would help ensure the labels had been formatted correctly. Any help or pointers on where I may be going wrong would be greatly appreciated.
Thanks,
Nerraw