I wanted to share my results with Pedestrian detection using the KITTI dataset because my initial attempt at it produced some lousy results. Lousy results in that mAP(val), Precision(val), and Recall (val) all ended up being zero.
It turns out that the prepare_kitti_data.py (at least the one I used) splits the data in a way that makes the pedestrian detection impossible in terms of validation. I looked at the validation images, and I couldn't find any that were really useful for pedestrian detection. If I tested the model on one of the training images it would sometimes find the pedestrian. That confirmed that I had correctly set the custom class mappings in the dataset creation form.
So then I decided to go through the manual task of creating a Mini-KITTI dataset with just the pedestrian images. This consisted of 440 training images, and 151 validation images.
When I used this dataset I got the following results. I went out to an epoch of 120, but that wasn't really necessary.
mAP (val) = 52.1297
precision (val) = 72.1355
recall (val) = 67.6826
I'm not sure if those numbers are good or bad, but at least they're not zero. :-)
My next step is to reduce the dataset to see what the minimum dataset size really should be so I can test it on my own dataset. I'm also curious if it can detect the stig (the dude with the helmet on) in the pedestrian images. There is probably not enough images for that one.
Is anyone using a special tool to create the label file for an image? Where we can draw a box around the object we're interested in and then define what the box is of?
For a few images I can just manually create the label file, but I can't imagine creating it for hundreds of images.