Hello,
I have trained SSD (Single-shot Multibox detector) on a customized version of the VOC dataset and used the steps to prepare the data (as in SSD's readme). I would now like to use the above model to fine-tune on multi-channel images (RGB+D) as input.
SSD uses its custom layer called 'AnnotatedDatum' and the way to prepare data is give in the README.md and it is like Pascal VOC's data preparation. I would like to have a 4-channel image with corresponding annotation to be processed but am unable to figure out how to do the same.
What should I do differently to prepare the dataset correctly?
Regards,
Ankit