I'm trying to use the VGG16 net in keras.applications and am testing it on the validation set for ImageNet (i.e. ILSVRC2012_img_val).
The first image in this set appears to be a snake on a beach, and reasonably enough, the 1st 5 options given are sea snake, water snake, diamondback, rock python and sidewinder. The corresponding top output nodes are 65, 58, 67, 62 & 68.
However, when I look at ILSVRC2014_devkit/data/ILSVRC2014_clsloc_validation_ground_truth.txt, which seems to have the ground truth responses for the validation set (my understanding, from I forget where, is that the 2014 devkit files really do correspond to the 2012 datasets), it says that node 490 should have the strongest response. This node, however, according to .keras/models/imagenet_class_index.json corresponds to chain mail.
So, I think the trained net is classifying correctly, but I obviously have a major misunderstanding about how to find the ground truth for Imagenet's validation set. If anyone can explain what it is I'm getting wrong about determining the ground truth for Imagenet's validation set, it would be a big help.
Thanks.