I have a trained model with caffe (through the command line) where I get an accuracy of 63%(according to the log files). However, when I try to run a script in Python to test the accuracy (acc2.py), I get all the predictions in the same class, with very similar prediction values, but not quite identical. My goal is to compute the accuracy per class, but in the example I just display the predictions for a given image.
Here are some examples of predictions:
[ 0.20748076 0.20283087 0.04773897 0.28503627 0.04591063 0.21100247] (label 0)
[ 0.21177764 0.20092578 0.04866471 0.28302929 0.04671735 0.20888527] (label 4)
[ 0.19711637 0.20476575 0.04688895 0.28988105 0.0465695 0.21477833] (label 3)
[ 0.21062914 0.20984225 0.04802448 0.26924771 0.05020727 0.21204917] (label 1)
It seems to be a problem with preprocessing, but I can't figure out what. The initial data is grayscale, but we convert it to rgb by duplicating the channels. What could be causing the problem?