Dear Caffe users,
apologies if a similar question has been asked before, but I couldn't find a related answer.
I have trained an MNIST classification model on the full set of regular characters (a-z,0-9) and what I find confusing is when looking at the top 2 classification results, the output is mostly binary.
While I can confirm that the class probabilities are continuous indeed, in the majority of cases the model is absolutely certain (1.0) about the correct class at hand and very rarely slightly uncertain (0.99 vs 0.01).
However, when it comes to intuitively ambiguous characters (0/O, I/1) I would expect less certainty.
How can be explained that the outputs are not something like 0.6/04, which is what I would expect? To some extend it boils down to the question of verifying whether what the model has learned is actually correct.
Thank you in advance,
Eduard
PS This is essentially a regular MNIST model except the number of outputs is increased to 36