Recently, I got into trouble with using MsCeleb1M-Faces dataset to train a face recognition model. The softmax loss is too big ,and be stable around the
value -log(1/N). And I did many experimnets to confirm this, which means I changed the class number N, and the softmax loss was changed into the value
-log(1/N).I changed the network ,but it did not work. I'm so confused and poor. I searched google for a long time but got nothing. Is there anyone meet the same
problem with me? Or Is there someone can help me ? Thanks very much!