I have attempted training a few times.
I have 4K classes, with ~ 4 million training images, ~ 200K validation images.
I have set test batch size and test_ter s.t. batchSize x test_iter = number of validation images.
I have used various batch sizes and learning rates.
Every time, after a few thousand iterations the test accuracy gets stuck at a value close to 0.1 %.
The training loss fluctuates around random guess levels, i.e. around 8.2