Hi Saeed,
there is no obvious correlation between number of classes and the number of training examples. It may depend on a lot of factors (including how hard the class are, what is the class diversity, etc).
I only remember, that Yan LeCun said that is is really hard to learn a good ConvNet based only on Binary problem. The main problem here is a low diversity between examples (50% of example point same class). This prone the ConvNet to learn non-general features. It learns specified features for each of class, what sometimes lead to worse result or overfitting.
To sum up:
1. I think that you need < 1.0M examples to learn AlexNet. Maybe sth like 300k-500k will be enough (maybe even 200k).
2. I you really want to get the result of that binary problem using AlexNet, use already learned net on ImageNet ( models/bvlc_alexnet). Just
finetune it using your classes.
Regards,
Bartosz