Train AlexNet for Binary Classification, How much data is required? Still > 1.0M samples?

618 views
Skip to first unread message

Saeed Izadi

unread,
Aug 18, 2015, 4:57:40 AM8/18/15
to Caffe Users
I'm going to train the AlexNet for a two class problem (discriminating cars from say, boats). I studied the reference paper and found that they had trained their model on 1.2M images. In my own problem, I'm dealing with a very large dataset (more that 500K samples for each category), however, since my problem is modeled as a binary classification one (instead of 1000 category classification), I'm wondering to know it is necessary to train the model on > 1.0M samples (500K per category) or less is sufficient ?
Thanks
-Saeed

Bartosz Ludwiczuk

unread,
Aug 18, 2015, 12:25:10 PM8/18/15
to Caffe Users
Hi Saeed,
there is no obvious correlation between number of classes and the number of training examples. It may depend on a lot of  factors (including how hard the class are, what is the class diversity, etc).
I only remember, that Yan LeCun said that is is really hard to learn a good ConvNet based only on Binary problem. The main problem here is a low diversity between examples (50% of example point same class). This prone the ConvNet to learn non-general features. It learns specified features for each of class, what sometimes lead to worse result or overfitting.

To sum up:
1. I think that you need < 1.0M examples to learn AlexNet. Maybe sth like 300k-500k will be enough (maybe even 200k). 
2. I you really want to get the result of that binary problem using AlexNet, use already learned net on ImageNet ( models/bvlc_alexnet). Just finetune it using your classes. 

Regards,
Bartosz
Reply all
Reply to author
Forward
0 new messages