Train AlexNet for Binary Classification, How much data is required? Still > 1.0M samples?

618 views

Skip to first unread message

Saeed Izadi

unread,

Aug 18, 2015, 4:57:40 AM8/18/15

to Caffe Users

I'm going to train the AlexNet for a two class problem (discriminating cars from say, boats). I studied the reference paper and found that they had trained their model on 1.2M images. In my own problem, I'm dealing with a very large dataset (more that 500K samples for each category), however, since my problem is modeled as a binary classification one (instead of 1000 category classification), I'm wondering to know it is necessary to train the model on > 1.0M samples (500K per category) or less is sufficient ?

Thanks

-Saeed

Bartosz Ludwiczuk

unread,

Aug 18, 2015, 12:25:10 PM8/18/15

to Caffe Users

Hi Saeed,

there is no obvious correlation between number of classes and the number of training examples. It may depend on a lot of factors (including how hard the class are, what is the class diversity, etc).

I only remember, that Yan LeCun said that is is really hard to learn a good ConvNet based only on Binary problem. The main problem here is a low diversity between examples (50% of example point same class). This prone the ConvNet to learn non-general features. It learns specified features for each of class, what sometimes lead to worse result or overfitting.

To sum up:

1. I think that you need < 1.0M examples to learn AlexNet. Maybe sth like 300k-500k will be enough (maybe even 200k).

2. I you really want to get the result of that binary problem using AlexNet, use already learned net on ImageNet ( models/bvlc_alexnet). Just finetune it using your classes.

Regards,

Bartosz

Reply all

Reply to author

Forward

0 new messages