ImageNet ILSVRC12 - Subset + Data Set Types

648 views
Skip to first unread message

chtp...@gmail.com

unread,
Oct 8, 2017, 11:56:55 PM10/8/17
to Caffe Users
Hi All,

I have full ImagetNet ILSVRC12 data set. After executing get_ilsvrc_aux.sh [1] I get train/val/test txt files, which if I am correct are image identifiers fed to create_imagenet.sh [2].

I have few questions:
  1. Since, this data set is too large, for now I just want to use subset of it in LMDB format to quickly test larger networks. Is it fine to just use first 100 or 500 (for example) names in above text files to generate smaller data set? 
  2. During training phase, why is val data set of ImageNet used and not the test data? It seems test data is never used only?
  3. In val.txt has rows in following format: ILSVRC2012_val_00000001.JPEG 65 . What is the significance of 65 for this format? 

Thank you.
Chetan

Przemek D

unread,
Oct 9, 2017, 9:55:39 AM10/9/17
to Caffe Users
1. For debugging you can use first 100 examples, but this will only prove that your network is structurally correct or not. It will not learn anything - note that there are 1000 classes in the ILSVRC data, so even 10k images aren't likely to bring you anywhere near convergence.
2. During the original competition, participants would be given training and validation data to create and test their algorithms. Then the algorithms would be tested on test data to compare between teams and determine the winner. Think of test set as a kind of global validation set that stays hidden from the participants during the competition to prevent them from using it in learning.
3. The number stands for index of the class this image belongs to.

Hope that helps.

chtp...@gmail.com

unread,
Oct 9, 2017, 12:00:22 PM10/9/17
to Caffe Users
Hi Przemek,

On Monday, October 9, 2017 at 6:55:39 AM UTC-7, Przemek D wrote:
1. For debugging you can use first 100 examples, but this will only prove that your network is structurally correct or not. It will not learn anything - note that there are 1000 classes in the ILSVRC data, so even 10k images aren't likely to bring you anywhere near convergence.

I just want to run networks from start to end and want to ensure I use ImageNet data only. So, not worried much about learning for now. As I understand, I can directly keep desired number of rows in these text files and there won't be an issue?

I was in the impression that there is a specific way these files are written and every row of all the three files have a correlation with another random row, so deleting without knowing will be inaccurate use of it. 
 
2. During the original competition, participants would be given training and validation data to create and test their algorithms. Then the algorithms would be tested on test data to compare between teams and determine the winner. Think of test set as a kind of global validation set that stays hidden from the participants during the competition to prevent them from using it in learning.

Now this is clear. Since, I am downloading after competition is done, hence I have test also with me.

The data set I have also has "train_t3". I am guessing this was released in addition to train, nothing specific about it.
 
3. The number stands for index of the class this image belongs to.

Can you please share more details? Is this index coming from train.txt file? As I wrote in first response, I just want to ensure there is no correlation between how each row is written in test, train and val txt files.

Thanks,
Chetan

Przemek D

unread,
Oct 11, 2017, 6:02:06 AM10/11/17
to Caffe Users
I can directly keep desired number of rows in these text files and there won't be an issue?
 Not that I know of, there shouldn't be.

Can you please share more details?
The standard format of an image list file for classification is:
<path/to/image> <class index>
This way, each image is bound to the class it belongs. Somewhere you can keep class descriptions, binding indices with class names - but the idea is, following the example from your original post, that the image "ILSVRC2012_val_00000001.JPEG" belongs to class 65. I admit I don't understand what do you mean by "no correlation between how each row is written in test, train and val txt files" - the format should stay the same in each file: filename, space, class id.
Reply all
Reply to author
Forward
0 new messages