Caffe Workflow - is it right?

88 views
Skip to first unread message

Bene

unread,
Dec 14, 2016, 7:49:48 AM12/14/16
to Caffe Users
Right now i am running the train mode of caffe.
I  just want to use one synset of the Imagenet 2012 Database -> Chairs

So here is my workflow:

1. I downloaded the synset of Chairs (1460 images)

2. I downloaded the train data / validation data from Imagenet (50 000 Images)

3. I create an lmdb for booth data sets including to resize the high and with of the images

a) for the synset of Chairs with the included train.txt ( Syntax: \n0300167_18182.JPEG 0 ) -> Thats an example row and i think "0" means "class 0 = chair", doesnt it?

b) for the validation data with the included val.txt ( Syntax: \ILSVRC20102_val_0050000.JPEG 355 ) -> Alos an example row and i think "355" means "class 355 = classXYZ" , doesnt it?
First Problem: The created lmdb is limited to 2gb -> Only 40.000 Images are in this lmdb -> I was not able to solve this problem so i continiued with this generated lmdb

4. Accourding to  the Caffe | ImageNet Tutorial you have to substract the image mean from every image (whatever that means). So i did that twice: once for the train data lmdb and once val data lmdb

-> Reuslt: train.binaryproto and val.binaryproto

5. Last but not least it is necessary to add the right paths to the train_val.prototxt which will be called from the solver.prototxt

-> Accourding to the Caffe | ImageNet Tutrioal you have to add the path of the binaryproto file to the train_val.protoxt - but which one? I choosed the train.binaryproto file

After a littlebit of configuration of the batch_size (I am using a gtx960m (4gb memory) with cuda and i choosed: batch size train: 50 + batch size val: 5) i was able to start the train prozess.

Right now it has 10.000 Iterations without an error but i have always a loss of  0

I1214 13:42:39.345842  9068 solver.cpp:228] Iteration 9980, loss = 0
I1214 13:42:39.345842  9068 solver.cpp:244]     Train net output #0: loss = 0 (* 1 = 0 loss)

Testing reslut after 10.000 Iterations:

I1214 13:42:50.959086  9068 solver.cpp:337] Iteration 10000, Testing net (#0)
I1214 13:43:10.901937  9068 solver.cpp:404]     Test net output #0: accuracy = 0.0006
I1214 13:43:10.901937  9068 solver.cpp:404]     Test net output #1: loss = 87.2837 (* 1 = 87.2837 loss)


Is that a normal "result" or should i stop the prozess? To keep in mind i have only one "class" -> chairs

Cheers and thank you!
Bene


 

Patrick McNeil

unread,
Dec 14, 2016, 9:27:17 AM12/14/16
to Caffe Users
I think the issue you are having is related to your dataset.

If you want to do just a binary classification, you want to have your data with two classes (0 = not chair, 1 = chair (or vice versa)).  If your data has a file with a class of 355 (or any value greater than 1), I believe Caffe will assume you have all of the classes in between (so, Caffe thinks you have classes from 0 - 355 or 356 classes total).  So, if you only had values of 0 and 355 in your data set you would get poor training results (since the values for 1 - 354 would not get well defined).  Also, your datasets should all use the same labels (training, validation, and testing data).  So, if you have not chair = 0 and chair = 1 in training, you should use the same values for validation.  If they change between the datasets, the network will not provide the correct class.

In addition, there should not be a limit of 2GB for your LMDB  Are you storing the LMDB on a file system that only supports that size of file?  Did you get an error when making the database?  I found that if there are issues with the encoding of the image, the LMDB creation process will error out. 

The mean image should be the same for both training and validation.

The output of your training is not correct.  Your testing accuracy is only 0.06% accurate.  Which is not very good on the surface, but if Caffe thinks you have 356 classes, the expected for guessing is only 0.02% (1/356), so it is learning, just not very well.  I think the issue is related to the number of classes you have defined and the limit on your database size.

Patrick

Bene

unread,
Dec 14, 2016, 12:34:07 PM12/14/16
to Caffe Users
Hello Patrick,
thank you for answering.
Now i did something:
I take all the chair data - give them a "0"  + I take some validation data (ca. 29.000 Images) and gave them a "1"

Example from the train.txt:
\n03001627_18205.JPEG 0 -> chair
\n03001627_18210.JPEG 0 -> chair
\ILSVRC2012_val_00020000.JPEG 1 -> no chair
\ILSVRC2012_val_00020001.JPEG 1 -> no chair

After that i create the lmdb + resize the images to 128 * 128

Then i make the mean file: final.binaryproto

So: There is the validation data left - can i not simple use the chair  lmdb (prevously gnerated)? Or should it be a different dataset? And if so: how should the val.txt look?


Also: the train phase and the test phase in my train_val.protxt needs a .binaryproto file -> so is it right to use the final.binaryproto (generated out of the train_lmdb) for both? - even the train_lmdb and val_lmdb is not identical?

Sorry
I just started...
cheers

Bene


Patrick McNeil

unread,
Dec 14, 2016, 1:14:57 PM12/14/16
to Caffe Users
Bene,

Typically you will to have three total data sets: training, validation, and testing.  The training is used for training the model.  The validation is used during the training process to determine how well the model is working on untrained data (to detect over fitting for example).  The testing set is used to test the model outside of the training process. 

For all three models, you would want to use the same image mean.  I normally just use the training set to create the image mean (you could use the entire data set if you want as well).  If your data sets are a good representation of your overall data (i.e. the training is not all chairs and validation is not all non-chair) your image mean should work well.

The val.txt should look the same as the train.txt where there is a single line for each file with the label at the end (0 or 1 in your case). 

Patrick
Reply all
Reply to author
Forward
0 new messages