I've been trying to train a LeNet style net to classify a set of images.
1. I created an lmdb database of labeled color images
2. I trained a LeNet5 architecture, using modified forms of train_lenet.sh, lenet_train_test.prototxt and lenet_solver.prototxt
3. These modifications were limited to the lenet_train_test.prototxt file:
a. modifying the data_param:source variable in the to refer to the lmdb with my data
b. changing the inner_product_param:num_output variable from 10 to 18, in order to reflect my 18 classes
4. The achieved accuracy, when running my train_lenet.sh was ~ 95%
5. I next tried to repeat this accuracy on an iPython notebook, using the Python object caffe.Classifier. The model file I used was a modified form of lenet.prototxt. The modification was limited to:
a. changing the second input_dim variable from 1 to 3 to reflect 3 color planes (i.e. RGB)
6. The resultant accuracy was ~50%
7. Changing the order of the color planes (channels) does not seem to improve anything.
If anyone knows of a way to see the intermediate results of a net, (i.e. the outputs of each layer), it would really help me debug this. - i.e. Give a net a single image and then see the output of each layer. Or, if anyone has a suggestion as to what my mistake could be, it would be a great help. In case it illuminates my mistake, here is the iPython code I used:
import caffe
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import os
import pickle
import sys
import random
caffe_home='/home/me/Caffe/'
# Set the right path to your model definition file, pretrained model weights,
# and the image you would like to classify.
MODEL_FILE = os.path.join(caffe_home,'caffe/examples/SGsurfaces/SG_Surf_lenet.prototxt')
PRETRAINED = os.path.join(caffe_home,'caffe/examples/SGsurfaces/Surf_lenet_iter_10000.caffemodel')
#Load database with images (really want to replace this with actual image files,
#but until things work, use the images the net acually trained on)
import lmdb
db_path = os.path.join(caffe_home,'caffe/examples/SGsurfaces/lmdbDBs_28_28/testLMDB')
lmdb_env = lmdb.open(db_path)
lmdb_txn = lmdb_env.begin()
lmdb_cursor = lmdb_txn.cursor()
#Create net
caffe.set_mode_gpu()
net = caffe.Classifier(MODEL_FILE, PRETRAINED,
raw_scale=255,
channel_swap=(0,1,2),
image_dims=(28, 28))
# Measure accuracy
correct = 0
total = 0
while lmdb_cursor.next() and total < 10000:
#Get Info from lmdb
value=lmdb_cursor.value()
#Get Image
datum = caffe.proto.caffe_pb2.Datum()
datum.ParseFromString(value)
#Get Ground Truth Label (only possible, since using data set
#the net trained with
label = int(datum.label)
lmdb_image0 = caffe.io.datum_to_array(datum)
if total == 0:
print lmdb_image0.shape
lmdb_image = lmdb_image0.transpose((1,2,0))
lmdb_image = lmdb_image.astype(np.float32)
#Show image, Net's predictions (& confidences) for Image & Ground Truth
prediction = net.predict([lmdb_image], oversample=False)
#prediction = net.predict([lmdb_image], oversample=True)
predLabel = np.argmax(prediction)
#Get accuracy stats
trueLabel = label
total += 1
if trueLabel == predLabel:
correct += 1
if total%1000 == 0:
print total, correct, float(correct)/float(total)
print "Accuracy = ", float(correct)/float(total)