How to use trained model to do classification for an image in Caffe?

915 views
Skip to first unread message

商明阳

unread,
Mar 20, 2017, 5:21:10 AM3/20/17
to Caffe Users
I want to use trained VGG model to do classification, I have downloaded the model from its project page, But I don't know how to use it to do classification for an image in caffe? I think maybe some codes has existed for this purpose, I have googled about this,but I can't find an answer and the codes, anyone knows can help me? thanks very much!

mahdi yed

unread,
Mar 20, 2017, 10:09:59 AM3/20/17
to Caffe Users
you can use python for that just load your model then put the data (Picture) on the data then execute a forward propagation.

商明阳

unread,
Mar 20, 2017, 10:39:27 AM3/20/17
to Caffe Users
thanks for your reply, can you give me a example link or code? I know this procedure, but I don't know how do make it in caffe .

在 2017年3月20日星期一 UTC+8下午10:09:59,mahdi yed写道:

Gil Mor

unread,
Mar 21, 2017, 3:56:08 PM3/21/17
to Caffe Users
Play with the classification example from github. It took me some time to get it working in my case.. The size of your images should be the same like in the model's deploy.prototxt (I used caffenet.. not sure how the  VGG model is configured..)

zho...@ihoment.com

unread,
Mar 21, 2017, 10:03:05 PM3/21/17
to Caffe Users
老兄,搞会了没?我也初学,教教我呗。。现在下了model不知道咋用,用classify.py脚本跑出来也不知道咋看结果。。。

mahdi yed

unread,
Mar 22, 2017, 1:04:01 PM3/22/17
to Caffe Users
open a python shell import caffe and numpy to play with arrays easily then

net = caffe.Net('models/bvlc_reference_caffenet/deploy.prototxt',
                'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel',
                caffe.TEST)

then just read the image like
im = caffe.io.load_image('examples.jpg')
net.blobs['data'].data[...] = transformer.preprocess('data', im) // put the data on the data layer 

then
out= net.forward()

Le lundi 20 mars 2017 10:21:10 UTC+1, 商明阳 a écrit :
Message has been deleted

Gil Mor

unread,
Mar 22, 2017, 7:05:38 PM3/22/17
to Caffe Users

I'll post my working solution but I'm almost sure you'll have to edit a lot of it to your needs.
I hope it'll give you some directions.. That's the best i can do.

I use caffenet and it comes with deploy.prototxt that I need to use for classification.
I needed to changed the name of the output layer (fc8 in caffenet) like I needed to change it in the train_val.prototxt (also comes with caffe).
I also changed the 'num_output' parameter is the output layer from 1000 to 2.

my images are 256x256 because that's the size required by caffenet for training. On the other hand, the size required for classification is 227x227 (images are cropped during training). 
I could have 2 folders with 2 different images sizes but I didn't want that So I did a work-around,
I'm 'resizing' my images when I insert them to the lmdb with the --resize_height, --resize_width arguments.
This means that I don't resize the actual files - only the images inside the lmdb.
Then I create the image mean from the train_lmdb, And then, during the classification,
after loading an image with caffe.io.load_image, I center-crop the numpy array (which is the loaded image).

Note that you need to do the same transformations you did on your training images.
For this I use the caffe.io.Transformer. This might be model dependent. 
For example this: transformer.set_raw_scale('data'255# rescale from [0, 1] to [0, 255] - needed for caffenet\Alexnet.
You need to check what preprocessing you need to do.

Here are the relevant functions. 
Call in this order:
first create train.txt and val.txt,
then call make_lmdb,
then make_image_mean_binaryproto
and then 
binary_classification_from_raw_images.

My code is a little ruff in some places - I had to get a prototype going..

Note that I have some global variables: 
caffe_root = 
$CAFFE_ROOT
caffe_tools - path to $CAFFE_ROOT/build/tools
my_model_data - path to my images folder 
my_model_mean_binaryproto - path to the image mean binaryproto which is created in my images folder, 
my_model -  just the name of my project
CLASSIFICATION_IMAGE_SIZE=227 # could be different for VGG

def binary_classification_from_raw_images(model, weights):
"""

:param model: path to deploy.prototxt. comes with caffenet.
You need to change the name of the output layer to the same name you gave your output layer in the train_val.prototxt
and also change the num_output to 2.
:param weights: path to the binaryproto. snapshot of the fine-tuned model. define how to save it in your solver.prototxt.
:return:
"""

#### set your paths
# caffe_root = caffe root.
# my_model_mean_binaryproto = my trained images mean binaryproto.
# my_model_data = path to my images folder. all my images are in the same folder and i differentiate them by keywords in the filename.
# In my images folder I also have val.txt and train.txt with filename and their label for training and validation.
# From these files I also create the lmdbs.

# * Load `caffe`.
# The caffe module needs to be on the Python path;
# we'll add it here explicitly.

sys.path.insert(0, caffe_root + '/python')
import caffe

# * Set Caffe to CPU mode and load the net from disk.
caffe.set_mode_cpu()

net = caffe.Net(model, # defines the structure of the model
weights,
caffe.TEST) # use test mode (e.g., don't perform dropout)

# * Set up input preprocessing. (We'll use Caffe's `caffe.io.Transformer` to do this, but this step is independent of other parts of Caffe, so any custom preprocessing code may be used).
#
# Our default CaffeNet is configured to take images in BGR format. Values are expected to start in the range [0, 255] and then have the mean ImageNet pixel value subtracted from them. In addition, the channel dimension is expected as the first (_outermost_) dimension.
#
# As matplotlib will load images with values in the range [0, 1] in RGB format with the channel as the _innermost_ dimension, we are arranging for the needed transformations here.

#### I don't have .npy array with the mean.
# load the mean ImageNet image (as distributed with Caffe) for subtraction
# mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
# mu = mu.mean(1).mean(1) # average over pixels to obtain the mean (BGR) pixel values
# print 'mean-subtracted values:', zip('BGR', mu)

# create transformer for the input called 'data'
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})

# I don't have my image mean in .npy file but in binaryproto. I'm converting it to a numpy array.
# Took me some time to figure this out.
blob = caffe.proto.caffe_pb2.BlobProto()
data = open(my_model_mean_binaryproto, 'rb').read()
blob.ParseFromString(data)
mu = np.array(caffe.io.blobproto_to_array(blob))
mu = mu.squeeze() # The output array had one redundant dimension.

transformer.set_transpose('data', (2, 0, 1)) # move image channels to outermost dimension
transformer.set_mean('data', mu) # subtract the dataset-mean value in each channel
transformer.set_raw_scale('data', 255) # rescale from [0, 1] to [0, 255] - needed for caffenet\Alexnet.
transformer.set_channel_swap('data', (2, 1, 0)) # swap channels from RGB to BGR

# ### 3. CPU classification
#
# * Now we're ready to perform classification. Even though we'll only classify one image, we'll set a batch size of 50 to demonstrate batching.

# In[6]:

# set the size of the input (we can skip this if we're happy
# with the default; we can also change it later, e.g., for different batch sizes)
net.blobs['data'].reshape(32, # batch size
3, # 3-channel (BGR) images
CLASSIFICATION_IMAGE_SIZE, CLASSIFICATION_IMAGE_SIZE) # image size is 227x227

correct = 0
count = 1

# save labels and prediction in arrays for further statistical analysis.
y_test, y_pred = [], []

# in my case, I'm reading the images file names from val.txt.
val_images = open(my_model_data + "/val.txt").readlines()

# You don't need to shuffle.. This is how I want to see the output.
random.shuffle(val_images)

# I'm saving mis-classified filenames in a file.
misclassified = open(my_model_data + "/misclassified.txt", "w")


for image_name_n_label in val_images:

# in val.txt every line contains - filename label
image_file, label = my_model_data + "/" + image_name_n_label.split(' ')[0], int(image_name_n_label.split(' ')[1])
y_test.append(label)

print(count, ". image: " + os.path.basename(image_file) + " label: " + str(label))
image = caffe.io.load_image(image_file)

# image shape is (3, 256, 256). we want it (3, 227, 227) for caffenet.
if image.shape[0] != CLASSIFICATION_IMAGE_SIZE:

# I'm cropping the numpy array on the fly so that I don't have to mess with resizing
# the actual images in a separate folder each time.
image = center_crop_image(image, CLASSIFICATION_IMAGE_SIZE, CLASSIFICATION_IMAGE_SIZE)
if image.shape[0] != CLASSIFICATION_IMAGE_SIZE:
print("!!!!!!! cropped shape ", image.shape)

transformed_image = transformer.preprocess('data', image)


# copy the image data into the memory allocated for the net
net.blobs['data'].data[...] = transformed_image

### perform classification
output = net.forward()

output_prob = output['prob'][0] # the output probability vector for the first image in the batch
print(count, ". prob for 0 class: {0:.5f}, prob for 1 class: {1:.5f}".format(output_prob[0], output_prob[1]))

predicted_label = output_prob.argmax()
y_pred.append(predicted_label)



if predicted_label == label:
correct += 1
else:
print("!!!!!!!!!!!!!!! misclassified")
misclassified.write(os.path.basename(image_file) + " " + str(label) + " " + str(predicted_label) + "\n")
# display misclassified images.
image = PIL.Image.open(image_file)
image.show()



accuracy = ((100. * correct) / (count))


print(count, '. predicted class is: ', output_prob.argmax())
print(count, ". accuracy: " + str(accuracy))

print("")

count += 1

misclassified.close()
# -------------------------------------------------------------------------------------------------------

def center_crop_image(image, new_width, new_height):

height, width, chan = image.shape

width_cut = (width - new_width) // 2
height_cut = (height - new_height) // 2

top, bottom = height_cut, -height_cut
left, right = width_cut, -width_cut

# could have 1 pixel off.
height_diff = new_height - (height - (height_cut*2))
width_diff = new_width - (width - (width_cut*2))

top -= height_diff
left -= width_diff

# or
# bottom += ydiff
# right += xdiff
# or any other combination

return image[top:bottom, left:right]

# -------------------------------------------------------------------------------------------------------

def make_lmdb(resize):

curr_dir = os.getcwd()
os.chdir(my_model_data)

subprocess.call(r"rm -r -f train_lmdb", shell=True)
subprocess.call(r"rm -r -f val_lmdb", shell=True)
subprocess.call(r"GLOG_logtostderr=1 " + caffe_tools + "/convert_imageset --resize_height={size} --resize_width={size} --shuffle ./ train.txt train_lmdb".format(size=resize), shell=True)
subprocess.call(r"GLOG_logtostderr=1 " + caffe_tools + "/convert_imageset --resize_height={size} --resize_width={size} --shuffle ./ val.txt val_lmdb".format(size=resize), shell=True)

os.chdir(curr_dir)

# ---------------------------------------------------------------------------------------

def make_image_mean_binaryproto():
subprocess.call(r"rm -r -f " + my_model_data + "/" + my_model + "_mean.binaryproto", shell=True)
cmd = caffe_tools + "/compute_image_mean -backend=lmdb " + my_model_data + "/train_lmdb " + my_model_data + "/" + my_model + "_mean.binaryproto"
subprocess.call(cmd, shell=True)

# ---------------------------------------------------------------------------------------

On Monday, March 20, 2017 at 11:21:10 AM UTC+2, 商明阳 wrote:

Vinod Patel

unread,
May 4, 2017, 6:09:58 AM5/4/17
to Caffe Users
Hi ..I need a small clarification. Why we need image range [0-255] for alexnet pretrained model. Why it is not working it with image [0-1] or [-1,1] range??? 
Reply all
Reply to author
Forward
0 new messages