As opposed to using a huge precreated LMDB file, I would like to use the memorydata layer to load my data on-the-fly. This allows for realtime data augmentation and many more interesting things.As a simple experiment I am trying to feed the MNIST data to the memorydata layer in python.To do this I extract minibatches from the leveldb file in python which I then feed to the network using the memorydata layer.For testing I still use the LMDB, but for the training phase I load the data using the memorydata layer.First I import some libraries and set some environment vars:caffe_root = '/home/diag/caffe-master/' # this file is expected to be in {caffe_root}/examples
import sys
import os
import numpy as np
sys.path.insert(0, caffe_root + 'python')
os.chdir('/media/diag/Data/Python_scripts/')
import caffe
import lmdb
from pylab import *
miniBatchsize = 100
caffe.set_device(0)
caffe.set_mode_gpu()
solver = caffe.SGDSolver('mnist_python/lenet_auto_solver.prototxt')
Next I define the function to extract minibatches from the LMDB database:def getData(it):
stats = env.stat()
nrEntries = stats['entries']
begin = it*miniBatchsize % nrEntries
end = it*miniBatchsize % nrEntries + miniBatchsize
ID = '{:08}'.format(0)
raw_datum = txn.get(ID)
datum = caffe.proto.caffe_pb2.Datum()
datum.ParseFromString(raw_datum)
channels, height, width = datum.channels, datum.height, datum.width
imageData=np.zeros((end-begin,channels,height,width),dtype='float32')
labels=np.zeros(((end-begin),1,1,1),dtype='float32')
count = 0
for i in range(begin, end):
ID = '{:08}'.format(i)
raw_datum = txn.get(ID)
datum = caffe.proto.caffe_pb2.Datum()
datum.ParseFromString(raw_datum)
flat_x = np.fromstring(datum.data, dtype=np.uint8)
x = flat_x.reshape(datum.channels, datum.height, datum.width)
y = datum.label
imageData[count,:,:,:] = x
labels[count,0,0,0] = y
count+=1
return imageData, labels
This part is probably not the issue. I have checked if the images are extracted correctly and it seems the images and labels are correctly added to a 4 dimensional array of size [100,1,28,28] and [100,1,1,1] for the images and the labels respectively.Then for the actual training of the network I use the following code:%%time
niter = 10000
test_interval = 500
train_loss = np.zeros(niter)
test_acc = np.zeros(int(np.ceil(niter / test_interval)))
output = np.zeros((niter, 8, 10))
# the main solver loop
env = lmdb.open('/media/diag/Data/Python_scripts/mnist_python/mnist_train_lmdb/', readonly=True)
with env.begin() as txn:
for it in range(niter):
(imageData, labels) = getData(it)
solver.net.set_input_arrays(imageData, labels)
solver.step(1)
train_loss[it] = solver.net.blobs['loss'].data
solver.test_nets[0].forward(start='conv1')
output[it] = solver.test_nets[0].blobs['ip2'].data[:8]
if it % test_interval == 0:
print 'Iteration', it, 'testing...'
correct = 0
for test_it in range(100):
solver.test_nets[0].forward()
correct += sum(solver.test_nets[0].blobs['ip2'].data.argmax(1)
== solver.test_nets[0].blobs['label'].data)
test_acc[it // test_interval] = correct / 1e4
Here I open the LMDB file, extract a minibatch from it, and input and process it through the network using this function: solver.net.set_input_arrays(imageData, labels)
solver.step(1)
The network starts training without any problem, but the loss on the trainingdata and the accuracy on the validation data, is not decreasing or increasing respectively (training loss is varying slightly around 2.30, and accuracy is 0.1).MNIST should comverge rapidly, so I seems something is wrong with my approach.I've been looking through all the topics about the memorydata layer in python, but none came up with an answer on how to get it to work.What am I doing wrong?