Hello!
What is the proper way to retrieve image data from an LMDB dataset? The function caffe.io.datum_to_array() is complaining that data size doesn't correspond to datum.channels * datum.height * datum.width, since the images are stored in encoded form (PNG or JPG).
Here is a code snippet for a MNIST dataset:
import os
import lmdb
import caffe
path = '~/digits/digits/jobs/20151102-000300-1f84'
db_train = 'train_db'
db_val = 'val_db'
lmdb_env = lmdb.open(os.path.join(path, db_train), readonly=True)
lmdb_txn = lmdb_env.begin()
lmdb_cursor = lmdb_txn.cursor()
datum = caffe.proto.caffe_pb2.Datum()
for key, value in lmdb_cursor:
datum.ParseFromString(value)
label = datum.label
data = caffe.io.datum_to_array(datum)
Here's the error I get:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-19b8a378d124> in <module>()
16 datum.ParseFromString(value)
17 label = datum.label
---> 18 data = caffe.io.datum_to_array(datum)
/share/tools-install/caffe/python/caffe/io.pyc in datum_to_array(datum)
84 if len(datum.data):
85 return np.fromstring(datum.data, dtype=np.uint8).reshape(
---> 86 datum.channels, datum.height, datum.width)
87 else:
88 return np.array(datum.float_data).astype(float).reshape(
ValueError: total size of new array must be unchanged
I can display the image properly is I decode datum.data with PIL:
from PIL import Image
Image.open(Image.io.BytesIO(datum.data))

Accessory question: where could I have found the answer to this in DIGITS source code?
Cheers,
marco