Read data from lmdb incorrectly

462 views
Skip to first unread message

Lillian Liu

unread,
May 7, 2016, 3:31:07 AM5/7/16
to Caffe Users

Hello, I am working on non-image data (floating point), and I saved the data into lmdb, which seems to work since I can read the correct values from the lmdb. (I guess this is because I am able to specify the correct type by using "np.fromstring(datum.data, dtype=np.float64)" )


However, the values in the data blob are messed up. I am wondering if this is a bug of Caffe? Or it is just a limitation of Caffe that it can only interpret datum.data saved in lmdb as uint8?


Btw, it seems to me that there is no way for Caffe to know that which data type to use to interpret datum.data. So I assume the unit8 is the default type? But this is not specified in any tutorials or examples.


Sorry for my ignorance in advance, I am a complete novice to Caffe.


Lillian

Mohamed Ezz

unread,
May 7, 2016, 5:04:09 AM5/7/16
to Caffe Users
When you create your lmdb database, use datum.float_data to store your data, instead of datum.data. You don't need to store anything in datum.data in this case.

Lillian Liu

unread,
May 7, 2016, 1:33:47 PM5/7/16
to Caffe Users
Thanks a lot for your reply!!!  

I did everything in python and pycaffe. So when I save my data into lmdb, i did:

datum = caffe.proto.caffe_pb2.Datum()
atum.float_data = features.astype(np.float)

But I got the following error:
File "/opt/anaconda2/lib/python2.7/site-packages/google/protobuf/internal/python_message.py", line 440, in setter  '"%s" in protocol message object.' % proto_field_name)
AttributeError: Assignment not allowed to repeated field "float_data" in protocol message object.

Could you please let me know the right way to set the datum.float_data ?

Btw, does that mean that datum.data can only be interpreted as uint8 in caffe?

Thanks a gain!!!

Lillian Liu

unread,
May 7, 2016, 1:34:44 PM5/7/16
to Caffe Users
Thanks a lot for your reply!!!  

I did everything in python and pycaffe. So when I save my data into lmdb, i did:

datum = caffe.proto.caffe_pb2.Datum()
atum.float_data = features.astype(np.float)

But I got the following error:
File "/opt/anaconda2/lib/python2.7/site-packages/google/protobuf/internal/python_message.py", line 440, in setter  '"%s" in protocol message object.' % proto_field_name)
AttributeError: Assignment not allowed to repeated field "float_data" in protocol message object.

Could you please let me know the right way to set the datum.float_data ?

Btw, does that mean that datum.data can only be interpreted as uint8 in caffe?

Thanks a gain!!!


On Saturday, May 7, 2016 at 5:04:09 AM UTC-4, Mohamed Ezz wrote:

Jan

unread,
May 9, 2016, 4:59:47 AM5/9/16
to Caffe Users
Oh, I completely forgot about float_data, yes that should work.

float_data is a repeated field in the protobuf message, that means you'd have to do something like

datum.float_data.extend(features.astype(np.float32).flatten())

Jan

Mohamed Ezz

unread,
May 9, 2016, 5:10:30 AM5/9/16
to Caffe Users
Also take a look here for many Caffe IO convenience functions : https://github.com/BVLC/caffe/blob/master/python/caffe/io.py
it should give you a nice overview of how to use the pycaffe interface for io

Lillian Liu

unread,
May 9, 2016, 10:35:50 AM5/9/16
to Caffe Users
Thank you very much!!

Lillian Liu

unread,
May 9, 2016, 10:36:10 AM5/9/16
to Caffe Users
Thank you very much!!

p.Paul

unread,
May 31, 2017, 6:00:14 AM5/31/17
to Caffe Users
hello, thank you for your support! Where can I find the c++ equivalent file.I need to know in how datum.data , datum.float_data is stored, where Datum* datum
The problem is I have an lmdb which has both float and uint8 values stored. I am using c++ interface , I canno0t find an effiecient way of parsing the data, since datum.data
const string& data = datum.data();
data.size is 0.
Reply all
Reply to author
Forward
0 new messages