importing non-homogeneous data

44 views
Skip to first unread message

Chris Kogelnik

unread,
Apr 24, 2015, 3:10:59 PM4/24/15
to tor...@googlegroups.com
I'm having difficulty loading binary data from a file into a tensor, where the data is of different types.  The final tensor is of a single float type.

For example, the byte stream is a double followed by an integer -- f0, f1, ..., f7, i0, i1, i2, i3.

I can't seem to find the method to coerce 8 bytes into a lua float or the 4 int bytes into a float.

Using bytes = tensor.ByteStorage(file) as the input stream treats each element independently and doesn't allow coercion of slices.

Similarly, ffi.cast('float', ...) doesn't appear to function with bytes:cdata() or bytes:data() as the 2nd arg.

Is there nothing in the language that will allow coercion?


Using the struct package and unpack() does provide a solution, but this requires the input as a string.  I'm trying to avoid this for both performance reasons and that string uses lua-land data which is limited.


What's the best way to deal with this?

Thanks!

soumith

unread,
Apr 26, 2015, 11:56:30 PM4/26/15
to torch7 on behalf of Chris Kogelnik
Hey Chris,

Have you looked at using the Torch File API?
With that, you can do something like:
f:readFloat(8)
followed by f:readInt(3)


If you really want to convert from a Byte pointer (unsigned char*) to a float*, you can do it with ffi.cast. bytes:data() will be the right second argument there, and float* will be the correct first argument. However, I recommend the less dangerous approach (using the File API).



--
You received this message because you are subscribed to the Google Groups "torch7" group.
To unsubscribe from this group and stop receiving emails from it, send an email to torch7+un...@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at http://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.

Chris Kogelnik

unread,
Apr 27, 2015, 11:55:56 AM4/27/15
to tor...@googlegroups.com
Thanks Soumith,

I missed the File API.  It looks to be just what I need.

I got around the problem by using:
bytes = ByteStorage()
bytes_data = torch.data(bytes)
copying the bytes directly to the various Tensors types, and then using Tensor conversion, like Tensor.float() for coercion.

Reply all
Reply to author
Forward
0 new messages