importing non-homogeneous data

Chris Kogelnik

unread,

Apr 24, 2015, 3:10:59 PM4/24/15

to tor...@googlegroups.com

I'm having difficulty loading binary data from a file into a tensor, where the data is of different types. The final tensor is of a single float type.

For example, the byte stream is a double followed by an integer -- f0, f1, ..., f7, i0, i1, i2, i3.

I can't seem to find the method to coerce 8 bytes into a lua float or the 4 int bytes into a float.

Using bytes = tensor.ByteStorage(file) as the input stream treats each element independently and doesn't allow coercion of slices.

Similarly, ffi.cast('float', ...) doesn't appear to function with bytes:cdata() or bytes:data() as the 2nd arg.

Is there nothing in the language that will allow coercion?

Using the struct package and unpack() does provide a solution, but this requires the input as a string. I'm trying to avoid this for both performance reasons and that string uses lua-land data which is limited.

What's the best way to deal with this?

Thanks!

soumith

unread,

Apr 26, 2015, 11:56:30 PM4/26/15

to torch7 on behalf of Chris Kogelnik

Hey Chris,

Have you looked at using the Torch File API?

With that, you can do something like:
f:readFloat(8)

followed by f:readInt(3)

for example.
https://github.com/torch/torch7/blob/master/doc/file.md

If you really want to convert from a Byte pointer (unsigned char*) to a float*, you can do it with ffi.cast. bytes:data() will be the right second argument there, and float* will be the correct first argument. However, I recommend the less dangerous approach (using the File API).

--
You received this message because you are subscribed to the Google Groups "torch7" group.
To unsubscribe from this group and stop receiving emails from it, send an email to torch7+un...@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at http://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.

Chris Kogelnik

unread,

Apr 27, 2015, 11:55:56 AM4/27/15

to tor...@googlegroups.com

Thanks Soumith,

I missed the File API. It looks to be just what I need.

I got around the problem by using:

bytes = ByteStorage()

bytes_data = torch.data(bytes)

copying the bytes directly to the various Tensors types, and then using Tensor conversion, like Tensor.float() for coercion.

Reply all

Reply to author

Forward