Converting Java byte[] array to Python's bytes or bytearray

2,994 views
Skip to first unread message

Krzysztof Leśniewski

unread,
May 4, 2018, 3:13:41 AM5/4/18
to Jep Project
I want to pass a Java's byte array (byte[]) to Python (3.6.3) and have it represented as bytes or bytearray (interoperability and to make it familiar to Python developers). I have a working example, but it feels awkward, so I wanted to ask if I am missing something here. I would expect following behavior:

byte[] input = new byte[] { 0x01, 0x7f, (byte) 0x80, (byte) 0xff };
jep
.set("input", input);
jep
.eval("print(type(input))"); // Should print:
<class 'bytes'>

However, the input is of type <class 'jep.PyJArray'>. This is not too bad, I could convert it to bytes as follows:

byte[] input = new byte[] { 0x01, 0x7f, 0x80, 0xff };
jep
.set("input_pyjarray", input);
jep
.eval("input = bytes(input_pyjarray)");


However, I get:

jep.JepException: <class 'ValueError'>: bytes must be in range(0, 256)

This is due to difference how the languages interpret bytes. Java has no unsigned values and interprets numbers as encoded using two's complements, and hence byte value is in range [-128, 127]. Python however expects the value to be in range [0, 255]. In binary form, there should be no difference (I don't know however how Python represents bytes). I would expect that PyJArray stores the byte array in a form that can be easy worked with in Python. However, to make it compatible, I have to convert the values. In the end, I end up with following code

byte[] input = new byte[] { 0x01, 0x7f, (byte) 0x80, (byte) 0xff };
jep
.set("input_pyjarray", input);
jep
.eval("input = bytes(b % 256 for b in input_pyjarray)");

Above example works, but perhaps I am missing something, and all that is not necessary. Especially that following code will work just fine, without any such conversions:

byte[] pythonInput = jep.getValue_bytearray("input")

Is there a better way to pass byte array from Java to Python?

Erik Johansson

unread,
May 4, 2018, 12:22:14 PM5/4/18
to jep-p...@googlegroups.com

Hi Krzystof,

I don't know anything about your application, but I find it much easier to use Numpy arrays in Python than Python arrays (lists). It is much easier to access raw data and convert or format it however you like in Numpy. Also, Jep has good support for passing Java native arrays into Numpy arrays and back to Java as well.

Hope this helps,

Erik

--
You received this message because you are subscribed to the Google Groups "Jep Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jep-project...@googlegroups.com.
To post to this group, send email to jep-p...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jep-project/4cfb7124-2944-4cb9-8e53-634d5315e942%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
************************************************
Erik Johansson
Project Manager, Wavefront Correction System
National Solar Observatory
Daniel K Inouye Solar Telescope
3665 Discovery Drive, Boulder, CO 80303
Tel: 303-735-7723
************************************************

Krzysztof Leśniewski

unread,
May 7, 2018, 8:39:38 AM5/7/18
to Jep Project
Hi Erik,

Thank you for the advice. We do use Numpy, but not exactly for the purpose. The use case here is to pass serialized data object, which then Python deserializes. The serialized message is just a byte array (blob), which "life" ends once it is deserialized. The passed byte array is not used for data analysis with Numpy. However, it should be possible to pass NDArray and deserialize message from it. I don't know however how both solutions are represented internally and so which integration is more suitable here. Would you still recommend going for NDArray in this case, or do you think proposed code offers more advantages?

Krzysztof

Nathan Jensen

unread,
May 7, 2018, 11:34:27 AM5/7/18
to Jep Project
Python 3 has a bytes object.  https://docs.python.org/3/library/stdtypes.html#binary-sequence-types-bytes-bytearray-memoryview

Jep.getValue_bytearray
(String str) was leftover from before we added numpy support.  It remains for backwards compatibility.  I recommend you use NDArray, you would get an ndarray of type int8 or uint8.  The API for NDArray is unlikely to change in the future, but we may update Jep in a future release to better support your use case.  But if you have something that works, nothing wrong with that.


To unsubscribe from this group and stop receiving emails from it, send an email to jep-project+unsubscribe@googlegroups.com.

To post to this group, send email to jep-p...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages