Apache Arrow support?

227 views
Skip to first unread message

Thomas Browne

unread,
Feb 21, 2021, 2:17:26 PM2/21/21
to Numerical Elixir (Nx)

I'm very excited about Elixir's new direction and intend to use the new Nx stack for all my scientific computing, because XLA is just so powerful. I do have a question however. There are a lot of very powerful libraries out there in other languages, obviously, and one common "glue" that seems to be emerging is the Apache Arrow in memory columnar data format.

Arrow, as you probably know, is written by Wes McKinney who wrote Python's Pandas, and  is kind of a refactor/rewrite from the ground up with so many very attractive features, the biggest unique selling point of which is "zero copy" serialization and even shared memory buffers across languages.

It seems to me this would be an ideal interop library for Elixir to support, as it would give access to Java, Python, Fortran, and other ecosystems through efficient sharing of big buffer pointers between all the languages. Is there a plan to support Apache Arrow?

I'd offer to help and maybe I will but I don't feel comfortable enough yet with dirty NIFs and wrapping C++.

José Valim

unread,
Feb 21, 2021, 4:57:20 PM2/21/21
to elix...@googlegroups.com
Hi Thomas,

We definitely welcome folks exploring Apache Arrow within Elixir. I don't think it would allow us to share pointers between languages but it could be used as a serialization/deserialization format. I would recommend exploring encoding/decoding the Apache Arrow format using Elixir itself. The tooling and features it inherited from Erlang for this purpose are pretty great!

--
You received this message because you are subscribed to the Google Groups "Numerical Elixir (Nx)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-nx+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-nx/a5e23733-f833-4e9e-bbef-75a9128655b4n%40googlegroups.com.

Thomas Browne

unread,
Feb 21, 2021, 7:37:50 PM2/21/21
to elix...@googlegroups.com

Okay great!

Could you point me to some initial resources in the Erlang ecosystem that you think are good to get me started exploring? Anything in Erlang that allows me to access memory blocks nice 'n fast. I assume you mean this https://erlang.org/doc/man/binary.html but if there's anything else, please let me know.

If and when I get somewhere I'll put up a repo and share.

José Valim

unread,
Feb 22, 2021, 12:49:02 AM2/22/21
to elix...@googlegroups.com
Yes, the binary blobs will be binaries in Elixir which you can match and decode using the bitsyntax. The result should be the same in Erlang vs Elixir.

Here are more examples and docs:




Reply all
Reply to author
Forward
0 new messages