Is arrow going to be a competitor for alluxio?

409 views
Skip to first unread message

Dima Fadeyev

unread,
Apr 6, 2016, 9:42:46 AM4/6/16
to Alluxio Users
Hello, everyone,

I've been reading about the Arrow in-memory storage format and it seems like arrow is going to be a competitor for Alluxio for some of the use cases (when Alluxio is used as temporary in-memory datastore or a buffer for data exchange between several apps). Am I right in my opinion or Alluxio's and Arrow's use cases do not overlap?

Gene Pang

unread,
Apr 6, 2016, 10:36:16 AM4/6/16
to Alluxio Users
Hi Dima,

Thanks for the question! I think Arrow is an interesting new project. As far as I can tell, it is an in-memory format for columnar data, not a distributed system. It could be used by computation frameworks like Impala, Spark, Drill, etc. Alluxio is an in-memory virtual distributed storage system. However, I think there great opportunities for Alluxio and Arrow to work together. For example, currently, applications can use Alluxio as a storage system of in-memory file data. I think it would be very interesting to explore the idea of supporting applications to use Alluxio as a storage system for Arrow formats. This combination could be very beneficial for users and applications.

Thanks,
Gene

Jais Sebastian

unread,
Feb 2, 2017, 12:02:20 PM2/2/17
to Alluxio Users
Hi Gene,

Does alluxio supports storing in Arrow format from spark. Do we have any example for this and any performance comparison with Parquet format.

Thanks,
Jais

Gene Pang

unread,
Feb 15, 2017, 9:11:05 AM2/15/17
to Alluxio Users
Hi Jais,

I am not aware of any Arrow format support in Alluxio. However, if an application is able to write out files in Arrow format, it may already work with Alluxio.

Thanks,
Gene
Reply all
Reply to author
Forward
0 new messages