Spark SQL on Alluxio v/s on HDFS

21 views
Skip to first unread message

Cam Mach

unread,
Dec 13, 2017, 5:30:05 PM12/13/17
to Alluxio Developers
Hello everyone, I am running stress test on Alluxio using Spark. It's supposed to have a better performance than running Spark on HDFS, right? But turn out worse. Is there any configurations in Alluxio that I should turn on or adjust to make it work better? Note, all of my data fully loaded into memory
Here is specific scenario: generate 2GB of arbitrary data on HDFS, and on Alluxio. Then run Spark SQL to read data from HDFS, then filter and count. Do the same for Alluxio, and measure the time taken on both.
Appreciate your help
Reply all
Reply to author
Forward
0 new messages