Compare performance between in memory file system (tmps) and Alluxio

180 views
Skip to first unread message

Tigi

unread,
Aug 18, 2016, 9:53:16 AM8/18/16
to Alluxio Users
Hello,

I use Alluxio on local on one machine and Spark on local on one machine too. I compared the performance of reading-files between Spark with in memory file system (tmpfs) and Spark with Alluxio. The reading time from tmpfs (8s for 10 000files) is more efficient that the reading time from Alluxio's memory (35s for 10 000).  Is it normal ?
Considering alluxio is a distributed in memory file system, the performance of Alluxio will be better than tmpfs ?

I use just "Filesystem Client API" to read file with Spark from Alluxio and files are mount and load in Alluxio from disk.

Tigi

unread,
Aug 18, 2016, 10:22:47 AM8/18/16
to Alluxio Users
 
I run Alluxio 1.2.0 on CentOS7 64bits with 16GB RAM with Spark 1.6.1 and Java 1.8.0_77  

Calvin Jia

unread,
Aug 18, 2016, 5:51:24 PM8/18/16
to Alluxio Users
Hi,

What were the sizes of your files? Alluxio may have more overhead (espcially with smaller files) than directly reading from tmpfs due to additional communication required. However, as you pointed out, the additional communication enables Alluxio to be distributed which means you can generalize workloads to take advantage of multiple machines. Tmpfs could be used for a micro benchmark but cannot provide the same functionality as Alluxio.

Hope this helps,
Calvin
Message has been deleted

Tigi

unread,
Aug 19, 2016, 3:46:10 AM8/19/16
to Alluxio Users
Hi Calvin,

files are small (about 60KB). Does Reduce block's size solve overhead ? I have reduce block's size (100KB). In addition I don't take advantage of multiple machines. Maybe it is the problem. I found just the difference between Alluxio and tmps was huge...

Thanks for your information,
Thibault


Calvin Jia

unread,
Aug 19, 2016, 7:44:22 PM8/19/16
to Alluxio Users
Hi,

Given the size of the files, I think you are encountering something similar to the small files problem, where the overhead of the system dominates the run time. Perhaps you can try comparing with one large file or many large (> 1 block) files?

Hope this helps,
Calvin

Tigi

unread,
Aug 20, 2016, 11:41:50 AM8/20/16
to Alluxio Users

Hi,

It's a good idea. I will test as soon as I solved and understood a weird error... An "Alluxio frame size() larger than max()" error that appear when I read more small file (>35.000). It's not a alluxio.security.authentication.type problem because I run Alluxio on local and Alluxio master address is correct. I have also modified alluxio.network.thrift.frame.size.bytes.max but not result..

Thank you again,
Thibault


Calvin Jia

unread,
Aug 22, 2016, 3:39:55 AM8/22/16
to Alluxio Users
Hi,

How are you currently running your workload? In some cases if you specify the incorrect master port the client will cache it and future accesses through the same client will have this error even if you correct the port.

Hope this helps,
Calvin

Tigi

unread,
Aug 22, 2016, 4:30:10 AM8/22/16
to Alluxio Users
Hi,

I use the command  ./bin/alluxio fs mount -readonly alluxio://19998/partition1  /data/partition1 and ./bin/alluxio fs load /partition110/* to import data in Alluxio memory. To read data, I use the spark-submit command  bin/spark-submit --driver-class-path /usr/local/spark-1.6.1-bin-hadoop2.6/lib/alluxio/*  --class "Main" /home/javaTest/target/spark-target.jar...
The master port is 19998.

My error :
16/08/22 10:16:05 ERROR logger.type: Frame size (17277505) larger than max length (16777216)!
org.apache.thrift.transport.TTransportException: Frame size (17277505) larger than max length (16777216)!
        at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:137)
        at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
        at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
        at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
        at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
        at org.apache.thrift.protocol.TProtocolDecorator.readMessageBegin(TProtocolDecorator.java:135)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
        at alluxio.thrift.FileSystemMasterClientService$Client.recv_listStatus(FileSystemMasterClientService.java:503)
        at alluxio.thrift.FileSystemMasterClientService$Client.listStatus(FileSystemMasterClientService.java:489)
        at alluxio.client.file.FileSystemMasterClient$8.call(FileSystemMasterClient.java:220)
        at alluxio.client.file.FileSystemMasterClient$8.call(FileSystemMasterClient.java:216)
        at alluxio.AbstractClient.retryRPC(AbstractClient.java:324)
        at alluxio.client.file.FileSystemMasterClient.listStatus(FileSystemMasterClient.java:216)
        at alluxio.client.file.BaseFileSystem.listStatus(BaseFileSystem.java:195)
        at alluxio.client.file.BaseFileSystem.listStatus(BaseFileSystem.java:186)
        at Main.main(Main.java:119) (List<URIStatus> status = fs.listStatus(path);)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Exception in thread "main" java.io.IOException: Failed after 32 retries.
        at alluxio.AbstractClient.retryRPC(AbstractClient.java:334)
        at alluxio.client.file.FileSystemMasterClient.listStatus(FileSystemMasterClient.java:216)
        at alluxio.client.file.BaseFileSystem.listStatus(BaseFileSystem.java:195)
        at alluxio.client.file.BaseFileSystem.listStatus(BaseFileSystem.java:186)
        at Main.main(Main.java:119)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


Thibault

Bin Fan

unread,
Aug 22, 2016, 2:06:31 PM8/22/16
to Alluxio Users

Tigi

unread,
Aug 23, 2016, 4:28:51 AM8/23/16
to Alluxio Users
Hi Bin Fan,

my error have no link with FAQ's answer. I think that this error is caused by small file or the number of files that I run. This error appear when I run about 40.000files (1 file = 60KB). But I run the same data quantity with files more bigger (5.000files ; 1file = 500KB), this error disappear...

Thibault

Tigi

unread,
Aug 24, 2016, 9:46:01 AM8/24/16
to Alluxio Users
So, if someone have the same error. In my case, the error is due List<URIStatus> status = fs.listStatus(path) . I think this list is probably limited by size or data quantity... To correct the problem, I create several partitions that I read with Spark in distributed. I choose this solution because to change the frame size doesn't works ( alluxio.network.thrift.frame.size.bytes.max )
The performance of Alluxio are better than tmps, if files are big and not too outnumbered...

Thank you to those who tried to help me

Thibault

Calvin Jia

unread,
Aug 24, 2016, 2:02:05 PM8/24/16
to Alluxio Users
Hi,

Thanks for posting your solution. I think that makes sense, since what may happen is the list status return is too large. Would you mind opening a JIRA ticket describing this bug?

Thanks again,
Calvin
Reply all
Reply to author
Forward
0 new messages