Alluxio doesn't write file into lower tier if higher tier is full

Zaicheng Wang

unread,

Nov 20, 2016, 4:17:40 PM11/20/16

to Alluxio Users

When asking questions, please specify some details about your configuration, such as:

- Which version of Alluxio are you using?
Alluxio 1.3.0

- Which computation framework and version are you using?
Spark 1.6, Hadoop 2.7.1

- Which under storage system and version are you using?
HDFS

- Are you running with tiered storage what is your configuration?
2 tiers:
MEM: 20 * 40 GB
SSD: 40 * 40 GB

- What is your OS version?
Amzn linux (rehat)

- What is your JAVA version?
1.7

I am using Alluxio to cache a bunch of big files(1.7TB in total, each file is about 2GB).
The MEM tier is 0.8TB and the SSD tier is 1.6TB. I got the exception:

16/11/19 09:05:16 WARN TaskSetManager: Lost task 95.0 in stage 0.0 (TID 78, ip-10-10-48-40.ec2.internal): java.io.IOException: Failed to cache: Not enough space left on worker ip-10-10-48-40.ec2.internal/10.10.48.40:29998 to store blockId 3808428037. Please consult http://www.alluxio.org/docs/1.3/en/Debugging-Guide.html for common solutions to address this problem.
	at alluxio.client.file.FileOutStream.handleCacheWriteException(FileOutStream.java:358)
	at alluxio.client.file.FileOutStream.write(FileOutStream.java:314)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:60)
	at java.io.DataOutputStream.write(DataOutputStream.java:107)
	at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:82)
	at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:103)
	at org.apache.spark.SparkHadoopWriter.write(SparkHadoopWriter.scala:96)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply$mcV$sp(PairRDDFunctions.scala:1199)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1197)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1197)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1250)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1205)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Not enough space left on worker ip-10-10-48-40.ec2.internal/10.10.48.40:29998 to store blockId 3808428037. Please consult http://www.alluxio.org/docs/1.3/en/Debugging-Guide.html for common solutions to address this problem.
	at alluxio.client.block.RetryHandlingBlockWorkerClient.requestSpace(RetryHandlingBlockWorkerClient.java:255)
	at alluxio.client.block.LocalBlockOutStream.flush(LocalBlockOutStream.java:121)
	at alluxio.client.block.BufferedBlockOutStream.write(BufferedBlockOutStream.java:104)
	at alluxio.client.file.FileOutStream.write(FileOutStream.java:305)
	... 17 more


which means even their is enough space in lower tier, it doesn't write to lower tier.
However, when I split the file to smaller size and copy them one-by-one, the alluxio was able to write file into lower tier.
Any idea about this? 


Thanks,

Calvin Jia

unread,

Nov 20, 2016, 10:07:12 PM11/20/16

to Alluxio Users

Hi,

How many files are you writing concurrently to each worker? In order for Alluxio to move data into tiered storage, the file must be completed. In your case, if you have more than 20 GB of incomplete files on a worker, you will see the error since no data are eligible to be moved to tiered storage.

Hope this helps,

Calvin

Zaicheng Wang

unread,

Nov 21, 2016, 12:26:38 AM11/21/16

to Alluxio Users

I am writing 624 files at concurrently in 39 nodes.
The size of each file should be about 2.78GB, which is less than 20GB.

在 2016年11月20日星期日 UTC-8下午7:07:12，Calvin Jia写道：

Calvin Jia

unread,

Nov 21, 2016, 1:08:55 PM11/21/16

to Alluxio Users, wcatp1...@gmail.com

Hi,

Since you are writing about 16 files per node concurrently, and each file is around 2.8 GB, there is over 20 GB of in-progress data being written to each worker, which will lead to the error. To solve this, you can try writing with less concurrency or smaller files. For example, if your files are 2.8 GB, try writing only 6 files concurrently on each machine (you can do this by modifying the Spark job configuration).

Hope this helps,

Calvin

Zaicheng Wang

unread,

Nov 21, 2016, 2:08:35 PM11/21/16

to Alluxio Users, wcatp1...@gmail.com

Yes,
I tried to increase the number of parallelism and it works.
Is this by design? Since I have to estimate the size of input file before I can determine the number of parallelism in Spark when using alluxio.

Thanks,

在 2016年11月21日星期一 UTC-8上午10:08:55，Calvin Jia写道：

Calvin Jia

unread,

Nov 21, 2016, 2:19:15 PM11/21/16

to Alluxio Users, wcatp1...@gmail.com

Hi,

This is currently a limitation in the Alluxio system, and we are looking to improve the behavior by allowing total memory = concurrency * block size instead of file size. However, this is generally an edge case. Could you share your use case for such a concurrent write heavy workload?

Thanks,

Calvin

Calvin Jia

unread,

Jan 2, 2017, 1:38:40 PM1/2/17

to Alluxio Users, wcatp1...@gmail.com

Hi Zaicheng,

Were you able to tune the system to successfully handle your workload? Feel free to share your use case if you think Alluxio can be better design to account for it.