Spark OFF_HEAP occurs "java.lang.RuntimeException: org.apache.spark.storage.BlockNotFoundException"

swin...@gmail.com

unread,

Mar 25, 2015, 9:27:42 PM3/25/15

to tachyo...@googlegroups.com

I run the job with spark 1.3 /Tachyon 0.6.1/hadoop 2.5.0-cdh5.2.0, when the memory of tachyon reach 100% used,it orrurs the exception "java.lang.RuntimeException: org.apache.spark.storage.BlockNotFoundException",i use rdd.persistence(StorageLevel.OFF_HEAP),Doesn't this mode store the rdd in underfs? When the numbers of RDD is too large to store in tachyon,how to avoid the loss of RDD?

thank you for any help!

Calvin Jia

unread,

Mar 26, 2015, 4:32:06 AM3/26/15

to tachyo...@googlegroups.com

Hi,

Currently I think spark uses TRY_CACHE mode to write to Tachyon, meaning data will be lost when it is evicted. (From this thread: https://groups.google.com/forum/#!topic/tachyon-users/xb8zwqIjIa4). This seems to be an integration issue between Spark and Tachyon. For now, you may need to persist the data to disk if the data set is too large.

Haoyuan Li

unread,

Mar 26, 2015, 4:42:06 AM3/26/15

to swin...@gmail.com, tachyo...@googlegroups.com

You may wanna try Tiered storage in Tachyon 0.6.1.

On Wed, Mar 25, 2015 at 6:27 PM, <swin...@gmail.com> wrote:

I run the job with spark 1.3 /Tachyon 0.6.1/hadoop 2.5.0-cdh5.2.0, when the memory of tachyon reach 100% used,it orrurs the exception "java.lang.RuntimeException: org.apache.spark.storage.BlockNotFoundException",i use rdd.persistence(StorageLevel.OFF_HEAP),Doesn't this mode store the rdd in underfs? When the numbers of RDD is too large to store in tachyon,how to avoid the loss of RDD?

thank you for any help!

--
You received this message because you are subscribed to the Google Groups "Tachyon Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tachyon-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Haoyuan Li

AMPLab, EECS, UC Berkeley

http://www.cs.berkeley.edu/~haoyuan/

swin...@gmail.com

unread,

Mar 26, 2015, 6:45:49 AM3/26/15

to tachyo...@googlegroups.com, swin...@gmail.com

what is Tiered storage? Is there document about this?

在 2015年3月26日星期四 UTC+8下午4:42:06，Haoyuan Li写道：

Calvin Jia

unread,

Mar 27, 2015, 4:12:34 PM3/27/15

to tachyo...@googlegroups.com, swin...@gmail.com

Tiered storage is a new feature in Tachyon 0.6. The documentation can be found here: http://tachyon-project.org/Hierarchy-Storage-on-Tachyon.html

苗海泉

unread,

Jul 27, 2016, 2:51:58 AM7/27/16

to Alluxio Users, tachyo...@googlegroups.com

Hello everybody, I use spark 1.6.2 and alluxio-1.2.0 , when I start test persist in spark-shell as follow codes,what happen?

Because new vesion alluxio not support this ?

If you kown the reason ,please tell me ,thank you very much!

scala> val afile = sc.textFile("hdfs://spark29:9000/home/logs/nat/nat_1467220740000_1467220800000")

afile: org.apache.spark.rdd.RDD[String] = hdfs://spark29:9000/home/logs/nat/nat_1467220740000_1467220800000 MapPartitionsRDD[9] at textFile at <console>:27

scala> afile.count

res4: Long = 88

scala> import org.apache.spark.storage.StorageLevel

import org.apache.spark.storage.StorageLevel

scala> afile.persist(StorageLevel.OFF_HEAP)

res5: afile.type = hdfs://spark29:9000/home/logs/nat/nat_1467220740000_1467220800000 MapPartitionsRDD[9] at textFile at <console>:27

scala> afile.count

[Stage 1:> (0 + 2) / 2]16/07/27 14:32:58 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 2, spark24): org.apache.spark.storage.BlockException: Block manager failed to return cached value for rdd_9_0!

at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:158)

at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)

at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)

at org.apache.spark.scheduler.Task.run(Task.scala:89)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

16/07/27 14:32:58 ERROR TaskSetManager: Task 0 in stage 1.0 failed 4 times; aborting job

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 6, spark24): org.apache.spark.storage.BlockException: Block manager failed to return cached value for rdd_9_0!

at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:158)

Pei Sun

unread,

Jul 27, 2016, 7:53:51 PM7/27/16

to 苗海泉, Alluxio Users, tachyo...@googlegroups.com

Hi,

It is not recommended to use persist(OFF_HEAP) anymore. This feature is removed in spark 2.0.0. You can use saveAsTextFile or saveAsObjectFile (slower).

Pei

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Pei Sun

苗海泉

unread,

Jul 29, 2016, 5:50:53 AM7/29/16

to Alluxio Users, tachyo...@googlegroups.com

Hello, I want to persist some data to hdfs,so I use commands as follows:
alluxio fs persist /home/miaohq/shigzNatRadius
and it tell me is already persist ,but I can't see it in hdfs why?
And then I copy to local I use command:
alluxio fs copyToLocal /home/miaohq/shigzNatRadius
/data/spark/miaohq/dpidata/dpilog

The cmd execute some copy ,then block ,shigzNatRadius is dir ,this
have 121 dirs,and one dir have 2-3 files.
please tell me why? thank you very much!

java :java version "1.8.0_77"
Alluxio version: 1.2.0
os:Linux spark29 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST
2013 x86_64 x86_64 x86_64 GNU/Linux

Pei Sun

unread,

Jul 29, 2016, 11:12:16 AM7/29/16

to 苗海泉, Alluxio Users, tachyo...@googlegroups.com

Hey,

Can you share your alluxio logs? Also if it says that the file already exists, can you try to ls the hdfs directory and see what is in it?

Pei

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Pei Sun

unread,

Aug 8, 2016, 11:46:07 AM8/8/16

to 苗海泉, Alluxio Users

Hi,

1. If you have linked alluxio client jar to Spark, you can see alluxio related information in spark client logs. You can specify how to save the spark logs by modifying ${YOUR_SPARK_HOME}/conf/log4j.properties .

2. You need to specify the Alluxio client jar in spark's classpath and then uses alluxio uri in spark application. You can find documentation here.

Pei

I pleased to share your alluxio log,but I found nothing about spark
persist contents in alluxio logs.

I doubt about how the spark to find alluxio to persist it's cache
data ? Our alluxio is separate deployment in spark cluster not use
spark’s inside alluxio.

This is wrong? Whether I loss some special conf about spark or alluxio?

Reply all

Reply to author

Forward