Spark OFF_HEAP occurs "java.lang.RuntimeException: org.apache.spark.storage.BlockNotFoundException"

292 views
Skip to first unread message

swin...@gmail.com

unread,
Mar 25, 2015, 9:27:42 PM3/25/15
to tachyo...@googlegroups.com
      I run the job with spark 1.3 /Tachyon 0.6.1/hadoop 2.5.0-cdh5.2.0, when the memory of tachyon reach 100% used,it orrurs the exception  "java.lang.RuntimeException: org.apache.spark.storage.BlockNotFoundException",i use rdd.persistence(StorageLevel.OFF_HEAP),Doesn't this mode store the rdd in underfs? When the numbers of RDD is too large to store in tachyon,how to avoid the loss of RDD?

thank you for any help!

Calvin Jia

unread,
Mar 26, 2015, 4:32:06 AM3/26/15
to tachyo...@googlegroups.com
Hi,

Currently I think spark uses TRY_CACHE mode to write to Tachyon, meaning data will be lost when it is evicted. (From this thread: https://groups.google.com/forum/#!topic/tachyon-users/xb8zwqIjIa4). This seems to be an integration issue between Spark and Tachyon. For now, you may need to persist the data to disk if the data set is too large.

Haoyuan Li

unread,
Mar 26, 2015, 4:42:06 AM3/26/15
to swin...@gmail.com, tachyo...@googlegroups.com
You may wanna try Tiered storage in Tachyon 0.6.1. 

On Wed, Mar 25, 2015 at 6:27 PM, <swin...@gmail.com> wrote:
      I run the job with spark 1.3 /Tachyon 0.6.1/hadoop 2.5.0-cdh5.2.0, when the memory of tachyon reach 100% used,it orrurs the exception  "java.lang.RuntimeException: org.apache.spark.storage.BlockNotFoundException",i use rdd.persistence(StorageLevel.OFF_HEAP),Doesn't this mode store the rdd in underfs? When the numbers of RDD is too large to store in tachyon,how to avoid the loss of RDD?

thank you for any help!

--
You received this message because you are subscribed to the Google Groups "Tachyon Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tachyon-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Haoyuan Li
AMPLab, EECS, UC Berkeley

swin...@gmail.com

unread,
Mar 26, 2015, 6:45:49 AM3/26/15
to tachyo...@googlegroups.com, swin...@gmail.com
what is Tiered storage? Is there document about this?

在 2015年3月26日星期四 UTC+8下午4:42:06,Haoyuan Li写道:

Calvin Jia

unread,
Mar 27, 2015, 4:12:34 PM3/27/15
to tachyo...@googlegroups.com, swin...@gmail.com
Tiered storage is a new feature in Tachyon 0.6. The documentation can be found here: http://tachyon-project.org/Hierarchy-Storage-on-Tachyon.html

苗海泉

unread,
Jul 27, 2016, 2:51:58 AM7/27/16
to Alluxio Users, tachyo...@googlegroups.com
Hello everybody, I use spark 1.6.2 and alluxio-1.2.0 , when I start test persist in spark-shell as follow codes,what happen?
Because new vesion alluxio not support this ? 
If you kown the reason ,please tell me ,thank you very much!

scala> val afile = sc.textFile("hdfs://spark29:9000/home/logs/nat/nat_1467220740000_1467220800000")
afile: org.apache.spark.rdd.RDD[String] = hdfs://spark29:9000/home/logs/nat/nat_1467220740000_1467220800000 MapPartitionsRDD[9] at textFile at <console>:27

scala> afile.count
res4: Long = 88                                                                 

scala> import org.apache.spark.storage.StorageLevel
import org.apache.spark.storage.StorageLevel

scala> afile.persist(StorageLevel.OFF_HEAP)
res5: afile.type = hdfs://spark29:9000/home/logs/nat/nat_1467220740000_1467220800000 MapPartitionsRDD[9] at textFile at <console>:27

scala> afile.count
[Stage 1:>                                                          (0 + 2) / 2]16/07/27 14:32:58 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 2, spark24): org.apache.spark.storage.BlockException: Block manager failed to return cached value for rdd_9_0!
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:158)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

16/07/27 14:32:58 ERROR TaskSetManager: Task 0 in stage 1.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 6, spark24): org.apache.spark.storage.BlockException: Block manager failed to return cached value for rdd_9_0!
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:158)

Pei Sun

unread,
Jul 27, 2016, 7:53:51 PM7/27/16
to 苗海泉, Alluxio Users, tachyo...@googlegroups.com
Hi,
    It is not recommended to use persist(OFF_HEAP) anymore. This feature is removed in spark 2.0.0. You can use saveAsTextFile or saveAsObjectFile (slower).

Pei

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Pei Sun

苗海泉

unread,
Jul 29, 2016, 5:50:53 AM7/29/16
to Alluxio Users, tachyo...@googlegroups.com
Hello, I want to persist some data to hdfs,so I use commands as follows:
alluxio fs persist /home/miaohq/shigzNatRadius
and it tell me is already persist ,but I can't see it in hdfs why?
And then I copy to local I use command:
alluxio fs copyToLocal /home/miaohq/shigzNatRadius
/data/spark/miaohq/dpidata/dpilog

The cmd execute some copy ,then block ,shigzNatRadius is dir ,this
have 121 dirs,and one dir have 2-3 files.
please tell me why? thank you very much!


java :java version "1.8.0_77"
Alluxio version: 1.2.0
os:Linux spark29 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST
2013 x86_64 x86_64 x86_64 GNU/Linux

Pei Sun

unread,
Jul 29, 2016, 11:12:16 AM7/29/16
to 苗海泉, Alluxio Users, tachyo...@googlegroups.com
Hey, 
   Can you share your alluxio logs? Also if it says that the file already exists, can you try to ls the hdfs directory and see what is in it?

Pei

--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Pei Sun

Pei Sun

unread,
Aug 8, 2016, 11:46:07 AM8/8/16
to 苗海泉, Alluxio Users
Hi,
   1. If you have linked alluxio client jar to Spark, you can see alluxio related information in spark client logs. You can specify how to save the spark logs by modifying ${YOUR_SPARK_HOME}/conf/log4j.properties . 
   2. You need to specify the Alluxio client jar in spark's classpath and then uses alluxio uri in spark application. You can find documentation here.
Pei 

I pleased to share your alluxio log,but I found nothing about spark
persist contents in alluxio logs.

I doubt about how  the spark to find alluxio to persist it's cache
data ?  Our alluxio is separate deployment  in spark cluster not use
spark’s inside alluxio.

This is wrong?  Whether  I loss some special conf about spark or alluxio?
Reply all
Reply to author
Forward
0 new messages