Out of memory when training 2GB logs.

211 views
Skip to first unread message

Kim Trang Le

unread,
Jan 19, 2016, 10:36:29 PM1/19/16
to predictionio-dev
Dear all,
I trained 2GB log with MySQL, distributed spark, 8GB RAM and error occured:


[Stage 1:>                                                          (0 + 4) / 4][WARN] [MemoryStore] Not enough space to cache rdd_4_1 in memory! (computed 63.0 MB so far)
[WARN] [MemoryStore] Not enough space to cache rdd_4_2 in memory! (computed 99.9 MB so far)
[WARN] [MemoryStore] Not enough space to cache rdd_4_3 in memory! (computed 96.2 MB so far)

[Stage 4:>                                                          (0 + 4) / 4][ERROR] [Executor] Exception in task 0.0 in stage 4.0 (TID 12)
[ERROR] [SparkUncaughtExceptionHandler] Uncaught exception in thread Thread[Executor task launch worker-1,5,main]
[WARN] [TaskSetManager] Lost task 0.0 in stage 4.0 (TID 12, localhost): java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2271)
        at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
        at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
        at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
        at java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1188)
        at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
        at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44)
        at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:81)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:236)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

[ERROR] [TaskSetManager] Task 0 in stage 4.0 failed 1 times; aborting job
[Stage 4:>                                                          (0 + 3) / 4][WARN] [QueuedThreadPool] 1 threads could not be stopped
[WARN] [TaskSetManager] Lost task 2.0 in stage 4.0 (TID 14, localhost): TaskKilled (killed intentionally)
[WARN] [TaskSetManager] Lost task 1.0 in stage 4.0 (TID 13, localhost): TaskKilled (killed intentionally)

To address this problem, do I need to specify driver-memory and executor-memory for pio train?
I am using an 8GB RAM machine.
Does it work with --driver-memory 4G --executor-memory 8G? If not, to which number of Gigabytes RAM I need to reduce? Thanks so much

Shalltell Uduojie

unread,
Jan 29, 2016, 8:30:02 PM1/29/16
to predictionio-dev
You may need to run spark in a standalone cluster http://spark.apache.org/docs/latest/spark-standalone.html this was how I resolved my issue in addition to specifying the driver and executor memory.

Pat Ferrel

unread,
Jan 30, 2016, 12:20:02 PM1/30/16
to Kim Trang Le, predictionio-dev
Are you still using an 8G machine? Is there any way to upgrade to 16 or better yet 32g? Although some templates may work with an 8g machine we consider 16g the minimum requirement.

Kim Trang Le

unread,
Jan 31, 2016, 11:02:01 PM1/31/16
to Pat Ferrel, predictionio-dev
Ok thanks Pat, I got it and upgraded to 16GB, with some spark configuration, it is ok now. 

Reply all
Reply to author
Forward
0 new messages