[ERROR] [ActorSystemImpl] Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatche

743 views
Skip to first unread message

Firman Gautama

unread,
Jun 1, 2015, 5:57:42 AM6/1/15
to predicti...@googlegroups.com

Hello All,

I would like to report an error happened when I'm trying to train around 1.4millions items of a data with the recommendation module.

driver memory = 4g
executor memory = 12g (tried with 16g too)

Here are the verbose output:


[INFO] [Console$] Using existing engine manifest JSON at /home/firman/30days/manifest.json
[INFO] [Runner$] Submission command: /pio/PredictionIO-0.9.3/vendors/spark-1.3.1-bin-hadoop2.6/bin/spark-submit --master spark://nn01.staging.us-tmp.xxxxxxxxxx.net:7077 --driver-memory 4G --executor-memory 16G --conf spark.akka.frameSize=1024 --class io.prediction.workflow.CreateWorkflow --jars file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar,file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation_2.10-0.1-SNAPSHOT.jar --files file:/pio/PredictionIO-0.9.3/conf/log4j.properties,file:/etc/hadoop/conf/core-site.xml,file:/etc/hbase/conf/hbase-site.xml --driver-class-path /pio/PredictionIO-0.9.3/conf:/etc/hadoop/conf:/etc/hbase/conf file:/pio/PredictionIO-0.9.3/lib/pio-assembly-0.9.3.jar --engine-id WtB0kwNl9oPmR4HSN6HyQdTJydjXHfJE --engine-version c8ab3be2e4e998f4c7a675dd3e7babbf05a84504 --engine-variant file:/home/firman/30days/engine.json --verbosity 0 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/firman/.pio_store,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost,PIO_STORAGE_SOURCES_HBASE_HOME=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hbase,PIO_HOME=/pio/PredictionIO-0.9.3,PIO_FS_ENGINESDIR=/home/firman/.pio_store/engines,PIO_STORAGE_SOURCES_LOCALFS_PATH=/home/firman/.pio_store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/pio/PredictionIO-0.9.3/vendors/elasticsearch-1.4.4,PIO_FS_TMPDIR=/home/firman/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_CONF_DIR=/pio/PredictionIO-0.9.3/conf,PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
[INFO] [Engine] Extracting datasource params...
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
[INFO] [Engine] Datasource params: (,DataSourceParams(30days,None))
[INFO] [Engine] Extracting preparator params...
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[INFO] [Engine] Serving params: (,Empty)
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://spark...@nn01.staging.us-tmp.xxxxxxxxxx.net:49886]
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: xxxxxxxxxx.data30days.DataSource@136bc6a7
[INFO] [Engine$] Preparator: xxxxxxxxxx.data30days.Preparator@6388062b
[INFO] [Engine$] AlgorithmList: List(xxxxxxxxxx.data30days.ALSAlgorithm@6528571)
[INFO] [Engine$] Data santiy check is on.
[INFO] [Engine$] xxxxxxxxxx.data30days.TrainingData does not support data sanity check. Skipping check.
[INFO] [Engine$] xxxxxxxxxx.data30days.PreparedData does not support data sanity check. Skipping check.
[Stage 16:>                                                        (0 + 0) / 32][WARN] [TaskSetManager] Stage 16 contains a task of very large size (77479 KB). The maximum recommended task size is 100 KB.
[Stage 16:>                                                       (0 + 32) / 32][ERROR] [ActorSystemImpl] Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-4] shutting down ActorSystem [sparkDriver]


Regards,
Firman


 

Donald Szeto

unread,
Jun 1, 2015, 1:33:07 PM6/1/15
to predicti...@googlegroups.com, firman....@gmail.com
Hi Firman,

Do you see other detail messages inside "pio.log" where you launched the "pio train" command?

Regards,
Donald


On Monday, June 1, 2015 at 2:57:42 AM UTC-7, Firman Gautama wrote:

Hello All,

I would like to report an error happened when I'm trying to train around 1.4millions items of a data with the recommendation module.

driver memory = 4g
executor memory = 12g (tried with 16g too)

Here are the verbose output:


[INFO] [Console$] Using existing engine manifest JSON at /home/firman/30days/manifest.json
[INFO] [Runner$] Submission command: /pio/PredictionIO-0.9.3/vendors/spark-1.3.1-bin-hadoop2.6/bin/spark-submit --master spark://nn01.staging.us-tmp.xxxxxxxxxx.net:7077 --driver-memory 4G --executor-memory 16G --conf spark.akka.frameSize=1024 --class io.prediction.workflow.CreateWorkflow --jars file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar,file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation_2.10-0.1-SNAPSHOT.jar --files file:/pio/PredictionIO-0.9.3/conf/log4j.properties,file:/etc/hadoop/conf/core-site.xml,file:/etc/hbase/conf/hbase-site.xml --driver-class-path /pio/PredictionIO-0.9.3/conf:/etc/hadoop/conf:/etc/hbase/conf file:/pio/PredictionIO-0.9.3/lib/pio-assembly-0.9.3.jar --engine-id WtB0kwNl9oPmR4HSN6HyQdTJydjXHfJE --engine-version c8ab3be2e4e998f4c7a675dd3e7babbf05a84504 --engine-variant file:/home/firman/30days/engine.json --verbosity 0 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/firman/.pio_store,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost,PIO_STORAGE_SOURCES_HBASE_HOME=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hbase,PIO_HOME=/pio/PredictionIO-0.9.3,PIO_FS_ENGINESDIR=/home/firman/.pio_store/engines,PIO_STORAGE_SOURCES_LOCALFS_PATH=/home/firman/.pio_store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/pio/PredictionIO-0.9.3/vendors/elasticsearch-1.4.4,PIO_FS_TMPDIR=/home/firman/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_CONF_DIR=/pio/PredictionIO-0.9.3/conf,PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
[INFO] [Engine] Extracting datasource params...
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
[INFO] [Engine] Datasource params: (,DataSourceParams(30days,None))
[INFO] [Engine] Extracting preparator params...
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[INFO] [Engine] Serving params: (,Empty)
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://sparkDriver@nn01.staging.us-tmp.xxxxxxxxxx.net:49886]
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: xxxxxxxxxx.data30days.DataSource@136bc6a7
[INFO] [Engine$] Preparator: xxxxxxxxxx.data30days.Preparator@6388062b
[INFO] [Engine$] AlgorithmList: List(xxxxxxxxxx.data30days.ALSAlgorithm@6528571)
[INFO] [Engine$] Data santiy check is on.
[INFO] [Engine$] xxxxxxxxxx.data30days.TrainingData does not support data sanity check. Skipping check.
[INFO] [Engine$] xxxxxxxxxx.data30days.PreparedData does not support data sanity check. Skipping check.
[Stage 16:>                                                        (0 + 0) / 32][WARN] [TaskSetManager] Stage 16 contains a task of very large size (77479 KB). The maximum recommended task size is 100 KB.
[Stage 16:>                                                       (0 + 32) / 32][ERROR] [ActorSystemImpl] Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-4] shutting down ActorSystem [sparkDriver]


Regards,
Firman


 
Message has been deleted

Firman Gautama

unread,
Jun 1, 2015, 6:41:18 PM6/1/15
to predicti...@googlegroups.com, firman....@gmail.com
Hi Donald,

I tried to increase executor memory but it didn't solve the problem, 
then I increased the driver memory, it seems fix the problem.

The "pio train" worked.

Below are the previous pio.log" verbose dump.

-----
2015-06-01 09:04:14,387 INFO  io.prediction.tools.console.Console$ [main] - Using existing engine manifest JSON at /home/firman/30days/manifest.json
2015-06-01 09:04:16,618 INFO  io.prediction.tools.Runner$ [main] - Submission command: /pio/PredictionIO-0.9.3/vendors/spark-1.3.1-bin-hadoop2.6/bin/spark-submit --driver-memory 4G --executor-memory 2G --class io.prediction.workflow.CreateWorkflow --jars file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar,file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation_2.10-0.1-SNAPSHOT.jar --files file:/pio/PredictionIO-0.9.3/conf/log4j.properties,file:/etc/hadoop/conf/core-site.xml,file:/etc/hbase/conf/hbase-site.xml --driver-class-path /pio/PredictionIO-0.9.3/conf:/etc/hadoop/conf:/etc/hbase/conf file:/pio/PredictionIO-0.9.3/lib/pio-assembly-0.9.3.jar --engine-id WtB0kwNl9oPmR4HSN6HyQdTJydjXHfJE --engine-version c8ab3be2e4e998f4c7a675dd3e7babbf05a84504 --engine-variant file:/home/firman/30days/engine.json --verbosity 0 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/firman/.pio_store,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost,PIO_STORAGE_SOURCES_HBASE_HOME=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hbase,PIO_HOME=/pio/PredictionIO-0.9.3,PIO_FS_ENGINESDIR=/home/firman/.pio_store/engines,PIO_STORAGE_SOURCES_LOCALFS_PATH=/home/firman/.pio_store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/pio/PredictionIO-0.9.3/vendors/elasticsearch-1.4.4,PIO_FS_TMPDIR=/home/firman/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_CONF_DIR=/pio/PredictionIO-0.9.3/conf,PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
2015-06-01 09:04:20,424 INFO  io.prediction.controller.Engine [main] - Extracting datasource params...
2015-06-01 09:04:20,539 INFO  io.prediction.workflow.WorkflowUtils$ [main] - No 'name' is found. Default empty String will be used.
2015-06-01 09:04:20,561 INFO  io.prediction.controller.Engine [main] - Datasource params: (,DataSourceParams(30days,None))
2015-06-01 09:04:20,562 INFO  io.prediction.controller.Engine [main] - Extracting preparator params...
2015-06-01 09:04:20,564 INFO  io.prediction.controller.Engine [main] - Preparator params: (,Empty)
2015-06-01 09:04:20,920 INFO  io.prediction.controller.Engine [main] - Extracting serving params...
2015-06-01 09:04:20,920 INFO  io.prediction.controller.Engine [main] - Serving params: (,Empty)
2015-06-01 09:04:23,367 INFO  Remoting [sparkDriver-akka.actor.default-dispatcher-3] - Starting remoting
2015-06-01 09:04:23,639 INFO  Remoting [sparkDriver-akka.actor.default-dispatcher-3] - Remoting started; listening on addresses :[akka.tcp://spark...@nn01.staging.us-tmp.xxxxxxxxxx.net:22898]
2015-06-01 09:04:24,897 INFO  io.prediction.controller.Engine$ [main] - EngineWorkflow.train
2015-06-01 09:04:24,898 INFO  io.prediction.controller.Engine$ [main] - DataSource: xxxxxxxxxx.data30days.DataSource@29c134e1
2015-06-01 09:04:24,899 INFO  io.prediction.controller.Engine$ [main] - Preparator: xxxxxxxxxx.data30days.Preparator@86369c6
2015-06-01 09:04:24,900 INFO  io.prediction.controller.Engine$ [main] - AlgorithmList: List(xxxxxxxxxx.data30days.ALSAlgorithm@5a3770d2)
2015-06-01 09:04:24,901 INFO  io.prediction.controller.Engine$ [main] - Data santiy check is on.
2015-06-01 09:04:27,394 INFO  io.prediction.controller.Engine$ [main] - xxxxxxxxxx.data30days.TrainingData does not support data sanity check. Skipping check.
2015-06-01 09:04:27,395 INFO  io.prediction.controller.Engine$ [main] - xxxxxxxxxx.data30days.PreparedData does not support data sanity check. Skipping check.
2015-06-01 09:08:35,547 WARN  org.elasticsearch.transport [elasticsearch[Mark Todd][transport_client_worker][T#1]{New I/O worker #1}] - [Mark Todd] Received response for a request that has timed out, sent [20760ms] ago, timed out [3499ms] ago, action [cluster:monitor/nodes/info], node [[#transport#-1][nn01.staging.us-tmp][inet[localhost/127.0.0.1:9300]]], id [47]
2015-06-01 09:18:17,497 ERROR org.apache.spark.executor.Executor [Executor task launch worker-2] - Exception in task 2.0 in stage 7.0 (TID 15)
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,503 ERROR org.apache.spark.executor.Executor [Executor task launch worker-5] - Exception in task 5.0 in stage 7.0 (TID 18)
java.lang.OutOfMemoryError: GC overhead limit exceeded
at scala.collection.immutable.HashMap$HashTrieMap.updated0(HashMap.scala:328)
at scala.collection.immutable.HashMap$HashTrieMap.updated0(HashMap.scala:326)
at scala.collection.immutable.HashMap.updated(HashMap.scala:54)
at scala.collection.immutable.HashMap$SerializationProxy.readObject(HashMap.scala:516)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
2015-06-01 09:18:17,503 ERROR org.apache.spark.executor.Executor [Executor task launch worker-9] - Exception in task 9.0 in stage 7.0 (TID 22)
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,502 ERROR org.apache.spark.executor.Executor [Executor task launch worker-10] - Exception in task 10.0 in stage 7.0 (TID 23)
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,538 ERROR org.apache.spark.util.SparkUncaughtExceptionHandler [Executor task launch worker-10] - Uncaught exception in thread Thread[Executor task launch worker-10,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,502 ERROR org.apache.spark.executor.Executor [Executor task launch worker-0] - Exception in task 0.0 in stage 7.0 (TID 13)
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,539 ERROR org.apache.spark.util.SparkUncaughtExceptionHandler [Executor task launch worker-0] - Uncaught exception in thread Thread[Executor task launch worker-0,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,497 ERROR org.apache.spark.executor.Executor [Executor task launch worker-6] - Exception in task 6.0 in stage 7.0 (TID 19)
java.lang.OutOfMemoryError: GC overhead limit exceeded
at scala.collection.immutable.HashMap$HashTrieMap.updated0(HashMap.scala:328)
at scala.collection.immutable.HashMap$HashTrieMap.updated0(HashMap.scala:326)
at scala.collection.immutable.HashMap$HashTrieMap.updated0(HashMap.scala:326)
at scala.collection.immutable.HashMap.updated(HashMap.scala:54)
at scala.collection.immutable.HashMap$SerializationProxy.readObject(HashMap.scala:516)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
2015-06-01 09:18:17,538 ERROR org.apache.spark.util.SparkUncaughtExceptionHandler [Executor task launch worker-5] - Uncaught exception in thread Thread[Executor task launch worker-5,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
at scala.collection.immutable.HashMap$HashTrieMap.updated0(HashMap.scala:328)
at scala.collection.immutable.HashMap$HashTrieMap.updated0(HashMap.scala:326)
at scala.collection.immutable.HashMap.updated(HashMap.scala:54)
at scala.collection.immutable.HashMap$SerializationProxy.readObject(HashMap.scala:516)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
2015-06-01 09:18:17,532 ERROR org.apache.spark.util.SparkUncaughtExceptionHandler [Executor task launch worker-9] - Uncaught exception in thread Thread[Executor task launch worker-9,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,525 ERROR org.apache.spark.util.SparkUncaughtExceptionHandler [Executor task launch worker-2] - Uncaught exception in thread Thread[Executor task launch worker-2,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,524 ERROR org.apache.spark.executor.Executor [Executor task launch worker-4] - Exception in task 4.0 in stage 7.0 (TID 17)
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,561 ERROR org.apache.spark.util.SparkUncaughtExceptionHandler [Executor task launch worker-4] - Uncaught exception in thread Thread[Executor task launch worker-4,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,562 ERROR akka.actor.ActorSystemImpl [sparkDriver-akka.actor.default-dispatcher-14] - exception on LARS? timer thread
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,522 ERROR org.apache.spark.executor.Executor [Executor task launch worker-8] - Exception in task 8.0 in stage 7.0 (TID 21)
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.io.ObjectInputStream$HandleTable$HandleList.<init>(ObjectInputStream.java:3480)
at java.io.ObjectInputStream$HandleTable.markDependency(ObjectInputStream.java:3305)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at scala.collection.immutable.HashMap$SerializationProxy.readObject(HashMap.scala:516)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
2015-06-01 09:18:17,565 ERROR org.apache.spark.util.SparkUncaughtExceptionHandler [Executor task launch worker-8] - Uncaught exception in thread Thread[Executor task launch worker-8,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.io.ObjectInputStream$HandleTable$HandleList.<init>(ObjectInputStream.java:3480)
at java.io.ObjectInputStream$HandleTable.markDependency(ObjectInputStream.java:3305)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at scala.collection.immutable.HashMap$SerializationProxy.readObject(HashMap.scala:516)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
2015-06-01 09:18:17,517 ERROR org.apache.spark.executor.Executor [Executor task launch worker-7] - Exception in task 7.0 in stage 7.0 (TID 20)
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,564 ERROR akka.actor.ActorSystemImpl [sparkDriver-akka.actor.default-dispatcher-14] - Uncaught fatal error from thread [sparkDriver-scheduler-1] shutting down ActorSystem [sparkDriver]
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,546 WARN  org.apache.spark.scheduler.TaskSetManager [task-result-getter-1] - Lost task 2.0 in stage 7.0 (TID 15, localhost): java.lang.OutOfMemoryError: GC overhead limit exceeded

2015-06-01 09:18:17,543 ERROR org.apache.spark.util.SparkUncaughtExceptionHandler [Executor task launch worker-6] - Uncaught exception in thread Thread[Executor task launch worker-6,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
at scala.collection.immutable.HashMap$HashTrieMap.updated0(HashMap.scala:328)
at scala.collection.immutable.HashMap$HashTrieMap.updated0(HashMap.scala:326)
at scala.collection.immutable.HashMap$HashTrieMap.updated0(HashMap.scala:326)
at scala.collection.immutable.HashMap.updated(HashMap.scala:54)
at scala.collection.immutable.HashMap$SerializationProxy.readObject(HashMap.scala:516)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
2015-06-01 09:18:17,567 ERROR org.apache.spark.util.SparkUncaughtExceptionHandler [Executor task launch worker-7] - Uncaught exception in thread Thread[Executor task launch worker-7,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
2015-06-01 09:18:17,573 ERROR org.apache.spark.scheduler.TaskSetManager [task-result-getter-1] - Task 2 in stage 7.0 failed 1 times; aborting job
2015-06-01 09:18:17,579 WARN  org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation [main-EventThread] - This client just lost it's session with ZooKeeper, closing it. It will be recreated next time someone needs it
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:401)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:319)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2015-06-01 09:18:17,581 WARN  org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation [main-EventThread] - This client just lost it's session with ZooKeeper, closing it. It will be recreated next time someone needs it
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:401)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:319)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)

-----

2015-06-01 09:38:44,729 INFO  io.prediction.tools.console.Console$ [main] - Using existing engine manifest JSON at /home/firman/30days/manifest.json
2015-06-01 09:38:47,194 INFO  io.prediction.tools.Runner$ [main] - Submission command: /pio/PredictionIO-0.9.3/vendors/spark-1.3.1-bin-hadoop2.6/bin/spark-submit --master spark://nn01.staging.us-tmp.xxxxxxxxxx.net:7077 --driver-memory 4G --executor-memory 16G --conf spark.akka.frameSize=1024 --class io.prediction.workflow.CreateWorkflow --jars file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar,file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation_2.10-0.1-SNAPSHOT.jar --files file:/pio/PredictionIO-0.9.3/conf/log4j.properties,file:/etc/hadoop/conf/core-site.xml,file:/etc/hbase/conf/hbase-site.xml --driver-class-path /pio/PredictionIO-0.9.3/conf:/etc/hadoop/conf:/etc/hbase/conf file:/pio/PredictionIO-0.9.3/lib/pio-assembly-0.9.3.jar --engine-id WtB0kwNl9oPmR4HSN6HyQdTJydjXHfJE --engine-version c8ab3be2e4e998f4c7a675dd3e7babbf05a84504 --engine-variant file:/home/firman/30days/engine.json --verbosity 0 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/firman/.pio_store,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost,PIO_STORAGE_SOURCES_HBASE_HOME=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hbase,PIO_HOME=/pio/PredictionIO-0.9.3,PIO_FS_ENGINESDIR=/home/firman/.pio_store/engines,PIO_STORAGE_SOURCES_LOCALFS_PATH=/home/firman/.pio_store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/pio/PredictionIO-0.9.3/vendors/elasticsearch-1.4.4,PIO_FS_TMPDIR=/home/firman/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_CONF_DIR=/pio/PredictionIO-0.9.3/conf,PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
2015-06-01 09:38:51,093 INFO  io.prediction.controller.Engine [main] - Extracting datasource params...
2015-06-01 09:38:51,209 INFO  io.prediction.workflow.WorkflowUtils$ [main] - No 'name' is found. Default empty String will be used.
2015-06-01 09:38:51,231 INFO  io.prediction.controller.Engine [main] - Datasource params: (,DataSourceParams(30days,None))
2015-06-01 09:38:51,232 INFO  io.prediction.controller.Engine [main] - Extracting preparator params...
2015-06-01 09:38:51,234 INFO  io.prediction.controller.Engine [main] - Preparator params: (,Empty)
2015-06-01 09:38:51,591 INFO  io.prediction.controller.Engine [main] - Extracting serving params...
2015-06-01 09:38:51,592 INFO  io.prediction.controller.Engine [main] - Serving params: (,Empty)
2015-06-01 09:38:53,994 INFO  Remoting [sparkDriver-akka.actor.default-dispatcher-3] - Starting remoting
2015-06-01 09:38:54,251 INFO  Remoting [sparkDriver-akka.actor.default-dispatcher-3] - Remoting started; listening on addresses :[akka.tcp://spark...@nn01.staging.us-tmp.xxxxxxxxxx.net:49886]
2015-06-01 09:38:55,656 INFO  io.prediction.controller.Engine$ [main] - EngineWorkflow.train
2015-06-01 09:38:55,657 INFO  io.prediction.controller.Engine$ [main] - DataSource: xxxxxxxxxx.data30days.DataSource@136bc6a7
2015-06-01 09:38:55,658 INFO  io.prediction.controller.Engine$ [main] - Preparator: xxxxxxxxxx.data30days.Preparator@6388062b
2015-06-01 09:38:55,658 INFO  io.prediction.controller.Engine$ [main] - AlgorithmList: List(xxxxxxxxxx.data30days.ALSAlgorithm@6528571)
2015-06-01 09:38:55,659 INFO  io.prediction.controller.Engine$ [main] - Data santiy check is on.
2015-06-01 09:38:58,706 INFO  io.prediction.controller.Engine$ [main] - xxxxxxxxxx.data30days.TrainingData does not support data sanity check. Skipping check.
2015-06-01 09:38:58,707 INFO  io.prediction.controller.Engine$ [main] - xxxxxxxxxx.data30days.PreparedData does not support data sanity check. Skipping check.
2015-06-01 09:41:40,502 WARN  org.apache.spark.scheduler.TaskSetManager [sparkDriver-akka.actor.default-dispatcher-4] - Stage 16 contains a task of very large size (77479 KB). The maximum recommended task size is 100 KB.
2015-06-01 09:44:22,309 ERROR akka.actor.ActorSystemImpl [sparkDriver-akka.actor.default-dispatcher-3] - Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-4] shutting down ActorSystem [sparkDriver]
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:82)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor$$anonfun$launchTasks$1.apply(CoarseGrainedSchedulerBackend.scala:183)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor$$anonfun$launchTasks$1.apply(CoarseGrainedSchedulerBackend.scala:181)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor.launchTasks(CoarseGrainedSchedulerBackend.scala:181)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor.makeOffers(CoarseGrainedSchedulerBackend.scala:167)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor$$anonfun$receiveWithLogging$1.applyOrElse(CoarseGrainedSchedulerBackend.scala:131)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:53)
at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverActor.aroundReceive(CoarseGrainedSchedulerBackend.scala:74)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Regards,
Firman

Donald Szeto

unread,
Jun 1, 2015, 8:36:45 PM6/1/15
to predicti...@googlegroups.com, firman....@gmail.com
Interesting. Are you using a stock template without modification?

On Monday, June 1, 2015 at 3:41:18 PM UTC-7, Firman Gautama wrote:
Hi Donald,

I tried to increase executor memory but it didn't solve the problem, 
then I increased the driver memory, it seems fix the problem.

The "pio train" worked.

Below are the previous pio.log" verbose dump.

-----
2015-06-01 09:04:14,387 INFO  io.prediction.tools.console.Console$ [main] - Using existing engine manifest JSON at /home/firman/30days/manifest.json
2015-06-01 09:04:16,618 INFO  io.prediction.tools.Runner$ [main] - Submission command: /pio/PredictionIO-0.9.3/vendors/spark-1.3.1-bin-hadoop2.6/bin/spark-submit --driver-memory 4G --executor-memory 2G --class io.prediction.workflow.CreateWorkflow --jars file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar,file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation_2.10-0.1-SNAPSHOT.jar --files file:/pio/PredictionIO-0.9.3/conf/log4j.properties,file:/etc/hadoop/conf/core-site.xml,file:/etc/hbase/conf/hbase-site.xml --driver-class-path /pio/PredictionIO-0.9.3/conf:/etc/hadoop/conf:/etc/hbase/conf file:/pio/PredictionIO-0.9.3/lib/pio-assembly-0.9.3.jar --engine-id WtB0kwNl9oPmR4HSN6HyQdTJydjXHfJE --engine-version c8ab3be2e4e998f4c7a675dd3e7babbf05a84504 --engine-variant file:/home/firman/30days/engine.json --verbosity 0 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/firman/.pio_store,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost,PIO_STORAGE_SOURCES_HBASE_HOME=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hbase,PIO_HOME=/pio/PredictionIO-0.9.3,PIO_FS_ENGINESDIR=/home/firman/.pio_store/engines,PIO_STORAGE_SOURCES_LOCALFS_PATH=/home/firman/.pio_store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/pio/PredictionIO-0.9.3/vendors/elasticsearch-1.4.4,PIO_FS_TMPDIR=/home/firman/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_CONF_DIR=/pio/PredictionIO-0.9.3/conf,PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
2015-06-01 09:04:20,424 INFO  io.prediction.controller.Engine [main] - Extracting datasource params...
2015-06-01 09:04:20,539 INFO  io.prediction.workflow.WorkflowUtils$ [main] - No 'name' is found. Default empty String will be used.
2015-06-01 09:04:20,561 INFO  io.prediction.controller.Engine [main] - Datasource params: (,DataSourceParams(30days,None))
2015-06-01 09:04:20,562 INFO  io.prediction.controller.Engine [main] - Extracting preparator params...
2015-06-01 09:04:20,564 INFO  io.prediction.controller.Engine [main] - Preparator params: (,Empty)
2015-06-01 09:04:20,920 INFO  io.prediction.controller.Engine [main] - Extracting serving params...
2015-06-01 09:04:20,920 INFO  io.prediction.controller.Engine [main] - Serving params: (,Empty)
2015-06-01 09:04:23,367 INFO  Remoting [sparkDriver-akka.actor.default-dispatcher-3] - Starting remoting
2015-06-01 09:04:23,639 INFO  Remoting [sparkDriver-akka.actor.default-dispatcher-3] - Remoting started; listening on addresses :[akka.tcp://sparkDriver@nn01.staging.us-tmp.xxxxxxxxxx.net:22898]
...

Firman Gautama

unread,
Jun 1, 2015, 11:27:31 PM6/1/15
to Donald Szeto, predicti...@googlegroups.com
Hi Donald,

We are using the stock template.

The only thing that we changed is:
- The "buy" threshold in the DataSource.scala, we increase it from 4.0 to 10.0 and value for "buy" when importing to the event server is > 10.

Regards,
Firman

On Tue, Jun 2, 2015 at 7:36 AM, Donald Szeto <don...@prediction.io> wrote:
Interesting. Are you using a stock template without modification?

On Monday, June 1, 2015 at 3:41:18 PM UTC-7, Firman Gautama wrote:
Hi Donald,

I tried to increase executor memory but it didn't solve the problem, 
then I increased the driver memory, it seems fix the problem.

The "pio train" worked.

Below are the previous pio.log" verbose dump.

-----
2015-06-01 09:04:14,387 INFO  io.prediction.tools.console.Console$ [main] - Using existing engine manifest JSON at /home/firman/30days/manifest.json
2015-06-01 09:04:16,618 INFO  io.prediction.tools.Runner$ [main] - Submission command: /pio/PredictionIO-0.9.3/vendors/spark-1.3.1-bin-hadoop2.6/bin/spark-submit --driver-memory 4G --executor-memory 2G --class io.prediction.workflow.CreateWorkflow --jars file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar,file:/home/firman/30days/target/scala-2.10/template-scala-parallel-recommendation_2.10-0.1-SNAPSHOT.jar --files file:/pio/PredictionIO-0.9.3/conf/log4j.properties,file:/etc/hadoop/conf/core-site.xml,file:/etc/hbase/conf/hbase-site.xml --driver-class-path /pio/PredictionIO-0.9.3/conf:/etc/hadoop/conf:/etc/hbase/conf file:/pio/PredictionIO-0.9.3/lib/pio-assembly-0.9.3.jar --engine-id WtB0kwNl9oPmR4HSN6HyQdTJydjXHfJE --engine-version c8ab3be2e4e998f4c7a675dd3e7babbf05a84504 --engine-variant file:/home/firman/30days/engine.json --verbosity 0 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/firman/.pio_store,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost,PIO_STORAGE_SOURCES_HBASE_HOME=/opt/cloudera/parcels/CDH-5.4.1-1.cdh5.4.1.p0.6/lib/hbase,PIO_HOME=/pio/PredictionIO-0.9.3,PIO_FS_ENGINESDIR=/home/firman/.pio_store/engines,PIO_STORAGE_SOURCES_LOCALFS_PATH=/home/firman/.pio_store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/pio/PredictionIO-0.9.3/vendors/elasticsearch-1.4.4,PIO_FS_TMPDIR=/home/firman/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_CONF_DIR=/pio/PredictionIO-0.9.3/conf,PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
2015-06-01 09:04:20,424 INFO  io.prediction.controller.Engine [main] - Extracting datasource params...
2015-06-01 09:04:20,539 INFO  io.prediction.workflow.WorkflowUtils$ [main] - No 'name' is found. Default empty String will be used.
2015-06-01 09:04:20,561 INFO  io.prediction.controller.Engine [main] - Datasource params: (,DataSourceParams(30days,None))
2015-06-01 09:04:20,562 INFO  io.prediction.controller.Engine [main] - Extracting preparator params...
2015-06-01 09:04:20,564 INFO  io.prediction.controller.Engine [main] - Preparator params: (,Empty)
2015-06-01 09:04:20,920 INFO  io.prediction.controller.Engine [main] - Extracting serving params...
2015-06-01 09:04:20,920 INFO  io.prediction.controller.Engine [main] - Serving params: (,Empty)
2015-06-01 09:04:23,367 INFO  Remoting [sparkDriver-akka.actor.default-dispatcher-3] - Starting remoting
2015-06-01 09:04:23,639 INFO  Remoting [sparkDriver-akka.actor.default-dispatcher-3] - Remoting started; listening on addresses :[akka.tcp://spark...@nn01.staging.us-tmp.xxxxxxxxxx.net:22898]
...

Reply all
Reply to author
Forward
0 new messages