Cannot create a new H2OContext

298 views
Skip to first unread message

euge...@gmail.com

unread,
Jul 18, 2016, 3:24:42 PM7/18/16
to H2O Open Source Scalable Machine Learning - h2ostream
Hi,

I cannot create a new H2OContext using sparkling-shell or in a scala jar. I am trying to run this on a Cloudera 5.7.1 cluster:

~/sparkling-water-1.6.3$ bin/sparkling-shell --num-executors 3 --executor-memory 4g --driver-memory 4g --master yarn-client

-----
  Spark master (MASTER)     : yarn-client
  Spark home   (SPARK_HOME) : /opt/cloudera/parcels/CDH/lib/spark
  H2O build version         : 3.8.2.3 (turchin)
  Spark build version       : 1.6.1
----

WARNING: User-defined SPARK_HOME (/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/spark) overrides detected (/opt/cloudera/parcels/CDH/lib/spark).
WARNING: Running spark-class from user-defined location.
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=384m; support was removed in 8.0
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.6.0
      /_/

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_91)
Type in expressions to have them evaluated.
Type :help for more information.
16/07/18 13:40:42 WARN spark.SparkContext: Dynamic Allocation and num executors both set, thus dynamic allocation disabled.
Spark context available as sc (master = yarn-client, app id = application_1468865872508_0003).
SQL context available as sqlContext.

scala> import org.apache.spark.h2o._
import org.apache.spark.h2o._

scala> val h2oContext = new H2OContext(sc).start()
16/07/18 13:41:14 WARN h2o.H2OContext: Increasing 'spark.locality.wait' to value 30000
16/07/18 13:41:14 WARN h2o.H2OContext: The property 'spark.scheduler.minRegisteredResourcesRatio' is not specified!
We recommend to pass `--conf spark.scheduler.minRegisteredResourcesRatio=1`
java.lang.NoSuchFieldException: classServer
        at java.lang.Class.getDeclaredField(Class.java:2070)
        at org.apache.spark.repl.h2o.H2OIMain.stopClassServer(H2OIMain.scala:74)
        at org.apache.spark.repl.h2o.H2OIMain.<init>(H2OIMain.scala:41)
        at org.apache.spark.repl.h2o.H2OIMain$.createInterpreter(H2OIMain.scala:180)
        at org.apache.spark.repl.h2o.H2OInterpreter.createInterpreter(H2OInterpreter.scala:151)
        at org.apache.spark.repl.h2o.H2OInterpreter.initializeInterpreter(H2OInterpreter.scala:105)
        at org.apache.spark.repl.h2o.H2OInterpreter.<init>(H2OInterpreter.scala:328)
        at water.api.scalaInt.ScalaCodeHandler.createInterpreterInPool(ScalaCodeHandler.scala:100)
        at water.api.scalaInt.ScalaCodeHandler$$anonfun$initializeInterpeterPool$1.apply(ScalaCodeHandler.scala:94)
        at water.api.scalaInt.ScalaCodeHandler$$anonfun$initializeInterpeterPool$1.apply(ScalaCodeHandler.scala:93)
        at scala.collection.immutable.Range.foreach(Range.scala:141)
        at water.api.scalaInt.ScalaCodeHandler.initializeInterpeterPool(ScalaCodeHandler.scala:93)
        at water.api.scalaInt.ScalaCodeHandler.<init>(ScalaCodeHandler.scala:37)
        at org.apache.spark.h2o.H2OContext$.registerScalaIntEndp(H2OContext.scala:830)
        at org.apache.spark.h2o.H2OContext$.registerClientWebAPI(H2OContext.scala:750)
        at org.apache.spark.h2o.H2OContext.start(H2OContext.scala:225)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:30)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:35)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:37)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:39)
        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:41)
        at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:43)
        at $iwC$$iwC$$iwC$$iwC.<init>(<console>:45)
        at $iwC$$iwC$$iwC.<init>(<console>:47)
        at $iwC$$iwC.<init>(<console>:49)
        at $iwC.<init>(<console>:51)
        at <init>(<console>:53)
        at .<init>(<console>:57)
        at .<clinit>(<console>)
        at .<init>(<console>:7)
        at .<clinit>(<console>)
        at $print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1045)
        at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1326)
        at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:821)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:852)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:800)
        at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
        at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
        at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
        at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
        at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
        at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
        at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
        at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1064)
        at org.apache.spark.repl.Main$.main(Main.scala:31)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Any ideas?

Thanks,
Eugene.

mat...@0xdata.com

unread,
Jul 19, 2016, 11:16:10 AM7/19/16
to H2O Open Source Scalable Machine Learning - h2ostream
Hey Eugene,

I see you are using Spark 1.6.1 bundled with CDH-5.7.1-1, from what I remember it's a bit different than the official Spark 1.6.1 distro and has some code which got committed into Spark 2.0 which is not officially yet supported by SW 1.6.x.

Could you install the official Spark 1.6.1 on your CDH cluster and use that instead?

Regards,
Mateusz

euge...@gmail.com

unread,
Jul 19, 2016, 5:10:29 PM7/19/16
to H2O Open Source Scalable Machine Learning - h2ostream, mat...@0xdata.com
Hi Mateusz,

Thank you for your response. I have tried it on an older cluster (CDH-5.6.0) and using sparkling-water 1.5.16 and it seems to work there.

euge...@gmail.com

unread,
Jul 19, 2016, 5:58:53 PM7/19/16
to H2O Open Source Scalable Machine Learning - h2ostream, mat...@0xdata.com
It is working on a CDH-5.6.0 cluster and using sparkling-shell (1.5.16). However, in my scala code the same line errors out (val h2oContext = new H2OContext(sc).start()). Could it be a sbt build issue?

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 27, nrmcdhdn03.com): java.lang.AssertionError: assertion failed: There should be at least one parser provider
        at scala.Predef$.assert(Predef.scala:179)
        at org.apache.spark.h2o.H2OContextUtils$$anonfun$8.apply(H2OContextUtils.scala:111)
        at org.apache.spark.h2o.H2OContextUtils$$anonfun$8.apply(H2OContextUtils.scala:104)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
        at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
        at scala.collection.AbstractIterator.to(Iterator.scala:1157)
        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
        at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
        at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
        at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:905)
        at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:905)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1850)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1850)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:88)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Thanks,
Eugene

mat...@0xdata.com

unread,
Jul 20, 2016, 5:16:30 AM7/20/16
to H2O Open Source Scalable Machine Learning - h2ostream, mat...@0xdata.com
Hey Eugene,

What do you mean by "in my scala code"? Did you make a fat jar and use spark submit?

Mateusz

euge...@gmail.com

unread,
Jul 20, 2016, 12:34:39 PM7/20/16
to H2O Open Source Scalable Machine Learning - h2ostream, mat...@0xdata.com
Hi Mateusz,

Yes, a scala program using sparkling water and sbt to build the executable jar and the using spark-submit. Here is the scala code:
package main.scala

import java.awt.image._
import java.io.File
import javax.imageio.ImageIO
import org.apache.spark.h2o.{H2OContext, H2OFrame, _}
import org.apache.spark.sql.Row
import org.apache.spark.sql.types.{StringType, StructField, StructType}
import org.apache.spark.{SparkConf, SparkContext}
import scala.collection.mutable.ArrayBuffer

object SparkDL {
 
    def main(args: Array[String]) {
        val conf = new SparkConf().setAppName("SparkDL")
        val sc = new SparkContext(conf)
        //val h2oContext = H2OContext.getOrCreate(sc, 4)

        val h2oContext = new H2OContext(sc).start()
       
        import h2oContext._
        import h2oContext.implicits._
       
        print("\n\n Done h2oContext \n\n")
       
        h2oContext.stop()
        sc.stop()
    }
}

Thanks,
Eugene.

euge...@gmail.com

unread,
Jul 20, 2016, 4:16:16 PM7/20/16
to H2O Open Source Scalable Machine Learning - h2ostream, mat...@0xdata.com
Here is the command I ran with the output

spark-submit --class h2oTest --verbose test.jar

Using properties file: /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/spark/conf/spark-defaults.conf
Adding default property: spark.serializer=org.apache.spark.serializer.KryoSerializer
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.shuffle.service.enabled=true
Adding default property: spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/lib/native
Adding default property: spark.yarn.historyServer.address=http://nrmcdhnn01:18088
Adding default property: spark.dynamicAllocation.schedulerBacklogTimeout=1
Adding default property: spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/lib/native
Adding default property: spark.yarn.config.gatewayPath=/opt/cloudera/parcels
Adding default property: spark.yarn.config.replacementPath={{HADOOP_COMMON_HOME}}/../../..
Adding default property: spark.shuffle.service.port=7337
Adding default property: spark.master=yarn-client
Adding default property: spark.authenticate=false
Adding default property: spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/lib/native
Adding default property: spark.eventLog.dir=hdfs://nameservice1/user/spark/applicationHistory
Adding default property: spark.dynamicAllocation.enabled=false
Adding default property: spark.dynamicAllocation.minExecutors=0
Adding default property: spark.dynamicAllocation.executorIdleTimeout=600
Adding default property: spark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/spark/lib/spark-assembly.jar
Parsed arguments:
  master                  yarn-client
  deployMode              null
  executorMemory          null
  executorCores           null
  totalExecutorCores      null
  propertiesFile          /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/spark/conf/spark-defaults.conf
  driverMemory            null
  driverCores             null
  driverExtraClassPath    null
  driverExtraLibraryPath  /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/lib/native
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                null
  mainClass               h2oTest
  primaryResource         file:/fs01/home/dars/cronJobs/javaProgs/test.jar
  name                    h2oTest
  childArgs               []
  jars                    null
  packages                null
  packagesExclusions      null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/spark/conf/spark-defaults.conf:
  spark.executor.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/lib/native
  spark.yarn.jar -> local:/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/spark/lib/spark-assembly.jar
  spark.driver.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/lib/native
  spark.authenticate -> false
  spark.yarn.historyServer.address -> http://nrmcdhnn01:18088
  spark.yarn.am.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/lib/native
  spark.eventLog.enabled -> true
  spark.dynamicAllocation.schedulerBacklogTimeout -> 1
  spark.yarn.config.gatewayPath -> /opt/cloudera/parcels
  spark.serializer -> org.apache.spark.serializer.KryoSerializer
  spark.dynamicAllocation.executorIdleTimeout -> 600
  spark.dynamicAllocation.minExecutors -> 0
  spark.shuffle.service.enabled -> true
  spark.yarn.config.replacementPath -> {{HADOOP_COMMON_HOME}}/../../..
  spark.shuffle.service.port -> 7337
  spark.eventLog.dir -> hdfs://nameservice1/user/spark/applicationHistory
  spark.master -> yarn-client
  spark.dynamicAllocation.enabled -> false


Main class:
h2oTest
Arguments:

System properties:
spark.executor.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/lib/native
spark.yarn.jar -> local:/opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/spark/lib/spark-assembly.jar
spark.driver.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/lib/native
spark.authenticate -> false
spark.yarn.historyServer.address -> http://nrmcdhnn01:18088
spark.yarn.am.extraLibraryPath -> /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/lib/native
spark.eventLog.enabled -> true
spark.dynamicAllocation.schedulerBacklogTimeout -> 1
SPARK_SUBMIT -> true
spark.yarn.config.gatewayPath -> /opt/cloudera/parcels
spark.serializer -> org.apache.spark.serializer.KryoSerializer
spark.shuffle.service.enabled -> true
spark.dynamicAllocation.minExecutors -> 0
spark.dynamicAllocation.executorIdleTimeout -> 600
spark.app.name -> h2oTest
spark.yarn.config.replacementPath -> {{HADOOP_COMMON_HOME}}/../../..
spark.jars -> file:/fs01/home/dars/cronJobs/javaProgs/test.jar
spark.submit.deployMode -> client
spark.shuffle.service.port -> 7337
spark.eventLog.dir -> hdfs://nameservice1/user/spark/applicationHistory
spark.master -> yarn-client
spark.dynamicAllocation.enabled -> false
Classpath elements:
file:/fs01/home/dars/cronJobs/javaProgs/test.jar


16/07/20 15:08:20 INFO SparkContext: Running Spark version 1.5.0-cdh5.6.0
16/07/20 15:08:21 INFO SecurityManager: Changing view acls to: dars
16/07/20 15:08:21 INFO SecurityManager: Changing modify acls to: dars
16/07/20 15:08:21 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(dars); users with modify permissions: Set(dars)
16/07/20 15:08:22 INFO Slf4jLogger: Slf4jLogger started
16/07/20 15:08:22 INFO Remoting: Starting remoting
16/07/20 15:08:22 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://spark...@10.1.22.228:45638]
16/07/20 15:08:22 INFO Remoting: Remoting now listens on addresses: [akka.tcp://spark...@10.1.22.228:45638]
16/07/20 15:08:22 INFO Utils: Successfully started service 'sparkDriver' on port 45638.
16/07/20 15:08:22 INFO SparkEnv: Registering MapOutputTracker
16/07/20 15:08:22 INFO SparkEnv: Registering BlockManagerMaster
16/07/20 15:08:22 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-8a34191f-41c7-4a18-bd81-6848e7d5ec79
16/07/20 15:08:22 INFO MemoryStore: MemoryStore started with capacity 530.3 MB
16/07/20 15:08:22 INFO HttpFileServer: HTTP File server directory is /tmp/spark-cae4fa8e-4095-4be9-af59-dee38f4d6963/httpd-00a75912-2003-429b-9e2a-b84b89181a68
16/07/20 15:08:22 INFO HttpServer: Starting HTTP Server
16/07/20 15:08:22 INFO Utils: Successfully started service 'HTTP file server' on port 57168.
16/07/20 15:08:22 INFO SparkEnv: Registering OutputCommitCoordinator
16/07/20 15:08:22 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/07/20 15:08:22 INFO SparkUI: Started SparkUI at http://10.1.22.228:4040
16/07/20 15:08:23 INFO SparkContext: Added JAR file:/fs01/home/dars/cronJobs/javaProgs/test.jar at http://10.1.22.228:57168/jars/test.jar with timestamp 1469045303252
16/07/20 15:08:23 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
16/07/20 15:08:23 INFO RMProxy: Connecting to ResourceManager at nrmcdhnn01/10.1.22.228:8032
16/07/20 15:08:23 INFO Client: Requesting a new application from cluster with 5 NodeManagers
16/07/20 15:08:23 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (20480 MB per container)
16/07/20 15:08:23 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/07/20 15:08:23 INFO Client: Setting up container launch context for our AM
16/07/20 15:08:23 INFO Client: Setting up the launch environment for our AM container
16/07/20 15:08:23 INFO Client: Preparing resources for our AM container
16/07/20 15:08:24 INFO Client: Uploading resource file:/tmp/spark-cae4fa8e-4095-4be9-af59-dee38f4d6963/__spark_conf__8616009882475768276.zip -> hdfs://nameservice1/user/dars/.sparkStaging/application_1467990577394_1659/__spark_conf__8616009882475768276.zip
16/07/20 15:08:24 INFO SecurityManager: Changing view acls to: dars
16/07/20 15:08:24 INFO SecurityManager: Changing modify acls to: dars
16/07/20 15:08:24 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(dars); users with modify permissions: Set(dars)
16/07/20 15:08:24 INFO Client: Submitting application 1659 to ResourceManager
16/07/20 15:08:25 INFO YarnClientImpl: Submitted application application_1467990577394_1659
16/07/20 15:08:26 INFO Client: Application report for application_1467990577394_1659 (state: ACCEPTED)
16/07/20 15:08:26 INFO Client:
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: root.dars
         start time: 1469045305002
         final status: UNDEFINED
         tracking URL: http://nrmcdhnn01:8088/proxy/application_1467990577394_1659/
         user: dars
16/07/20 15:08:27 INFO Client: Application report for application_1467990577394_1659 (state: ACCEPTED)
16/07/20 15:08:28 INFO Client: Application report for application_1467990577394_1659 (state: ACCEPTED)
16/07/20 15:08:29 INFO Client: Application report for application_1467990577394_1659 (state: ACCEPTED)
16/07/20 15:08:29 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://spark...@10.1.22.232:50247/user/YarnAM#-1010418489])
16/07/20 15:08:29 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> nrmcdhnn01, PROXY_URI_BASES -> http://nrmcdhnn01:8088/proxy/application_1467990577394_1659), /proxy/application_1467990577394_1659
16/07/20 15:08:29 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/07/20 15:08:30 INFO Client: Application report for application_1467990577394_1659 (state: ACCEPTED)
16/07/20 15:08:31 INFO Client: Application report for application_1467990577394_1659 (state: RUNNING)
16/07/20 15:08:31 INFO Client:
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: 10.1.22.232
         ApplicationMaster RPC port: 0
         queue: root.dars
         start time: 1469045305002
         final status: UNDEFINED
         tracking URL: http://nrmcdhnn01:8088/proxy/application_1467990577394_1659/
         user: dars
16/07/20 15:08:31 INFO YarnClientSchedulerBackend: Application application_1467990577394_1659 has started running.
16/07/20 15:08:31 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35563.
16/07/20 15:08:31 INFO NettyBlockTransferService: Server created on 35563
16/07/20 15:08:31 INFO BlockManager: external shuffle service port = 7337
16/07/20 15:08:31 INFO BlockManagerMaster: Trying to register BlockManager
16/07/20 15:08:31 INFO BlockManagerMasterEndpoint: Registering block manager 10.1.22.228:35563 with 530.3 MB RAM, BlockManagerId(driver, 10.1.22.228, 35563)
16/07/20 15:08:31 INFO BlockManagerMaster: Registered BlockManager
16/07/20 15:08:31 INFO EventLoggingListener: Logging events to hdfs://nameservice1/user/spark/applicationHistory/application_1467990577394_1659
16/07/20 15:08:34 INFO YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@nrmcdhdn02:44315/user/Executor#-269944410]) with ID 2
16/07/20 15:08:35 INFO BlockManagerMasterEndpoint: Registering block manager nrmcdhdn02:56371 with 530.3 MB RAM, BlockManagerId(2, nrmcdhdn02, 56371)
16/07/20 15:08:35 INFO YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@nrmcdhdn03:45009/user/Executor#551084165]) with ID 1
16/07/20 15:08:35 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
16/07/20 15:08:35 WARN H2OContext: Increasing 'spark.locality.wait' to value 30000
16/07/20 15:08:35 WARN H2OContext: The property 'spark.scheduler.minRegisteredResourcesRatio' is not specified!

We recommend to pass `--conf spark.scheduler.minRegisteredResourcesRatio=1`
16/07/20 15:08:35 INFO H2OContext: Starting H2O services: Sparkling Water configuration:
  workers        : None
  cloudName      : sparkling-water-dars_-1486143172
  flatfile       : true
  clientBasePort : 54321
  nodeBasePort   : 54321
  cloudTimeout   : 60000
  h2oNodeLog     : INFO
  h2oClientLog   : WARN
  nthreads       : -1
  drddMulFactor  : 10
16/07/20 15:08:35 INFO BlockManagerMasterEndpoint: Registering block manager nrmcdhdn03:36657 with 530.3 MB RAM, BlockManagerId(1, nrmcdhdn03, 36657)
16/07/20 15:08:35 INFO SparkContext: Starting job: collect at SpreadRDDBuilder.scala:103
16/07/20 15:08:35 INFO DAGScheduler: Got job 0 (collect at SpreadRDDBuilder.scala:103) with 21 output partitions
16/07/20 15:08:35 INFO DAGScheduler: Final stage: ResultStage 0(collect at SpreadRDDBuilder.scala:103)
16/07/20 15:08:35 INFO DAGScheduler: Parents of final stage: List()
16/07/20 15:08:35 INFO DAGScheduler: Missing parents: List()
16/07/20 15:08:35 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at mapPartitionsWithIndex at SpreadRDDBuilder.scala:99), which has no missing parents
16/07/20 15:08:36 INFO MemoryStore: ensureFreeSpace(2160) called with curMem=0, maxMem=556038881
16/07/20 15:08:36 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 2.1 KB, free 530.3 MB)
16/07/20 15:08:36 INFO MemoryStore: ensureFreeSpace(1356) called with curMem=2160, maxMem=556038881
16/07/20 15:08:36 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1356.0 B, free 530.3 MB)
16/07/20 15:08:36 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.1.22.228:35563 (size: 1356.0 B, free: 530.3 MB)
16/07/20 15:08:36 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:861
16/07/20 15:08:36 INFO DAGScheduler: Submitting 21 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at mapPartitionsWithIndex at SpreadRDDBuilder.scala:99)
16/07/20 15:08:36 INFO YarnScheduler: Adding task set 0.0 with 21 tasks
16/07/20 15:08:36 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, nrmcdhdn02, partition 0,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:36 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, nrmcdhdn03, partition 1,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:40 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on nrmcdhdn02:56371 (size: 1356.0 B, free: 530.3 MB)
16/07/20 15:08:40 INFO BlockManagerInfo: Added rdd_0_0 in memory on nrmcdhdn02:56371 (size: 16.0 B, free: 530.3 MB)
16/07/20 15:08:40 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, nrmcdhdn02, partition 2,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:40 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 4487 ms on nrmcdhdn02 (1/21)
16/07/20 15:08:40 INFO BlockManagerInfo: Added rdd_0_2 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:40 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, nrmcdhdn02, partition 3,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:40 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 88 ms on nrmcdhdn02 (2/21)
16/07/20 15:08:40 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on nrmcdhdn03:36657 (size: 1356.0 B, free: 530.3 MB)
16/07/20 15:08:40 INFO BlockManagerInfo: Added rdd_0_3 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:40 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, nrmcdhdn02, partition 4,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:40 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 98 ms on nrmcdhdn02 (3/21)
16/07/20 15:08:40 INFO BlockManagerInfo: Added rdd_0_4 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:40 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, nrmcdhdn02, partition 5,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:40 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 108 ms on nrmcdhdn02 (4/21)
16/07/20 15:08:40 INFO BlockManagerInfo: Added rdd_0_5 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, nrmcdhdn02, partition 6,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_1 in memory on nrmcdhdn03:36657 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 112 ms on nrmcdhdn02 (5/21)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, nrmcdhdn03, partition 7,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_6 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 4794 ms on nrmcdhdn03 (6/21)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_7 in memory on nrmcdhdn03:36657 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, nrmcdhdn02, partition 8,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, nrmcdhdn03, partition 9,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 114 ms on nrmcdhdn02 (7/21)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 82 ms on nrmcdhdn03 (8/21)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_8 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_9 in memory on nrmcdhdn03:36657 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, nrmcdhdn02, partition 10,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, nrmcdhdn03, partition 11,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 115 ms on nrmcdhdn02 (9/21)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 110 ms on nrmcdhdn03 (10/21)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_10 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_11 in memory on nrmcdhdn03:36657 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, nrmcdhdn02, partition 12,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 13, nrmcdhdn03, partition 13,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 10.0 in stage 0.0 (TID 10) in 124 ms on nrmcdhdn02 (11/21)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 11) in 120 ms on nrmcdhdn03 (12/21)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_12 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_13 in memory on nrmcdhdn03:36657 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 14, nrmcdhdn02, partition 14,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 15, nrmcdhdn03, partition 15,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 12) in 120 ms on nrmcdhdn02 (13/21)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 13.0 in stage 0.0 (TID 13) in 115 ms on nrmcdhdn03 (14/21)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_14 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_15 in memory on nrmcdhdn03:36657 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 16, nrmcdhdn02, partition 16,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 17.0 in stage 0.0 (TID 17, nrmcdhdn03, partition 17,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 14.0 in stage 0.0 (TID 14) in 106 ms on nrmcdhdn02 (15/21)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 15.0 in stage 0.0 (TID 15) in 97 ms on nrmcdhdn03 (16/21)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_16 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_17 in memory on nrmcdhdn03:36657 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 18.0 in stage 0.0 (TID 18, nrmcdhdn02, partition 18,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 16.0 in stage 0.0 (TID 16) in 91 ms on nrmcdhdn02 (17/21)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 19.0 in stage 0.0 (TID 19, nrmcdhdn03, partition 19,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_18 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 17.0 in stage 0.0 (TID 17) in 124 ms on nrmcdhdn03 (18/21)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 20.0 in stage 0.0 (TID 20, nrmcdhdn02, partition 20,PROCESS_LOCAL, 2021 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 18.0 in stage 0.0 (TID 18) in 93 ms on nrmcdhdn02 (19/21)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_19 in memory on nrmcdhdn03:36657 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO BlockManagerInfo: Added rdd_0_20 in memory on nrmcdhdn02:56371 (size: 40.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 19.0 in stage 0.0 (TID 19) in 115 ms on nrmcdhdn03 (20/21)
16/07/20 15:08:41 INFO TaskSetManager: Finished task 20.0 in stage 0.0 (TID 20) in 101 ms on nrmcdhdn02 (21/21)
16/07/20 15:08:41 INFO DAGScheduler: ResultStage 0 (collect at SpreadRDDBuilder.scala:103) finished in 5.456 s
16/07/20 15:08:41 INFO YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
16/07/20 15:08:41 INFO DAGScheduler: Job 0 finished: collect at SpreadRDDBuilder.scala:103, took 5.961555 s
16/07/20 15:08:41 INFO ParallelCollectionRDD: Removing RDD 0 from persistence list
16/07/20 15:08:41 INFO BlockManager: Removing RDD 0
16/07/20 15:08:41 INFO SpreadRDDBuilder: Detected 2 spark executors for 2 H2O workers!
16/07/20 15:08:41 INFO H2OContext: Launching H2O on following 2 nodes: (1,nrmcdhdn03,-1),(2,nrmcdhdn02,-1)
16/07/20 15:08:41 INFO SparkContext: Starting job: collect at H2OContextUtils.scala:174
16/07/20 15:08:41 INFO DAGScheduler: Got job 1 (collect at H2OContextUtils.scala:174) with 2 output partitions
16/07/20 15:08:41 INFO DAGScheduler: Final stage: ResultStage 1(collect at H2OContextUtils.scala:174)
16/07/20 15:08:41 INFO DAGScheduler: Parents of final stage: List()
16/07/20 15:08:41 INFO DAGScheduler: Missing parents: List()
16/07/20 15:08:41 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[3] at map at H2OContextUtils.scala:104), which has no missing parents
16/07/20 15:08:41 INFO MemoryStore: ensureFreeSpace(2816) called with curMem=3516, maxMem=556038881
16/07/20 15:08:41 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2.8 KB, free 530.3 MB)
16/07/20 15:08:41 INFO MemoryStore: ensureFreeSpace(1796) called with curMem=6332, maxMem=556038881
16/07/20 15:08:41 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1796.0 B, free 530.3 MB)
16/07/20 15:08:41 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.1.22.228:35563 (size: 1796.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:861
16/07/20 15:08:41 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 1 (MapPartitionsRDD[3] at map at H2OContextUtils.scala:104)
16/07/20 15:08:41 INFO YarnScheduler: Adding task set 1.0 with 2 tasks
16/07/20 15:08:41 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 21, nrmcdhdn03, partition 0,NODE_LOCAL, 2885 bytes)
16/07/20 15:08:41 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 22, nrmcdhdn02, partition 1,NODE_LOCAL, 2885 bytes)
16/07/20 15:08:41 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on nrmcdhdn03:36657 (size: 1796.0 B, free: 530.3 MB)
16/07/20 15:08:41 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on nrmcdhdn02:56371 (size: 1796.0 B, free: 530.3 MB)
16/07/20 15:08:42 WARN TaskSetManager: Lost task 1.0 in stage 1.0 (TID 22, nrmcdhdn02): java.lang.AssertionError: assertion failed: There should be at least one parser provider
16/07/20 15:08:42 INFO TaskSetManager: Lost task 0.0 in stage 1.0 (TID 21) on executor nrmcdhdn03: java.lang.AssertionError (assertion failed: There should be at least one parser provider) [duplicate 1]
16/07/20 15:08:42 INFO TaskSetManager: Starting task 0.1 in stage 1.0 (TID 23, nrmcdhdn03, partition 0,NODE_LOCAL, 2885 bytes)
16/07/20 15:08:42 INFO TaskSetManager: Starting task 1.1 in stage 1.0 (TID 24, nrmcdhdn02, partition 1,NODE_LOCAL, 2885 bytes)
16/07/20 15:08:42 INFO TaskSetManager: Lost task 0.1 in stage 1.0 (TID 23) on executor nrmcdhdn03: java.lang.AssertionError (assertion failed: There should be at least one parser provider) [duplicate 2]
16/07/20 15:08:42 INFO TaskSetManager: Starting task 0.2 in stage 1.0 (TID 25, nrmcdhdn03, partition 0,NODE_LOCAL, 2885 bytes)
16/07/20 15:08:42 INFO TaskSetManager: Lost task 1.1 in stage 1.0 (TID 24) on executor nrmcdhdn02: java.lang.AssertionError (assertion failed: There should be at least one parser provider) [duplicate 3]
16/07/20 15:08:42 INFO TaskSetManager: Starting task 1.2 in stage 1.0 (TID 26, nrmcdhdn02, partition 1,NODE_LOCAL, 2885 bytes)
16/07/20 15:08:42 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 10.1.22.228:35563 in memory (size: 1356.0 B, free: 530.3 MB)
16/07/20 15:08:42 INFO BlockManagerInfo: Removed broadcast_0_piece0 on nrmcdhdn02:56371 in memory (size: 1356.0 B, free: 530.3 MB)
16/07/20 15:08:42 INFO TaskSetManager: Lost task 0.2 in stage 1.0 (TID 25) on executor nrmcdhdn03: java.lang.AssertionError (assertion failed: There should be at least one parser provider) [duplicate 4]
16/07/20 15:08:42 INFO BlockManagerInfo: Removed broadcast_0_piece0 on nrmcdhdn03:36657 in memory (size: 1356.0 B, free: 530.3 MB)
16/07/20 15:08:42 INFO TaskSetManager: Starting task 0.3 in stage 1.0 (TID 27, nrmcdhdn03, partition 0,NODE_LOCAL, 2885 bytes)
16/07/20 15:08:42 INFO TaskSetManager: Lost task 1.2 in stage 1.0 (TID 26) on executor nrmcdhdn02: java.lang.AssertionError (assertion failed: There should be at least one parser provider) [duplicate 5]
16/07/20 15:08:42 INFO TaskSetManager: Starting task 1.3 in stage 1.0 (TID 28, nrmcdhdn02, partition 1,NODE_LOCAL, 2885 bytes)
16/07/20 15:08:42 INFO ContextCleaner: Cleaned accumulator 1
16/07/20 15:08:42 INFO BlockManager: Removing RDD 0
16/07/20 15:08:42 INFO ContextCleaner: Cleaned RDD 0
16/07/20 15:08:42 INFO TaskSetManager: Lost task 1.3 in stage 1.0 (TID 28) on executor nrmcdhdn02: java.lang.AssertionError (assertion failed: There should be at least one parser provider) [duplicate 6]
16/07/20 15:08:42 ERROR TaskSetManager: Task 1 in stage 1.0 failed 4 times; aborting job
16/07/20 15:08:42 INFO TaskSetManager: Lost task 0.3 in stage 1.0 (TID 27) on executor nrmcdhdn03: java.lang.AssertionError (assertion failed: There should be at least one parser provider) [duplicate 7]
16/07/20 15:08:42 INFO YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
16/07/20 15:08:42 INFO YarnScheduler: Cancelling stage 1
16/07/20 15:08:42 INFO DAGScheduler: ResultStage 1 (collect at H2OContextUtils.scala:174) failed in 0.344 s
16/07/20 15:08:42 INFO DAGScheduler: Job 1 failed: collect at H2OContextUtils.scala:174, took 0.373840 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 1.0 failed 4 times, most recent failure: Lost task 1.3 in stage 1.0 (TID 28, nrmcdhdn02): java.lang.AssertionError: assertion failed: There should be at least one parser provider
Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1294)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1282)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1281)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1281)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1507)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1469)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
        at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:905)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
        at org.apache.spark.rdd.RDD.collect(RDD.scala:904)
        at org.apache.spark.h2o.H2OContextUtils$.startH2O(H2OContextUtils.scala:174)
        at org.apache.spark.h2o.H2OContext.start(H2OContext.scala:214)
        at h2oTest$.main(h2oTest.scala:12)
        at h2oTest.main(h2oTest.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.AssertionError: assertion failed: There should be at least one parser provider
16/07/20 15:08:42 INFO SparkContext: Invoking stop() from shutdown hook
16/07/20 15:08:42 INFO SparkUI: Stopped Spark web UI at http://10.1.22.228:4040
16/07/20 15:08:42 INFO DAGScheduler: Stopping DAGScheduler
16/07/20 15:08:42 INFO YarnClientSchedulerBackend: Shutting down all executors
16/07/20 15:08:42 INFO YarnClientSchedulerBackend: Interrupting monitor thread
16/07/20 15:08:42 INFO YarnClientSchedulerBackend: Asking each executor to shut down
16/07/20 15:08:42 INFO YarnClientSchedulerBackend: Stopped
16/07/20 15:08:42 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/07/20 15:08:42 INFO MemoryStore: MemoryStore cleared
16/07/20 15:08:42 INFO BlockManager: BlockManager stopped
16/07/20 15:08:42 INFO BlockManagerMaster: BlockManagerMaster stopped
16/07/20 15:08:42 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/07/20 15:08:42 INFO SparkContext: Successfully stopped SparkContext
16/07/20 15:08:42 INFO ShutdownHookManager: Shutdown hook called
16/07/20 15:08:42 INFO ShutdownHookManager: Deleting directory /tmp/spark-cae4fa8e-4095-4be9-af59-dee38f4d6963

mat...@0xdata.com

unread,
Jul 21, 2016, 9:36:39 AM7/21/16
to H2O Open Source Scalable Machine Learning - h2ostream, mat...@0xdata.com
Hey Eugene,

I haven't seen this error yet - will try to reproduce when I have a moment free but not sure when that will be.

For the time being could you install an original Spark distro in your cluster instead of using the one that comes with CDH?

Regards,
Mateusz

Reply all
Reply to author
Forward
0 new messages