Running sparkling-water application on Apache Spark with --master yarn-client : ISSUE

226 views
Skip to first unread message

Devesh Kandpal

unread,
Mar 6, 2016, 10:55:11 PM3/6/16
to H2O Open Source Scalable Machine Learning - h2ostream
Hi,


I'm running my sparkling-water application by doing a ./spark-submit --master yarn-client --class className.class pathToJar.jar

My hadoop pseudo distributed configuration has a resource manager , with a single namenode and a datanode.

I've tried running : ./spark-shell --master yarn-client and I get a scala prompt without any error.

I'm using all h2o related jars as dependencies in an uber jar. I get the following exception:


I'm using apache spark-1.6 (tried with spark-1.5 as well, did not work) with hadoop 2.6.3 adn sparkling water 1.5.10 dependencies.

Exception in thread "main" java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
        at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:104)
        at org.apache.spark.SparkContext.getExecutorStorageStatus(SparkContext.scala:1546)
        at org.apache.spark.h2o.H2OContext.numOfSparkExecutors(H2OContext.scala:150)
        at org.apache.spark.h2o.H2OContext.createSpreadRDD(H2OContext.scala:255)
        at org.apache.spark.h2o.H2OContext.start(H2OContext.scala:183)
        at amo.AMOCognitive$.main(AMOCognitive.scala:63)
        at amo.AMOCognitive.main(AMOCognitive.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)





The errror in my code that I get at line 63 is where I initialize H20Context. FInd below the code for the same :



val conf = new SparkConf()
      .setAppName("AMO Cognitive Tool")
      .set("spark.logConf", "false")
    val sc = new SparkContext(conf)
        import org.apache.spark.h2o._
    implicit val sqlContext = SQLContext.getOrCreate(sc)
    import sqlContext.implicits._
    // Start H2O services
    val h2oContext = H2OContext.getOrCreate(sc).start()


I have even tried changing the highlighted line to :

  val h2oContext = new H2OContext(sc).start()


However, this did not help



Could someone please help me out with this?

Appreciate your time and help.



Regards,
Devesh.



Michal Malohlava

unread,
Mar 7, 2016, 1:05:26 PM3/7/16
to h2os...@googlegroups.com
Hi Devesh,

Sparkling Water libraries 1.5.10 are designed for Spark 1.5.X
We are preparing 1.6 release right now.

However, did you see the same error with Spark 1.5?

I would recommend to:

  • increase available memory in driver and executors (options spark.driver.memory resp., spark.yarn.am.memory andspark.executor.memory),
  • make cluster homogeneous - use the same value for driver and executor memory
  • increase PermGen size if you are running on top of Java7 (options spark.driver.extraJavaOptions resp.,spark.yarn.am.extraJavaOptions and spark.executor.extraJavaOptions)
  • in rare cases, it helps to increase spark.yarn.driver.memoryOverhead, spark.yarn.am.memoryOverhead, orspark.yarn.executor.memoryOverhead

For running Sparkling Water on top of Yarn:

  • make sure that Yarn provides stable containers, do not use preemptive Yarn scheduler
  • make sure that Spark application manager has enough memory and increase PermGen size
  • in case of a container failure Yarn should not restart container and application should gracefully terminate

Furthermore, we recommend to configure the following Spark properties to speedup and stabilize creation of H2O services on top of Spark cluster:

Property Context Value Explanation
spark.locality.wait all 3000 Number of seconds to wait for task launch on data-local node. We recommend to increase since we would like to make sure that H2O tasks are processed locally with data.
spark.scheduler.minRegisteredResourcesRatio all 1 Make sure that Spark starts scheduling when it sees 100% of resources.
spark.task.maxFailures all 1 Do not try to retry failed tasks.
spark...extraJavaOptions all -XX:MaxPermSize=384m Increase PermGen size if you are running on Java7. Make sure to configure it on driver/executor/Yarn application manager.
spark.yarn.....memoryOverhead yarn increase Increase memoryOverhead if it is necessary.
spark.yarn.max.executor.failures yarn 1 Do not try restart executors after failure and directly fail computation.

Michal
--
You received this message because you are subscribed to the Google Groups "H2O Open Source Scalable Machine Learning - h2ostream" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages