Does zeppelin work with Spark under yarn-client mode?

2,127 views
Skip to first unread message

Jianshi Huang

unread,
Oct 9, 2014, 1:58:47 AM10/9/14
to zeppelin-...@googlegroups.com
The document said I need to setup Spark's master url, which is not available in YARN mode I think.

Jianshi

Jianshi Huang

unread,
Oct 9, 2014, 2:42:43 AM10/9/14
to zeppelin-...@googlegroups.com
I checked the code for createSparkContext, Looks like it should be ok to use Yarn-client

kevin...@gmail.com

unread,
Oct 9, 2014, 4:44:31 AM10/9/14
to zeppelin-...@googlegroups.com
Oops, it didn't support yarn mode.

在 2014年10月9日星期四UTC+8下午2时42分43秒,Jianshi Huang写道:

Matt Narrell

unread,
Oct 9, 2014, 11:28:26 AM10/9/14
to zeppelin-...@googlegroups.com
Or yarn-cluster.  This would be very nice to have working.

moon soo Lee

unread,
Oct 9, 2014, 10:48:32 PM10/9/14
to zeppelin-...@googlegroups.com
I have created issue for supporting Yarn cluster https://zeppelin-project.atlassian.net/browse/ZEPPELIN-172

moon soo Lee

unread,
Oct 9, 2014, 10:49:12 PM10/9/14
to zeppelin-...@googlegroups.com

kevin...@gmail.com

unread,
Oct 11, 2014, 12:59:30 AM10/11/14
to zeppelin-...@googlegroups.com
Hi, moon

Yarn-client is OK. You need add the yarn dependency in pom, replace the protobuf-2.4.1.jar with the 2.5.0 version and export the SPARK_JAR.

Thanks,
Kevin.
在 2014年10月10日星期五UTC+8上午10时49分12秒,moon soo Lee写道:

Jianshi Huang

unread,
Oct 11, 2014, 2:20:20 AM10/11/14
to kevin...@gmail.com, zeppelin-...@googlegroups.com
Thanks Kevin,

Adding yarn dependency and changing protobuf version looks strange to me. They should be dependencies of Spark.

Another reason to use spark-submit?

Jianshi

--
You received this message because you are subscribed to the Google Groups "zeppelin-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zeppelin-develo...@googlegroups.com.
To post to this group, send email to zeppelin-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

troy

unread,
Nov 6, 2014, 6:17:59 PM11/6/14
to zeppelin-...@googlegroups.com
Please elaborate what you mean here by "Yarn-client is OK". Do you mean it would work if the changes you describe are made? In that case, could you please explain the changes that you have tried and work for you?

Thanks

kevin...@gmail.com

unread,
Nov 9, 2014, 8:29:51 PM11/9/14
to zeppelin-...@googlegroups.com
Hi, troy

Now if you specify the SPARK_HOME enviroment, then zeppelin will use spark-submit to run and you can set running mode such as yarn-client just like the spark-shell.

Thanks,
Kevin.

在 2014年11月7日星期五UTC+8上午7时17分59秒,troy写道:

Ophir Cohen

unread,
Jan 26, 2015, 4:40:44 AM1/26/15
to zeppelin-...@googlegroups.com
Hi Guys,
I lost you in the thread.
I'm trying to make Zeppelin work with Spark on Yarn but it does not work.
I tried with Spark 1.2 and 1.1 and still it does not work.

Few questions:
  1. Does the underline Hadoop version matters? Should I try 2.2.0?
  2. Does Zeppeling can work with Spark 1.2?
  3. What should be the 'msater' paramater? 'yarn-clien'? 'yarn-submit'?
  4. Should I build Zeppelin with other paramters? Should I change the protobug version?
  5. What does it mean set SPARK_HOME? Where? In the startup script?

Thanks,
Ophir

Ophir Cohen

unread,
Jan 26, 2015, 9:36:17 AM1/26/15
to zeppelin-...@googlegroups.com
No matter what I'm trying I'm getting:

ERROR [2015-01-26 14:34:19,199] ({pool-2-thread-2} Job.java[run]:164) - Job failed
java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.deploy.SparkHadoopUtil$
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
        at com.nflabs.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:184)
        at com.nflabs.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:118)
        at com.nflabs.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:271)
        at com.nflabs.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:81)
        at com.nflabs.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:31)
        at com.nflabs.zeppelin.interpreter.LazyOpenInterpreter.bindValue(LazyOpenInterpreter.java:67)
        at com.nflabs.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:174)
        at com.nflabs.zeppelin.scheduler.Job.run(Job.java:147)
        at com.nflabs.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:85)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

Does anybody can help?

Ophir

kevin...@gmail.com

unread,
Jan 26, 2015, 9:47:55 PM1/26/15
to zeppelin-...@googlegroups.com
Hi, Coehn

For you questions:
1. Does the underline Hadoop version matters? Should I try 2.2.0?
No
2. Does Zeppeling can work with Spark 1.2?
Yes, I think it's independent with the spark version
3. What should be the 'msater' paramater? 'yarn-clien'? 'yarn-submit'?
yarn-client
4. Should I build Zeppelin with other paramters? Should I change the protobug version?
No
5. What does it mean set SPARK_HOME? Where? In the startup script?
It's a environment variable specifying your spark directory(You must have a spark instance)
 
Best Regards,
Kevin.
在 2015年1月26日星期一 UTC+8下午5:40:44,Ophir Cohen写道:

Ophir Cohen

unread,
Jan 26, 2015, 11:10:00 PM1/26/15
to kevin...@gmail.com, zeppelin-...@googlegroups.com

Thanks for your quick replay,
I'll try again.
Just to be clearer:
Where do you export spark home? On zeppelin-env.sh should work?

--
You received this message because you are subscribed to a topic in the Google Groups "zeppelin-developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/zeppelin-developers/fDF7Ow4wv10/unsubscribe.
To unsubscribe from this group and all its topics, send an email to zeppelin-develo...@googlegroups.com.

Rajat Gupta

unread,
Jan 27, 2015, 3:33:42 AM1/27/15
to zeppelin-...@googlegroups.com


ERROR [2015-01-26 14:34:19,199] ({pool-2-thread-2} Job.java[run]:164) - Job failed
java.lang.
NoClassDefFoundError: Could not initialize class org.apache.spark.deploy.SparkHadoopUtil$
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
        at com.nflabs.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:184)
        at com.nflabs.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:118)
        at com.nflabs.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:271)

I think ^^ error happens because classes in spark/yarn folder are not made available by zeppelin as of now. After compiling zeppelin I copied my spark assembly jar in the spark/interpreter dir and things worked beyond this point.

Ophir Cohen

unread,
Jan 27, 2015, 4:26:29 AM1/27/15
to zeppelin-...@googlegroups.com
I'm feeling I'm almost there though I didn't get your point, you took spark assembly from where to where?
I can't find any spark/interpreter in Zepplin dir...

Can you elaborate?
Thanks!

Ophir Cohen

unread,
Jan 27, 2015, 4:32:57 AM1/27/15
to zeppelin-...@googlegroups.com
OK, found it.
Now I get anther exception:
ERROR [2015-01-27 09:29:19,884] ({Thread-152} JobProgressPoller.java[run]:37) - Can not get or update progress
com.nflabs.zeppelin.interpreter.InterpreterException: org.apache.spark.SparkException: YARN mode not available ?

Any idea?

Ophir Cohen

unread,
Jan 27, 2015, 4:51:09 AM1/27/15
to zeppelin-...@googlegroups.com
BTW the underline error is:
Caused by: java.lang.NoSuchMethodError: org.apache.spark.scheduler.cluster.YarnSchedulerBackend.actorSystem()Lakka/actor/ActorSystem;
    at org.apache.spark.scheduler.cluster.YarnSchedulerBackend.<init>(YarnSchedulerBackend.scala:45)
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.<init>(YarnClientSchedulerBackend.scala:28)

It looks that I'm missing some jar but can't figure out which...

The full exception:
ERROR [2015-01-27 09:47:43,924] ({Thread-21} JobProgressPoller.java[run]:37) - Can not get or update progress

com.nflabs.zeppelin.interpreter.InterpreterException: org.apache.spark.SparkException: YARN mode not available ?
    at com.nflabs.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:83)
    at com.nflabs.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:31)
    at com.nflabs.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:78)
    at com.nflabs.zeppelin.notebook.Paragraph.progress(Paragraph.java:149)
    at com.nflabs.zeppelin.scheduler.JobProgressPoller.run(JobProgressPoller.java:35)
Caused by: org.apache.spark.SparkException: YARN mode not available ?
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1597)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:310)

    at com.nflabs.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:184)
    at com.nflabs.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:118)
    at com.nflabs.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:271)
    at com.nflabs.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:81)
    ... 4 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1594)
    ... 9 more
Caused by: java.lang.NoSuchMethodError: org.apache.spark.scheduler.cluster.YarnSchedulerBackend.actorSystem()Lakka/actor/ActorSystem;
    at org.apache.spark.scheduler.cluster.YarnSchedulerBackend.<init>(YarnSchedulerBackend.scala:45)
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.<init>(YarnClientSchedulerBackend.scala:28)
    ... 14 more

Rajat Gupta

unread,
Jan 27, 2015, 5:06:25 AM1/27/15
to zeppelin-...@googlegroups.com
sorry, I meant zeppelin/interpreter/spark directory


Ophir Cohen

unread,
Jan 27, 2015, 5:20:01 AM1/27/15
to zeppelin-...@googlegroups.com
Yep, found it thanks!
Now I have something else 😳
Reply all
Reply to author
Forward
0 new messages