How to run a spark program on spark-notebook

316 views
Skip to first unread message

mhmd....@gmail.com

unread,
Jul 8, 2015, 12:20:30 PM7/8/15
to spark-not...@googlegroups.com
Hi,

I am excited to find this amazing project, we can start a spark program directly without the need of any creation or configuration :-D
In fact, recently I developed two programs: a spark program and a spark-streaming one.
I would like to know if it is possible to integrate and run my programs on spark-notebook ?

Sincerely,
Mhmd

andy petrella

unread,
Jul 8, 2015, 11:57:37 PM7/8/15
to mhmd....@gmail.com, spark-not...@googlegroups.com

Hello!

Glad you discovered us and you like the tool 😄
Sure you can run your program because you can inject jars in a notebook. To run your program with it, they will restore to take the sparkContext as an argument since it's provided by the notebook itself.

Hope it answers your question 😅
Enjoy

Cheers
Andy


--
You received this message because you are subscribed to the Google Groups "spark-notebook-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-notebook-...@googlegroups.com.
To post to this group, send email to spark-not...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spark-notebook-user/391643c0-7873-460d-a137-a81bb98fe239%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mhmd....@gmail.com

unread,
Jul 9, 2015, 6:49:15 AM7/9/15
to spark-not...@googlegroups.com, mhmd....@gmail.com
Thanks for replying :-)
But still don't know how to inject my application's jar into notebook and run it, could you give me the different steps or some URL explaining that ?

Cheers
Mhmd

andy petrella

unread,
Jul 9, 2015, 6:54:59 AM7/9/15
to mhmd....@gmail.com, spark-not...@googlegroups.com
sure here is some info, come to gitter if it's not enough, it's going to be easier then to help you:
https://github.com/andypetrella/spark-notebook/#import-download-dependencies
this is another way

just one note, these will put your deps in the spark.jars conf so that the whole will be sent to the cluster, but that's not an issue generally.

I recommend you to use the configuration for that, the application.conf, cluster or notebook's metadata rather than using `:dp` or `:cp`

hth

cheers
andy



--
You received this message because you are subscribed to the Google Groups "spark-notebook-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-notebook-...@googlegroups.com.
To post to this group, send email to spark-not...@googlegroups.com.

Mohammed GHESMOUNE

unread,
Jul 9, 2015, 10:57:53 AM7/9/15
to andy petrella, spark-not...@googlegroups.com
Thank you for the links :-)
I added my jars to a repository /path/to/repo, then I set the local-repo
:local-repo /path/to/repo
After that, I tried to run my program by typing
RunPgm.run(
              intputFile = "conf/test/resources/data1test.txt",
              outputDir = "target/surefire-reports",
              nbRow = 3,
              nbCol = 3,
              maxIter = 30
    )
However, I got an error: not found: value RunPgm

Cheers,
Mhmd

Please, could you tell me how to run my program ?

andy petrella

unread,
Jul 9, 2015, 11:21:58 AM7/9/15
to Mohammed GHESMOUNE, spark-not...@googlegroups.com

oki
the easiest is to use the cp version then. just create a cell with this :cp pointing to your jar

the thing is that your jar will need to have all dependencies in it (Uber jar or assembly).

to use dp it'll require you to have your project in a ivy2 repo (sbt publishLocal) or in a maven repo. For the first you can use the local repo pointing to your .ivy2 folder, for the latter you'll need to add your local or remote maven repo in the remote repo

sorry writing on a phone now... hard

Mohammed GHESMOUNE

unread,
Jul 10, 2015, 5:36:00 AM7/10/15
to andy petrella, spark-not...@googlegroups.com
Cool it seems to be working with :cp...
But I have some error (sparkException: Only one SparkContext may be running in this JVM) !
I tried: sparkContext.getConf.set("spark.driver.allowMultipleContexts", "true"), but the same error !
Could you tell me how to do it correctly ? or how to get, in my application, the sparkContext initialized by notebook ?

Thanks for your help :-)

andy petrella

unread,
Jul 10, 2015, 7:50:16 AM7/10/15
to Mohammed GHESMOUNE, spark-not...@googlegroups.com
mmmmh, which version of the notebook do you use?
I think this problem was resolved in 0.5.2, otherwise I would recommend you to build a distro from the repo, or generate one from the website on the master branch.
:-( sorry

Mohammed Ghesm...

unread,
Jul 10, 2015, 10:46:24 AM7/10/15
to andy petrella, spark-not...@googlegroups.com
I use the 0.5.2 version (spark-notebook-0.5.2-scala-2.10.4-spark-1.4.0-hadoop-2.6.0); even when downloaded the master branch, I'd the same error (Only one SparkContext may be running in this JVM) :-(

andy petrella

unread,
Jul 10, 2015, 10:48:10 AM7/10/15
to Mohammed Ghesm..., spark-not...@googlegroups.com
really, using :cp ? damned, it's weird. I'll have to look at that.

In the meantime, you can call reset after :cp in another cell


Reply all
Reply to author
Forward
0 new messages