What is the kernel for Apache Spark with Scala?

45 views
Skip to first unread message

mauricio...@sheffield.ac.uk

unread,
Jan 21, 2017, 6:45:12 AM1/21/17
to sage-cloud
Hi, 

I've seen kernels for Apache Spark with Python and SageMath, but I'd like to use SageMathCloud for Apache Spark with Scala. Is that kernel not included yet? Or, how should I run code in Scala and Spark?

Thanks

Harald Schilly

unread,
Jan 21, 2017, 6:48:31 AM1/21/17
to sage-cloud
Hello, we don't have such a kernel and we'll likely discontinue the
ones with spark. They aren't really working well and there is also no
cluster behind it.

-- harald
> --
> You received this message because you are subscribed to the Google Groups
> "sage-cloud" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sage-cloud+...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sage-cloud/473bcf2a-809a-4bd3-992d-d96dcdb346cf%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Harald Schilly

unread,
Jan 22, 2017, 5:49:26 AM1/22/17
to sage-cloud
Hello, I've looked into this, and my conclusion is that everything
regarding Spark is kind of broken. I don't know how to tame it in a
way that it runs in such a less-privileged project in SMC.

Regarding Scala, the idea would be to load the spark libraries and
then start from there with the context. The problem I run into is,
that it starts to create a temporary file in a directory in /tmp.
That's not a good idea, because it's shared. I tried to set the
environment variable

_JAVA_OPTIONS=-Djava.io.tmpdir=/projects/<your project id>/tmp/

and you can check this via "echo $_JAVA_OPTIONS" in a terminal in SMC.
However, there is still:

10:37:52,139 |-ERROR in
ch.qos.logback.core.rolling.RollingFileAppender[FILE] -
openFile(/tmp/javatmp/jupyter-scala.log,true) call failed.
java.io.FileNotFoundException: /tmp/jav
atmp/jupyter-scala.log (Permission denied)
at java.io.FileNotFoundException:
/tmp/javatmp/jupyter-scala.log (Permission denied)

in

cat ~/.smc/jupyter/jupyter-notebook.log

The code to start up Spark in a Scala kernel would be something like this:

```
import java.io.File
var spark_jars = new File("/projects/spark/spark/jars/")
spark_jars.listFiles.filter(_.isFile).toList.map(f =>
classpath.addPath(f.toString()))

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf

val conf = new SparkConf().setAppName("local").setMaster("local[1]")
val sc = new SparkContext(conf)
sc
```

If you have any insights into this, I would be happy if you share them
with us :-)

-- Harald





On Sat, Jan 21, 2017 at 12:45 PM, <mauricio...@sheffield.ac.uk> wrote:
Reply all
Reply to author
Forward
0 new messages