how to import implicits to scala worksheet?

1,297 views
Skip to first unread message

anonygrits

unread,
Jun 7, 2016, 2:03:58 AM6/7/16
to Scala IDE User
I use worksheets a lot in my data science dev workflow. One of my key reasons for using Eclipse over IntelliJ is that currently, I can use Spark in my Eclipse Scala worksheets but not in the IntelliJ ones.

The one thing that puts me back in the REPL is when I need to import sqlContext implicits (import sqlContext.implicits._). Does anyone know if there is a way to enable this with Eclipse settings? I'm using the 4.4.1 version.

Thx!

wpopie...@virtuslab.com

unread,
Jun 7, 2016, 4:25:54 AM6/7/16
to Scala IDE User
Not sure but probably you should have visible jar with the implicit in your worksheet parent project. Try to look for spark-sql in project's 'Referenced libraries`. If not there you need to add it to project

anonygrits

unread,
Jun 8, 2016, 5:06:10 AM6/8/16
to Scala IDE User
Hey thanks! I'm not sure how I missed something so obvious, but I just assumed that having it in my build.sbt would take care of things. I must not have that file set up correctly.

wpopie...@virtuslab.com

unread,
Jun 8, 2016, 7:13:59 AM6/8/16
to Scala IDE User
great, just for future there is sbt eclipse plugin to generate eclipse project from build.sbt

anonygrits

unread,
Jun 8, 2016, 8:06:00 AM6/8/16
to Scala IDE User
Hmm... I use that already. Maybe I need to rerun sbt eclipse each time I change my build.sbt file? 

Usually I start off with a very basic build.sbt and then add to it as I go along. I build from the command line, and in Eclipse, I do "refresh", but now that I think about it, I don't think if I've connected the fact that I need to run sbt eclipse other than for getting the files set up the first time.

It's kind of funny how easy it is to miss the obvious things (though probably not so funny for the folks who answer the same questions over and over again...)

Thanks!

wpopie...@virtuslab.com

unread,
Jun 8, 2016, 8:15:56 AM6/8/16
to Scala IDE User
right now every time build.sbt is changed the sbt eclipse should be run. problem here is that .settings with settings is also recreated so loses its content. and no worries, 2 pairs of eyes sometimes see more that one only :)

anonygrits

unread,
Jun 8, 2016, 8:37:00 AM6/8/16
to Scala IDE User
Cool. Thanks for explaining the why!

Thanks!

anonygrits

unread,
Sep 12, 2016, 6:21:28 AM9/12/16
to Scala IDE User
Hello - sorry to bother everyone again about this, but after switching to spark 2.0, "import spark.implicits._" (updated from "import sqlContext.implicits._") stopped working. I definitely have the spark.sql jar in my referenced libraries, and I've added these imports as well:

import org.apache.spark.sql
import org.apache.spark.sql.SparkSession

The code runs in spark-shell but not in my worksheet, and it's slowing down my dev process. I need the implicits to read in my data as a data set of a specific case class. Is there a way around this?

Thanks!

wpopie...@virtuslab.com

unread,
Sep 13, 2016, 4:00:09 AM9/13/16
to Scala IDE User
Hi,
I'm looking into it, but meantime if you could import used implicit and apply/use it explicitly just to check if it is seen on worksheet classpath.
W.

iulian dragos

unread,
Sep 13, 2016, 7:51:25 AM9/13/16
to scala-i...@googlegroups.com
What exactly is happening? Can  you share your code?

--
You received this message because you are subscribed to the Google Groups "Scala IDE User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-ide-user+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-ide-user/6ff3d830-3720-4ded-b580-beb364aa92be%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
« Je déteste la montagne, ça cache le paysage »
Alphonse Allais

anonygrits

unread,
Sep 14, 2016, 3:02:01 AM9/14/16
to Scala IDE User
Hello - how do I check the worksheet classpath?

anonygrits

unread,
Sep 14, 2016, 3:06:23 AM9/14/16
to Scala IDE User
Here are the basic parts of it (runs in shell but not in worksheet):

import org.apache.spark.sql
import org.apache.spark.sql.SparkSession

object testApp {
val spark = SparkSession
.builder()
.master("local")
.appName("testApp")
.getOrCreate()

import spark.implicits._

val file = "/test/path/data.csv"

case class test (
name: String
)

val testDS = spark.read.format("csv")
.option("header", "true")
.load(file)
.as[test]

}

anonygrits

unread,
Sep 14, 2016, 5:29:02 AM9/14/16
to Scala IDE User
Nevermind... I'm more awake now. I have the spark sql jar for spark 2.0 included already in the referenced libraries list. That's what you mean right?

If you have any other ideas, can you let me know? Thx!

wpopie...@virtuslab.com

unread,
Sep 16, 2016, 6:31:03 AM9/16/16
to Scala IDE User
bad news anonygrits

looks like problem is not in implicit but just code execution. So worksheet is a little bit handicapped to scala application which means that is when execution of code fails you don't get info about it in any place like console or log. My advice is to turn your worksheet app into scala application and look at console output.
Here is example of worksheet.sc


import org.apache.spark.sql
import org.apache.spark.sql.SparkSession

object testApp {
  val spark = SparkSession
     .builder()
     .master("local")
     .appName("testApp")
     .getOrCreate()                               //> Using Spark's default log4j profile: org/apache/spark/log4j-defaults.propert
                                                  //| ies
                                                  //| 16/09/16 12:06:35 INFO SparkContext: Running Spark version 2.0.0
                                                  //| 16/09/16 12:06:35 WARN NativeCodeLoader: Unable to load native-hadoop librar
                                                  //| y for your platform... using builtin-java classes where applicable
                                                  //| 16/09/16 12:06:35 INFO SecurityManager: Changing view acls to: wpopielarski
                                                  //|
                                                  //| 16/09/16 12:06:35 INFO SecurityManager: Changing modify acls to: wpopielarsk
                                                  //| i
                                                  //| 16/09/16 12:06:35 INFO SecurityManager: Changing view acls groups to:
                                                  //| 16/09/16 12:06:35 INFO SecurityManager: Changing modify acls groups to:
                                                  //| 16/09/16 12:06:35 INFO SecurityManager: SecurityManager: authentication disa
                                                  //| bled; ui acls disabled; users  with view permissions: Set(wpopielarski); gro
                                                  //| ups with view permissions: Set(); users  with modify permissions: Set(wpopie
                                                  //| larski); groups with modify permissions: Set()
                                                  //| 16/09/16 12:06:36 INFO Utils: Successfully started service '
                                                  //| Output exceeds cutoff limit.

  import spark.implicits._

  val file = "C:\\Users\\wpopielarski\\work\\tools\\runtime-equinox-weaving\\ProjectD\\test\\path\\data.csv"
                                                  //> file  : String = C:\Users\wpopielarski\work\tools\runtime-equinox-weaving\Pr
                                                  //| ojectD\test\path\data.csv

  case class test (
    name: String
  )

  def testDS = spark.read.format("csv")
    .load(file)
    .as[String]                                   //> testDS: => org.apache.spark.sql.Dataset[String]
}

and here its counterpart in scala
package acme

import org.apache.spark.sql
import org.apache.spark.sql.SparkSession

object testApp extends App {

  val spark = SparkSession
     .builder()
     .master("local")
     .appName("testApp")
     .getOrCreate()

  import spark.implicits._

  val file = "C:\\Users\\wpopielarski\\work\\tools\\runtime-equinox-weaving\\ProjectD\\test\\path\\data.csv"

  case class test (
    name: String
  )

  def foo = spark.read.format("csv")
    .load(file)
    .as[String]
}



W dniu środa, 14 września 2016 11:29:02 UTC+2 użytkownik anonygrits napisał:

wpopie...@virtuslab.com

unread,
Sep 16, 2016, 6:34:27 AM9/16/16
to Scala IDE User

Don't remember to extend your main object with App
object testApp extends App { ... }

anonygrits

unread,
Sep 18, 2016, 1:38:29 PM9/18/16
to Scala IDE User
Thanks! I'm not sure exactly what you mean, but maybe I need to extend App even in the worksheet? I didn't have to do that in earlier versions of Spark. Somehow you got it to work without extending App though, so maybe you mean something else?

Everything else runs just fine in the console, which is why I can't figure out what why it won't run in the worksheet.

iulian dragos

unread,
Sep 18, 2016, 2:56:12 PM9/18/16
to scala-i...@googlegroups.com
But what are the actual errors that you're seeing?

On Sun, Sep 18, 2016 at 7:38 PM, anonygrits <anony...@gmail.com> wrote:
Thanks! I'm not sure exactly what you mean, but maybe I need to extend App even in the worksheet? I didn't have to do that in earlier versions of Spark. Somehow you got it to work without extending App though, so maybe you mean something else?

Everything else runs just fine in the console, which is why I can't figure out what why it won't run in the worksheet.
--
You received this message because you are subscribed to the Google Groups "Scala IDE User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-ide-user+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

anonygrits

unread,
Sep 18, 2016, 3:37:02 PM9/18/16
to Scala IDE User
The error states that I need to import spark session implicits for the encoder to work for datasets. I've imported it, but my worksheet still errors on my case class. I'm not sure how the example was run above for the worksheet that seems to work, but that doesn't work for me.

The other advice above about turning it into a spark App and looking at the console is fine, but it doesn't really work for my data science workflow. I like using the worksheets because I can see the data transformations step by step. If I have to use the app, then there's not much of an advantage for using Eclipse over IntelliJ or just using the spark-shell. The worksheets are the only reason I've stayed with Eclipse for so long...

Thanks to everyone for their inputs. I need to keep moving forward on my project, so I'm going to see if I can get worksheets up & running in IntelliJ.

iulian dragos

unread,
Sep 18, 2016, 5:56:52 PM9/18/16
to scala-i...@googlegroups.com
I think it would be much easier for everyone if you just copy-pasted the error and your code.

--
You received this message because you are subscribed to the Google Groups "Scala IDE User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-ide-user+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

anonygrits

unread,
Sep 18, 2016, 6:16:04 PM9/18/16
to Scala IDE User
The code is shown in one of the prior threads, and the error cannot be copy-pasted since that's not the way (my) worksheet works.

I'm not sure how to close this thread, but please consider it closed at this point since I'm exploring other options. Thanks so much!!

wpopie...@virtuslab.com

unread,
Sep 19, 2016, 2:42:49 AM9/19/16
to Scala IDE User
You don't have to extend your object with App in worksheet. But worksheet doesn't tell too much about runtime errors so to be sure that your all dependencies are included and visible is better to create regular scala application and check errors. As you can see in my example implicits in worksheet are working fine. Moreover if there is problem with implicits compilation worksheet gives back info about that too. The only problem with worksheet are runtime errors especially indicating missing classes. Unfortunately there is no info left to see the execution result.

anonygrits

unread,
Sep 19, 2016, 6:01:11 AM9/19/16
to Scala IDE User
Yes, thank you.

anonygrits

unread,
Sep 19, 2016, 11:18:32 AM9/19/16
to Scala IDE User
It looks like your code is not reading in the data as the cast class test. Does it work for you if you change it to "as[test]"?

I'm trying to read in my data as a dataset of case classes.

wpopie...@virtuslab.com

unread,
Sep 20, 2016, 2:24:59 AM9/20/16
to Scala IDE User
you're right, 'def foo' needs to be turned into 'val foo' but then I suffer with this:
<code>
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalStateException: unread block data
 at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(Unknown Source)
 at java.io.ObjectInputStream.readObject0(Unknown Source)
 at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
 at java.io.ObjectInputStream.readSerialData(Unknown Source)
 at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
 at java.io.ObjectInputStream.readObject0(Unknown Source)
 at java.io.ObjectInputStream.readObject(Unknown Source)
 at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
 at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
</code>
I don't get this error, but spark community says it is caused by some spark libs inconsistency. If your example works fine I appreciate when you give me your build.sbt libDependencies for tests

anonygrits

unread,
Sep 20, 2016, 5:27:01 AM9/20/16
to Scala IDE User
I'm not sure what you mean by 'def foo'? Can you post your code again?

I meant that the last part of your csv import should say 'as[test]' rather than 'as[String]'.

Maybe you meant to reply to a different post?

wpopie...@virtuslab.com

unread,
Sep 21, 2016, 3:28:40 AM9/21/16
to Scala IDE User
oh sorry, should be as[test] but then worksheet is gone :). Just create class test outside worksheet script and import it inside, like:
in file acme/test
package acme
case class test(name: String)

and then in
...
val file
= ...

import acme.test
val testDS
= spark.read.format("csv")
 
.load(file)
 
.as[test]


should be ok then

anonygrits

unread,
Sep 21, 2016, 4:59:29 AM9/21/16
to Scala IDE User
Does this run for you? Or are you just saying that it "should" work?

The reason why I ask is because I just spent a whole hour trying to make this run, and there are still encoder errors. I'm curious to see what your actual files look like and the output that you get.

wpopie...@virtuslab.com

unread,
Sep 21, 2016, 6:31:39 AM9/21/16
to Scala IDE User
'Should work' means 'it works for me' so not sure if for others
Reply all
Reply to author
Forward
0 new messages