Thanks!
import org.apache.spark.sql
import org.apache.spark.sql.SparkSession
The code runs in spark-shell but not in my worksheet, and it's slowing down my dev process. I need the implicits to read in my data as a data set of a specific case class. Is there a way around this?
Thanks!
--
You received this message because you are subscribed to the Google Groups "Scala IDE User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-ide-user+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-ide-user/6ff3d830-3720-4ded-b580-beb364aa92be%40googlegroups.com.
import org.apache.spark.sql
import org.apache.spark.sql.SparkSession
object testApp {
val spark = SparkSession
.builder()
.master("local")
.appName("testApp")
.getOrCreate()
import spark.implicits._
val file = "/test/path/data.csv"
case class test (
name: String
)
val testDS = spark.read.format("csv")
.option("header", "true")
.load(file)
.as[test]
}
If you have any other ideas, can you let me know? Thx!
import org.apache.spark.sql
import org.apache.spark.sql.SparkSession
object testApp {
val spark = SparkSession
.builder()
.master("local")
.appName("testApp")
.getOrCreate() //> Using Spark's default log4j profile: org/apache/spark/log4j-defaults.propert
//| ies
//| 16/09/16 12:06:35 INFO SparkContext: Running Spark version 2.0.0
//| 16/09/16 12:06:35 WARN NativeCodeLoader: Unable to load native-hadoop librar
//| y for your platform... using builtin-java classes where applicable
//| 16/09/16 12:06:35 INFO SecurityManager: Changing view acls to: wpopielarski
//|
//| 16/09/16 12:06:35 INFO SecurityManager: Changing modify acls to: wpopielarsk
//| i
//| 16/09/16 12:06:35 INFO SecurityManager: Changing view acls groups to:
//| 16/09/16 12:06:35 INFO SecurityManager: Changing modify acls groups to:
//| 16/09/16 12:06:35 INFO SecurityManager: SecurityManager: authentication disa
//| bled; ui acls disabled; users with view permissions: Set(wpopielarski); gro
//| ups with view permissions: Set(); users with modify permissions: Set(wpopie
//| larski); groups with modify permissions: Set()
//| 16/09/16 12:06:36 INFO Utils: Successfully started service '
//| Output exceeds cutoff limit.
import spark.implicits._
val file = "C:\\Users\\wpopielarski\\work\\tools\\runtime-equinox-weaving\\ProjectD\\test\\path\\data.csv"
//> file : String = C:\Users\wpopielarski\work\tools\runtime-equinox-weaving\Pr
//| ojectD\test\path\data.csv
case class test (
name: String
)
def testDS = spark.read.format("csv")
.load(file)
.as[String] //> testDS: => org.apache.spark.sql.Dataset[String]
}
package acme
import org.apache.spark.sql
import org.apache.spark.sql.SparkSession
object testApp extends App {
val spark = SparkSession
.builder()
.master("local")
.appName("testApp")
.getOrCreate()
import spark.implicits._
val file = "C:\\Users\\wpopielarski\\work\\tools\\runtime-equinox-weaving\\ProjectD\\test\\path\\data.csv"
case class test (
name: String
)
def foo = spark.read.format("csv")
.load(file)
.as[String]
}
object testApp extends App { ... }
Everything else runs just fine in the console, which is why I can't figure out what why it won't run in the worksheet.
Thanks! I'm not sure exactly what you mean, but maybe I need to extend App even in the worksheet? I didn't have to do that in earlier versions of Spark. Somehow you got it to work without extending App though, so maybe you mean something else?
Everything else runs just fine in the console, which is why I can't figure out what why it won't run in the worksheet.
--
You received this message because you are subscribed to the Google Groups "Scala IDE User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-ide-user+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-ide-user/7c28b175-98c9-4fc4-bfbf-c7fd9d9aa480%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
The other advice above about turning it into a spark App and looking at the console is fine, but it doesn't really work for my data science workflow. I like using the worksheets because I can see the data transformations step by step. If I have to use the app, then there's not much of an advantage for using Eclipse over IntelliJ or just using the spark-shell. The worksheets are the only reason I've stayed with Eclipse for so long...
Thanks to everyone for their inputs. I need to keep moving forward on my project, so I'm going to see if I can get worksheets up & running in IntelliJ.
--
You received this message because you are subscribed to the Google Groups "Scala IDE User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-ide-user+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-ide-user/bc39878d-69b9-46f0-9f29-c4edf7c2d289%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I'm not sure how to close this thread, but please consider it closed at this point since I'm exploring other options. Thanks so much!!
I'm trying to read in my data as a dataset of case classes.
package acme
case class test(name: String)
...
val file = ...
import acme.test
val testDS = spark.read.format("csv")
.load(file)
.as[test]
The reason why I ask is because I just spent a whole hour trying to make this run, and there are still encoder errors. I'm curious to see what your actual files look like and the output that you get.