To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.
lazy val conf = new SparkConf().setAppName(getClass.getSimpleName)
.setMaster(SparkMaster)
.set("spark.cassandra.connection.host", CassandraHosts)
.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
.set("spark.kryo.registrator", "com.datastax.killrweather.KillrKryoRegistrator")
.set("spark.cleaner.ttl", SparkCleanerTtl.toString)
.setJars( Array("/projects/radtech.io/killrweather/killrweather-app/target/scala-2.10/app_2.10-1.0.0-SNAPSHOT.jar"))
//.setJars(SparkContext.jarOfClass(this.getClass).toSeq)
lazy val sc = new SparkContext(conf)
sc.addJar("/projects/radtech.io/spark-cassandra-connector/spark-connector/target/spark-cassandra-connector-assembly-1.3.0-SNAPSHOT.jar")
sc.addJar("/projects/radtech.io/killrweather/killrweather-core/core_2.10-1.0.0-SNAPSHOT.jar")Hi Helena,Thanks for the feedback. I have added the jar with setJars, but that still results in the same error which sort of makes sense when I think about it. Would this not have to be a fatjar, assembly added, for it to work? If I simply add the generated jar from sbt package, it will not contain any of the dependencies such as the spark-cassandra-connector, datastax-driver, ... which the executors may be dependent on; am I missing something here?I can think of ways to achieve this by adding:
.set("spark.executor.extraClassPath", "/common/file/share/here/*")
Or use addJar for each of the required jars, easy enough to do using something like the sbt native packager and then just loading all jars from the lib directory, or the appropriate ones something like:
jars = listJars(“./lib/*.jar”).map(_.filter(_.size != 0)).toSeq.flatten
if (jars ! = null) {
jars.foreach(sparkContext.addJar)
}I have the first one working, but it needs to be automated and cleaned up.
So am I missing something obvious here? Is there a easier or cleaner solution that I'm missing? I don't think there is but wanted to validate.
-Todd
.set("spark.executor.extraClassPath", "/common/file/share/here/*")
Or use addJar for each of the required jars, easy enough to do using something like the sbt native packager and then just loading all jars from the lib directory, or the appropriate ones something like:
jars = listJars(“./lib/*.jar”).map(_.filter(_.size != 0)).toSeq.flatten
if (jars ! = null) {
jars.foreach(sparkContext.addJar)
}
I have the first one working, but it needs to be automated and cleaned up.
So am I missing something obvious here? Is there a easier or cleaner solution that I'm missing? I don't think there is but wanted to validate.
-Todd
./bin/spark-shell
--master spark://ubuntu:7077 --driver-class-path /home/automaton/spark-cassandra-connector/spark-cassandra-connector/target/scala-2.10/spark-cassandra-connector-assembly-1.2.0-SNAPSHOT.jar --conf spark.executor.extraClassPath=/home/automaton/spark-cassandra-connector/spark-cassandra-connector/target/scala-2.10/spark-cassandra-connector-assembly-1.2.0-SNAPSHOT.jar -conf spark.cassandra.connection.host=127.0.0.1
JoseHi Todd,
Any reason why you are not building a fat jar for the Spark Cassandra connector?
Mohammed
From: spark-conn...@lists.datastax.com [mailto:spark-conn...@lists.datastax.com]
On Behalf Of Todd Nist
Sent: Tuesday, May 26, 2015 1:58 PM
To: spark-conn...@lists.datastax.com
Subject: Re: spark.executor.extraClassPath - Values not picked up by executors
Hi Helena,
Thanks for the feedback. I have added the jar with setJars, but that still results in the same error which sort of makes sense when I think about it. Would this not have to be a fatjar, assembly added, for it to work? If I simply add the generated jar from sbt package, it will not contain any of the dependencies such as the spark-cassandra-connector, datastax-driver, ... which the executors may be dependent on; am I missing something here?
I can think of ways to achieve this by adding:
.set("spark.executor.extraClassPath", "/common/file/share/here/*")
Or use addJar for each of the required jars, easy enough to do using something like the sbt native packager and then just loading all jars from the lib directory, or the appropriate ones something like:
jars = listJars(“./lib/*.jar”).map(_.filter(_.size != 0)).toSeq.flatten
if (jars ! = null) {
jars.foreach(sparkContext.addJar)
}
I have the first one working, but it needs to be automated and cleaned up.
So am I missing something obvious here? Is there a easier or cleaner solution that I'm missing? I don't think there is but wanted to validate.
-Todd
On Tue, May 26, 2015 at 8:23 AM, Helena Edelson <helena....@datastax.com> wrote:
Hi Todd,
You may just need to add the jar to SparkConf.setJars() in the app, or it could be a mis-configuration in your build changes
I have not pushed the latest version upgrades to KillrWeather because we have not yet released the connector version supporting spark 1.3. This is coming soon. I’ll be updating the repo before heading to talk at ScalaDays Amsterdam in 2 weeks.
Master, workers(executors) and driver are different processes, and have its own classpathes.
The spark.executor.extraClassPath is actually taken by an executor only, but not a driver. It means that each executor will have the connector jar in its classpath, but not in the driver's classpath. To add a jar to a driver's class path it is offered to use --driver-class-path command line options or set the spark.driver.extraClassPath in spark-defaults.config on the driver's machine.