java.lang.RuntimeException: Failed to load class for data source: org.apache.spark.sql.cassandra

1,164 views
Skip to first unread message

Benjamin Ross

unread,
Jul 30, 2015, 4:51:37 PM7/30/15
to DataStax Spark Connector for Apache Cassandra
Hey all,
I’m running what should be a very straight-forward application of the Cassandra sql connector, and I’m getting an error. I'm relatively new to spark and scala, so I'm sure there's something I'm doing wrong, but it doesn't make much sense.

I'm submitting this to spark as follows:
spark-submit test-2.0.5-SNAPSHOT-jar-with-dependencies.jar

This is the error I'm getting. My jar is shaded, so I assume this shouldn’t happen? I’ve confirmed that org.apache.spark.sql.cassandra and org.apache.cassandra classes are in the jar.

Exception in thread "main" java.lang.RuntimeException: Failed to load class for data source: org.apache.spark.sql.cassandra
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl.scala:220)
at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:233)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
at com.latticeengines.test.CassandraTest$.main(CassandraTest.scala:33)
at com.latticeengines.test.CassandraTest.main(CassandraTest.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/07/30 15:34:47 INFO spark.SparkContext: Invoking stop() from shutdown hook



Here’s the code I’m trying to run:
object CassandraTest {
def main(args: Array[String]) {
println("Hello, scala!")

var conf = new SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")


val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val df = sqlContext
.read
.format("org.apache.spark.sql.cassandra")
.options(Map( "table" -> "kv", "keyspace" -> "test"))
.load()
val w = Window.orderBy("value").rowsBetween(-2, 0)
df.select(mean("value").over(w))

}
}


Here's my pom:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">

<modelVersion>4.0.0</modelVersion>
<artifactId>test</artifactId>
<packaging>jar</packaging>
<name>${component-name}</name>

<properties>
<component-name>le-sparkdb</component-name>
<hadoop.version>2.6.0.2.2.0.0-2041</hadoop.version>
<scala.version>2.10.4</scala.version>
<spark.version>1.4.1</spark.version>
<avro.version>1.7.7</avro.version>
<parquet.avro.version>1.4.3</parquet.avro.version>
<le.domain.version>2.0.5-SNAPSHOT</le.domain.version>
<le.common.version>2.0.5-SNAPSHOT</le.common.version>
<le.eai.version>2.0.5-SNAPSHOT</le.eai.version>
<spark.cassandra.version>1.2.4</spark.cassandra.version>
</properties>
<parent>
<groupId>com.latticeengines</groupId>
<artifactId>le-parent</artifactId>
<version>2.0.5-SNAPSHOT</version>
<relativePath>le-parent</relativePath>
</parent>

<build>
<plugins>
<!-- <plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>com.latticeengines.test.CassandraTest</mainClass>
</manifest>
</archive>
</configuration>
</plugin> -->

<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<archive>
<manifest>
<mainClass>com.latticeengines.test.CassandraTest</mainClass>
</manifest>
</archive>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-eclipse-plugin</artifactId>
<version>${maven.eclipse.version}</version>
<configuration>
<downloadSources>true</downloadSources>
<downloadJavadocs>true</downloadJavadocs>
<projectnatures>
<projectnature>org.scala-ide.sdt.core.scalanature</projectnature>
<projectnature>org.eclipse.jdt.core.javanature</projectnature>
</projectnatures>
<buildcommands>
<buildcommand>org.scala-ide.sdt.core.scalabuilder</buildcommand>
</buildcommands>
<classpathContainers>
<classpathContainer>org.scala-ide.sdt.launching.SCALA_CONTAINER</classpathContainer>
<classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer>
</classpathContainers>
<excludes>
<exclude>org.scala-lang:scala-library</exclude>
<exclude>org.scala-lang:scala-compiler</exclude>
</excludes>
<sourceIncludes>
<sourceInclude>**/*.scala</sourceInclude>
<sourceInclude>**/*.java</sourceInclude>
</sourceIncludes>
</configuration>
</plugin>
</plugins>
<sourceDirectory>src/main/scala</sourceDirectory>
</build>

<dependencies>
<dependency>
<groupId>com.twitter</groupId>
<artifactId>parquet-avro</artifactId>
<version>${parquet.avro.version}</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>${avro.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>com.latticeengines</groupId>
<artifactId>le-domain</artifactId>
<version>${le.domain.version}</version>
<exclusions>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.cassandra</groupId>
<artifactId>cassandra-all</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>com.latticeengines</groupId>
<artifactId>le-common</artifactId>
<version>${le.common.version}</version>
<exclusions>
<exclusion>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>${spark.cassandra.version}</version>
</dependency>
</dependencies>

</project>


Thanks so much in advance!

Alex Liu

unread,
Jul 30, 2015, 4:58:17 PM7/30/15
to DataStax Spark Connector for Apache Cassandra, benjam...@gmail.com

Add spark cassandra connector jar to spark-submit, so the executor can find the class.


alex

Benjamin Ross

unread,
Jul 30, 2015, 5:11:10 PM7/30/15
to DataStax Spark Connector for Apache Cassandra, benjam...@gmail.com, alex...@datastax.com
Tried that... same issue. Also, should that be necessary considering that I built the jar shaded?

~/bin/spark/bin/spark-submit test-2.0.5-SNAPSHOT-jar-with-dependencies.jar --jars ~/.m2/repository/com/datastax/spark/spark-cassandra-connector_2.10/1.2.4/spark-cassandra-connector_2.10-1.2.4.jar


Exception in thread "main" java.lang.RuntimeException: Failed to load class for data source: org.apache.spark.sql.cassandra
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl.scala:220)
at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:233)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
at com.latticeengines.test.CassandraTest$.main(CassandraTest.scala:32)
at com.latticeengines.test.CassandraTest.main(CassandraTest.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)




Benjamin Ross

unread,
Jul 30, 2015, 5:41:30 PM7/30/15
to DataStax Spark Connector for Apache Cassandra, benjam...@gmail.com, alex...@datastax.com
Really stuck here - any help would be much appreciated! Thanks so much in advance.

Alex Liu

unread,
Jul 30, 2015, 5:48:35 PM7/30/15
to DataStax Spark Connector for Apache Cassandra, benjam...@gmail.com
1.2.4 doesn't has data source: org.apache.spark.sql.cassandra

try 1.3.x or 1.4.x

alex

Benjamin Ross

unread,
Jul 30, 2015, 6:12:40 PM7/30/15
to DataStax Spark Connector for Apache Cassandra, benjam...@gmail.com, alex...@datastax.com
Oy! Thanks. That did it.
Reply all
Reply to author
Forward
0 new messages