Databricks runtime 7.3 LTS and Connector 3.0 incompatibility

408 views
Skip to first unread message

Nishant J

unread,
Oct 15, 2020, 1:24:34 PM10/15/20
to DataStax Spark Connector for Apache Cassandra
Is there any compatibility issue between the Connector and Databricks 7.3LTS (Spark 3.0.1, Scala 2.12). I created a cluster with spark connector (spark-cassandra-connector_2.12-3.0.0.jar) and all its transient dependencies however I get this error on a simple select statement - 

java.io.IOException: Failed to open native connection to Cassandra at {##.###.#.##:9042} :: com.typesafe.config.impl.ConfigImpl.newSimpleOrigin(Ljava/lang/String;)Lcom/typesafe/config/ConfigOrigin;

ROOT cause - 
Caused by: java.lang.NoSuchMethodError: com.typesafe.config.impl.ConfigImpl.newSimpleOrigin(Ljava/lang/String;)Lcom/typesafe/config/ConfigOrigin; at com.typesafe.config.ConfigOriginFactory.newSimple(ConfigOriginFactory.java:42)

I uploaded the connector jar and all of its dependencies to the cluster. Here is the list of dependencies that maven pulled for me - 

HdrHistogram-2.1.11.jar
commons-lang3-3.9.jar
config-1.3.4.jar
java-driver-core-shaded-4.7.2.jar
java-driver-mapper-runtime-4.7.2.jar
java-driver-query-builder-4.7.2.jar
java-driver-shaded-guava-25.1-jre-graal-sub-1.jar
javatuples-1.2.jar
jcip-annotations-1.0-1.jar
jsr305-3.0.2.jar
metrics-core-4.0.5.jar
native-protocol-1.4.10.jar
paranamer-2.8.jar
reactive-streams-1.0.2.jar
scala-library-2.12.11.jar
scala-reflect-2.12.11.jar
slf4j-api-1.7.26.jar
spark-cassandra-connector-driver_2.12-3.0.0.jar
spark-cassandra-connector_2.12-3.0.0.jar
spotbugs-annotations-3.1.12.jar

Jaroslaw Grabowski

unread,
Oct 16, 2020, 2:28:48 AM10/16/20
to spark-conn...@lists.datastax.com
The assembly contains all of the dependencies, some of them are shaded to mitigate version mismatches like the one you mentioned.
Let us know if the assembly worked for you.

--
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.


--
Jaroslaw Grabowski

Alex Ott

unread,
Oct 16, 2020, 4:23:29 AM10/16/20
to DataStax Spark Connector for Apache Cassandra
It works for the most of cases, until you enable & start to use SparkCassandraExtensions, for direct join for example. In that case, it will fail with ClassNotFound. The reason for that is that DBR 7.x contains pieces of 3.1.0 that have breaking changes in the optimizer. That's tracked as https://datastax-oss.atlassian.net/browse/SPARKC-626

The code change is trivial - change imports (there is PR for that), but that requires compilation of SCC with Spark 3.1.0-SNAPSHOT. I've got a build, based on SCC master, that works just fine - I'm able to do direct join...

P.S. Also, please note that to use SparkCassandraExtensions you need to have a cluster init script in place, that should copy the assembly before the driver & executor will start...
With best wishes,                    Alex Ott
http://alexott.net/
Twitter: alexott_en (English), alexott (Russian)

Nishant J

unread,
Oct 16, 2020, 9:57:38 PM10/16/20
to DataStax Spark Connector for Apache Cassandra, ale...@gmail.com
Thanks for the pointers. It worked for me on 7.3LTS (Spark 3.0.1). I didn't had to run the init script to use simple DataSet and RDD queries

Alex Ott

unread,
Oct 17, 2020, 4:27:17 AM10/17/20
to Nishant J, DataStax Spark Connector for Apache Cassandra
Yes, it works without init scripts, but you can't use more advanced functionality of Dataframes, like Direct Join, with current version, even if you use init script and set SparkCassandraExtensions
Reply all
Reply to author
Forward
0 new messages