IncompatibleClassChangeError with delta lake 1.2.0

1,598 views
Skip to first unread message

ashok kumar

unread,
Apr 20, 2022, 2:21:07 AM4/20/22
to Delta Lake Users and Developers
I am using delta lake 1.2.0 with spark 3.2.1.

Getting below error, Any idea why we are getting this

java.lang.IncompatibleClassChangeError: class org.apache.spark.sql.catalyst.plans.logical.DeltaDelete has interface org.apache.spark.sql.catalyst.plans.logical.UnaryNode as super class
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at org.apache.spark.sql.delta.DeltaAnalysis.apply(DeltaAnalysis.scala:61)
    at org.apache.spark.sql.delta.DeltaAnalysis.apply(DeltaAnalysis.scala:54)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
    at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
    at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
    at scala.collection.immutable.List.foldLeft(List.scala:91)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
    at scala.collection.immutable.List.foreach(List.scala:431)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:222)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:218)
    at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:167)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:218)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:182)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:203)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:202)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:88)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:196)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
    at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:196)
    at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:88)
    at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:86)
    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:78)
    at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:90)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:88)
    at org.apache.spark.sql.streaming.DataStreamReader.loadInternal(DataStreamReader.scala:186)
    at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:143)
    at org.apache.spark.deploy.stream.***.kafkaDf(FlowLogIngestionSpark.scala:67)
    at org.apache.spark.deploy.stream.**.writeToSink(FlowLogIngestionSpark.scala:140)
    at org.apache.spark.deploy.stream.**.execute(FlowLogIngestionSpark.scala:100)
    at org.apache.spark.deploy.**.start(SparkApp.scala:59)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Shixiong(Ryan) Zhu

unread,
Apr 20, 2022, 9:48:39 AM4/20/22
to ashok kumar, Delta Lake Users and Developers
How do you set up Delta Lake? Looks like you are not using Delta Lake 1.2.0. There may be some setup issues. To debug whether you are using the right Delta Lake version, you can add the following line to print the location of the Delta Lake jar:

classOf[org.apache.spark.sql.catalyst.plans.logical.DeltaDelete].getResource("DeltaDelete.class")

Best Regards,

Ryan


--
You received this message because you are subscribed to the Google Groups "Delta Lake Users and Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to delta-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/delta-users/bbceb2f4-e071-468c-b9a1-c5924b3b72f3n%40googlegroups.com.

ashok kumar

unread,
Apr 28, 2022, 2:58:15 AM4/28/22
to Delta Lake Users and Developers
Thanks, I will try this to check for version.

Mohamad Shaker

unread,
Jun 15, 2022, 12:06:31 PM6/15/22
to Delta Lake Users and Developers
Hi, 


I'm facing the same issue and tried to print the location of the Delta Lake jar using:


classOf[org.apache.spark.sql.catalyst.plans.logical.DeltaDelete].getResource("DeltaDelete.class")

my code starting failing with the same error when it reached to the above line of code!
error message: Exception in thread "main" java.lang.IncompatibleClassChangeError: class org.apache.spark.sql.catalyst.plans.logical.DeltaDelete has interface org.apache.spark.sql.catalyst.plans.logical.UnaryNode as super class

I have "io.delta" %% "delta-core" % "1.2.0" as a dependency in my build.sbt and it was working before bumping delta  1.0.0  --> 1.2.1 & Spark 3.1.1 to 3.2.0 

best,

Shixiong(Ryan) Zhu

unread,
Jun 15, 2022, 12:41:33 PM6/15/22
to Mohamad Shaker, Delta Lake Users and Developers
If the delta lake version is compatible with your Spark version, could you double check if you set the following Spark confs in your application?

--conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" --conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog"

You can set them via CLI like above, or create your SparkSession like (Python code):

spark = SparkSession.builder \
  .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
  .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \
  ...
  .getOrCreate()

Best Regards,

Ryan


Mohamad Shaker

unread,
Jun 16, 2022, 12:40:33 AM6/16/22
to Delta Lake Users and Developers
It turns out to be a problem with the pre-installed Spark on EMR-6.6.0 (Spark version 3.2.0-amzn-0)

Using spark-shell (spark-shell --packages io.delta:delta-core_2.12:1.2.0 --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" --conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog") 

scala> classOf[org.apache.spark.sql.catalyst.plans.logical.DeltaDelete].getResource("DeltaDelete.class")
error: Error while emitting <console>
assertion failed: Invalid superClass in Lorg/apache/spark/sql/catalyst/plans/logical/DeltaDelete;: Some(Lorg/apache/spark/sql/catalyst/plans/logical/UnaryNode;)

However, I tried to manually install the official Spark 3.2 on EMR and run its spark-shell (same way as above)

 scala> classOf[org.apache.spark.sql.catalyst.plans.logical.DeltaDelete].getResource("DeltaDelete.class")
res0: java.net.URL = jar:file:/root/.ivy2/jars/io.delta_delta-core_2.12-1.2.0.jar!/org/apache/spark/sql/catalyst/plans/logical/DeltaDelete.class

So obviously, there is something wrong with the EMR version of Spark/hadoop!

Best,

Shixiong(Ryan) Zhu

unread,
Jun 16, 2022, 4:14:32 PM6/16/22
to Mohamad Shaker, Delta Lake Users and Developers
Great to hear that you figured it out. Would you be able to fire a ticket to EMR so that they may be able to fix it and Delta Lake users on EMR won't have the same problem in future? Thanks! 

Best Regards,

Ryan


Reply all
Reply to author
Forward
0 new messages