java.lang.NoClassDefFoundError: com/google/protobuf/CodedInputStream$

293 views
Skip to first unread message

Zda Kao

unread,
Nov 12, 2020, 5:32:11 PM11/12/20
to ScalaPB
I got this error "User class threw exception: java.lang.NoClassDefFoundError: com/google/protobuf/CodedInputStream$" when I run on AWS EMR Spark:

Release label:emr-6.0.0
Hadoop distribution:Amazon 3.2.1
Applications:Spark 2.4.4, Hive 3.1.2

My configuration runs fine locally:
scalaVersion := "2.12.10"

build.sbt:
libraryDependencies ++= Seq(
...
// ScalaPB with SparkSQL
"com.thesamet.scalapb" %% "sparksql-scalapb" % "0.9.0"
)
// Hadoop contains an old protobuf runtime that is not binary compatible
// with 3.0.0. We shared ours to prevent runtime issues.
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.google.protobuf.**" -> "shadeproto.@1").inAll,
ShadeRule.rename("scala.collection.compat.**" -> "shadecompat.@1").inAll
)

PB.targets in Compile := Seq(
scalapb.gen() -> (sourceManaged in Compile).value
)

assembly.sbt:
resolvers += Resolver.url("bintray-sbt-plugins", url("http://dl.bintray.com/sbt/sbt-plugin-releases"))(Resolver.ivyStylePatterns)

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.10")

build.properties:
sbt.version = 1.3.13

scalapb.sbt:
addSbtPlugin("com.thesamet" % "sbt-protoc" % "0.99.27")

libraryDependencies += "com.thesamet.scalapb" %% "compilerplugin" % "0.9.6"

Class:
sbt: com.google.protobuf:protobuf-java:3.8.0:jar

I tried copying the protobuf-java:3.8.0:jar to /home/hadoop/extrajars folder and reference it in the spark default config:
- sudo vim /etc/spark/conf/spark-defaults.conf
spark.driver.extraClassPath :/home/hadoop/extrajars/*
spark.executor.extraClassPath :/home/hadoop/extrajars/* 

I also reference it in the spark-submit:
--driver-library-path /home/hadoop/extrajars/postgresql-42.2.11.jar,/home/hadoop/extrajars/spark-sql-kafka-0-10_2.12-2.4.4.jar,/home/hadoop/extrajars/spark-streaming-kafka-0-10_2.12-2.4.4.jar,/home/hadoop/extrajars/jdbc-4.50.3.jar,/home/hadoop/extrajars/config-1.4.0.jar,/home/hadoop/extrajars/scala-logging_2.13-3.9.2.jar,/home/hadoop/extrajars/sparksql-scalapb_2.12-0.9.0.jar,/home/hadoop/extrajars/frameless-dataset_2.12-0.8.0.jar,/home/hadoop/extrajars/frameless-core_2.12-0.8.0.jar,/home/hadoop/extrajars/scalapb-runtime_sjs0.6_2.12-0.9.6.jar,/home/hadoop/extrajars/lenses_sjs0.6_2.12-0.9.6.jar,/home/hadoop/extrajars/protobuf-java-3.8.0.jar \
--driver-class-path /home/hadoop/extrajars/postgresql-42.2.11.jar,/home/hadoop/extrajars/spark-sql-kafka-0-10_2.12-2.4.4.jar,/home/hadoop/extrajars/spark-streaming-kafka-0-10_2.12-2.4.4.jar,/home/hadoop/extrajars/jdbc-4.50.3.jar,/home/hadoop/extrajars/config-1.4.0.jar,/home/hadoop/extrajars/scala-logging_2.13-3.9.2.jar,/home/hadoop/extrajars/sparksql-scalapb_2.12-0.9.0.jar,/home/hadoop/extrajars/frameless-dataset_2.12-0.8.0.jar,/home/hadoop/extrajars/frameless-core_2.12-0.8.0.jar,/home/hadoop/extrajars/scalapb-runtime_sjs0.6_2.12-0.9.6.jar,/home/hadoop/extrajars/lenses_sjs0.6_2.12-0.9.6.jar,/home/hadoop/extrajars/protobuf-java-3.8.0.jar \
--jars /home/hadoop/extrajars/postgresql-42.2.11.jar,/home/hadoop/extrajars/spark-sql-kafka-0-10_2.12-2.4.4.jar,/home/hadoop/extrajars/spark-streaming-kafka-0-10_2.12-2.4.4.jar,/home/hadoop/extrajars/jdbc-4.50.3.jar,/home/hadoop/extrajars/config-1.4.0.jar,/home/hadoop/extrajars/scala-logging_2.13-3.9.2.jar,/home/hadoop/extrajars/sparksql-scalapb_2.12-0.9.0.jar,/home/hadoop/extrajars/frameless-dataset_2.12-0.8.0.jar,/home/hadoop/extrajars/frameless-core_2.12-0.8.0.jar,/home/hadoop/extrajars/scalapb-runtime_sjs0.6_2.12-0.9.6.jar,/home/hadoop/extrajars/lenses_sjs0.6_2.12-0.9.6.jar,/home/hadoop/extrajars/protobuf-java-3.8.0.jar \

The default version on AWS spark 2.4.4 is protobuf-java-2.5.0.jar.

Any advise is much appreciated.



Nadav Samet

unread,
Nov 12, 2020, 11:21:03 PM11/12/20
to Zda Kao, ScalaPB
Hi Zda, how do you deploy to EMR? The recommended way is to create a fat jar using sbt-assembly, and pass that as a single dependency to spark-submit. It looks like you are passing individual jars manually, some of which are related to Scala.js (those that have sjs0.6 in their name) which is probably not what you want.

-Nadav

--
You received this message because you are subscribed to the Google Groups "ScalaPB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scalapb+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scalapb/425910b9-ff77-4d09-8f51-57a05c379490n%40googlegroups.com.


--
-Nadav

Zda Kao

unread,
Nov 13, 2020, 5:52:53 PM11/13/20
to ScalaPB
Hi Nadav,

I'm new to scala and I wasn't familiar with sbt. I thought sbt package was the method of creating the fat jar. I was able to use the assembly plugin to create the fat jar now. However, I ran into many duplicate errors so I reduce to only the essential jars for testing and I was able to deploy successfully to AWS EMR Spark. Now I'm going to read up on how to remove duplicates.

Thanks you very much for your help,

Zda
Reply all
Reply to author
Forward
0 new messages