Spark Cluster - o.a.s.s.BlockManagerId; local class incompatible: stream classdesc serialVersionUID

343 views
Skip to first unread message

Todd Nist

unread,
Jan 20, 2015, 3:41:42 PM1/20/15
to zeppelin-...@googlegroups.com

I created the following issue, https://github.com/NFLabs/zeppelin/issues/292, on github site, but wanted to see if perhaps anyone here may have input on this.  Apologize of the dual posting new to group and I'm not sure which gets the most views.  

When executing the below example from the tutorial:

val bankText = sc.textFile("/Users/work/zeppelin_test_data/bank/bank-full.csv")

case class Bank(age:Integer, job:String, marital : String, education : String, balance : Integer)

val bank = bankText.map(s=>s.split(";")).filter(s=>s(0)!="\"age\"").map(
    s=>Bank(s(0).toInt, 
            s(1).replaceAll("\"", ""),
            s(2).replaceAll("\"", ""),
            s(3).replaceAll("\"", ""),
            s(5).replaceAll("\"", "").toInt
        )
)

bank.registerTempTable("bank”)
In a spark standalone cluster I am getting the following error thrown which seems to indicate there are two instances of the class.

INFO [2015-01-20 13:12:25,018] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:99) - Job paragraph_1421776075682_-56845921 finished by scheduler com.nflabs.zeppelin.spark.SparkInterpreter1551788863
ERROR [2015-01-20 13:12:51,607] ({sparkDriver-akka.actor.default-dispatcher-17} Slf4jLogger.scala[apply$mcV$sp]:66) - org.apache.spark.storage.BlockManagerId; local class incompatible: stream classdesc serialVersionUID = -7366074099953117729, local class serialVersionUID = 1677335532749418220
java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId; local class incompatible: stream classdesc serialVersionUID = -7366074099953117729, local class serialVersionUID = 1677335532749418220
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
My spark cluster is running spark-1.1.1-bin-hadoop2.4.   I am starting the master and worker as follows:

  • Master: $SPARK_HOME/sbin/start-master.sh
  • Worker1: $SPARK_HOME//bin/spark-class org.apache.spark.deploy.worker.Worker spark://radtech.io:7077 --webui-port 8081 --cores 2 --memory 1G
  • Worker2: `$SPARK_HOME//bin/spark-class org.apache.spark.deploy.worker.Worker spark://radtech.io:7077 --webui-port 8082 --cores 2 --memory 1G
I have set the MASTER in the $ZEPPELIN_HOME/conf/zeppelin-env.sh as follows:

#!/bin/bash

# export JAVA_HOME=
 export MASTER=spark://radtech.io:7077 # spark master
Any ideas on what I have missed here?  TIA for the assistance.

-Todd

Todd Nist

unread,
Jan 20, 2015, 4:21:19 PM1/20/15
to zeppelin-...@googlegroups.com
As I read through this again it seems like I have two different versions of Spark coming into play here, the one Zeppelin is using and the one the cluster is running, this is the only way I can see this being a problem. I did the build as follows:

mvn install -DskipTests -Dspark.version=1.1.1 -Dhadoop.version=2.4.0

Am I missing something else? I see in the $ZEPPELIN_HOME/interpreter/spark directory that all the spark artifacts are of version 1.1.1 so that seems right; somewhere else I should be looking?
Reply all
Reply to author
Forward
0 new messages