I'm currently running the TPCx-BB version of BigBench downloaded from the TPC website. I'm trying to get the benchmark to run using "spark_sql" engine. I downloaded and compiled the latest Apache versions of Hive (2.1.0) & Spark (1.6.0) and configured the benchmark to use these.
I run the Data Generation, Populate Metastore stages of the benchmark using Hive MR since I got to know that these aren't supported on Spark Sql. When I get to the Power Stage (i.e. running the queries) using spark_sql, I get this annoying error regarding some logging library. I've tried installing different versions of Spark & Hive but nothing helped so far. Here's the snippet of the error:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.session.SessionState$LogHelper.<init>(Lorg/apache/commons/logging/Log;)V
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.<init>(SparkSQLCLIDriver.scala:255)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:139)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Did anyone face a similar issue? Any advice on this would be appreciated.