BigBench Queries using Spark SQL

242 views
Skip to first unread message

Janaki

unread,
Jul 8, 2016, 1:49:03 PM7/8/16
to Big Data Benchmark for BigBench
Hello,

I'm currently running the TPCx-BB version of BigBench downloaded from the TPC website. I'm trying to get the benchmark to run using "spark_sql" engine. I downloaded and compiled the latest Apache versions of Hive (2.1.0) & Spark (1.6.0) and configured the benchmark to use these. 

I run the Data Generation, Populate Metastore stages of the benchmark using Hive MR since I got to know that these aren't supported on Spark Sql. When I get to the Power Stage (i.e. running the queries) using spark_sql, I get this annoying error regarding some logging library. I've tried installing different versions of Spark & Hive but nothing helped so far. Here's the snippet of the error:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.session.SessionState$LogHelper.<init>(Lorg/apache/commons/logging/Log;)V
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.<init>(SparkSQLCLIDriver.scala:255)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:139)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties

Did anyone face a similar issue? Any advice on this would be appreciated.

Thanks,
Janaki

Yan Tang

unread,
Jul 22, 2016, 2:39:49 AM7/22/16
to Big Data Benchmark for BigBench
hi, Janaki
    It seems that your spark package is not compatible with the hive package. Have you build your Hive binary based on Spark1.6?

Best Regards,
Yan
Reply all
Reply to author
Forward
0 new messages