Spark Notebook - External package seems to be loaded but cannot be imported

119 views
Skip to first unread message

Sebastian Hätälä

unread,
Nov 4, 2016, 5:33:51 AM11/4/16
to Hue-Users
Hi together,

I have following problem: On my HDP 2.5 cluster running with Spark (v 1.6.2) and Livy I want to create a Spark Notebook in Hue. All application we will ever write have dependencies to the package ai.h2o:sparkling-water-core. Therefore in my spark-defaults.conf I added the following configuration: spark.jars.packages = ai.h2o:sparkling-water-core_2.10:1.6.8

Also I set the debugging level for Spark, as well as for Livy, to DEBUGAs I understand it, Hue creates a Livy interactive session whenever I open a Spark Notebook. Judging from the Livy logs the package gets loaded (see the 'livy-livy-server.out' attached). Also the information from the Spark-History Server indicated that the packages has been loaded since 'file:/home/livy/.ivy2/jars/ai.h2o_sparkling-water-core_2.10-1.6.8.jar' ist listed as one entry for the field 'spark.jars' and 'http://<LIVY_SERVER>:51720/jars/ai.h2o_h2o-core-3.10.0.7.jar'. is part of the classpath entries.

However, when I then try to import a class from the loaded package in the Spark Notebook in Hue I get the following error:

> import org.apache.spark.h2o._
error: object h2o is not a member of package org.apache.spark import org.apache.spark.h2o._ ^

When I start a spark-shell (without additional configuration) the same command works!

What am I doing wrong?
livy-livy-server.out

Sebastian Hätälä

unread,
Nov 5, 2016, 2:17:12 PM11/5/16
to Hue-Users
Excuse me for double posting but I have conducted some further research I would like to share: The access.log and error.log combined contain the following information:

access         INFO     192.168.120.29 shaetaelae - "POST /notebook/api/create_session HTTP/1.1"
connectionpool INFO    
Resetting dropped connection: itfin109.it.zeb.de
access         INFO    
192.168.120.29 shaetaelae - "POST /notebook/api/execute HTTP/1.1"
access         INFO    
192.168.120.29 shaetaelae - "POST /notebook/api/check_status HTTP/1.1"
access         INFO    
192.168.120.29 shaetaelae - "POST /notebook/api/fetch_result_data HTTP/1.1"
decorators     ERROR    
Error running <function fetch_result_data at 0x7f07485b6488>
Traceback (most recent call last):
 
File "/usr/local/hue/desktop/libs/notebook/src/notebook/decorators.py", line 81, in decorator
   
return func(*args, **kwargs)
 
File "/usr/local/hue/desktop/libs/notebook/src/notebook/api.py", line 218, in fetch_result_data
    response
['result'] = get_api(request, snippet).fetch_result(notebook, snippet, rows, start_over)
 
File "/usr/local/hue/desktop/libs/notebook/src/notebook/connectors/spark_shell.py", line 314, in fetch_result
   
raise QueryError(msg)
QueryError
access         INFO    
192.168.120.29 shaetaelae - "POST /jobbrowser/jobs/ HTTP/1.1"

Am I right to therefore suspect an issue originating in either the decorators.py, api.py or spark_shell.py?

Henry Arnold

unread,
Jun 2, 2017, 6:37:24 AM6/2/17
to Hue-Users
Hi, did you ever figure out how to add the sparkling water jars through the spark-defualts?

Thanks
Reply all
Reply to author
Forward
0 new messages