message: "module 'tensorflow' has no attribute 'Session'"

Maiia Bakhova

unread,

Oct 20, 2019, 2:09:23 PM10/20/19

to DL-Pipelines-users

Hello everybody,

I am trying to run a DL script from book "Spark: The Definitive Guide" by Bill Chambers and Matei Zaharia. Their code is here:

https://github.com/databricks/Spark-The-Definitive-Guide/tree/master/code

I downloaded to my Community Databricks account the script "Advanced_Analytics_and_Machine_Learning-Chapter_31_Deep_Learning.py" and ran it.

At the first cell I got problems, and a message was:

ImportError: No module named 'sparkdl'.

I installed the module in my cluster library. Got

ImportError: No module named 'keras'.

Installed this as well. Then the same thing with tensorflow. At this point I got

ConnectException error: This is often caused by an OOM error that causes the connection to the Python REPL to be closed. Check your query's memory usage.

I tried different order for module installation. In particular since keras is build on the top of tensorflow I put the last one before first. Finally I got a list of all required modules: sparkdl, tensorflow, tensorflowonspark, tensorframes, kafka, jieba, keras. I wish somewhere would be a list with them. Still was getting an error message:

AttributeError: module 'tensorflow' has no attribute 'Session'

As far as I know `Session` method is a base `tensorflow` method. Googling did not yield a solution for PySpark.

I found Spark Deep Learning repository on github and read current recommendations, just in case it's here: https://github.com/databricks/spark-deep-learning/blob/master/README.md

We see the advice "To work with the latest code, Spark 2.3.0 is required and Python 3.6 & Scala 2.11 are recommended". Thus I need to create a cluster with these versions. But there is no such option for me when I create a cluster, see attached picture.

I may use only Spark 2.4.* or 2.2.*

Can somebody please help me?

Best,
Mya

SparkDLproblem.png

Maestro M.

unread,

Dec 3, 2019, 7:28:28 AM12/3/19

to DL-Pipelines-users

Hi,

You have tensorflow 2.0 installed on your cluster, the sparkdl is currently broken.

Maestro M.

unread,

Dec 3, 2019, 9:59:07 AM12/3/19

to DL-Pipelines-users

Install the following dependencies:

maven install: databricks:spark-deep-learning:1.5.0-spark2.4-s_2.11

pip install: keras pillow tensorflow==1.13.1 wrapt==1.10.11

On Sunday, October 20, 2019 at 8:09:23 PM UTC+2, Maiia Bakhova wrote:

Maiia Bakhova

unread,

Dec 5, 2019, 10:51:33 AM12/5/19

to DL-Pipelines-users

Thanks for your advice!

Regretfully I cannot utilize it because the only Spark accessible to me is on Community Databricks account, and they do not allow me to install Spark versions of my choice.