install a module to pyspark kernel in all-spark-notebook

257 views
Skip to first unread message

Tim Harsch

unread,
Feb 5, 2018, 3:31:39 PM2/5/18
to Project Jupyter
This seems like it should be pretty basic, but I'm having a hard time installing a module to the pyspark notebook.  I must be missing something...

If I choose a python3 notebook, the module (matplotlib) is already there and works as expected.  I notice that root has default python as /opt/conda/bin/python (python 3.6.3).  The jovyan user has python as /usr/bin/python (python 2.7.12).

If I try to use the module matplotlib in the pyspark kernel, I get this:
No module named matplotlib.pyplot
Traceback (most recent call last):
ImportError: No module named matplotlib.pyplot

I've tried several things including installing pip for /usr/bin/python, and installing the matplotlib module there.. but still it is a problem.

I have a docker contanier that inherits from all-spark-notebook, so I can modify the container OS if needed.

I look at the kernel.json and see:
/usr/local/share/jupyter/kernels/pysparkkernel/kernel.json

{"argv":["python","-m","sparkmagic.kernels.pysparkkernel.pysparkkernel", "-f", "{connection_file}"],
 "display_name":"PySpark"
}

I assume that since python is not fully qualified that is picks up /usr/bin/python from the path.

Any ideas?

Thanks,
Tim

Luciano Resende

unread,
Feb 5, 2018, 6:02:57 PM2/5/18
to jup...@googlegroups.com
What you might be missing is to pass some of these configurations to Spark.

We have some configuration examples on the Jupyter Enterprise Gateway documentation:

Also note that, if you are in a distributed Spark environment, the issue might also be that the necessary library is not available on the machine the work is running, compared to your local env where you might only have the Spark driver running.

Please let us know if this is not the case.

--
You received this message because you are subscribed to the Google Groups "Project Jupyter" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jupyter+unsubscribe@googlegroups.com.
To post to this group, send email to jup...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jupyter/853cd237-6888-40ac-88bf-0ae5b94c3094%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Reply all
Reply to author
Forward
Message has been deleted
0 new messages