Running BigDL on LakeFS

44 views
Skip to first unread message

Ndu jude Leonard

unread,
Jun 7, 2022, 7:49:48 PM6/7/22
to User Group for BigDL
Hi Team,

I am currently struggling with setting up BigDL on our server. Currently I have LakeFS running on that server, and a jupyter notebook configured to run spark for communicating and writing data into LakeFS file storages system. But then I want to be able to also run BigDL on that jupyter notebook and I have tried a couple of methods including setting it up with conda and linking the kernel to my jupyter, yet i keep getting no module found error, even when I use just the pip install command, same error still occurs.

Would like to know if there is any convenient way for me to do this. 

Xin Qiu

unread,
Jun 7, 2022, 10:21:59 PM6/7/22
to User Group for BigDL
How can I reproduce your error? Could you give us some commands and guides?

Xin Qiu

unread,
Jun 8, 2022, 3:18:34 AM6/8/22
to User Group for BigDL

You can retry to install BigDL with https://bigdl.readthedocs.io/en/latest/doc/UserGuide/python.html#install 
Then you can try to run this simple bigdl example in Jupyter notebook https://github.com/intel-analytics/BigDL/blob/main/apps/dogs-vs-cats/transfer-learning.ipynb, to make sure your installation is successful.

Ndu jude Leonard

unread,
Jun 8, 2022, 10:45:59 AM6/8/22
to User Group for BigDL
I was actually aiming to install bigdl built on spark-3. Then I used this command as directed in the doc  "pip install bigdl-spark3". It was successfully installed but the module was not found when I try to import it like this

from bigdl.orca import init_orca_context 
sc = init_orca_context()

Then I decided to try this instead  "pip install bigdl"  but was still getting the same error.

My question is, can Bigdl still run on a cluster that is already running spark?

Xin Qiu

unread,
Jun 8, 2022, 10:28:29 PM6/8/22
to User Group for BigDL
Could you check if the bigdl/orca exists on your file system? pip will tell you where bigdl is installed when you run "pip install".
Will the module-not-found error occur if you just use a python intercative shell?

>>  My question is, can Bigdl still run on a cluster that is already running spark?
You had better use the pyspark installed with pip, you can unset the SPARK environments("env | grep SPARK" can find them) before you start jupyter or python.

Reply all
Reply to author
Forward
0 new messages