Jupyter Notebook with IPython parallel on AWS EMR 4.7.1 (Hadoop 2.7.2, Hive 1.0.0, Hue 3.7.1, etc.)

332 views
Skip to first unread message

cloud.t...@gmail.com

unread,
Jul 9, 2016, 7:20:18 AM7/9/16
to Project Jupyter
Hi All,

I have created AWS EMR cluster with the following:

- Hadoop 2.7.2
- Hive 1.0.0
- Hue 3.7.1
- Mahout 0.12.0
- Pig 0.14.0
- Spark 1.6.1

This EMR Cluster is having 1 Master node and 2 Core node. I have configured Jupyter Notebook server along with Password protection and HTTPS, I am going through IPython parallel document at https://ipyparallel.readthedocs.io/en/latest/intro.html, but finding difficult to understand the concepts how it does actually. Please some one can share their experience to enable distributed processing on parallel nodes? What steps have to be performed on Master node? What steps have to performed on Code nodes?

Thanks and regards, :)

cloud.t...@gmail.com

unread,
Jul 9, 2016, 11:04:55 AM7/9/16
to Project Jupyter
To enable cluster mode, I have executed following command on EMR Master node:

$ pip-2.7 install ipyparallel

$ jupyter serverextension enable --py ipyparallel
$ jupyter nbextension install --py ipyparallel
$ jupyter nbextension enable --py ipyparallel

$ ipcluster start -n 4 &

$ jupyter notebook --debug > log.file 2>&1

Cluster mode (i.e. IPyparallel) not enabled, log shows that plugin has been loaded, logs are as follow:

[D 15:02:16.251 NotebookApp] Searching [u'/home/hadoop', '/home/hadoop/.jupyter', '/usr/etc/jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files
[D 15:02:16.252 NotebookApp] Looking for jupyter_config in /etc/jupyter
[D 15:02:16.252 NotebookApp] Looking for jupyter_config in /usr/local/etc/jupyter
[D 15:02:16.253 NotebookApp] Looking for jupyter_config in /usr/etc/jupyter
[D 15:02:16.253 NotebookApp] Looking for jupyter_config in /home/hadoop/.jupyter
[D 15:02:16.253 NotebookApp] Looking for jupyter_config in /home/hadoop
[D 15:02:16.254 NotebookApp] Looking for jupyter_notebook_config in /etc/jupyter
[D 15:02:16.255 NotebookApp] Looking for jupyter_notebook_config in /usr/local/etc/jupyter
[D 15:02:16.255 NotebookApp] Looking for jupyter_notebook_config in /usr/etc/jupyter
[D 15:02:16.255 NotebookApp] Looking for jupyter_notebook_config in /home/hadoop/.jupyter
[D 15:02:16.256 NotebookApp] Loaded config file: /home/hadoop/.jupyter/jupyter_notebook_config.py
[W 15:02:16.259 NotebookApp] Unrecognized JSON config file version, assuming version 1
[D 15:02:16.259 NotebookApp] Loaded config file: /home/hadoop/.jupyter/jupyter_notebook_config.json
[D 15:02:16.260 NotebookApp] Looking for jupyter_notebook_config in /home/hadoop
[W 15:02:16.831 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[W 15:02:16.832 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using authentication. This is highly insecure and not recommended.
[I 15:02:16.881 NotebookApp] Loading IPython parallel extension
[I 15:02:16.892 NotebookApp] Serving notebooks from local directory: /home/hadoop
[I 15:02:16.892 NotebookApp] 0 active kernels
[I 15:02:16.892 NotebookApp] The Jupyter Notebook is running at: http://[all ip addresses on your system]:8900/
[I 15:02:16.893 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[D 15:02:39.322 NotebookApp] 200 GET /api/sessions?_=1468076123847 (116.75.85.209) 3.27ms
[D 15:02:39.324 NotebookApp] 200 GET /api/terminals?_=1468076123848 (116.75.85.209) 0.87ms
[D 15:02:39.609 NotebookApp] 200 GET /api/contents?type=directory&_=1468076123849 (116.75.85.209) 5.73ms
[D 15:02:41.556 NotebookApp] Using contents: services/contents
[D 15:02:41.617 NotebookApp] 200 GET /tree (116.75.85.209) 61.98ms

Please, any one can suggest the possible reason for the cluster mode is not starting.

cloud.t...@gmail.com

unread,
Jul 10, 2016, 11:46:12 PM7/10/16
to Project Jupyter
Once ipengine on multiple hosts are started and connected to the ipcontroller on remote machine, how to make sure these remote engines are utilized as and when such programs are executed?


On Saturday, July 9, 2016 at 4:50:18 PM UTC+5:30, cloud.t...@gmail.com wrote:
Reply all
Reply to author
Forward
0 new messages