disco nodes will not start

121 views
Skip to first unread message

james.m...@infinityworks.com

unread,
Jul 28, 2015, 12:08:20 PM7/28/15
to Disco-development
Hi,

I've followed the instructions for installing disco slaves using the "make install-node" command rather than "make install" which I do on the master. Following this I execute the "python setup.py install" command.

However when I then try to execute /<disco home>/bin/disco start I get an error message:

    stderr: /bin/sh: 1: disco: not found

However the file is in /<disco home>/bin/ the pwd is /<disco home>/bin/, the file has the correct rwx permissions.

So I created a shell script to start disco:

#!/bin/sh

cd /home/disco/disco/bin
sudo ./disco start
 
When I execute this I get a different error message:

    DISCO_HOME is not specified, where should Disco live?

But doing "echo $DISCO_HOME" from _any_ user account gives the correct installation directory (i.e the one where disco is checked out to)

I do not have these problems with the disco master install.

Any suggestions?

Thanks

Shayan Pooya

unread,
Jul 29, 2015, 2:18:20 AM7/29/15
to disc...@googlegroups.com
Hello James,

1. On the master node, you should use `make install`. make
install-node is just for the worker nodes. The worker nodes do not
need to have the disco command in PATH.
2. I am assuming you are using sudo for running "make install-node"
and "python setup.py install"

You should do either the system-wide installation (recommended):
http://disco.readthedocs.org/en/develop/start/install_sys.html

Or the per-user installation:
http://disco.readthedocs.org/en/develop/start/install.html

They should not be mixed.

3.
> But doing "echo $DISCO_HOME" from _any_ user account gives the correct installation directory (i.e the one where disco is checked out to)

It is very likely that sudo is dropping that environment variable.
This is not a root-cause but just a symptom. But you can still
continue investigating with something like:
$ sudo env | grep -i disco
> --
> You received this message because you are subscribed to the Google Groups
> "Disco-development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to disco-dev+...@googlegroups.com.
> To post to this group, send email to disc...@googlegroups.com.
> Visit this group at http://groups.google.com/group/disco-dev.
> For more options, visit https://groups.google.com/d/optout.

james.m...@infinityworks.com

unread,
Aug 3, 2015, 12:08:32 PM8/3/15
to Disco-development
OK.

Maybe if I walk you through my setup process you can find the error since I'm getting the following error in my log:

2015-08-03 15:17:26.695 [info] <0.7.0> Application disco started on node disco_8989_master@discomaster
2015-08-03 15:17:26.695 [info] <0.97.0>@node_mon:slave_start:109 Starting node "disco_8989_slave" on "discoslave2" ("discoslave2")
2015-08-03 15:17:26.696 [info] <0.96.0>@node_mon:slave_start:109 Starting node "disco_8989_slave" on "discoslave1" ("discoslave1")
2015-08-03 15:17:26.696 [info] <0.95.0>@node_mon:slave_start:109 Starting node "disco_8989_slave" on "discoslave0" ("discoslave0")
2015-08-03 15:17:27.217 [error] <0.134.0> ** Connection attempt from disallowed node disco_8989_slave@discoslave1 **
2015-08-03 15:17:27.243 [error] <0.136.0> ** Connection attempt from disallowed node disco_8989_slave@discoslave2 **
2015-08-03 15:17:27.248 [error] <0.138.0> ** Connection attempt from disallowed node disco_8989_slave@discoslave0 **

The Master (clean Ubuntu trusty64 install)
=========

1. Create 'admin' user group with sudoers, etc
2. Configure sshd
3. Add all disco nodes to host file
4. Install Git and Python 2.7.x
5. Install Erlang
6. Create a 'disco' user in the 'admin' group
7. Upload disco user ssh key and configure ssh properties
8. Upload the common .erlang-cookie to disco user home
9. Set the DISCO_HOME location
10. Checkout disco to DISCO_HOME
11. Do 'make install'
12. Do 'python setup.py install'
13. Make sure disco user has correct permisions for directories (e.g. /usr/var/disco)
14. Insert the slave hostnames and worker counts in to /usr/var/disco/disco_8989.config
15. start disco master

Each Slave (clean Ubuntu trusty64 install)
=========

1. Create 'admin' user group with sudoers, etc
2. Configure sshd
3. Add all disco nodes to host file
4. Install Git and Python 2.7.x
5. Install Erlang
6. Create a 'disco' user in the 'admin' group
7. Upload disco user ssh key and configure ssh properties
8. Upload the common .erlang-cookie to disco user home
9. Set the DISCO_HOME location
10. Checkout disco to DISCO_HOME
11. Do 'make install-node'
12. Do 'python setup.py install'
13. Make sure disco user has correct permisions for directories (e.g. /usr/var/disco)
15. start disco slave

I can see the nodes listed in the Disco Master UI but they have a red bar not a black one.

From the troubleshooting guid I've tried the following:
* Can ssh without password from any disco node to any other
* Disco installed in same path on all machines
* Same .erlang-cookie on all machines with permissions 0400
* slave:start(localhost, "testnode") and 'net_adm:ping(testnode@localhost)' work as expected
* hostnames are resolved using hosts file not DNS - this is tricky to change
* ssh localhost "python DISCO_HOME/lib/disco/worker/classic/worker.py" - returns the expected response.

To me it looks like there's still some issue with the master talking to the slaves but I'm struggling to see what it might be - any suggestions?

Thanks
James

james.m...@infinityworks.com

unread,
Aug 4, 2015, 11:03:54 AM8/4/15
to Disco-development, james.m...@infinityworks.com
Problem solved, had '.erlang-cookie' not '.erlang.cookie'
Reply all
Reply to author
Forward
0 new messages