Socket.gaierror: [Errno -2] Name or service not known

1,052 views
Skip to first unread message

Bálint Szebenyi

unread,
Jun 24, 2014, 6:18:34 PM6/24/14
to scoop...@googlegroups.com
Hi!

I am currently trying to get scoop to work on my desktop+laptop. (Scoop: 0.7.1, Python 2.7.6)

I try to launch scoop this way:
python -m scoop --hostfile scoop_hostfile deap_tutorial.py

However I get:
[2014-06-25 00:09:41,820] __main__  INFO    Worker(s) launched using /bin/zsh
Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/szebenyib/.virtualenvs/gp/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 302, in <module>
    b.main()
  File "/home/szebenyib/.virtualenvs/gp/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 92, in main
    self.run()
  File "/home/szebenyib/.virtualenvs/gp/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 290, in run
    futures_startup()
  File "/home/szebenyib/.virtualenvs/gp/lib/python2.7/site-packages/scoop/bootstrap/__main__.py", line 271, in futures_startup
    run_name="__main__"
  File "/home/szebenyib/.virtualenvs/gp/lib/python2.7/site-packages/scoop/futures.py", line 64, in _startup
    result = _controller.switch(rootFuture, *args, **kargs)
  File "/home/szebenyib/.virtualenvs/gp/lib/python2.7/site-packages/scoop/_control.py", line 176, in runController
    execQueue = FutureQueue()
  File "/home/szebenyib/.virtualenvs/gp/lib/python2.7/site-packages/scoop/_types.py", line 256, in __init__
    self.socket = Communicator()
  File "/home/szebenyib/.virtualenvs/gp/lib/python2.7/site-packages/scoop/_comm/scoopzmq.py", line 62, in __init__
    s.connect((scoop.BROKER.externalHostname, scoop.BROKER.task_port))
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
socket.gaierror: [Errno -2] Name or service not known
[2014-06-25 00:09:51,063] launcher  (127.0.0.1:60534) INFO    Finished cleaning spawned subprocesses.

I can log in with ssh using .ssh/config as described in the tutorial. After login I can run on the remote machine the necessary python files (tested it, because I had previously permission failures, but solved those by putting my .virtualenvs directory there readable for the separate user I created for scoop). I have also tried to use the same user (same virtualenv, same directory structure) on both machines - to no avail. In my ssh config I connect directly by ip on my local network, so I guess this cannot be a router problem.
I really don't know why there is a networking problem here.

Can you please help me with it?

Yannick Hold-Geoffroy

unread,
Jul 3, 2014, 11:16:57 PM7/3/14
to scoop...@googlegroups.com
Hello,

Thanks for your interest in SCOOP. Sorry for the delay in the answer.

Before contacting other workers, a worker must determine its own externally routable address (IP or DNS). For this, the trick SCOOP uses is to create an UDP socket to connect to the broker. When everything is fine, the resulting socket holds the desired externally routable address of the worker. Note that a "connect()" for an UDP socket doesn't actually connect nor exchange packets to its target; it only enforces the resolution of the hostname.

But in your case, this method seems to fail. The problem is that the variable "scoop.BROKER.externalHostname" holds an invalid (or non-routable) hostname.
Do you mind sharing your scoop_hostfile to see it the problem lies in there? Did you enter the exact same name in the scoop_hostfile than you did in .ssh/config ?
Can you re-execute your program, but this time can you pass the verbose flag to SCOOP ( -vvv ) and post the entire output?

As a workaround, can you try using the --tunnel flag to see if it works?

Have a nice day,
Yannick Hold


--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "scoop-users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse scoop-users...@googlegroups.com.
Pour obtenir davantage d'options, consultez la page https://groups.google.com/d/optout.

Bálint Szebenyi

unread,
Jul 12, 2014, 10:54:56 AM7/12/14
to scoop...@googlegroups.com
Hi,

Sorry, I have been on a vacation and that's why I'm only answering now.

I have in the end managed to get it working using the --tunnel switch. However it seemed slow :(
I have set up my scoop file this way:
127.0.0.1 2
192.168.1.3 4

I have tried using the host names as specified in the .ssh/config file (I have verified that those entries were okay because I could login with them and run a python executable)
E.g.:
Host white-testuser
    Port 20000
    User testuser
    Hostname 192.168.1.3
    ServerAliveInterval 300
    ServerAliveCountMax 2
    IdentityFile ~/.ssh/id_rsa_testuser

I have tried using more users, ports, but ended up reaching the laptop by ip and default port (22). For some reason it was not OK with hostnames and other ports. I cannot remember exactly what things I tried though.

Currently I am not using scoop as I work with an IPython notebook and it is unfortunately not supported there. Once I have a decent working notebook and I will need performance I will try
to move out its contents to a python file and add scoop there.

Anyway, thank you for the detailed answer and for developing SCOOP. If the above data is not enough then advise I will find the time and run scoop and post its output.

Bálint Szebenyi

Yannick Hold-Geoffroy

unread,
Jul 14, 2014, 2:07:44 PM7/14/14
to scoop...@googlegroups.com
Helo Mr. Szebenyi,

You are right, the --tunnel will encrypt every communication and hence will slow things down. One way to accelerate encryption is to take a faster (but less secure) encryption cipher. If you've got the development version of SCOOP, this can be done by adding a parameter along the lines of '-carcfour,blowfish-cbc' to the ./scoop/launch/constants.py file. You can even activate compression ( '-C' ) if bandwidth is a bigger problem than computing power.

But I don't recommend you to take these steps in your case. It may be useful for anyone wanting the encryption while desiring performance (ie. over the internet). Your problem lies in the fact that your wrote 127.0.0.1 as your first host address. This means that the second host will try to connect to 127.0.0.1 to connect to the first host, which won't work. You should either put the hostname (if resolvable by the second host) or the externally visible IP address of your machine (192.168.1.X) instead of 127.0.0.1. You seems to have tried this already; but I am curious: have you written "white-testuser" in your scoop hostfile, or something else? The configuration you showed seems otherwise fine.

Making SCOOP work in IPython notebooks is something that I really want to achieve in the future.

Thanks for your feedback,
Have a nice day,
Yannick Hold


--
Reply all
Reply to author
Forward
0 new messages