after pulling the new docker image, cluster won't built gives `ClusterSizeError: No nodes could be started!'

15 views
Skip to first unread message

Hatef Monajemi

unread,
Aug 9, 2019, 4:04:44 PM8/9/19
to elasticluster

2019-08-09 19:59:44 00abdbf74f9e elasticluster[1] INFO Determined that provided credentials are not valid.

2019-08-09 19:59:44 00abdbf74f9e elasticluster[1] INFO Attempting to use Google Application Default Credentials.

2019-08-09 19:59:44 00abdbf74f9e elasticluster[1] ERROR Could not start node `frontend001`:  -- <class 'httplib.ResponseNotReady'>

Traceback (most recent call last):

  File "elasticluster/cluster.py", line 580, in _start_node

    node.start()

  File "elasticluster/cluster.py", line 1319, in start

    **self.extra)

  File "elasticluster/providers/gce.py", line 521, in start_instance

    gce = self._connect()

  File "elasticluster/providers/gce.py", line 194, in _connect

    credentials = self._get_credentials()

  File "elasticluster/providers/gce.py", line 162, in _get_credentials

    return GoogleCredentials.get_application_default()

  File "/usr/local/lib/python2.7/site-packages/oauth2client/client.py", line 1271, in get_application_default

    return GoogleCredentials._get_implicit_credentials()

  File "/usr/local/lib/python2.7/site-packages/oauth2client/client.py", line 1256, in _get_implicit_credentials

    credentials = checker()

  File "/usr/local/lib/python2.7/site-packages/oauth2client/client.py", line 1187, in _implicit_credentials_from_gce

    if not _in_gce_environment():

  File "/usr/local/lib/python2.7/site-packages/oauth2client/client.py", line 1042, in _in_gce_environment

    if NO_GCE_CHECK != 'True' and _detect_gce_environment():

  File "/usr/local/lib/python2.7/site-packages/oauth2client/client.py", line 999, in _detect_gce_environment

    http, _GCE_METADATA_URI, headers=_GCE_HEADERS)

  File "/usr/local/lib/python2.7/site-packages/oauth2client/transport.py", line 282, in request

    connection_type=connection_type)

  File "/usr/local/lib/python2.7/site-packages/httplib2/__init__.py", line 2135, in request

    cachekey,

  File "/usr/local/lib/python2.7/site-packages/httplib2/__init__.py", line 1796, in _request

    conn, request_uri, method, body, headers

  File "/usr/local/lib/python2.7/site-packages/httplib2/__init__.py", line 1737, in _conn_request

    response = conn.getresponse()

  File "/usr/local/lib/python2.7/httplib.py", line 1108, in getresponse

    raise ResponseNotReady()

ResponseNotReady

2019-08-09 19:59:44 00abdbf74f9e elasticluster[1] ERROR Could not start cluster `gce`: No nodes could be started!

2019-08-09 19:59:44 00abdbf74f9e elasticluster[1] ERROR Error: No nodes could be started!

Traceback (most recent call last):

  File "/home/elasticluster/__main__.py", line 212, in main

    return self.params.func()

  File "elasticluster/subcommands.py", line 91, in __call__

    return self.execute()

  File "elasticluster/subcommands.py", line 219, in execute

    cluster.start(min_nodes, self.params.max_concurrent_requests)

  File "elasticluster/cluster.py", line 487, in start

    raise ClusterSizeError("No nodes could be started!")

ClusterSizeError: No nodes could be started!

Aborting because of errors: No nodes could be started!.




my gce cluster

[cluster/gce]

cloud=google

login=google

setup=ansible-slurm

security_group=default

image_id=ubuntu-1604-xenial-v20181004

frontend_nodes=1

compute_nodes=1

ssh_to=frontend

# Ask for 50G of disk

boot_disk_type=pd-standard

boot_disk_size=50

flavor=n1-standard-4






Hatef Monajemi

unread,
Aug 11, 2019, 1:16:07 PM8/11/19
to elasticluster
This was a credential problem. What I had to do:

gcloud auth application-default login

and and update this section with new credentials

177 # set up mount commands for host directories                            

178 volumes="-v $HOME/.ssh:/home/.ssh -v $HOME/.elasticluster:/home/.elasticluster -v $HOME/.config:/home/.config"

179 if [ -n "$SSH_AUTH_SOCK" ]; then                                        

180     volumes="${volumes} -v $SSH_AUTH_SOCK:/home/.ssh-agent.sock"        

181 fi                                                                      



Riccardo Murri

unread,
Aug 13, 2019, 12:36:32 AM8/13/19
to Hatef Monajemi, elasticluster
Hello Hatef,

many thanks for investigating this issue! I think this is related to
issue #649 which you reported a while ago.

It should now be better with the latest "master" branch: the
`$HOME/.config` directory is now used automatically if found (no need
to patch the `elasticluster.sh` script); you still need to have valid
Google cloud credentials, either as "application default credentials"
or as client ID + secret key embedded in ElastiCluster's config file.

Ciao,
R

Hatef Monajemi

unread,
Aug 13, 2019, 1:54:51 AM8/13/19
to Riccardo Murri, elasticluster
Thank you for the fix Riccardo. Hope all is well.

Hatef

Sent from my iPhone
Reply all
Reply to author
Forward
0 new messages