nomad "client: failed to register node: rpc error: EOF

866 views
Skip to first unread message

er...@coms.io

unread,
Dec 23, 2015, 8:12:08 PM12/23/15
to Nomad
Greetings!

I'm running NodeFabric across three AZs at AWS us-east. They are registered at atlas, and all services are happy.

I've installed the Nomad agent on each of those boxes and started them as servers. They appear happy on the console. While I told them to register at Atlas, I don't see any evidence of that at Atlas - that said, I see no errors reported.

I created a Ubuntu server; installed docker, consul, and nomad. I've started the consul agent as client on this box, and:

ubuntu@ip-10-3-32-133:~$ sudo consul members
    2015/12/24 00:46:43 [INFO] agent.rpc: Accepted client: 127.0.0.1:49671
Node            Address           Status  Type    Build  Protocol  DC
docker.1        10.3.32.133:8301  alive   client  0.6.0  2         dc1
ip-10-3-10-100  10.3.10.100:8301  alive   server  0.5.2  2         dc1
ip-10-3-12-100  10.3.12.100:8301  alive   server  0.5.2  2         dc1
ip-10-3-14-100  10.3.14.100:8301  alive   server  0.5.2  2         dc1

Back in the Consul WebUI, I now show "Docker.1" as a Node with 0 Services reporting. The agent reports alive and reachable in the Serf Health Status.

Now I'm starting to get excited...

I fire up nomad in client mode. In DEBUG, ,I see fingerprinting confirm consul, and then complain about reading the interface speed, then setting to default. It reports finding docker, but complains privileged containers are disabled. 

Finally, it shows, "client: available drivers [docker exec]"

It sits for a few seconds and reports:

[ERR] client: failed to register node: rpc error: EOF

Simultaneously, over on the Nomad server console, I see:

[ERR] memberlist: Received invalid msgType (3) from=10.3.32.133:37173

Nomad client and server configs follow; much thanks in advance - hoping this is painfully obvious.


Server Config example (currently manually wired on each box):

[ec2-user@ip-10-3-10-100 ~]$ cat nomadServer_b.conf
bind_addr = "0.0.0.0"
data_dir = "/var/lib/nomad"

advertise {
  # We need to specify our host's IP because we can't
  # advertise 0.0.0.0 to other nodes in our cluster.
  rpc = "10.3.10.100:4647"
  serf = "10.3.10.100:4648"
}

server {
  enabled = true
  bootstrap_expect = 3
}

atlas {
  infrastructure = REDACTED
  token = REDACTED
  join = true
}



Nomad Client Config:

ubuntu@ip-10-3-32-133:~$ cat nomadClient_32.133.conf
bind_addr = "10.3.32.133"
data_dir = "/var/lib/nomad"
log_level = "DEBUG"

advertise {
  # We need to specify our host's IP because we can't
  # advertise 0.0.0.0 to other nodes in our cluster.
  rpc = "10.3.32.133:4647"
  serf = "10.3.32.133:4648"
}

client {
  enabled = true
  encrypt = "REDACTED"
  node_class = "worker.small"
  network_speed = 1000
}

er...@coms.io

unread,
Dec 24, 2015, 12:03:11 AM12/24/15
to Nomad
Disregard - had the wrong ports set for servers on the client side...
Reply all
Reply to author
Forward
0 new messages