basic example runs, but registers and deregisters service with consul

441 views
Skip to first unread message

ja...@fpcomplete.com

unread,
Apr 10, 2016, 12:20:09 AM4/10/16
to Nomad
Hello!

I am a bit puzzled by nomad's behavior here..

I define a super basic job which aims to run redis:

```
job "redis" {
    datacenters = ["us-west-2"]
    type = "service"
    # Create a redis server using the docker image
    task "redis" {
        driver = "docker"
        config {
            image = "redis"
            port_map {
                redis = 6379
            }
        }
        service {
            name = "redis"
            port = "redis"
            check {
                type = "tcp"
                interval = "30s"
                timeout = "5s"
            }
        }
        resources {
            cpu = 20
            memory = 2048
            network {
                mbits = 100
                port "redis" {
#                   static = 10099
                }
            }
        }
    }
}
```

That uses the `redis` docker image, lets nomad manage the port, maps the port from nomad to 6379 in the running redis container, and uses the consul cluster I have for service registration. I also tested this whole spiel with a static host port (rather than letting nomad handle the dynamic port allocation).

I run the job:

```
# nomad run redis.hcl
==> Monitoring evaluation "37958c7e"
    Evaluation triggered by job "redis"
    Allocation "60ec2622" created: node "7aa5f0bf", group "redis"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "37958c7e" finished with status "complete"
```

Check status:

```
# nomad status redis                                                                                                                                                 
ID          = redis
Name        = redis
Type        = service
Priority    = 50
Datacenters = us-west-2
Status      = running
Periodic    = false

==> Evaluations
ID        Priority  Triggered By    Status
37958c7e  50        job-register    complete
08101e45  50        job-deregister  complete
f2bed523  50        job-register    complete

==> Allocations
ID        Eval ID   Node ID   Task Group  Desired  Status
60ec2622  37958c7e  7aa5f0bf  redis       run      pending
1c78fed8  f2bed523  7aa5f0bf  redis       stop     dead
```

Looking at the task with `nomad alloc-status`, I can see the IP/port:

```
==> Task Resources
Task: "redis"
CPU  Memory MB  Disk MB  IOPS  Addresses
20   2048       300      0     redis: 10.10.10.10:22290
```

I can lookup that service in consul:

```
# dig redis.service.consul | grep redis                                                                                                                              
; <<>> DiG 9.9.5-3ubuntu0.7-Ubuntu <<>> redis.service.consul
;redis.service.consul.          IN      A
redis.service.consul.   0       IN      A       10.10.10.10
```

But then the nomad server logs show this service as being unregistered:

```
    2016/04/10 03:59:21 [DEBUG] http: Request /v1/allocation/60ec2622-1d09-01e5-e8bd-8012b5a11744 (1.432872ms)
    2016/04/10 03:59:21 [DEBUG] client: state changed, updating node.
    2016/04/10 03:59:21 [DEBUG] client: node registration complete
    2016/04/10 03:59:36 [DEBUG] client: state changed, updating node.
    2016/04/10 03:59:36 [DEBUG] client: node registration complete
    2016/04/10 03:59:46 [INFO] consul: perform sync, deregistering service redis with consul
    2016/04/10 03:59:51 [DEBUG] client: state changed, updating node.
    2016/04/10 03:59:51 [DEBUG] client: node registration complete
```

..sure enough, it's not in the DNS catalog:

```
# dig redis.service.consul | grep redis
; <<>> DiG 9.9.5-3ubuntu0.7-Ubuntu <<>> redis.service.consul
;redis.service.consul.          IN      A
```

If I wait a few seconds, it'll come back:

```
# dig redis.service.consul | grep redis
; <<>> DiG 9.9.5-3ubuntu0.7-Ubuntu <<>> redis.service.consul
;redis.service.consul.          IN      A
redis.service.consul.   0       IN      A       10.10.10.10
```

The service registration is flapping, though I can confirm being able to connect to the service:

```
# nc -v 10.10.10.10 10002
Connection to 10.10.10.10 10002 port [tcp/*] succeeded!
```

(This works the same whether the job is in staic or dynamic port allocation)

Diptanu Choudhury

unread,
Apr 10, 2016, 5:15:55 AM4/10/16
to ja...@fpcomplete.com, Nomad
Hi Jason,

What does the Nomad client configuration look like? It is expected that the Nomad client is configured to talk to a local Consul agent running alongside on the same node.

I have seen this happen when the Nomad client is configured to talk to the Consul Server and not the local Consul Agent. When that happens, nomad clients de-registers services that other clients might be running.

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/nomad/issues
IRC: #nomad-tool on Freenode
---
You received this message because you are subscribed to the Google Groups "Nomad" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nomad-tool+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nomad-tool/6710b814-bef3-4fce-a11e-19bea2c76ebb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Diptanu Choudhury

J Boyer

unread,
Apr 10, 2016, 10:22:03 AM4/10/16
to Nomad


On 04/10/2016 05:15 AM, Diptanu Choudhury wrote:
> Hi Jason,
>
> What does the Nomad client configuration look like? It is expected
> that the Nomad client is configured to talk to a local Consul agent
> running alongside on the same node.
>
> I have seen this happen when the Nomad client is configured to talk to
> the Consul Server and not the local Consul Agent. When that happens,
> nomad clients de-registers services that other clients might be running.

You are 100% correct, I had the nomad agents pointed at the consul
servers via `consul.service.consul:8500`. Is there a reasoning for that
logic? Thanks for the guidance here!

Diptanu Choudhury

unread,
Apr 10, 2016, 1:56:05 PM4/10/16
to J Boyer, Nomad
It's more efficient to do this on the clients than the server, the clients can precisely add or remove services based on the state of a Task. Also, pushing this responsibility to the clients makes it scale better when a Nomad cluster is running 10s of 1000s of services.

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.

GitHub Issues: https://github.com/hashicorp/nomad/issues
IRC: #nomad-tool on Freenode
--- You received this message because you are subscribed to the Google Groups "Nomad" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nomad-tool+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

ja...@fpcomplete.com

unread,
Apr 10, 2016, 6:34:50 PM4/10/16
to Nomad


On Sunday, April 10, 2016 at 1:56:05 PM UTC-4, Diptanu Choudhury wrote:
It's more efficient to do this on the clients than the server, the clients can precisely add or remove services based on the state of a Task. Also, pushing this responsibility to the clients makes it scale better when a Nomad cluster is running 10s of 1000s of services.


That makes a lot of sense, thanks for sharing!
Reply all
Reply to author
Forward
0 new messages