Unable to see any metrics related to Nomad

156 views
Skip to first unread message

Sai Kiran

unread,
Aug 1, 2019, 11:33:53 AM8/1/19
to Nomad
Hi,
I am trying to enable telemetry on nomad server on linux and unable to see any metrics being reported. 


telemetry.hcl:
telemetry {
  publish_allocation_metrics = true
 publish_node_metrics       = true
 statsd_address= "localhost:8127"
}


config.hcl

log_level = "DEBUG"
data_dir = "/opt/nomad"

datacenter = "dc1"
bind_addr = "0.0.0.0"

server {
  enabled = "true"
  bootstrap_expect = 3
}



Agent info:

/usr/local/bin/nomad agent -config /etc/nomad.d
==> Loaded configuration from /etc/nomad.d/nomad.hcl, /etc/nomad.d/telemetry.hcl
==> Starting Nomad agent...
==> Nomad agent configuration:

       Advertise Addrs: HTTP: 10.84.132.164:4646; RPC: 10.84.132.164:4647; Serf: 10.84.132.164:4648
            Bind Addrs: HTTP: 0.0.0.0:4646; RPC: 0.0.0.0:4647; Serf: 0.0.0.0:4648
                Client: false
             Log Level: DEBUG
                Region: global (DC: dc1)
                Server: true
               Version: 0.9.1

==> Nomad agent started! Log data will stream in below:

    2019-08-01T15:15:08.186Z [WARN ] agent.plugin_loader: skipping external plugins since plugin_dir doesn't exist: plugin_dir=/opt/nomad/plugins
    2019-08-01T15:15:08.186Z [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/opt/nomad/plugins
    2019-08-01T15:15:08.186Z [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/opt/nomad/plugins
    2019-08-01T15:15:08.187Z [INFO ] agent: detected plugin: name=qemu type=driver plugin_version=0.1.0
    2019-08-01T15:15:08.187Z [INFO ] agent: detected plugin: name=java type=driver plugin_version=0.1.0
    2019-08-01T15:15:08.187Z [INFO ] agent: detected plugin: name=docker type=driver plugin_version=0.1.0
    2019-08-01T15:15:08.187Z [INFO ] agent: detected plugin: name=rkt type=driver plugin_version=0.1.0
    2019-08-01T15:15:08.187Z [INFO ] agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0
    2019-08-01T15:15:08.187Z [INFO ] agent: detected plugin: name=exec type=driver plugin_version=0.1.0
    2019-08-01T15:15:08.187Z [INFO ] agent: detected plugin: name=nvidia-gpu type=device plugin_version=0.1.0
    2019-08-01T15:15:08.193Z [INFO ] nomad: raft: Initial configuration (index=1): [{Suffrage:Voter ID:10.84.132.164:4647 Address:10.84.132.164:4647} {Suffrage:Voter ID:10.84.132.179:4647 Address:10.84.132.179:4647} {Suffrage:Voter ID:10.84.133.229:4647 Address:10.84.133.229:4647}]
    2019-08-01T15:15:08.193Z [INFO ] nomad: serf: EventMemberJoin: on-prem-photon-3.global 10.84.132.164
    2019-08-01T15:15:08.193Z [INFO ] nomad: starting scheduling worker(s): num_workers=1 schedulers="[service batch system _core]"
    2019-08-01T15:15:08.196Z [INFO ] nomad: raft: Node at 10.84.132.164:4647 [Follower] entering Follower state (Leader: "")
    2019-08-01T15:15:08.196Z [INFO ] nomad: serf: Attempting re-join to previously known node: on-prem-photon-1.global: 10.84.133.229:4648
    2019-08-01T15:15:08.196Z [INFO ] nomad: adding server: server="on-prem-photon-3.global (Addr: 10.84.132.164:4647) (DC: dc1)"




Ran the networking utility for reading from UDP port 8127 and don't see any metrics being emitted (netcat -ulp 8127)

On a sidenote, consul telemetry is enabled on the same box with UDP port 8126 and I am able to see data coming ..
netcat -ulp 8125
consul.memberlist.gossip:0.007190|ms
consul.raft.replication.appendEntries.rpc.71d99b17-c4e7-b01c-a160-188bee152d27:0.183613|ms
consul.raft.replication.appendEntries.logs.71d99b17-c4e7-b01c-a160-188bee152d27:0.000000|c
consul.memberlist.gossip:0.007386|ms
consul.raft.replication.appendEntries.rpc.71d99b17-c4e7-b01c-a160-188bee152d27:0.196174|ms
consul.raft.replication.appendEntries.logs.71d99b17-c4e7-b01c-a160-188bee152d27:0.000000|c
consul.raft.replication.appendEntries.rpc.71d99b17-c4e7-b01c-a160-188bee152d27:0.237609|ms
consul.raft.replication.appendEntries.logs.71d99b17-c4e7-b01c-a160-188bee152d27:0.000000|c
consul.memberlist.gossip:0.006987|ms
consul.raft.replication.appendEntries.rpc.71d99b17-c4e7-b01c-a160-188bee152d27:0.217755|ms
consul.raft.replication.appendEntries.logs.71d99b17-c4e7-b01c-a160-188bee152d27:0.000000|c
consul.raft.replication.appendEntries.rpc.71d99b17-c4e7-b01c-a160-188bee152d27:0.261357|ms
consul.raft.replication.appendEntries.logs.71d99b17-c4e7-b01c-a160-188bee152d27:0.000000|c
consul.raft.replication.heartbeat.71d99b17-c4e7-b01c-a160-188bee152d27:0.236952|ms
consul.raft.replication.appendEntries.rpc.71d99b17-c4e7-b01c-a160-188bee152d27:0.189188|ms
consul.raft.replication.appendEntries.logs.71d99b17-c4e7-b01c-a160-188bee152d27:0.000000|c
consul.memberlist.gossip:0.008832|ms
consul.raft.replication.appendEntries.rpc.71d99b17-c4e7-b01c-a160-188bee152d27:0.259481|ms
consul.raft.replication.appendEntries.logs.71d99b17-c4e7-b01c-a160-188bee152d27:0.000000|c
consul.runtime.num_goroutines:105.000000|g
consul.runtime.alloc_bytes:5406504.000000|g
consul.runtime.sys_bytes:72024312.000000|g
consul.runtime.malloc_count:28696280.000000|g



Any help is appreciated.

Lowe Schmidt

unread,
Aug 1, 2019, 4:20:28 PM8/1/19
to Sai Kiran, Nomad
You're looking at the wrong port and/or configured the wrong port.
 
"statsd_address= "localhost:8127"
netcat -ulp 8125
--
Lowe Schmidt | +46 723 867 157


--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/nomad/issues
IRC: #nomad-tool on Freenode
---
You received this message because you are subscribed to the Google Groups "Nomad" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nomad-tool+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nomad-tool/6c2d4d15-f37d-4edb-a6f7-d782c4ec4062%40googlegroups.com.

Sai Kiran

unread,
Aug 1, 2019, 5:06:12 PM8/1/19
to Nomad
Sorry it was typo here. 

Consul's port is 8125, vault's port is 8126 and nomad's port is 8127. I see only telemetry for Consul and not for Vault or Nomad. 

To unsubscribe from this group and stop receiving emails from it, send an email to nomad...@googlegroups.com.

Mahmood Ali

unread,
Aug 1, 2019, 9:47:34 PM8/1/19
to Sai Kiran, Nomad
Hi Sai,

I just reproduced this and it does look it's a bug affecting statsd_address handling.  Using `datadog_address` (which uses the same statd protocol with minor differences) causes it to report metrics as expected.  Can you try testing with `datadog_address` as well?

Please open a github issue in https://github.com/hashicorp/nomad/issues/new and we'll follow up - thanks!

- Mahmood

To unsubscribe from this group and stop receiving emails from it, send an email to nomad-tool+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nomad-tool/b7df5a3d-e499-41ea-9069-e7624db81dbc%40googlegroups.com.

Mahmood Ali

unread,
Aug 1, 2019, 10:03:47 PM8/1/19
to Sai Kiran, Nomad
Also, can you please test with `statsd_address= "127.0.0.1:8127"` - It's possible that localhost is mapping to the [::1] ipv6 address while statsd listener is bound to 127.0.0.1 ipv4 address only.

Thanks,
- Mahmood

Sai Kiran

unread,
Aug 2, 2019, 3:01:12 PM8/2/19
to Nomad
Hi Mahmood, 
Thanks for reaching back. 

I tried with statsd_address on 127.0.0.1 and couldn't see any metrics. 

As you suggested I also added datadog_address, and that didn't work as well. 

Consul.json
{
"telemetry": {
        "dogstatsd_addr" : "127.0.0.1:8125",
        "disable_hostname": true
    }
}

Nomad telemetry.hcl

telemetry {
  publish_allocation_metrics = true
  publish_node_metrics       = true
  datadog_address= "127.0.0.1:8127"
  collection_interval = "10s"
  disable_hostname = true
}

Vault telemetry.hcl

{
"telemetry": {
        "dogstatsd_addr": "127.0.0.1:8126",
        "disable_hostname": true
    }
}

Sébastien Portebois

unread,
Aug 2, 2019, 4:23:02 PM8/2/19
to Mahmood Ali, Sai Kiran, Nomad
Hi 

I can confirm its works using datadog_adress on the same port
We successfully collect metrics like that on many clusters for a while (running the datadog agent, not in a container):

Our Nomad config:
telemetry {
  publish_allocation_metrics = true
  publish_node_metrics       = true
  datadog_address = "localhost:8125"
  disable_hostname = true
}

And in our datadog agent config:
dogstatsd_port: 8125

I agree that the statsd_address would make sense. In our case we didn’t run into the issue because we followed Datadog documentation about Nomad integration, which explicitly documents the datadog_address property:


Sébastien





Sai Kiran

unread,
Aug 2, 2019, 4:59:56 PM8/2/19
to Nomad
Hi Sébastien
We are using telegraf, I suppose it should be irrelevant of collector agent (Datadog/collectd/telegraf). 

As I mentioned above, my problem is only of the hashi agent's (consul/vault/nomad all are on same hosts) is emitting metrics. 

I first tried using statsd address, and after the suggestion on top I tried with 127.0.0.1 with datadog_address, and couldn't see any metrics being emitted. 

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/nomad/issues
IRC: #nomad-tool on Freenode
---
You received this message because you are subscribed to the Google Groups "Nomad" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nomad...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages