Consul is designed for one agent per host - that's why we made the
deterministic host-based IDs (a Nomad agent running on the same host
will pick the same ID for itself, which makes it nice for correlating
things between the two, for example). Folks using Docker often use
-net=host or bind Consul to an address on the bridge network. This
post from the community has an interesting alternate approach to these
two - https://medium.com/zendesk-engineering/making-docker-and-consul-get-along-5fceda1d52b9.
Installing a Consul agent per container. Consul’s architecture anticipates a single agent per host IP address; and in most environments, a Docker host has a single network-accessible IP address. Running more than one Consul agent per container would cause multiple agents to join the Consul network and claim responsibility for the host, causing major instability in the cluster.
ing Consul to bind to the Docker bridge IP address. The routing would work properly, but: (a) typically, bridge interfaces are assigned dynamically by Docker; (b) there may be more than one bridge interface; (c) containers would have to know the IP address of the selected bridge interface; and (d) the Consul agent and dnsmasq (discussed below) would not be able to start until the Docker engine has started. We don’t want to create any unnecessary dependencies.
If your architecture is working for you then it's hard to say it's
wrong :-) It doesn't map super cleanly into Consul, though, to have
multiple agents running on the same host. You have to be careful about
giving them unique node names and addresses, even though they are on
the same node, and Consul's gossip protocol means that the agents will
all probe each other to see if a host is down, so you might have peer
agents probing each other which doesn't make sense. I'd definitely
recommend trying to get to one Consul agent per host if you can.
As far as the host-based IDs are concerned,
https://www.consul.io/docs/agent/options.html#_disable_host_node_id
should be all you need. When the agent starts up it will generate its
UUID and save it off into Consul's data directory in a file called
"node-id". Is it possible that the process that makes your Consul
container starts it up so that all your containers are getting the
same ID?
Beside that, monitoring the host is the same aspect as monitoring a VM. You either want to monitor host or the application container, no matter it's a VM or LXC. Nevertheless I would never consider consul for either of those tasks, rather datadog, icinga, monit, munin ... So the monitoring aspect of consul is irrelevant for me, it's just not the right tool for that.
In the end it comes down to the aspect you named James, does consul want to be a infrastructure element plainly or also application aware.
Considering all the features consul offers:
- KV
- Services not only nodes
- Watches to generate configuration
Using consul only as a infrastructure element would not do consul fair, it would degrade it to much less it actually is.
That said, consul should see a container as it's host, no matter what it is running on. All it cares about this container/node, being up, with its services. No matter if below is a VM and below this comes the physical host. It simply should not matter.
It should also not matter if a cluster is spread on different physical hosts, or on VMs on the same host or on containers.
That is of course my POV, iam sure I do not see the whole global picture.
I really like the controversy of this discussion, it brings some light in those corners you usually do not look at.
Also thank you a lot James for sharing the insight/mission Hashicorp is currently driving behind consul, very valuable.