We are in the process of using Consul to monitor external services. We had earlier setup consul in a traditional way with three Consul Servers and 3 Consul Agents. The Agents were monitoring external services with checks being performed via HTTP API. All of this seems to be working perfectly fine. Except that we now have an issue when a agent node suddenly does down or losses its connectivity to the Server nodes the service status is incorrectly shown as failing on the Consul Server.
Here is what we did so far a) Spun a new instance and registered that with Consul as an External-Node with Services and checks its likely to monitor b) Setup the consul_esm demon on an Consul Agent node. It seems the checks were able to be executed and passing on the remote node. As I understand multiple such ESM deamons can be setup and it would ensure that it has a leader to execute the checks. If the external deamon is unable to reach the node for a specified period of time the Services are unregistered. But in case the node itself is down the service is marked as failing - which is identical to the normal consul setup.
I feel a bit confused about what value-add is consul_esm providing over traditional Consul setup; with Agent nodes doing the checks and communicating with Consul servers. As opposed to going via the ESM deamon
The main question is has anyone implemented consul_esm on a large scale? Is there any Architecture recommendation for this kind of setup? Is there any detailed discussion/documentation about such a use case - utilizing consul more for Service Monitoring ( External) rather then using it more as a Service Discovery tool etc.