I was thinking about this as well, but this would create the SPOF in that DC, when it goes down. This is unacceptable for us, we've many DC all around the world and we treat them equal in the scope of our services and infrastructure design.
Well, let's talk about Icinga (version 1) monitoring tool as the
example. We're currently heavily depending on puppet exported resources
and its built-in support for nagios_* configuration objects with this.
But it's starting to take ages to reconfigure monitoring this way on
each of the Icinga cores with many and many resources to collect and realize
there... It's therefore my priority to find another and much faster
solution for our monitoring configuration management.
Please also note, we've got many Icinga core instances around the world, most of the DCs share one, but there're several present in one DC as well. We need to treat the configuration as global, most of the objects remain targeted for Icinga core within the same DC, but there are also some, targeted for other instance (or instances) around the world.
I've ended up with following K/V structure for Icinga configuration exports to consul. It was crucial for me to include the source consul DC as one of the keys there, to prevent subtree overwrite by consul-replicate and I've also included a tag, which I use to realize the configuration at the right place to be loaded by correct Icinga core instance. (I'm including only two DCs in the example and nagios_host object only for simplicity.)
K/V structure for Icinga 1:
- global/CONSUL_DC/icinga/ICINGA_INSTANCE_TAG/host/NODE_FQDN/data/...
consul-replicate sources:
- dc1: global/dc2@dc2
- dc2: global/dc1@dc1
The consul-template is running on each of the Icinga cores, creating all the configuration files necessary for that instance. This includes specific prefixes and configuration object files for each of the nagios objects and each of the consul DCs too:
template {
source = "/etc/consul-template/templates/nagios_host_dc1.ctmpl"
destination = "/etc/nagios/nagios_host_consul_dc1.cfg"
perms = 0600
}
template {
source = "/etc/consul-template/templates/nagios_host_dc2.ctmpl"
destination = "/etc/nagios/nagios_host_consul_dc2.cfg"
perms = 0600
}
Each of the template itself is then hooked to specific subtree withing the same consul DC and its own ICINGA_INSTANCE_TAG via the tree function:
/etc/consul-template/templates/nagios_host_dc1.ctmpl:
{{- range $node_fqdn, $data := tree "global/dc1/icinga/ICINGA_INSTANCE_TAG/host/@dc1" | byKey -}}
Now, there are many of these nagios objects being managed for each of the dozens DCs... well and it's just a lot of templates and config files. But the puppet is in charge here, creating the configuration for consul-template dynamically, so it's not a real concern for us now.