[DNS Interface] Why not make "node" and "service" an optional part of the FQDN

667 views
Skip to first unread message

XavM

unread,
May 3, 2014, 7:20:36 PM5/3/14
to consu...@googlegroups.com
Hello,

Just diving into consul for a few days, but it already sounds really promising.

Regarding the DNS interface, there is maybe some good reasons I don't see yet, but why do you make "node" and "service" non optional parts of the FQDN ?

  dig <node>.node.<datacenter>.<domain>
  dig <tag>.<service>.service.<datacenter>.<domain>

Having them optional, like datacenter and tag are, would make consul much more easier to use as the DNS with no "recursor" (assuming the appropriate "domain" and "ports" configuration keys are used).

Specifying them optionally would be helpful only for disambiguation when node and service share a same name.

Regards,

Xavier

Armon Dadgar

unread,
May 4, 2014, 4:49:03 PM5/4/14
to XavM, consu...@googlegroups.com
Hey Xavier,

There are two primary reasons. The first is that we use the “.node.” and “.service.” to
disambiguate the query. Without that, we would need to do 2 queries to determine if it
is a node lookup or a service lookup. In the case of a name conflict, there is no sane
way to respond to the query either. The other reason was for human readability. It is
immediately clear when you see a Consul FQDN what is being resolved.

I’m not sure how it affects the usability without a recursor? Maybe you can clarify?

Best Regards,
Armon Dadgar
--
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

XavM

unread,
May 4, 2014, 5:57:28 PM5/4/14
to consu...@googlegroups.com, XavM
Hey Armon,

I was evaluating the possibility to use consul for what it does (service discovery and health check + kv), plus as the main/only DNS on the network

Several dedicated hosts (containers) running a consul agent as a server, forming one cluster
Consul "domain" set  to "mydomain.com", and "ports.dns" to 53 
Each host having a resolv.conf set to 127.0.0.1 and running a consul agent as a client, joining the cluster

When doing so, real hostname for each host would be set to a different value than the one searchable via the A records of the consul DNS Interface  : 

  server1.mydomain.com would not resolve to anything
  server1.node.mydomain.com should be used by any one looking up for the ip of server1.mydomain.com

Would it make sense to have the "node" part optional and keep "service" for performance and disambiguation (When not provided, defaults to "node", just like <datacenter> defaults to the local dc if not specified) ?

(My point is that, when you do not have thousands+ DNS requests each seconds, you don't always need bind, and the less moving parts you have, the safer you feel <- Having bind out of the loop completely and replace it with consul could be an option in my situation and maybe for others)

Xavier

Armon Dadgar

unread,
May 4, 2014, 6:49:59 PM5/4/14
to XavM, consu...@googlegroups.com, XavM
We can potentially make the node optional as you said, and require that the “.service.” portion
remain in the FQDN. That way the query remains unambiguous.

That said, I wouldn’t necessarily recommend using Consul to service public DNS records,
since bind probably would do a much better job of it. Also probably safer to forward the
appropriate queries to a private Consul behind bind.

XavM

unread,
May 5, 2014, 6:43:37 AM5/5/14
to consu...@googlegroups.com, XavM
Great,

I have juste created the corresponding issue for traceability in gitHub : https://github.com/hashicorp/consul/issues/126

Regards,

Xavier

Gurminder Gill

unread,
Jun 26, 2014, 1:31:14 AM6/26/14
to consu...@googlegroups.com, mail...@gmail.com
Armon,

Are you suggesting running named in front of consul on every host? If not, we lose the benefits of local lookups + P2P gossip. 

Alternatively, a HTTP API call equivalent to the dig lookup would be great. Right now, the closest one seems to be /v1/health/service/<service>?passing, whose reply is bloated for a simple lookup.

Thanks,
Gurminder

Armon Dadgar

unread,
Jun 27, 2014, 1:09:58 AM6/27/14
to Gurminder Gill, consu...@googlegroups.com, Xavier M !!
Gurminder,

Could you clarify what you mean? I don't suggest running named in front of Consul on every host.
What I meant is I do not recommend serving public DNS records via Consul directly. Using it for internal
DNS directly is fine.

Generally, BIND/named are run on a few hosts in the cluster not on every node. I don't see any reason to
run it on each node.

The HTTP API you recommended is effectively what Consul is running internally to service DNS requests.




Best Regards,

Armon Dadgar

Gurminder Gill

unread,
Jun 27, 2014, 8:12:24 PM6/27/14
to Armon Dadgar, consu...@googlegroups.com, Xavier M !!
Hi Armon,

I was under the impression that the consul client stores service discovery data and stays consistent using P2P gossip. So, I didn't want to put the consul client behind a dedicated BIND - extra hops. Next, the recommendation for public DNS queries was not to rely on recursion option within the client. That means I need additional components (bind/dnsmasq) per host for best performance. Overall, I was a bit lost on the optimal client-side setup.

Now it seems that the client is simply a wrapper for service discovery & requires a server for every hard call. For this scenario, running a dedicated bind + consul client is ok per your recommendation.

I am a little surprised that consul is not syncing service discovery data using gossip. Its a CP system. The fact that people use TTL means "eventually consistency" is ok for this use-case. Such a system would effectively mean a dynamic short TTL with local lookup performance & HA. Really shines when running elastic services on many containers. Thoughts?

I love the depth of features that consul provides. Running those on top of Serf's AP system would have been nice.  

Independent of that, I am still not sure about the optimal client setup. In your experiments, what works best at scale?

Option 1) Run our own dedicated BIND servers + consul. Extra pain, SPOF, Bottleneck. Please note that we are currently relying on dedicated BIND from cloud provider. 
Option 2) Run dnsmasq + consul on each host. Extra component. Consul client is a redundant hop. 
Option 3) Skip the consul client. Only run dnsmasq on each host. Seems most reasonable especially if TTL is used.
Option 4) Use Http API. But the optimal call is marked internal.
Option 5) ???

Thanks,
Gurminder



 

   

Armon Dadgar

unread,
Jun 28, 2014, 10:03:58 PM6/28/14
to Gurminder Gill, consu...@googlegroups.com, Xavier M !!
Gurminder,

Consul is a CP system, servers contain all the state, and clients are doing RPC to the servers
for all requests. With this model, the clients actually don’t have any data. The P2P gossip that
is used to is to track the location of servers. There are a number of trade-offs here, but the CP
architecture enables a much richer feature set than an AP system. We went this route after
having built Serf and dealing with the limitations of AP. (Here is more: http://www.consul.io/intro/vs/serf.html)

If you want more of an AP like system, Serf is always available too!

That said, there are 2 deployment styles we recommend:

1) Consul client on all nodes, with DNS recursor configured. All DNS queries are serviced
by the local Consul agent, and the recursor is used for non-Consul queries. This is simple to
configure, simple to deploy.

2) Consul client on all nodes, with central BIND servers. In this case, assume there are 3
Consul servers. On each server node, you also run BIND. BIND will forward Consul requests
to the local agent, and recurse non-Consul requests. Since you are running N=3 BIND’s,
you don’t have a SPOF (all the name servers are in /etc/resolv.conf). This is a slightly more
complex deployment, but works very well since BIND enables DNS caching and other features
that Consul does not support.

These two options both work very well, and offer a high level of reliability. Both are fairly
straightforward to setup as well.

Hope that helps!

Best Regards,
Armon Dadgar

Gurminder Gill

unread,
Jun 29, 2014, 1:39:01 PM6/29/14
to Armon Dadgar, consu...@googlegroups.com, Xavier M !!
Hi Armon,

Thank you for the detailed guidance. It helps.

Best Regards,
Gurminder

Jason Barnes

unread,
Jul 25, 2014, 10:49:34 AM7/25/14
to consu...@googlegroups.com, gen...@gmail.com, mail...@gmail.com
Armon, 

It is my understanding with the following setup you are only able to specify one host as a DNS recursor?

Are there plans to modify this to allow multiple host for redundancy? Or is there something I have overlooked or another method that would allow multiple hosts for redundancy? 

1) Consul client on all nodes, with DNS recursor configured. All DNS queries are serviced
by the local Consul agent, and the recursor is used for non-Consul queries. This is simple to
configure, simple to deploy.

Thanks
Jason

Armon Dadgar

unread,
Jul 25, 2014, 10:53:40 AM7/25/14
to consu...@googlegroups.com, Jason Barnes, mail...@gmail.com, gen...@gmail.com
Hey Jason,

Currently only a single recursor is supported. We could support multiple recursors in the future,
but for now, the simplest solution is to run a local dnsmasq or alternative recursor and point
Consul at that.

Best Regards,
Armon Dadgar

Jason Barnes

unread,
Jul 25, 2014, 1:55:29 PM7/25/14
to consu...@googlegroups.com, barnes...@gmail.com, mail...@gmail.com, gen...@gmail.com
Thank you for the quick response!


On Friday, July 25, 2014 10:53:40 AM UTC-4, Armon Dadgar wrote:
Hey Jason,

Currently only a single recursor is supported. We could support multiple recursors in the future,
but for now, the simplest solution is to run a local dnsmasq or alternative recursor and point
Consul at that.

Best Regards,
Armon Dadgar
Reply all
Reply to author
Forward
0 new messages