Nomad service to service communication

1,489 views
Skip to first unread message

ANKIT ROHILLA

unread,
Jun 27, 2018, 10:22:59 AM6/27/18
to Nomad
Hi,
I want to check how nomad's communication mechanism works. That is, I want to deploy two services which talk to each other and track the IPs that are communicating with each other. My driver should only be docker.
So I want to know which container is talking to what other containers.
Is there any good example of this kind of service where I can create two containers and have them talk to each other. So I can further inspect.
Please give me any link that provides job files for the same. I am having trouble configuring jobs.
Thanks

Justin DynamicD

unread,
Jun 28, 2018, 6:22:50 PM6/28/18
to Nomad

So a bit of confusion here.  Nomad is a scheduler, it doesn't actually configure any communication.  Instead it simply looks at available clients/nodes, and selects an appropriate target that can host the defined job.  It's up to said services/actions to discover each other using whatever tools are available.  So you mention containers, in this example for each job nomad simply looks for a client capable of running the container as it's described, then it makes sure said container stays running.  

When it comes to services "discovering each other" you're really talking about what's been coined as "service discovery" or it's big-brother variant "service mesh".  If you're staying homogeneous in the hashicorp toolset ... you need to look at Consul.  It's a very solid SD tool, and just released some more formal service mesh capabilities into beta.  It integrates into nomad VERY nicely (just point at the cluster in your nomad config file, add a acl if needed), and once done you can declare a service name per job which sets you up to do exactly what you're looking for.


But to restate:  Nomad is a "pure" scheduler, it does not directly manage network communication; Nomad clients on a local host directly call bash/docker/whatever and return the results to the Nomad servers.  Any network configuration comes from the plugin that just translates the command to the host.

ANKIT ROHILLA

unread,
Jul 2, 2018, 12:00:14 AM7/2/18
to Nomad
Ok, got it. Thanks.
One more question.
If I have a container running a service in a client and that container wants to access some other service (service run as containers) in the cluster which might be present on the same client or a different one, then how will that container contact consul and get its request directed to a suitable container running that external service?
Will that service be accessed through the naming convention that consul dns maintains, i.e., 'service_name.service.consul'. If yes, then how would the container know that the dns of consul is running on what ip and port?
I tried the above and inside the container it cannot resolve the service name, that means it doesnt know where the dns is running. But fortunately it can connect to a client from inside the container using the client's IP (but this is not suited).
Also, if I have to access a service from outside the cluster, then consul exposes a single endpoint for that service or not? Similar to what a load balancer does.
Please help. I am really stuck at this point.
Ask if any doubts.

Regards
Ankit R

Justin DynamicD

unread,
Jul 2, 2018, 3:52:16 PM7/2/18
to Nomad
So "there is no one way" ... "all opinions are my own" ... "blah blah blah".

OK ... so there are a number of ways to handle service discovery.  As you correctly mentioned, you can leverage Consul's DNS capabilities as an easy way to resolve the server name/IP, which also happens to be the easiest method IMO, so lets run with that. 

In order to do this, we first need to make sure your DNS resolves "service.consul".  As you are likely already aware,  in a typical Docker configuration, Docker will use it's own internal name resolution for discovery then forward to the DNS entries configured on the host once that fails.  This means, as long as your docker node can resolve "service.consul" then so will your containers (unless explicitly configured to look somewhere else).

I personally borrowed from Hashicorp's "1 million container challenge" in my own design: https://github.com/hashicorp/c1m, but you don't have to do it that way.  In this design,  the docker node has consul installed locally and dnsmasq installed and configured with conditional forwarding to it.  So basically anything ending in ".consul" forwards to the local 127.0.0.1:8600, everything else goes to the normal DNS routes.  Again,  me following Hashicorp's design:  if you'd prefer just setting up conditional forwarders on your DNS servers that will work perfectly fine too.

Once we know that containers can resolve "service.consul", the next step is to make sure your container publishes itself to consul so it can be discovered.  Here nomad can help as you can define not only the service but even health checks right from the nomad job using the service stanza: https://www.nomadproject.io/docs/job-specification/service.html.  I highly reocmend configuring health checks as well,  as this allows consul to enable/disable services based on response, which will assist you in the final part of your question coming up.

So, now we have a resolvable consul system, that will respond with an IP of a known healthy container when queried.  "wiki.service.consul" will work as long as my "wiki" service is healthy somewhere.  cool, right?  But we can also take it a step further.  You asked about load balancers and really this get's into the next level of consul integration:  the API.  There are a couple routes you can play with here, really.  Systems like Traefik and Fabio are load balancers that can talk native to consul and dynamically update as containers come online/offline.  If you have a need/desire to stay with more traditional load balancers like HAProxy, you can use tools like consul-template to dynamically configure and maintain their configuration.  As an example of this, I actually have this little guy stored: https://github.com/Justin-DynamicD/haproxy-consultemplate.  This is a consul template I wrote that automatically updates HAProxy with any service in consul that has been tagged "proxy".  It then has a few extra K/V lookup tricks to allow me to overwrite the defaults.  Really should update the readme there ... it was kinda written for me.

I hope this helps gives you some ideas on how you can get things working.

ANKIT ROHILLA

unread,
Jul 3, 2018, 3:16:17 AM7/3/18
to Nomad
Thanks for taking out time to reply. I have a few more things .

" This means, as long as your docker node can resolve "service.consul" then so will your containers (unless explicitly configured to look somewhere else)."
In my case docker node or what we call as client in nomad is able to resolve 'service.consul', i.e., when I use 'dig @127.0.0.1 -p 8600 service_name.service.consul' on node, it displays all the clients on which the specified service is running.
But when I enter inside a container and try to do the same thing, it is not able to resolve.
Am I doing it all wrong?
Please tell me the steps. I just want the containers (from inside) to resolve '.consul' and if possible let consul discover container_IP addresses with their running services. Because as of now I cant find container_IP running inside nodes. I am using conntrack and ip tables to get container_IP, but as you said "Once we know that containers can resolve "service.consul", the next step is to make sure your container publishes itself to consul so it can be discovered.", it seems consul can show the container_IPs directly and I can use its APIs to obtain those IPs, which is far more better approach as compared to tracking ip_tables and connections with conntrack.

Thanks
Ankit R

Justin DynamicD

unread,
Jul 3, 2018, 2:26:09 PM7/3/18
to Nomad
It depends on how you are running your container.

By default, containers will run in their own network space and have ports PATed out as needed.  In this scenario, "127.0.0.1" will not hit consul, because consul isn't running _within_ the container nor is it sharing the host network.  Docker does some trickery in mounting the host's resolv.conf over the container so it can use the same DNS resolvers.  More detail here: https://docs.docker.com/v17.09/engine/userguide/networking/default_network/configure-dns/, but the gist of it is that you have to basically treat the container a little differently than a traditional app on a host.


So essentially your test is misleading.  the container itself doesn't host consul thus will never answer that dig (unless you run your container using the network: host model).  Instead just try a simple ping and see what happens.  If you installed dnsmasq like I mentioned, it will be in the resolv.conf and forward as needed.  If you added a conditional forwarder to your DNS topology, same result should occur.  If you want your container to query consul directly, use the host IP, not 172.0.0.1 to avoid the issue you're seeing.

Does that make sense?
Reply all
Reply to author
Forward
0 new messages