Hello,
I am using ansible to deploy Kubernetes to bare metal using the playbooks from
https://github.com/kubernetes/contrib/tree/master/ansible and am having problems with DNS resolution from pods not living on the master node.
Here are some details on the setup:
Docker: Docker version 1.13.1, build 092cba3
Flanneld: Flanneld version 0.5.5
Kubernetes: Kubernetes v1.4.5
OS: Linux version 4.4.0-59-generic (buildd@lgw01-11) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #80-Ubuntu SMP Fri Jan 6 17:47:47 UTC 2017
Testing, using the instructions from
https://kubernetes.io/docs/admin/dns/ on a single node (running on the master) yields:
Server: 10.254.0.10
Address 1: 10.254.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 10.254.0.1 kubernetes.default.svc.cluster.local
while testing on a different node yields:
Server: 10.254.0.10
Address 1: 10.254.0.10
nslookup: can't resolve 'kubernetes.default'
I have also used the vagrant scripts provided by the repository using ubuntu16 with libvirt and see consistent behaviour with that of the physical machines.
I should also note that I was able to test the vagrant script using centos7 (which installed kubernetes 1.4.0) which actually appeared to work, however wasn't able to adapt that to the ubuntu16 case.
Some information from the VM setup:
Logs from kube-dns kubedns:
I0224 19:28:08.359739 1 server.go:94] Using
https://10.254.0.1:443 for kubernetes master, kubernetes API: <nil>
I0224 19:28:08.365076 1 server.go:99] v1.5.0-alpha.0.1651+7dcae5edd84f06-dirty
I0224 19:28:08.365111 1 server.go:101] FLAG: --alsologtostderr="false"
I0224 19:28:08.365174 1 server.go:101] FLAG: --dns-port="10053"
I0224 19:28:08.365195 1 server.go:101] FLAG: --domain="cluster.local."
I0224 19:28:08.365217 1 server.go:101] FLAG: --federations=""
I0224 19:28:08.365224 1 server.go:101] FLAG: --healthz-port="8081"
I0224 19:28:08.365229 1 server.go:101] FLAG: --kube-master-url=""
I0224 19:28:08.365256 1 server.go:101] FLAG: --kubecfg-file=""
I0224 19:28:08.365262 1 server.go:101] FLAG: --log-backtrace-at=":0"
I0224 19:28:08.365269 1 server.go:101] FLAG: --log-dir=""
I0224 19:28:08.365275 1 server.go:101] FLAG: --log-flush-frequency="5s"
I0224 19:28:08.365281 1 server.go:101] FLAG: --logtostderr="true"
I0224 19:28:08.365285 1 server.go:101] FLAG: --stderrthreshold="2"
I0224 19:28:08.365290 1 server.go:101] FLAG: --v="0"
I0224 19:28:08.365294 1 server.go:101] FLAG: --version="false"
I0224 19:28:08.365299 1 server.go:101] FLAG: --vmodule=""
I0224 19:28:08.365389 1 server.go:138] Starting SkyDNS server. Listening on port:10053
I0224 19:28:08.365615 1 server.go:145] skydns: metrics enabled on : /metrics:
I0224 19:28:08.365684 1 dns.go:166] Waiting for service: default/kubernetes
I0224 19:28:08.366448 1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://
0.0.0.0:10053 [rcache 0]
I0224 19:28:08.366517 1 logs.go:41] skydns: ready for queries on cluster.local. for udp://
0.0.0.0:10053 [rcache 0]
I0224 19:28:38.366301 1 dns.go:172] Ignoring error while waiting for service default/kubernetes: Get
https://10.254.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp
10.254.0.1:443: i/o timeout. Sleeping 1s before retrying.
E0224 19:28:38.367639 1 reflector.go:214] pkg/dns/dns.go:154: Failed to list *api.Endpoints: Get
https://10.254.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp
10.254.0.1:443: i/o timeout
E0224 19:28:38.368457 1 reflector.go:214] pkg/dns/dns.go:155: Failed to list *api.Service: Get
https://10.254.0.1:443/api/v1/services?resourceVersion=0: dial tcp
10.254.0.1:443: i/o timeout
I0224 19:29:09.367199 1 dns.go:172] Ignoring error while waiting for service default/kubernetes: Get
https://10.254.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp
10.254.0.1:443: i/o timeout. Sleeping 1s before retrying.
E0224 19:29:09.368291 1 reflector.go:214] pkg/dns/dns.go:154: Failed to list *api.Endpoints: Get
https://10.254.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp
10.254.0.1:443: i/o timeout
E0224 19:29:09.369085 1 reflector.go:214] pkg/dns/dns.go:155: Failed to list *api.Service: Get
https://10.254.0.1:443/api/v1/services?resourceVersion=0: dial tcp
10.254.0.1:443: i/o timeout
I0224 19:29:14.031575 1 server.go:133] Received signal: terminated, will exit when the grace period ends
I0224 19:29:40.368220 1 dns.go:172] Ignoring error while waiting for service default/kubernetes: Get
https://10.254.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp
10.254.0.1:443: i/o timeout. Sleeping 1s before retrying.
E0224 19:29:40.369086 1 reflector.go:214] pkg/dns/dns.go:154: Failed to list *api.Endpoints: Get
https://10.254.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp
10.254.0.1:443: i/o timeout
E0224 19:29:40.369799 1 reflector.go:214] pkg/dns/dns.go:155: Failed to list *api.Service: Get
https://10.254.0.1:443/api/v1/services?resourceVersion=0: dial tcp
10.254.0.1:443: i/o timeout
Logs from kube-dns dnsmasq:
dnsmasq[1]: started, version 2.76 cachesize 1000
dnsmasq[1]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
dnsmasq[1]: using nameserver 127.0.0.1#10053
dnsmasq[1]: read /etc/hosts - 7 addresses
Logs from kube-dns healthz:
2017/02/24 19:14:33 Healthz probe on /healthz-kubedns error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-02-24 19:14:33.130111978 +0000 UTC, error exit status 1
...
2017/02/24 19:30:03 Healthz probe on /healthz-dnsmasq error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-02-24 19:29:43.134054709 +0000 UTC, error exit status 1
Thanks,
Thomas