Flannel pods CrashLoopBackOff when using kubeadm to install kubernetes cluster on ubuntu 16.04

Chen Li

unread,

Mar 2, 2017, 4:25:45 AM3/2/17

to kubernetes-sig-cluster-lifecycle

Hi all,

I'm following the guide:https://kubernetes.io/docs/getting-started-guides/kubeadm/ to install k8s on 3 nodes ubuntu 16.04.

On the master node, I have run

a) kubeadm init --pod-network-cidr 10.244.0.0/16

b) download :https://github.com/coreos/flannel/blob/master/Documentation/kube-flannel.yml

c) kubectl create -f kube-flannel.yml

Then I run the join command on the other 2 nodes.

But, the flannel pod on the other 2 nodes always down in a while.

NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE

default kube-flannel-ds-9cx5f 1/2 CrashLoopBackOff 7 15m 174.37.107.68 glb-shanghai-jason-performance-006

default kube-flannel-ds-fwfl3 2/2 CrashLoopBackOff 7 14m 173.193.128.142 glb-shanghai-jason-performance-005

default kube-flannel-ds-vnx2t 2/2 Running 0 15m 173.193.128.154 glb-shanghai-jason-performance-004

kube-system dummy-2088944543-42419 1/1 Running 0 16m 173.193.128.154 glb-shanghai-jason-performance-004

kube-system etcd-glb-shanghai-jason-performance-004 1/1 Running 0 16m 173.193.128.154 glb-shanghai-jason-performance-004

kube-system kube-apiserver-glb-shanghai-jason-performance-004 1/1 Running 0 16m 173.193.128.154 glb-shanghai-jason-performance-004

kube-system kube-controller-manager-glb-shanghai-jason-performance-004 1/1 Running 0 16m 173.193.128.154 glb-shanghai-jason-performance-004

kube-system kube-discovery-1769846148-jdhbh 1/1 Running 0 16m 173.193.128.154 glb-shanghai-jason-performance-004

kube-system kube-dns-2924299975-gf0mm 4/4 Running 0 16m 10.244.0.2 glb-shanghai-jason-performance-004

kube-system kube-proxy-0rvnz 1/1 Running 0 16m 173.193.128.154 glb-shanghai-jason-performance-004

kube-system kube-proxy-3bjxb 1/1 Running 0 15m 174.37.107.68 glb-shanghai-jason-performance-006

kube-system kube-proxy-452dz 1/1 Running 0 14m 173.193.128.142 glb-shanghai-jason-performance-005

kube-system kube-scheduler-glb-shanghai-jason-performance-004 1/1 Running 0 16m 173.193.128.154 glb-shanghai-jason-performance-004

Use docker command to check the pod, I can get:

docker logs 7589cb83f2a2

E0302 09:09:51.398785 1 main.go:127] Failed to create SubnetManager: error retrieving pod spec for 'default/kube-flannel-ds-9cx5f': Get https://10.96.0.1:443/api/v1/namespaces/default/pods/kube-flannel-ds-9cx5f: dial tcp 10.96.0.1:443: i/o timeout

The 10.96.0.1 is the kubernet service IP:

kubectl get svc

NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE

kubernetes 10.96.0.1 <none> 443/TCP 17m

Anyone can help me here ???

Thanks,

-chen

Pavel Moukhataev

unread,

Mar 2, 2017, 7:01:03 PM3/2/17

to kubernetes-sig-cluster-lifecycle

It seems that you run into the same issue I did https://groups.google.com/forum/#!topic/kubernetes-sig-cluster-lifecycle/Ze_4kxxdeA4.

10.96.0.1 is service network. And flannel is trying to access api server. There are iptables rules that are used to redirect this to real address.

To make sure this is a problem can you please attach the following:

1) list of network interfaces you have (ip address)

2) list of routes (ip route)

but you didn't change interface so default is to be used in your case.

3) provide iptables-save

4) check if api server is from within master node (with different address), from worker node with different address, from worker node with 10.96.0.1 address and try another network interfaces - like curl --interface <> ...

четверг, 2 марта 2017 г., 12:25:45 UTC+3 пользователь Chen Li написал:

Chen Li

unread,

Mar 2, 2017, 8:40:06 PM3/2/17

to kubernetes-sig-cluster-lifecycle

Get from one of the 2 slave nodes:

1) list of network interfaces you have (ip address)

ip address

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000

link/ether 00:25:90:f5:46:b2 brd ff:ff:ff:ff:ff:ff

inet 10.6.27.89/26 brd 10.6.27.127 scope global eth0

valid_lft forever preferred_lft forever

inet6 fe80::225:90ff:fef5:46b2/64 scope link

valid_lft forever preferred_lft forever

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000

link/ether 00:25:90:f5:46:b3 brd ff:ff:ff:ff:ff:ff

inet 174.37.107.68/27 brd 174.37.107.95 scope global eth1

valid_lft forever preferred_lft forever

inet6 fe80::225:90ff:fef5:46b3/64 scope link

valid_lft forever preferred_lft forever

4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000

link/ether 00:25:90:f5:46:b4 brd ff:ff:ff:ff:ff:ff

5: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000

link/ether 00:25:90:f5:46:b5 brd ff:ff:ff:ff:ff:ff

6: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN group default qlen 1000

link/ether 1e:12:58:46:14:b5 brd ff:ff:ff:ff:ff:ff

7: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default

link/ether 02:42:6c:9a:20:59 brd ff:ff:ff:ff:ff:ff

inet 172.17.0.1/16 scope global docker0

valid_lft forever preferred_lft forever

2) list of routes (ip route)

ip route

default via 174.37.107.65 dev eth1 onlink

10.0.0.0/8 via 10.6.27.65 dev eth0

10.6.27.64/26 dev eth0 proto kernel scope link src 10.6.27.89

161.26.0.0/16 via 10.6.27.65 dev eth0

172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown

174.37.107.64/27 dev eth1 proto kernel scope link src 174.37.107.68

3) provide iptables-save

iptables-save

# Generated by iptables-save v1.6.0 on Thu Mar 2 19:37:16 2017

*nat

:PREROUTING ACCEPT [0:0]

:INPUT ACCEPT [0:0]

:OUTPUT ACCEPT [0:0]

:POSTROUTING ACCEPT [0:0]

:DOCKER - [0:0]

:KUBE-MARK-DROP - [0:0]

:KUBE-MARK-MASQ - [0:0]

:KUBE-NODEPORTS - [0:0]

:KUBE-POSTROUTING - [0:0]

:KUBE-SEP-IT2ZTR26TO4XFPTO - [0:0]

:KUBE-SEP-NEA4B7MEWJDH6SEV - [0:0]

:KUBE-SEP-YIL6JZP7A3QYXJU2 - [0:0]

:KUBE-SERVICES - [0:0]

:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]

:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]

:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]

-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES

-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER

-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES

-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER

-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING

-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE

-A DOCKER -i docker0 -j RETURN

-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000

-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000

-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE

-A KUBE-SEP-IT2ZTR26TO4XFPTO -s 10.244.0.2/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ

-A KUBE-SEP-IT2ZTR26TO4XFPTO -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.0.2:53

-A KUBE-SEP-NEA4B7MEWJDH6SEV -s 173.193.128.154/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ

-A KUBE-SEP-NEA4B7MEWJDH6SEV -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-NEA4B7MEWJDH6SEV --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 173.193.128.154:6443

-A KUBE-SEP-YIL6JZP7A3QYXJU2 -s 10.244.0.2/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ

-A KUBE-SEP-YIL6JZP7A3QYXJU2 -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.0.2:53

-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y

-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU

-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4

-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS

-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-SEP-IT2ZTR26TO4XFPTO

-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-NEA4B7MEWJDH6SEV --mask 255.255.255.255 --rsource -j KUBE-SEP-NEA4B7MEWJDH6SEV

-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-NEA4B7MEWJDH6SEV

-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -j KUBE-SEP-YIL6JZP7A3QYXJU2

COMMIT

# Completed on Thu Mar 2 19:37:16 2017

# Generated by iptables-save v1.6.0 on Thu Mar 2 19:37:16 2017

*filter

:INPUT ACCEPT [8:540]

:FORWARD ACCEPT [0:0]

:OUTPUT ACCEPT [19:4968]

:DOCKER - [0:0]

:DOCKER-ISOLATION - [0:0]

:KUBE-FIREWALL - [0:0]

:KUBE-SERVICES - [0:0]

-A INPUT -j KUBE-FIREWALL

-A FORWARD -j DOCKER-ISOLATION

-A FORWARD -o docker0 -j DOCKER

-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

-A FORWARD -i docker0 ! -o docker0 -j ACCEPT

-A FORWARD -i docker0 -o docker0 -j ACCEPT

-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES

-A OUTPUT -j KUBE-FIREWALL

-A DOCKER-ISOLATION -j RETURN

-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP

COMMIT

# Completed on Thu Mar 2 19:37:16 2017

4) check if api server is from within master node (with different address), from worker node with different address, from worker node with 10.96.0.1 address and try another network interfaces - like curl --interface <> ...

curl https://10.36.249.226:443/

curl: (7) Failed to connect to 10.36.249.226 port 443: Connection refused

curl https://173.193.128.154:443/

curl: (7) Failed to connect to 173.193.128.154 port 443: Connection refused

curl https://10.96.0.1:443/

===> Not return in a very long time....

Chen Li

unread,

Mar 2, 2017, 8:42:52 PM3/2/17

to kubernetes-sig-cluster-lifecycle

Tried the command you posted on another page:

url --interface eth1 https://10.96.0.1:443/

curl: (60) server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"

of Certificate Authority (CA) public keys (CA certs). If the default

bundle file isn't adequate, you can specify an alternate file

using the --cacert option.

If this HTTPS server uses a certificate signed by a CA represented in

the bundle, the certificate verification probably failed due to a

problem with the certificate (it might be expired, or the name might

not match the domain name in the URL).

If you'd like to turn off curl's verification of the certificate, use

the -k (or --insecure) option.

curl --interface eth0 https://10.96.0.1:443/

Looks like eth1 can work ???

How to solve this ???

Pavel Moukhataev

unread,

Mar 3, 2017, 6:19:24 AM3/3/17

to kubernetes-sig-cluster-lifecycle

Yep - seems you have the same trouble. You have 2 interfaces - eth0 and eth1 and flannel or proxy setup iptables to use eth1. This is because eth1 is default interface on your host and to send request to 10.36.249.226 eth0 is used. You have route there:

# ip route

10.0.0.0/8 via 10.6.27.65 dev eth0

So you have either use another network for services - like 192.168. In that case default interface will to be used to send data to it.

Or you can just remove that route: 10.0.0.0/8 via 10.6.27.65 dev eth0 - I don't know why do you that network configuration, so can't say exactly. May be there is VPN.

четверг, 2 марта 2017 г., 12:25:45 UTC+3 пользователь Chen Li написал:

Hi all,

Vasista T

unread,

Apr 21, 2017, 7:11:23 AM4/21/17

to kubernetes-sig-cluster-lifecycle

I'm having the same issue on Centos 7 VM machine.

My ip route output shows:

default via 192.168.170.1 dev ens160 proto static metric 100

172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

192.168.170.0/23 dev ens160 proto kernel scope link src 192.168.170.152 metric 100