Kubernetes on AWS: External load balancer cannot reach service about 10% of the time

83 views
Skip to first unread message

Amit Saha

unread,
Sep 13, 2018, 2:07:37 AM9/13/18
to Kubernetes user discussion and Q&A
Hi all,

I posted this query on stack overflow [1], but hoping to see if the community here has any insights. I have a k8s cluster setup via kops on AWS. My initial setup was a single node, on which I deployed a service and 
exposed it via an internal AWS load balancer. Once the load balancer was created, the AWS console showed that it could not reach the service on the instance. I verified that the service was running and reachable 
via the Cluster IP from the master, but couldn't use instance IP and the `NodePort`. 

Having faced this issue before, I created a new node. The load balancer now shows that it can reach one of two instances. That is, it can now reach the service via the new Node IP. I still had two pods running on my old node. 
Now, I increased the number of replicas of my service by 1, and hence I had pods running on the new node too. Now, my load balancer shows that it can reach both instances. 
My theory is that, the old node is forwarding traffic to the pods on the new nodes since the iptables rules were modified by kube-proxy to probabilistically forward traffic to one of the two 
containers running on different nodes.


FWIW, I am using amazon-vpc-cni-k8s. I have posted more details including the IPtables rules/traces here: https://gist.github.com/amitsaha/81072353afffb5dd6edcfc71c7a47405#file-aws-internal-elb-instance-issue-md


I can reproduce it about 1 in 20 times. 

Has anyone seen this issue before?

Thanks for any insights.

- Amit.
Reply all
Reply to author
Forward
0 new messages