Network Policy to limit open connections per pod

54 views
Skip to first unread message

jtro...@gmail.com

unread,
Mar 28, 2018, 10:54:35 AM3/28/18
to Kubernetes user discussion and Q&A
Is there anything similar to a network policy that limits x open connections per pod?

During a 100k TPS load test, a subset of pods had errors connecting to a downstream service and we maxed out the nf_conntrack table (500k) which affected the rest of the pods on each node that had this issue - which happened to be 55% of the cluster.

Besides handling this at the application level, I wanted to protect the cluster as a whole so that not one deployment can affect the entire cluster in this manner.

Thanks for any help.

-Jonathan

Rodrigo Campos

unread,
Mar 28, 2018, 11:44:29 AM3/28/18
to kubernet...@googlegroups.com
Just curious, but why not change the contrack limit?
--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.

Tim Hockin

unread,
Mar 28, 2018, 11:50:19 AM3/28/18
to Kubernetes user discussion and Q&A
The simple answer is to change the limit.  The more robust answer would be toake the limit more dynamic, but that can fail at runtime if, for example, kernel memory is fragmented.  Also I am not sure that tunable can be live-adjusted.  

:(

We have ideas about how to be more frugal with conntrack records, but have not had anyone follow up on that work.  So much to do.

Jonathan Tronson

unread,
Mar 28, 2018, 11:57:55 AM3/28/18
to kubernet...@googlegroups.com
When the downstream service went south we rapidly went from ~25k to 500k in the table in less than a minute. I wouldn’t think there would be a reasonable number to set that to that could prevent the entire node from being affected. TPS was so high that catastrophe could be delayed a bit but not prevented by a higher number.

We also noticed that when this breakdown occurs the network traffic and CPU utilization on our DNS servers increases tremendously.
You received this message because you are subscribed to a topic in the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-users/ZlteifiQO8c/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kubernetes-use...@googlegroups.com.

Matthias Rampke

unread,
Mar 29, 2018, 3:24:01 AM3/29/18
to kubernet...@googlegroups.com
Did you check what the tracked connections were? We had to massively reduce the timeouts on UDP tracking, but this got things under control well. Check whether your application may be doing one DNS request per transaction / outgoing request, this happens in many standard libraries unless you take great care.

/MR

On Wed, Mar 28, 2018, 17:57 Jonathan Tronson <jtro...@gmail.com> wrote:
When the downstream service went south we rapidly went from ~25k to 500k in the table in less than a minute. I wouldn’t think there would be a reasonable number to set that to that could prevent the entire node from being affected. TPS was so high that catastrophe could be delayed a bit but not prevented by a higher number.

We also noticed that when this breakdown occurs the network traffic and CPU utilization on our DNS servers increased tremendously.

jtro...@gmail.com

unread,
Apr 5, 2018, 1:56:30 PM4/5/18
to Kubernetes user discussion and Q&A

After installing conntrack, I dumped the list of connections by status and created a pivot table in excel to group the connections by source and destination. I could see that a vast majority of the TCP connections were in SYN_SENT or TIME_WAIT and the source IP was the flannel ip of each of nodes (10.x.x.0) of our cluster - and the destination IP/Ports were just 2 pods - so that deployment was getting crushed by connections and it couldnt respond due to a downstream system being unavailable. So connections were backing up in the form of SYN_SENT and TIME_WAIT - and we hit our 500k limit for that ec2 instance (c4.4xlarge). We are looking at some form of a circuit breaker framework, and also looking at limiting connections at the Spring Boot/tomcat level. It would be nice if we could also do that as a Network Policy in kube.

Reply all
Reply to author
Forward
0 new messages