Multi-AZ vs. Single-AZ Clusters on AWS: Cross-AZ Data Transfer Costs?

831 views
Skip to first unread message

Henning Jacobs

unread,
Oct 19, 2016, 4:04:00 PM10/19/16
to Kubernetes user discussion and Q&A
We are still in the discussion of setting up our Kubernetes clusters on AWS. The default seems to be multi-AZ for most people (http://kubernetes.io/docs/admin/multiple-zones/), but this leads to unnecessary cross-AZ costs:
  • AWS charges for all inter-AZ data transfer (0.01 USD per GB)
  • AWS ELB integration will use cross-zone ELBs
  • Kubernetes scheduler spreads pods across zones (failure domains)
  • Kubernetes nodePort iptables forwarding will load balance across all pods disregarding AZs
=> You end up with unnecessary cross-AZ traffic even for simple stateless apps as traffic is going through ELB to one node (e.g. in AZ "a") and forwarded to another AZ (e.g. "b").

This might not be an issue for small scale users, but data transfer costs can quickly explode with scale.... :-/

Anybody out there running Multi-AZ Kubernetes on AWS and checking/optimizing their data transfer costs?

Thanks.

- Henning

Rodrigo Campos

unread,
Oct 19, 2016, 4:37:49 PM10/19/16
to kubernet...@googlegroups.com
I'm not using multi AZ, but the behavior of going from one node to the other to reach a pod is much reduced with the still in the works (and alpha I think) to preserve source IP.

The way I've seen it was being used to preserve source IP is, basically, to avoid the extra hop and the lb just gives the package to the current node.

That might help reduce bw too.

But I don't know what folks are doing for bw costs in multi AZ clusters. Just had that idea when reading :)
--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.

Henning Jacobs

unread,
Oct 20, 2016, 2:27:26 AM10/20/16
to Kubernetes user discussion and Q&A
I guess you refer to https://github.com/kubernetes/kubernetes/pull/29409 ? I did not know about it, so thanks for the hint (that made me search on GH..) :-)

How do you run your cluster if it's not multi-AZ? Do you only run one cluster in one AZ or one cluster per AZ + federation (which seems to be immature/pre-alpha from what we tested)?

Can you also describe your reasons why you chose single-AZ?

Casey Lucas

unread,
Oct 20, 2016, 3:51:40 PM10/20/16
to Kubernetes user discussion and Q&A

This doesn't specifically answer your question, but since you're "in discussion", something else to think about:

If you are planning on running multi-AZ and want your etcd cluster to be HA, then you'll need to have more than two AZs because the etcd cluster needs a quorum.  For example, if you deploy a three node etcd cluster (1 member in AZ A and 2 other members in AZ B) and you loose connectivity between AZs, then your etcd cluster will not function.

We decided not to use cross-AZ cluster and instead go with one cluster per AZ.  Reasons:

- we currently only deploy to two AZs and would need to go to three+ for etcd
- strict separation for upgrades.  No need to worry about kubernetes related upgrade problems with multiple separate clusters. Blow one entire cluster away, install fresh new latest kubernetes, and then do the other.  You'll need to do some external load balancing of course.
- no need to worry about cross-AZ latency. For example, etcd should be tuned to allow for inter-AZ latency.
- no kubernetes related cross-AZ costs.

HTH,
Casey

Casey Lucas

unread,
Oct 20, 2016, 4:07:22 PM10/20/16
to kubernet...@googlegroups.com

I realized my etcd comments were misleading.  To clarify, if you have a single etcd setup like this:

 

AZ A: etcd0, etcd1

AZ B: etcd2

 

… and AWS loses AZ A, then your AZ B etcd side of the cluster will not be available because the AZ A failure broke quorum.

 

Sorry for any confusion,

Casey

--
You received this message because you are subscribed to a topic in the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kubernetes-users/TS-ksof_Uds/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.

Henning Jacobs

unread,
Oct 21, 2016, 2:47:28 AM10/21/16
to Kubernetes user discussion and Q&A
Thanks Casey!

Yes, we are aware of the etcd problems with less than three AZs.

How do you manage cluster-balancing, are you using a federated API server or do it from client side? Do you do DNS round robin for your services or do you run one cross-AZ ELB for them?

I fear that federation is not ready yet (e.g. some basics like auth webhook are not implemented) and cross-cluster-balancing would need some thought, i.e. essentially another layer of scheduling (which we would probably need to build on our own).

Casey Lucas

unread,
Oct 21, 2016, 7:45:41 AM10/21/16
to kubernet...@googlegroups.com

We are not yet in production with kubernetes (but are working toward that goal) so keep that in mind.  We intend to do deploy the same containers to multiple clusters (AZs) for HA. Once federation is ready, I imagine we’ll start using that feature. Like, you I don’t think it’s ready for prime time just yet but am excited about the possibilities. We do not run ELB for existing, non-k8s endpoints. Instead we run multiple nginx instances per AZ setup to prefer backends that are in the same AZ (lower cross-AZ fees) but failover to cross AZ if needed. We use RR DNS. At this time, we intend to do the same for k8s.

 

-casey

 

From: <kubernet...@googlegroups.com> on behalf of Henning Jacobs <henning...@zalando.de>
Reply-To: <kubernet...@googlegroups.com>
Date: Friday, October 21, 2016 at 1:47 AM
To: Kubernetes user discussion and Q&A <kubernet...@googlegroups.com>
Subject: [kubernetes-users] Re: Multi-AZ vs. Single-AZ Clusters on AWS: Cross-AZ Data Transfer Costs?

 

Thanks Casey!

--

Henning Jacobs

unread,
Oct 21, 2016, 2:20:18 PM10/21/16
to Kubernetes user discussion and Q&A
Thanks for the insights. Yes, I think one cluster per AZ would be the future, but we'll probably go with a simple multi-AZ approach for now.
We are planning to migrate first smaller microservices to Kubernetes in the next week(s) to gain production experience (that's what we are desperately lacking).

Are you going to be at Kubecon? I would be more than happy to chat in person (sig-aws apparently meets Tuesday evening).
Reply all
Reply to author
Forward
0 new messages