Are Mega Data Centers Necessary?...

Rao Dronamraju

unread,

Nov 7, 2009, 4:32:45 PM11/7/09

to cloud-c...@googlegroups.com

From the beginning it has been professed that CC for economies of scale reason need mega data centers.

It certaily makes sense from economies of scale perspective but is the physical maga data center(s) only way to achieve the economies of scale?...

Why not make mini-clouds from reasonably sized data centers (say 10,000+ as opposed to 100,000+) and connect them up in such a way that they form a virtual mega data center.

If the mini-data centers run out of resource elasticity, they seamlessly borrow / migrate resources from other nearby mini data centers.

Ofcourse, one major issue here is, can WAN bandwidth support this?....

That is why my earlier post some time back about why spend $75 - $100 billion on healthcare?....why not internet2?.

By increasing the WAN/Internet2 bandwidth, you not only help the CC industry but also the internet/web industry.

Now, which has better ROI?...investing in Healthcare or the Internet2?....

Ray DePena

unread,

Nov 7, 2009, 6:22:11 PM11/7/09

to cloud-c...@googlegroups.com

If noone has called it yet, I call dibs on CAN (Cloud Area Network) ;-)

--
Ray DePeña
Director, Stealth Startups
Strategic Business Advisor

http://www.linkedin.com/in/raydepena
Sacramento, CA 95630
(916) 941-5558

yarapavan

unread,

Nov 8, 2009, 4:53:25 AM11/8/09

to Cloud Computing

I vaguely remember Sun's container datacenter approach in this case.

On Nov 8, 4:22 am, Ray DePena <ray.dep...@gmail.com> wrote:
> If noone has called it yet, I call dibs on CAN (Cloud Area Network) ;-)
>

> On Sat, Nov 7, 2009 at 1:32 PM, Rao Dronamraju <rao.dronamr...@sbcglobal.net
>
>
>
> > wrote:
> > *From the beginning it has been professed that CC for economies of scale
> > reason need mega data centers.*
>
> > *It certaily makes sense from economies of scale perspective but is the

> > physical maga data center(s) only way to achieve the economies of scale?...

> > *
>
> > * *
>
> > *Why not make mini-clouds from reasonably sized data centers (say 10,000+

> > as opposed to 100,000+) and connect them up in such a way that they form a

> > virtual mega data center.*
>
> > *If the mini-data centers run out of resource elasticity, they seamlessly
> > borrow / migrate resources from other nearby mini data centers.*
>
> > * *
>
> > *Ofcourse, one major issue here is, can WAN bandwidth support this?....*
>
> > * *
>
> > *That is why my earlier post some time back about why spend $75 - $100
> > billion on healthcare?....why not internet2?.*
>
> > *By increasing the WAN/Internet2 bandwidth, you not only help the CC
> > industry but also the internet/web industry.*
>
> > *Now, which has better ROI?...investing in Healthcare or the
> > Internet2?....*
>
> > * *
>
> > * *

Bob Sutterfield

unread,

Nov 8, 2009, 11:11:36 AM11/8/09

to cloud-c...@googlegroups.com

It's not just the WAN bandwidth (which isn't itself a trivial problem), it's the latency between the workload and its storage. You can light the fiber as fat as you like with DWDM or whatever, but physics still controls the round-trip travel time. Different workloads are differently suited to different distances from their storage.

For many workloads, it's important to be very near their storage. This leads even to re-architecting the datacenter network from a routed hierarchy to a flat switched mesh, just to reduce the number of switching and routing networking devices in the path. In the public cloud (e.g. AWS) there's a strong performance incentive to use ephemeral instance storage (e.g. EC2) for primary processing, rather than pay the latency penalty for access to permanent storage services (e.g. S3).

So yes, mega data centers are necessary for mega workloads and mega storage, until we either solve the speed of light problem or get better at sharding workloads and storage.

See the hierarchy of storage characteristics (latency, bandwidth, capacity) in Jeff Dean's preso (particularly slides 5-8 and 24) at http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf

--

Bob Sutterfield

b...@sutterfields.us

http://www.linkedin.com/in/BobSutterfield

Peglar, Robert

unread,

Nov 8, 2009, 11:46:50 AM11/8/09

to cloud-c...@googlegroups.com

Dean’s talk is very good. Only quibble I have with it is his citation of disk failures – pages 9 & 10. Instead of 1-5% disk AFR, it should be 0.01-0.05% disk AFR, and on page 10, O(1000s) is should be O(10s) in a year, using best practices and storage elements.

The reason that disk AFR is very important is this; they are permanent, not ephemeral. CPU failures (servers) are one thing, since they hold very little data; disk failures, OTOH, are critical to avoid. If a design assumes high disk AFR, it’s forced into replicating data at least 2x (if not 3x or more, present in some designs today) just to overcome same. Plus, we know that over half of reported disk failures, upon close examination (e.g. running drive-level diagnostics) are false positive, which makes the problem much worse. That is non-optimal both in technical and economic terms.

Rob

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.698 / Virus Database: 270.14.53/2487 - Release Date: 11/08/09 01:37:00

Rao Dronamraju

unread,

Nov 8, 2009, 1:09:35 PM11/8/09

to cloud-c...@googlegroups.com

Bob,

Yes, I agree that latency is definitely a problem. But in life we all make compromises especially when it comes to economics. When we cannot buy a mansion we settle for a large house. Our expectations are adjusted to the reality. We do not expect to drive cars that go more than 200+ mph or planes that go 1000+ mph (at this time). So considering the fact that there are hundreds if not thousands of hosting providers across the country and the world in existance today with mini-cloud size facilities, is it not lot more economical (especailly after an almost a great depression) to make virtual mega data centers out of them than build these huge/mega data centers. Yes, you certainly need to “get better at sharding workloads and storage.” as you say. Also thanks for the great slides.

From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Bob Sutterfield
Sent: Sunday, November 08, 2009 10:12 AM
To: cloud-c...@googlegroups.com
Subject: [ Cloud Computing ] Re: Are Mega Data Centers Necessary?...

It's not just the WAN bandwidth (which isn't itself a trivial problem), it's the latency between the workload and its storage. You can light the fiber as fat as you like with DWDM or whatever, but physics still controls the round-trip travel time. Different workloads are differently suited to different distances from their storage.

Bob Sutterfield

unread,

Nov 8, 2009, 4:59:00 PM11/8/09

to cloud-c...@googlegroups.com

Robert Peglar wrote:

The reason that disk AFR is very important is this; they are permanent, not ephemeral. CPU failures (servers) are one thing, since they hold very little data; disk failures, OTOH, are critical to avoid.

In Google's designs, all hardware at every level of scope and scale is assumed to be ephemeral. This runs from memory chip bit error rates to data center power grid availability.

If a design assumes high disk AFR, it’s forced into replicating data at least 2x (if not 3x or more, present in some designs today) just to overcome same. Plus, we know that over half of reported disk failures, upon close examination (e.g. running drive-level diagnostics) are false positive, which makes the problem much worse. That is non-optimal both in technical and economic terms.

If the highest goal of the design is overall system availability and the second highest goal is low latency to each request, it makes sense to pay more in replication multipliers and in maintenance costs (diagnostic and inventory and time to return-to-service). Also, by now Google doesn't assume disk (or any other) failure rates, they measure their fleet's historical experience and project future expectations. So their architecture and software designs and their staffing levels are well informed by data.

Alan Ho

unread,

Nov 9, 2009, 12:19:10 AM11/9/09

to cloud-c...@googlegroups.com

Another reason why you may want multiple data centers is the concept of "computing at the edge", which means to bring computing as close to the customer as possible. While this is theoretically possible, I personally think there are a few technologies that need to be baked into the applications on "DAY 1".

1. Cost based routing / cost based allocation- without cost-based routing, requests end up going to the wrong datacenter, hence you lose the benefit of distributed datacenters
2. Intelligent replication - Assuming that it is impossible to have high bandwidth between each data center, high-read applications (e.g. Amazon) needs to have the ability to read from a local datacenter, and have the data replicated to other datacenters as necessary.
3. Multi-master systems - Assuming that it is impossible to have high bandwidth between each data center, high-write applications (e.g. gmail) needs to have the ability to write to the local datacenter, and have the data replicated to other datacenters as necessary. The traditional database does not support this !!!

My opinion is that high-bandwidth interconnects between data-centers is not a tractable problem for public clouds. Its much better if the data locality issue is tackled upfront instead. Now latency between data-centers is an interesting problem - it might make sense to ask "what type of latency" is important to you. For many applications, it is the TP99 (99th percentile of latency) that matters for each request. In those cases, the latency is typically dominated by bandwidth between data-centers. That's why if you want good latency, it makes sense to build datacenter locality into your application -> but the more datacenters that you have the harder it makes the data to be local.

In my opinion, it is far too early for the average developer to tackle these issues, so it makes sense to go with mega-data centers for each "region" (Europe / NA / Asia / etc). 3 mega-datacenters per region is probably the sweet-spot for application builders because it balances the need for redundancy & and the ease developing applications with datacenter locality.

Regards,
Alan Ho

Peglar, Robert

unread,

Nov 9, 2009, 6:17:30 AM11/9/09

to cloud-c...@googlegroups.com

@Bob

Can’t disagree with your last paragraph. But, meeting that goal given massive increases (non-linear) in data using high disk AFR is increasingly difficult as more FTE labor has to be applied just to keep the disk farm running. Disks may be cheap, but humans are expensive.

Plus, if the true goal is overall system availability, at some point the concept of reliable s/w on top of unreliable h/w falls apart, unless you have an endless supply of cheap humans for break/fix. What is really needed is reliable s/w on top of reliable h/w; then there are far fewer scaling issues. If the disk h/w is autonomic, you have a shot at scalability.

Of course, all this is predicated on the assumption that optimizing for low FTE cost is a goal. It may very well not be in Google’s case.

Rob

Robert Peglar
Vice President, Technology, Storage Systems Group
Xiotech Corporation
Robert...@xiotech.com
952 983 2287 (Office)
314 308 6983 (Mobile)
636 532 0828 (Fax)
www.xiotech.com : Toll-Free 866 472 6764

From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Bob Sutterfield
Sent: Sunday, November 08, 2009 3:59 PM
To: cloud-c...@googlegroups.com
Subject: [ Cloud Computing ] Re: Are Mega Data Centers Necessary?...

Robert Peglar wrote:

No virus found in this incoming message.
Checked by AVG - www.avg.com

Version: 9.0.698 / Virus Database: 270.14.53/2487 - Release Date: 11/08/09 13:39:00

Bob Sutterfield

unread,

Nov 9, 2009, 9:50:03 AM11/9/09

to cloud-c...@googlegroups.com

Robert Peglar wrote:

meeting that goal [overall availability] given massive increases (non-linear) in data using high disk AFR is increasingly difficult as more FTE labor has to be applied just to keep the disk farm running. Disks may be cheap, but humans are expensive... all this is predicated on the assumption that optimizing for low FTE cost is a goal. It may very well not be in Google’s case.

One way to reduce human costs is to staff for average needs, not peak, which means leveling the demand. That means building resilient systems that can survive a while, meeting SLAs, even with broken components. So a spindle that packs in at 2:00am isn't an emergency requiring immediate attention. It can be left in place until the day shift arrives, and replaced as part of their (optimally sorted) batch repair process.

Rao Dronamraju

unread,

Nov 9, 2009, 12:46:20 PM11/9/09

to cloud-c...@googlegroups.com

“Another reason why you may want multiple data centers is the concept of "computing at the edge", which means to bring computing as close to the customer as possible. While this is theoretically possible, I personally think there are a few technologies that need to be baked into the applications on "DAY 1".”

There are lot of hosting providers who are already close/computing to the edge, their customers. This infrastructure has been put in place in the last 15 years (ever since internet/web became ubiquitous). These facilities and their proximity to customers should alleviate the latency problem. They also have worked out the BW problems over the last 15 years. But I do agree with Bob that if you go this route, the inter-miniCloud/datacenter latency would be a problem and consequently you need to do workload/processing proximity optimization. In addition, this distributed architecture distributes risk, BW congestion associated with mega data centers. Even with mega data centers we will come across BW and latency issues. Because in order to run them efficiently, they will have to aggregate hundreds of multi-clients from across the country and continents. This in itself will cause both BW and latency issues.

Also another issue comes to mind, considering that the CC industry is headed the private cloud route atleast in the next 3 to 5 years, the private clouds are distributed across the country. So in the next phase, the hybrid clouds will augment the extra capacity of these private clouds and they will also possibly be closer to their customers – the private clouds. So economics is driving the architecture/topology of the clouds to be distributed rather than consolidated, monolithic mega data centers. It will be intereting to see what happens in the next 5 years.

From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Alan Ho
Sent: Sunday, November 08, 2009 11:19 PM
To: cloud-c...@googlegroups.com
Subject: [ Cloud Computing ] Re: Are Mega Data Centers Necessary?...

Another reason why you may want multiple data centers is the concept of "computing at the edge", which means to bring computing as close to the customer as possible. While this is theoretically possible, I personally think there are a few technologies that need to be baked into the applications on "DAY 1".

Jayarama Shenoy

unread,

Nov 9, 2009, 1:20:40 PM11/9/09

to cloud-c...@googlegroups.com

Hi

Two comments:
- SSD's offer an almost 2 order improvement in random read performance (most relevant to a search engine type application). And despite the flattening of the Zipfian distribution (which appears to be stabilizing now?), a quantitative analysis of real web search workloads (well, not Googles - there arent public traces for those that I am aware of) shows significant amenability to dynamic tiering across SSD & HDD hybrid storage.

I.e. the close synergy between multi-replication for performance and multi-replication for availability which served Google so well in a 2000-2003 architecture needs to be revisited IMHO. It was quite appropriate (perhaps) then with HDD being your only medium, but there are more tools to be brought to bear in 2009-2012. (We don't publicly know what GOOG is doing at the moment, probably not standing still is my guess).

- Not frequently mentioned is that one weakness of the "nine fives" architecture of hordes of modest performance servers is power consumption. In some ways Google type architecture is trading off a lot of capex for a sub-optimal opex.

It is hard for me to believe that the optimal solutions are exclusively at either end point (i.e. the most vanilla servers on one side and only the best EMC/NetApp/HP/IBM gear that money can buy on the other) and not somewhere in between.

This aligns with my first comment. In some ways, it is worse to keep tossing thousands of servers at a problem (because it burns up a lot of power and that is actually more precious to us all than money) and replicating like crazy than to spend lots of (and certainly even a bit of) money on doing something less brute force.

Jay

Date: Sun, 8 Nov 2009 13:59:00 -0800

Subject: [ Cloud Computing ] Re: Are Mega Data Centers Necessary?...

From: b...@sutterfields.us
To: cloud-c...@googlegroups.com

If the highest goal of the design is overall system availability and the second highest goal is low latency to each request, it makes sense to pay more in replication multipliers and in maintenance costs (diagnostic and inventory and time to return-to-service). Also, by now Google doesn't assume disk (or any other) failure rates, they measure their fleet's historical experience and project future expectations. So their architecture and software designs and their staffing levels are well informed by data.

--

Bob Sutterfield

b...@sutterfields.us

http://www.linkedin.com/in/BobSutterfield

Windows 7: Unclutter your desktop. Learn more.

Reply all

Reply to author

Forward