Moore’s Law: The Future of Cloud Computing from the Bottom Up

64 views
Skip to first unread message

GregO

unread,
Feb 9, 2010, 5:37:52 PM2/9/10
to Cloud Computing
Below is a snippet from my latest blog..

I would like to hear from others what the effect of the "Cloud Chip"
will have on Virtualization and Cloud Computing...

Thanks,
GregO

Moore’s Law: The Future of Cloud Computing from the Bottom Up

I'm a serial entrepreneurial leader. It's an art/science, left/right
brain thing. I have to say that one of the most challenging parts of
creating a compelling strategy, leading a company or building products
is getting people to see the possibilities, transitions and tipping
points. Imagineering the future calls me to look back at what made
companies great -- specifically, how they capitalized on paradigm
shifts while the rest missed it. Reading the recent bestseller,
Outliers, it struck me that, not only do you have to be smart, but you
have to be in the right place with the experience to see and grab the
brass ring.

Moore's Law is one of those history lessons that have traditionally
been a touchpoint that points the way to the future. Simply put,
Moore's law describes a long-term trend in the history of computing
hardware, in which the number of transistors that can be placed
inexpensively on an integrated circuit has doubled approximately every
two years.

Translation: compute power has reliably doubled at a decreased cost
every two years.

In a recent announcement, Intel gave a glimpse of what the future will
look like. The "Cloud" chip will have 48 cores, is available to
Intel's ISV partner today and will be shipping in volume in less then
18 months. The quote from the Intel dude stated that it will increase
the power of what is available today by 10-20 times. Oh my.... Buckle
your seatbelt .... Moore's law just took a giant step up the paradigm.

<the rest at http://blog.appzero.com/ >

Jim Starkey

unread,
Feb 9, 2010, 5:56:02 PM2/9/10
to cloud-c...@googlegroups.com
I'm skeptical, very skeptical. I see the system cost of large number of
cores -- memory contention and memory bandwidth contention, but I don't
see the benefit unless there is an application that needs memory shared
between a large number of threads. 48 cores with 36 stalled waiting on
the memory controller doesn't strike me as a good architecture.

In the absence of such an application (and one that doesn't also require
scale-out), a more useful configuration is a server sled -- single
board, single power supply, on board switch but a half dozen servers
each with dedicated memory and maybe a dedicated local disk. James
Hamilton has written about these. A sled gives the server density of a
massive-core system without the memory contention, and is probably cheaper.

Intel, I think, is pushing what they think they know how to build.
Whether there is any market pull for this, I don't know, but I doubt it.

The future doesn't belong to scale-up (bigger, faster machines) but to
scale-out (more, cheaper machines). Maybe Intel is just looking for
their next boat to miss, but cloud computing will always be happier with
more, cheaper machines (Jan will insist on a high powered logo on each
on, though).

GregO wrote:
> Below is a snippet from my latest blog..
>
> I would like to hear from others what the effect of the "Cloud Chip"
> will have on Virtualization and Cloud Computing...
>
> Thanks,
> GregO
>

> Moore�s Law: The Future of Cloud Computing from the Bottom Up


--
Jim Starkey
Founder, NimbusDB, Inc.
978 526-1376

Jan Klincewicz

unread,
Feb 9, 2010, 7:57:22 PM2/9/10
to cloud-c...@googlegroups.com
Fundamentally, I agree with Jim.  There are very few bits of code out there used for commercial purposes today that can take advantage of a modern Quad processor.  Once a chip has gone multi-core, adding additional cores is not that big a deal, and we see at this point the move from quad core to hex core gives about a 30% increase. This is not linear.  Also, servers are not CPU bound, thus we get diminishing returns pumping them full of cores.  Right now they are out of sync with I/O and Storage.  Memory has become much less expensive, but we have precious few applications that can use more than 4GB since most of the world runs 32-bit code. I quoted 4GB of server memory to customer today at $208.00 U.S.

HPTC outliers : Please refrain from reminding me how your protein folding app can chew up a TB of RAM and 64 cores ...   What's your email server running ?

Virtualization is a paradigm that already shifted 10 years ago.  It IS a good way to squeeze more resources out of a single box, but very few truly Fault Tolerant solutions exist for SMP VMs. Check out Marathon Technologies http://www.marathontechnologies.com/ for that.  Running 80 VMs on a host that will inevitably fail sometime is not what most people do in Production environments.

Jim likes to talk about "sleds."  It must be snowing where he is too.  I find the concept very similar to blade servers (which we also called a paradigm shift five years ago.)   Sharing components like power supplies is a great idea, likewise virtualized I/O.  More efficient servers, though are still an evolutionary and not revolutionary accomplishment. 

A company called 3Leaf  http://www.3leafsystems.com/ seems to have a means of cobbling together a huge SMP box out of smaller ones and sharing memory, but again. aside from HPTC applications, few programs scale up this well in the commodity space.

I am having my cojones busted for suggesting that running high densities of VMs on dirt-cheap white boxes is ill-advised, but I stand by my assertion that if all your eggs are in one basket, don't cheap out on the basket.

Moore's Law has stood the test of quite a bit of time, but at the end of the day,what are the practical ramifications of mega-powerful CPUs without highly available apps to run on them ?   If I were to predict the NEXT real paradigm shift, I would look for a more grid-oriented software architecture, where the loss of a single server is insignificant.   Google seems to operate its search this way, but they are an "outlyer" with the capacity and finances to focus on a specific area of compute.

IMO, Intel is looking to attach the "Cloud" mojo to a product to jump on the bandwagon just like Compaq called itself the "Non-Stop Internet Company" back in 1999. 





On Tue, Feb 9, 2010 at 5:56 PM, Jim Starkey <jsta...@nimbusdb.com> wrote:
I'm skeptical, very skeptical.  I see the system cost of large number of cores -- memory contention and memory bandwidth contention, but I don't see the benefit unless there is an application that needs memory shared between a large number of threads.  48 cores with 36 stalled waiting on the memory controller doesn't strike me as a good architecture.

In the absence of such an application (and one that doesn't also require scale-out), a more useful configuration is a server sled -- single board, single power supply, on board switch but a half dozen servers each with dedicated memory and maybe a dedicated local disk.  James Hamilton has written about these.  A sled gives the server density of a massive-core system without the memory contention, and is probably cheaper.

Intel, I think, is pushing what they think they know how to build.  Whether there is any market pull for this, I don't know, but I doubt it.

The future doesn't belong to scale-up (bigger, faster machines) but to scale-out (more, cheaper machines).  Maybe Intel is just looking for their next boat to miss, but cloud computing will always be happier with more, cheaper machines (Jan will insist on a high powered logo on each on, though).




GregO wrote:
Below is a snippet from my latest blog..

I would like to hear from others what the effect of the "Cloud Chip"
will have on Virtualization and Cloud Computing...

Thanks,
GregO

Moore’s Law: The Future of Cloud Computing from the Bottom Up


--


--
~~~~~
Register Today for Cloud Slam 2010 at official website - http://cloudslam10.com
Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net
Buy 88 conference sessions and panels on cloud computing on DVD at http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com



--
Cheers,
Jan

Ricky Ho

unread,
Feb 9, 2010, 8:11:38 PM2/9/10
to cloud-c...@googlegroups.com
As many of the application data structure can be modeled as a graph (e.g. a large social graph), I wonder if anyone has come across of any Graph DB offerings in the cloud so doing large scale graph algorithms is possible.

Something like the Pregel model from Google ... Here is the model I'm thinking ...
http://horicky.blogspot.com/2010/02/nosql-graphdb.html

Rgds,
Ricky


Rao Dronamraju

unread,
Feb 9, 2010, 9:06:59 PM2/9/10
to cloud-c...@googlegroups.com

"Today's VM landscape breaks down close to 80% Windows, 15% Linux and 5%
other."

Are you sure about this statistic?....With millions of web servers deployed
in the last 10 to 15 years during the internet/web era, and it is my
understanding that majority of these web servers are Linux/Apache/Tomcat
servers, does MS have such a whopping 80 to 15 edge over linux?...Even if
you include application and database servers the ratio seems to be heavily
skewed in favor of MS. Also most DNS, DHCP etc run on Linux more than MS.

About the 48 cores and making use of them, I agree that there are two models
to fully utilize the cores. 1. Load Balancing model and 2. Paralelism model.

The LB model is easy. All you have to do is make sure that the scheduler
keeps the core fully scheduled. The parallelism model is more difficuilt as
we all know that parallelization at this scale is difficuilt from many
different perspectives.

It appears to me that you are talking about Solaris Containers for MS
Windows, is that right?...

I think the biggest challnge would be not only leveraging the cores through
parallelization techniques but management of such mega (millions of VMs)
sprawl. If a cloud has 10,000 servers and ecah one hosts 240 VMs, it is 2,
400,000 VMs in a 10,000 server data center. A 10,000 server data center is
not considered a big cloud, based on what was being built around - MS with
500,000 server cloud in San Antonio and/or Chicago. Amazon has already close
to 50,000 I heard and Racksapce 25,000. So managing all these millions of
VMs is going to be a challenge of galactic proportions.

I disagree with your prediction about Windows MS being at 75%. I think Linux
would be a dominant VM OS in the clouds. In fact it is my guess that the
reason MS is building its own mega data centers is because it knows that
most clouds might use Linux and hence it needs to drive the MS windows based
clouds itself.

My prediction is 2012 is when public clouds will have major growth as the
industry would have solved by that time the security and management problems
that we are seeing today.

Thanks,
GregO

--

Erik Sliman

unread,
Feb 9, 2010, 10:49:17 PM2/9/10
to cloud-c...@googlegroups.com
Jim and Jan,

Your points are all valid, and do primary concerns.  I understand the concerns with memory contention and failover.  Being that this was once an argument on why cheap wintel boxes could never displace the mainframe, my gut tell me that these high density core chips do have a real chance in the cloud, if, for no other reason, then for their ability to save space and reduce power consumption.  So, let's play devil's advocate with the concerns.

Memory contention:  imagine 48 cores and 12-channel memory.  What is the REAL % of wait time for the threads?  It seems potentially contentious, but without intricate knowledge of what % of a core's time requires an open memory channel, I don't really know how contentious it will really be, particularly considering potential core idle time.

Failover:  Why does a VM have to depend on a CPU housing for this?  Let's assume it is using a fiber optic SAN and the memory state of a VM's requiring high availability is mirrored.  Why can't the VM fail over to another box if the box it is in fails?  This is, fundamentally, the type of clustering that made cheap hardware capable of being used to build data centers and supercomputers. 

Xen does it
http://sheepy.org/node/65

Erik
OpenStandards.net

Peglar, Robert

unread,
Feb 10, 2010, 8:14:08 AM2/10/10
to cloud-c...@googlegroups.com
In corporate America, I don't doubt this statistic. Overall, the x86
server market (physical) was 75% Windows/25% other, mostly Linux. In
other words, if a physical server shipped, 75% of them were imaged with
Windows. It is very reasonable to assume the P->V conversion of these
servers follows the original pattern.

This stat ignores the RISC server market, which is roughly 50% AIX and
25/25 for each of Solaris and HP-UX. But the unit shipments of these
servers is far below x86 servers.

As for the OSes/Hypervisors running in cloud compute centers, I would
agree that the majority of them will be Linux/VMware with some HyperV
and Citrix. Not a great amount of native Windows, but never
underestimate Microsoft to take market share as time goes by.

Rob


Robert Peglar
Vice President, Technology, Storage Systems Group
Xiotech Corporation | Toll-Free: 866.472.6764
o 952 983 2287 m 314 308 6983 f 636 532 0828
Robert...@xiotech.com | www.xiotech.com

Peglar, Robert

unread,
Feb 10, 2010, 8:27:23 AM2/10/10
to cloud-c...@googlegroups.com

@Erik,

 

Good post.  I agree on your point about running VM farms using shared storage (SAN) – but alas, some folks insist on running VM farms on DAS, believing it to be cheaper.  It’s not, over the long term, but the perception persists.  Also, those folks would rather transfer risk to the user, i.e. VM outages, than construct optimal farms.

 

As for cheap whiteboxes replacing mainframes, it hasn’t happened.  The venerable machine is still around, still running, still a terrific platform for running tons of Linux instances, for example.  It won’t go away, in my lifetime at least.

 

One thing, though;  VM failover isn’t clustering, it’s failover.  The difference is simple; one instance which can move between N platforms, versus N instances (on N platforms) communicating with each other and managing shared resources.  Both are HA tactics, but the method is different.

 

Rob


Xiotech Website

Jan Klincewicz

unread,
Feb 10, 2010, 9:05:39 AM2/10/10
to cloud-c...@googlegroups.com
Erik:

Failover refers to the ability of a OS instance to RESTART after failure. Failover still means sufficient downtime to re-boot an OS and load apps.

  Mirroring the memory state (and CPU state) is what Marathon EverRun does.(I mentioned them in my last post)  Marathon can do this for physical as well as Virtual Xen servers.  I suppose you could call this a two-node cluster, but the real term in "Fault Tolerant" as opposed to "Highly Available".

A configuration where every VM in a 48-core box was running in lockstep with its twin on a machine powered by a separate grid would be a pretty reliable design.

Jan Klincewicz

unread,
Feb 10, 2010, 9:21:52 AM2/10/10
to cloud-c...@googlegroups.com
Rob:

Clustering is defined differently in the Windows world than the Linux world.  I's guess the majority of Windows clusters are two-node, active-passive.  I understand the distinction, but unfortunately,the definition evolves to repelct whatever the marketers can get the public to accept.  As even a 2-node cluster requires a Quorum disk, shared storage is a must.  So yum really see many users deploying VMs without at least a NAS ?  I have seen a few "one-offs" but you pretty much give up all the great benefits of virtualization when you skip a common storage platform.

Peglar, Robert

unread,
Feb 10, 2010, 10:16:44 AM2/10/10
to cloud-c...@googlegroups.com

Darren Sykes

unread,
Feb 10, 2010, 10:50:33 AM2/10/10
to cloud-c...@googlegroups.com
My understanding was that the VMWare technology to move running machines between storage devices was a direct resut of the work they put in to allow their customers to move from VMFS 2 to 3; At the time it was called DMotion (D for data) internally at VMWare, but was renamed when they decided to make it a production feature.

We (and most people I've spoken to about this) use the feature when reorganising data or moving between storage platforms. Whilst I suppose it would be useful for DAS VMWare environments, I'm certain that's not why it was originally conceived.

________________________________

From: cloud-c...@googlegroups.com on behalf of Peglar, Robert
Sent: Wed 2/10/2010 15:16
To: cloud-c...@googlegroups.com
Subject: RE: [ Cloud Computing ] Moore's Law: The Future of Cloud Computing from the Bottom Up

There are plenty of VMs deployed w/o shared storage, absolutely. This has led to data movers such as VMware SRM, which moves the data for the VM from one disk (or array) to another. Without this piece, VM failover would not be possible. The mere existence of s/w bits like this indicate the demand for time/space tradeoffs, sadly.

Rob

From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Jan Klincewicz
Sent: Wednesday, February 10, 2010 8:22 AM
To: cloud-c...@googlegroups.com
Subject: Re: [ Cloud Computing ] Moore's Law: The Future of Cloud Computing from the Bottom Up

Rob:

Clustering is defined differently in the Windows world than the Linux world. I's guess the majority of Windows clusters are two-node, active-passive. I understand the distinction, but unfortunately,the definition evolves to repelct whatever the marketers can get the public to accept. As even a 2-node cluster requires a Quorum disk, shared storage is a must. So yum really see many users deploying VMs without at least a NAS ? I have seen a few "one-offs" but you pretty much give up all the great benefits of virtualization when you skip a common storage platform.


On Wed, Feb 10, 2010 at 8:27 AM, Peglar, Robert <Robert...@xiotech.com> wrote:

@Erik,

Good post. I agree on your point about running VM farms using shared storage (SAN) - but alas, some folks insist on running VM farms on DAS, believing it to be cheaper. It's not, over the long term, but the perception persists. Also, those folks would rather transfer risk to the user, i.e. VM outages, than construct optimal farms.

As for cheap whiteboxes replacing mainframes, it hasn't happened. The venerable machine is still around, still running, still a terrific platform for running tons of Linux instances, for example. It won't go away, in my lifetime at least.

One thing, though; VM failover isn't clustering, it's failover. The difference is simple; one instance which can move between N platforms, versus N instances (on N platforms) communicating with each other and managing shared resources. Both are HA tactics, but the method is different.

Rob

From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Erik Sliman
Sent: Tuesday, February 09, 2010 9:49 PM


To: cloud-c...@googlegroups.com

Subject: Re: [ Cloud Computing ] Moore's Law: The Future of Cloud Computing from the Bottom Up

Jim and Jan,

Your points are all valid, and do primary concerns. I understand the concerns with memory contention and failover. Being that this was once an argument on why cheap wintel boxes could never displace the mainframe, my gut tell me that these high density core chips do have a real chance in the cloud, if, for no other reason, then for their ability to save space and reduce power consumption. So, let's play devil's advocate with the concerns.

Memory contention: imagine 48 cores and 12-channel memory. What is the REAL % of wait time for the threads? It seems potentially contentious, but without intricate knowledge of what % of a core's time requires an open memory channel, I don't really know how contentious it will really be, particularly considering potential core idle time.

Failover: Why does a VM have to depend on a CPU housing for this? Let's assume it is using a fiber optic SAN and the memory state of a VM's requiring high availability is mirrored. Why can't the VM fail over to another box if the box it is in fails? This is, fundamentally, the type of clustering that made cheap hardware capable of being used to build data centers and supercomputers.

Erik
OpenStandards.net <http://openstandards.net/>

GregO wrote:

Thanks,
GregO

--

--
~~~~~
Register Today for Cloud Slam 2010 at official website - http://cloudslam10.com <http://cloudslam10.com/>

Post Job/Resume at http://cloudjobs.net <http://cloudjobs.net/>

Buy 88 conference sessions and panels on cloud computing on DVD at http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com

--
Cheers,
Jan

--
~~~~~
Register Today for Cloud Slam 2010 at official website - http://cloudslam10.com <http://cloudslam10.com/>

Post Job/Resume at http://cloudjobs.net <http://cloudjobs.net/>

Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com

--
~~~~~
Register Today for Cloud Slam 2010 at official website - http://cloudslam10.com <http://cloudslam10.com/>

Post Job/Resume at http://cloudjobs.net <http://cloudjobs.net/>

Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com


<http://www.xiotech.com/>


Robert Peglar
Vice President, Technology, Storage Systems Group
Xiotech Corporation | Toll-Free: 866.472.6764
o 952 983 2287 m 314 308 6983 f 636 532 0828

Robert...@xiotech.com | www.xiotech.com <http://www.xiotech.com/>

-- <http://www.xiotech.com/>

~~~~~
Register Today for Cloud Slam 2010 at official website - http://cloudslam10.com
Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net
Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com

To unsubscribe from this group, send email to cloud-computi...@googlegroups.com <http://www.xiotech.com/>


--
Cheers,
Jan

--
~~~~~
Register Today for Cloud Slam 2010 at official website - http://cloudslam10.com
Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net
Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com

To report this email as spam click here <https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg==> .

--
~~~~~
Register Today for Cloud Slam 2010 at official website - http://cloudslam10.com <http://cloudslam10.com/>

Post Job/Resume at http://cloudjobs.net <http://cloudjobs.net/>

Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com

Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom

winmail.dat

Rao Dronamraju

unread,
Feb 10, 2010, 10:59:33 AM2/10/10
to cloud-c...@googlegroups.com

From this report it appears that the market share seems to be 43% to 15%.

"* Microsoft Windows server revenue was $4.5 billion in 3Q09 showing a 12.8%
year-over-year decline and comprising 43.0% of all server revenue in the
quarter. Windows servers account for the single largest segment, by
operating system, in the worldwide server market.

* Linux server revenue declined 12.6% year over year to $1.5 billion in the
quarter. Linux servers now represent 14.8% of all server revenue, up
slightly from 14.0% a year ago."

http://www.idg.com/www/pr.nsf/0/02732ED3B5E320328525768000659F41

Altough the numbers are quarterly year over year, the overall ratio per year
would be close.

Ray Nugent

unread,
Feb 10, 2010, 11:08:42 AM2/10/10
to cloud-c...@googlegroups.com
Darren, how much do you use this feature? is it critical or nice to have?

Ray


From: Darren Sykes <Darren...@csr.com>
To: cloud-c...@googlegroups.com
Sent: Wed, February 10, 2010 7:50:33 AM

Erik Sliman

unread,
Feb 10, 2010, 11:13:01 AM2/10/10
to cloud-c...@googlegroups.com
Unless you are looking to profit directly from support revenue, revenue stats do not translate into usage when comparing a free OS to a purely commercial one.  What about Centos

Jan Klincewicz

unread,
Feb 10, 2010, 11:21:12 AM2/10/10
to cloud-c...@googlegroups.com
There are two different live migration task we are talking about.  The best known is called vMotion by VMware (and XenMotion) by the folks at Citrix.  I believe MSFT just calls it "Live Migration" and this entails moving a VM from one host to another such that the CPU and Memory states are consistent and there is no perceivable downtime during the migration.

Live STORAGE migration, on the other hand (which to my knowledge is unique to VMWare) refers to the ability to move the file(s) on disk (containing the VM) from one storage array to another without disrupting the availability of the running VM.  This coul obviously come in very handy in a large shop where, for example, a "test and dev" machine passes QA and can be migrated from an inexpensive test NAS to a more robust Production FC SAN with no interruption.

For an SMB with a single Clarion box or MSA1500, this will not be an issue.  As you can imagine, if you need this feature, you need it very badly, but if you don't, then it is just a checklist item to out on your RFP to ensure VMware wins <g>.
--
Cheers,
Jan

Jim Starkey

unread,
Feb 10, 2010, 11:26:14 AM2/10/10
to cloud-c...@googlegroups.com
The question isn't market share but relative server population. When
extrapolating from server revenues to server population, it is probably
important to keep in mind that Linux is free.


--

Jaymes Davis

unread,
Feb 10, 2010, 11:29:21 AM2/10/10
to cloud-c...@googlegroups.com

OK. Its not SRM  that’s (Site recovery Manager) , you’re talking about Storage motion

Peglar, Robert

unread,
Feb 10, 2010, 11:34:14 AM2/10/10
to cloud-c...@googlegroups.com
Revenue, yes. What I was referring to was unit shipments. Windows
tends to get less revenue/server and the RISC Unix more revenue/server.
So, it's very possible that Windows has 75% of the unit shipments but
only 43% of the revenue.

Rob

Peglar, Robert

unread,
Feb 10, 2010, 11:32:35 AM2/10/10
to cloud-c...@googlegroups.com

Darren, you have the history correct, yes.  But remember many VMs on version 2 were direct attached, and therefore the Dmotion utility to move them along.  So, the tool really has multiple effects.

 

From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Darren Sykes
Sent: Wednesday, February 10, 2010 9:51 AM
To: cloud-c...@googlegroups.com
Subject: RE: [ Cloud Computing ] Moore's Law: The Future of Cloud Computing from the Bottom Up

 

My understanding was that the VMWare technology to move running machines between storage devices was a direct resut of the work they put in to allow their customers to move from VMFS 2 to 3; At the time it was called DMotion (D for data) internally at VMWare, but was renamed when they decided to make it a production feature.

 

We (and most people I've spoken to about this) use the feature when reorganising data or moving between storage platforms. Whilst I suppose it would be useful for DAS VMWare environments, I'm certain that's not why it was originally conceived.

 

 

 


From: cloud-c...@googlegroups.com on behalf of Peglar, Robert
Sent: Wed 2/10/2010 15:16
To: cloud-c...@googlegroups.com
Subject: RE: [ Cloud Computing ] Moore's Law: The Future of Cloud Computing from the Bottom Up

There are plenty of VMs deployed w/o shared storage, absolutely.  This has led to data movers such as VMware SRM, which moves the data for the VM from one disk (or array) to another.  Without this piece, VM failover would not be possible.  The mere existence of s/w bits like this indicate the demand for time/space tradeoffs, sadly.

 

Rob

 

From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Jan Klincewicz
Sent: Wednesday, February 10, 2010 8:22 AM
To: cloud-c...@googlegroups.com
Subject: Re: [ Cloud Computing ] Moore's Law: The Future of Cloud Computing from the Bottom Up

 

Rob:

Clustering is defined differently in the Windows world than the Linux world.  I's guess the majority of Windows clusters are two-node, active-passive.  I understand the distinction, but unfortunately,the definition evolves to repelct whatever the marketers can get the public to accept.  As even a 2-node cluster requires a Quorum disk, shared storage is a must.  So yum really see many users deploying VMs without at least a NAS ?  I have seen a few "one-offs" but you pretty much give up all the great benefits of virtualization when you skip a common storage platform.

On Wed, Feb 10, 2010 at 8:27 AM, Peglar, Robert <Robert...@xiotech.com> wrote:

@Erik,

 

Good post.  I agree on your point about running VM farms using shared storage (SAN) – but alas, some folks insist on running VM farms on DAS, believing it to be cheaper.  It’s not, over the long term, but the perception persists.  Also, those folks would rather transfer risk to the user, i.e. VM outages, than construct optimal farms.

 

As for cheap whiteboxes replacing mainframes, it hasn’t happened.  The venerable machine is still around, still running, still a terrific platform for running tons of Linux instances, for example.  It won’t go away, in my lifetime at least.

 

One thing, though;  VM failover isn’t clustering, it’s failover.  The difference is simple; one instance which can move between N platforms, versus N instances (on N platforms) communicating with each other and managing shared resources.  Both are HA tactics, but the method is different.

 

Rob

 

From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Erik Sliman
Sent: Tuesday, February 09, 2010 9:49 PM

Subject: Re: [ Cloud Computing ] Moore’s Law: The Future of Cloud Computing from the Bottom Up

 

Jim and Jan,

Your points are all valid, and do primary concerns.  I understand the concerns with memory contention and failover.  Being that this was once an argument on why cheap wintel boxes could never displace the mainframe, my gut tell me that these high density core chips do have a real chance in the cloud, if, for no other reason, then for their ability to save space and reduce power consumption.  So, let's play devil's advocate with the concerns.

Memory contention:  imagine 48 cores and 12-channel memory.  What is the REAL % of wait time for the threads?  It seems potentially contentious, but without intricate knowledge of what % of a core's time requires an open memory channel, I don't really know how contentious it will really be, particularly considering potential core idle time.

Failover:  Why does a VM have to depend on a CPU housing for this?  Let's assume it is using a fiber optic SAN and the memory state of a VM's requiring high availability is mirrored.  Why can't the VM fail over to another box if the box it is in fails?  This is, fundamentally, the type of clustering that made cheap hardware capable of being used to build data centers and supercomputers. 

Xen does it
http://sheepy.org/node/65

Erik
OpenStandards.net

On Tue, Feb 9, 2010 at 7:57 PM, Jan Klincewicz <jan.kli...@gmail.com> wrote:

Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net
Buy 88 conference sessions and panels on cloud computing on DVD at http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com

 

To report this email as spam click here.

Erik Sliman

unread,
Feb 10, 2010, 11:49:25 AM2/10/10
to cloud-c...@googlegroups.com
@Jan

You said, "Failover refers to the ability of a OS instance to RESTART after failure."  I don't know how "restart" became a requirement of "failover" in any circle.  I agree with this definition:

"In computing, failover is the capability to switch over automatically to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active application,[1] server, system, or network."  (source:  wikipedia). 

That last sentence on that page ties in interstingly with this conversation, and no I did not edit any part of the page.  :) 

"The use of virtualization software has allowed failover practices to become less reliant on physical hardware."

The cluster statement I made was intended to be more analogous than literal, as clustering tends to address scalability in addition to high availability (HA).  The HA aspect comes into play in that a cluster can usually continue to run uninterrupted when one node fails, with use of that node failing over to other nodes. 

Erik
OpenStandards.net

Darren Sykes

unread,
Feb 10, 2010, 12:00:30 PM2/10/10
to cloud-c...@googlegroups.com
It's a nice to have. It occasionally gets us out of a hole, but usually one that wouldn't have been there with adequate planning.
 
However, as has been mentioned, you could use it to move VM's between physical servers 'on the fly' when using DAS. It'd be a pain though, you'd need to SVM the machine to a shared disk (presumably a small holding area on a NAS box), VMotion the machine to another physical server and then SVM it back to DAS on the new host. Shared storage is a much simpler solution which offers many other benefits.
 
I'm not really sure SVM would be of much use in a cloud environment. I suppose it could be used to offer more disk performance at times of peak loading with some intelligence around the migration process, but the actual IO cost of movement would probably be excessive since it's really a series of file copies of all the data that represents the VM.
 
 
 
 

From: cloud-c...@googlegroups.com on behalf of Ray Nugent
Sent: Wed 2/10/2010 16:08

Jan Klincewicz

unread,
Feb 10, 2010, 1:52:49 PM2/10/10
to cloud-c...@googlegroups.com
@Erik

You are correct, in the case of Active-Active clusters (if you are talking about Oracle 10G or Open VMS or Beowulf clusters, for example.)  In the definition of Clustering as it pertains to the current state-of-the-art in x86 virtualization (with few exceptions such as Marathon, which I mentioned, or even VMware similar feature which can provide Fault tolerance to SINGLE vCPU instances) there will be re-starting of the OS in the event of a cluster member failing.  In these cases, clustering has ZIP to do with "scalability" but does allow for load balancing and additional availability.

I would consider the word "uninterrupted" to perhaps be applicable to the CLUSTER itself, but certainly not to the OS instances and apps running on the failed host.

Rao Dronamraju

unread,
Feb 10, 2010, 4:13:19 PM2/10/10
to cloud-c...@googlegroups.com

I agree but absense of real numbers, one can derive some inference….

 

Since Linux is free, the cost (License+Service) per unit (server) of Linux should be lower than cost (license+Service) per Windows server, is it not?...

 

If the cost is lower per unit that means the numbers are higher. That should mean, there are more numbers of Linux VMs in deployment (those that are directly attributable to revenue + those that are installed by free downloads)

 

So the market share of the numbers deployed would be 15+% for Linux and 43% for Windows. I would think the ratio could be something like 30% to 45%.

 

But that is my guestimate as I have not been keeping track of Linux Vs Windows (deployed) market share for 3+  years.

 

 


From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Erik Sliman


Sent: Wednesday, February 10, 2010 10:13 AM
To: cloud-c...@googlegroups.com

Greg Pfister

unread,
Feb 10, 2010, 6:50:44 PM2/10/10
to Cloud Computing
My reaction to the original Intel announcement
(http://bit.ly/9ReZCQ):

I echo Jim's comment that it's engineers saying "here's something
something we know how to do." ... but I note that "know" is limited
since it's a lab prototype, not a product ... So, OK, they got their
QuickConnect multilink memory interconnect, so now they can string
cores out to as many as then can cram on as big a chip as they can
make. Good test of the cache coherence algorithms.

After that, they say "here are a random bunch of hot topics we think
maybe it might be good for," topics chosen by hardware engineers and
not software guys.

Time will tell whether anybody uses it for anything like what they
speculate. At least this time they're not saying it will enable brain
simulation or something like that.

I'd want to know the memory latency and bandwidth, and the IO latency
and bandwidth, before saying it's good for anything at all.

And I just *love* their equation of 48-way parallel performance to 2-
or 4-way parallel performance, undoubted achieved by just multiplying
by the number of processors. Perfect scaling, anybody? Out to 48
cores? ROTFLMAO.

(See my recent Perils of Parallel post about Larrabee memory
bandwidth. Do you want to write code that will be memory starved if on
average it accesses more than 1 byte per instruction?)

Greg Pfister
http://perilsofparallel.blogspot.com/

Sassa

unread,
Feb 11, 2010, 5:53:51 AM2/11/10
to Cloud Computing
CPUs already have memory they can use exclusively; it is called cache.
So, more on-chip memory, plus programmatic coherency control, please.
Use DIMMs as a page cache (swap) and a shared mem communication
device.

You have cheap boxes of adequate power because somebody's pushing the
envelope - like not everyone buys a racing car, but any given Escort
gets cheaper because of the people who do. So you can't say "the
future is after scale-out".


Sassa

> > Moore s Law: The Future of Cloud Computing from the Bottom Up


>
> > I'm a serial entrepreneurial leader.  It's an art/science, left/right
> > brain thing. I have to say that one of the most challenging parts of
> > creating a compelling strategy, leading a company or building products
> > is getting people to see the possibilities, transitions and tipping
> > points. Imagineering the future calls me to look back at what made
> > companies great -- specifically, how they capitalized on paradigm
> > shifts while the rest missed it. Reading the recent bestseller,
> > Outliers, it struck me that, not only do you have to be smart, but you
> > have to be in the right place with the experience to see and grab the
> > brass ring.
>
> > Moore's Law is one of those history lessons that have traditionally
> > been a touchpoint that points the way to the future. Simply put,
> > Moore's law describes a long-term trend in the history of computing
> > hardware, in which the number of transistors that can be placed
> > inexpensively on an integrated circuit has doubled approximately every
> > two years.
>
> > Translation: compute power has reliably doubled at a decreased cost
> > every two years.
>
> > In a recent announcement, Intel gave a glimpse of what the future will
> > look like. The "Cloud" chip will have 48 cores, is available to
> > Intel's ISV partner today and will be shipping in volume in less then
> > 18 months. The quote from the Intel dude stated that it will increase
> > the power of what is available today by 10-20 times.  Oh my.... Buckle
> > your seatbelt .... Moore's law just took a giant step up the paradigm.
>

> > <the rest athttp://blog.appzero.com/>

Greg O'Connor

unread,
Feb 11, 2010, 9:10:23 AM2/11/10
to cloud-c...@googlegroups.com

I think you guys have completely missed the point about what is happening in the virtualization market, in the data center and in particular on the Windows platform.

 

@jim & @jan – Both of you talk about an application @jim “In the absence of such an application..” @jan “There are very few bits of code out there used for commercial purposes today that can take advantage of a modern Quad processor”.

 

I surely have done a poor job of communicating in this blog.

 

The point is that Hypervisors and OS can run multiple OS or apps and partition the workload across all these cores. It is not about a single app!  Unless of course we are talking about how a hypervisor or OS as an app that schedules, partition and utilizes the silicon.

 

It is about a lot of apps that quite frankly are not doing a lot of stuff concurrently.

 

Here is the data from Gartner Dataquest Insight: Virtualization Market Size Driven by Cost Reduction, Resource Utilization and Management Advantages, January 5 January 2009

                                    2008                2009                2010                2011

Windows (Server)        $5,419             $5,952             $6,457             $6,907

Sun Solaris                  $1,362              $1,366             $1,377             $1,383

Linux (Server)             $1,407              $1,568             $1,771             $1,980

IBM AIX                    $1,010              $1,021             $1,042             $1,050

IBM System z             $973                $975                $978                $980

HP-UX                       $821                $829                $844                $855

Total                           $10,993           $11,711           $12,469           $13,155

 

If you assume all the boxes cost the same then Windows has 50% part share of the number of boxes that ship. Clear they have more on a numbers basis since they on average cost less the UNIX ones list above. I will see if I can get the numbers of units shipped that goes along with this report.

 

Let me go on record that I agree with your points.

-         Memory contention will be a challenge (the big point of the blog)

-         I/O bandwidth is as well (no help here offered)

-         Throughput does not scale up as fast as the number of cores

-         There are *few* SINGLE applications that can take advantage of multi-core

 

I was at a large bank in NYC city discussing this with them last month. They have 70,000 server machines (physical). Close to 50,000 run Windows. 98% of those windows machines run 1 application, actually less then one because and app is often 3 tiers and spread across multiple machines. Most of these “apps” run <10% cpu utilization. There is a huge cost of running 25,000 apps that do very little work compared to a month end process, or a risk analysis app. I asked how much it cost to run and maintain 25,000 apps that do very little, they would not say…

 

Can you image the pain and cost of managing and running all the little apps in a huge data center? This argument is not about clustering, HPC, fail-over it is for tons of little apps that consume way too much of the budget.

 

 I would bet that 20% of the apps make up 80% of the CPU, and I/O consumption of the total consumption of a FY1000 data center.

 

The cloud chip is for these 80% of the apps.  That stacking them up and running 200+ copies of the same operating system is silly and will make the memory contention issues even worse. @jim I have seen many a post from you about how wasteful the OS on top of and OS is, why not this time?

 

It is interesting to me that public clouds are mostly Linux, heck Rackspace doesn’t even have a Windows offering yet. Clearly public clouds are about developing new applications. As the cloud market matures I am hopeful that it will deal with the long tail of applications not just the 10-20% of the apps that get the most visibility.

 

I also agree that everyone just sticks in the word cloud to be part of the movement.

 

I most likely can’t produce a Gartner report that says the long-tail of business applications is run on Windows platform but if there was one I would be willing to bet my kids college tuition that this is true. I have 5 boys it is a lot of money!

 

One more time:

-         Any person that wants to keep there job at a FY1000 only runs one app per OS on the Windows server

-         # VMs that run Windows only application will increase dramatically from 10+ today over the coming years

-         In many cases the VM will be running the same version of OS consuming tons of memory

-         Memory access is one of the challenges in utilizing all the cores in a multi-core chip

-         A giant reduction in running all these copies of the OS can be attained if you run more then one app on the OS

-         Windows need better app Isolation to run more then one app at a time

-         A good way to reduce the 80% of the number of apps that do only 20% of the work is consolidate them

 

Any better?

 

It is the boring unwashed mass of applications that are not the least bit technically challenging. I should have been stated that right up front… My bad.

 

Cheers

GregO

 

 

 

 

 

 



Rao Dronamraju

unread,
Feb 11, 2010, 11:22:25 AM2/11/10
to cloud-c...@googlegroups.com

“I was at a large bank in NYC city discussing this with them last month. They have 70,000 server machines (physical). Close to 50,000 run Windows. 98% of those windows machines run 1 application, actually less then one because and app is often 3 tiers and spread across multiple machines. Most of these “apps” run <10% cpu utilization. There is a huge cost of running 25,000 apps that do very little work compared to a month end process, or a risk analysis app. I asked how much it cost to run and maintain 25,000 apps that do very little, they would not say…”

 

Greg, did you ask them, why even after virtualization has been around for nearly 10 years and having some 50,000+ servers running 1 application each at < 10% cpu utilization, the bank has not thought of server consolidation as a first step?....The bankers are probably too busy giving each other $140 Billion!!! bonuses from the TARP moneyJ.

 

“The cloud chip is for these 80% of the apps.  That stacking them up and running 200+ copies of the same operating system is silly and will make the memory contention issues even worse. @jim I have seen many a post from you about how wasteful the OS on top of and OS is, why not this time?”

 

Yes, I agree with your point that 200+ copies of OS is not necessary. But “OS on top of OS”, Hypervisor is not an OS in the traditional OS sense. There is a LOT of OS functionality that is NOT in the Hypervisors. If you see KVM, it is only a Linux module and QEMU, not a lot of duplicated OS function like other hypervisor folks have done especially in the area of I/O. Although, I must add that the KVM folks had the luxury of learning from the other hypervisor folks and improvise/innovate on it. Most hypervisors are aroud 100,000 lines of code. In addition, if the workload is application centric, then most of the pages in memory would be application + those pages of OS that are absolutely needed. But you point about why run 200 copies of OS is well taken. So it appears to me that you are proposing something like Solaris Conatainers on Windows, is that right?...

 

Your Gartner table below, seems to suggest that Windows server has the highest cost reduction, resource utilization and management advantages?...is this because they have the higherst cost, lowest resource utilization and management disadvantages, so that when they are virtualized you get the most out of them. Does this also mean Linux & Unix, have very little inefficeincy built into them, hence very little ROI in cost reduction, resource utilization and management advantages?....

 

 


From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Greg O'Connor


Sent: Thursday, February 11, 2010 8:10 AM
To: cloud-c...@googlegroups.com

Miha Ahronovitz

unread,
Feb 11, 2010, 11:46:08 AM2/11/10
to cloud-c...@googlegroups.com
Greg,

I read this post completely absorbed. I think it has  one of the most relevant content ever posted on this group, and we had many good posts. The observation that out of 70,000 physical servers at a customer in NYC, the average utilization is 10%, in the year 2010, it shows the complete waste of resources, power, and people. After decades of working in distributed resource management (DRM)  sw - whose goal is 100% utilization - we are no better on large scale. The few optimally run clouds are the exception confirming the rule of waste

Tthere is NO  magic OS, v12n, or other software or solution with universal adoption that solves the resource optimization question. There are hundreds of thousands of data centers and organizations, where the 10% utilization and lower is the norm. Two scattered for a common solution

Without a goal, there is no way to optimize anything. Optimize for what? We can optimize for response time in compute intensive, for throughput,  for constant  levels of service in persistent application or for maximum revenues in a pay per use cloud. All of these concepts are goals different from the 100% utilization. To conclude, the cloud business model will probably induce

  1.  An optimization of computers resources for any goal
  2.  Ability to optimize for more meaningful goals instead of only  the ubiquitous 100% utilization
  3.  Consistency of optimizations. Much easier to optimize per core not per CPU in a specific cloud versus a worldwide scale. (SGE does per core optimizations  for Nehalem processors)
   4. Ability of a public cloud running optimally (Say 100%) to offer service at lower costs to a client organization with 10% utilization and make a huge profit

I am not an expert on v12n  (few people are on this group), but virtualization is nothing but one detail in the arsenal of potential tools for the cloud engineers designing a cloud with clear goals defined. What Greg did is present further  data promote the clouds as inevitable.

Clouds are the natural anti-oxidant diet that will solve naturally  the Data Center malaise beyond expectations

Miha

PS: Regarding Windows servers  having 50% of the market...Computerworld says:

"We're told Linux is the only OS with a growing market share: Windows and Mac OS X actually shrank. The Net Applications report also shows Windows 7 already dwarfing all versions of Mac OS combined."
Of February 1,  W7 has 10% share from 2.8% in November among Desktop OSs
http://tinyurl.com/y9a32au

My own data in the last years  in HPC showed 80% of downloads are for Linux, 10% for Windows and 10% all others. The Data Centers are heavily Linux shops. Windows are still desktops and localized clusters.



From: Greg O'Connor <oc....@gmail.com>
To: cloud-c...@googlegroups.com
Sent: Thu, February 11, 2010 6:10:23 AM
Subject: Re: [ Cloud Computing ] Re: Moore’s Law: The Future of Cloud Computing from the Bottom Up

Erik Sliman

unread,
Feb 11, 2010, 1:14:09 PM2/11/10
to cloud-c...@googlegroups.com
@Greg

Good post.  There is a lot of useful info to digest in your post.  Your perspective is a healthy contribution to cloud discussions. 

In the applications arena, Java is currently the #1 language accounting for nearly 20% of new lines written.  Amazingly, C is #2 still, despite being the oldest in the top 10.  Java was created to be "write once run anywhere", permitting nearly all Java apps to run on all the popular operating systems.  C has become very portable, particular in the server apps market. 

You pointed out that most public clouds run Linux. In the server market, Java is very dominant, and runs on all the operating systems you mentioned, Linux, Unix and Windows.  C, also, tends to be very portable, permitting Apache and nearly all the DBMS to run on all the major operating systems.  PHP, the #3 language according to TIOBE, is also OS independent. 

What I believe is happening in the DC market is that the OS is being challenged to provide value beyond being a user interface we are familiar with.  It is becoming a commodity in the server market, with the best differentiation being its cloud capabilities (VM automation and provisioning).  On this playing field, an OS that is both free and driven by the largest contribution of continuous innovation with contributors such as IBM, Novell and RedHat has the best advantage.  In looking at both the OS level and application level solutions that are creating our public cloud infrastructure, I see both Linux (OS) and Java (app language) currently dominating.

The public cloud does not require new apps to thrive.  It is a new home for old apps.  As for apps dependent on Windows, such as IIS or SQL Server, they'll depend on Windows based public clouds.  But, IIS has never been the predominant web server, nor has SQL Server been the predominant DBMS.  Many of those three tier apps have Apache on the web server, and DB2, Oracle or MySQL in the database tier, and Java, PHP and Python in the middle (e.g., WebSphere, WebLogic, JBoss), particularly among your Fortune 500 companies. 

Many of those apps run on Wintel in DCs today, but not because they can't run on Linux.  The primary resistance I've seen to Linux in the DC has been DC operators themselves who don't want to "support multiple OSes" and are "comfortable with Windows", probably rooted in its use as their desktop.  These same DCs tend to house Unix, but Unix is often run by different operators.  Clouds are changing this as these are the very roles being automated by virtualization and the continued automation the cloud brings.  Licensing cost is increasing in importance while labor cost is decreasing in importance in the DC TCO equation.  The Windows sales pitch for DC TCO has always been about human productivity, not licensing cost. 

Erik

Jan Klincewicz

unread,
Feb 11, 2010, 2:42:46 PM2/11/10
to cloud-c...@googlegroups.com
I hate to inject reality into a good infomercial, but is anyone considering that the ability to host more VMs per server because of additional cores does absolutely nothing to ensure the availability of those VMs should a host fail ?  This has nothing to do with potential for utilization, balancing I/O and memory contention.  I just don't see a "paradigm shift" in an evolutionary step of adding additional cores to a CPU so more VMs can run on a server which will inevitably fail because of a $2.00 component.

Certainly, virtualization allows fuller use of a host's resources, and I don't think anyone has argued against that in the past decade.  I stand by my assertion that until fault tolerance is part of any platform, adding workloads "just because you can" is ill-advised regardless of the hypervisor, OS, or app.
Cheers,
Jan

Ricky Ho

unread,
Feb 11, 2010, 8:04:37 PM2/11/10
to cloud-c...@googlegroups.com

Cloud computing environment has some inherited constraints and limitations such as high latency and eventual consistency.  The cloud Map Reduce implementation has some interesting tricks to get around these constraints and I think it will be very useful for architect designing cloud-based apps.


http://horicky.blogspot.com/2010/02/cloud-mapreduce-tricks.html


Of course, Cloud Map Reduce itself is a strong alternative choice to Hadoop, which may also be interested to some members of this group.

  

Comments and feedback are welcome.

 

Rgds,

Ricky


Greg Pfister

unread,
Feb 12, 2010, 1:00:38 PM2/12/10
to Cloud Computing
Greg,

As others have said, you have lots of good data here, and it
corresponds to what I also know. I happened to see, two years ago,
utilization data collected from 1000s of server systems, over many
months, that showed the mean, median, and mode of the utilization to
all be 10%-15%. This was not just Wintel or Lintel systems; this
included big *nix boxes, too.

So, I AGREE WITH YOU. Virtualization is good and an appropriate way
forward and a good way -- a slam-dunk -- to use multiple cores. See my
blog post of a while ago on "Why IT departments should not fear
multicore."
(http://bit.ly/buT1gW)

But that's not the point. The point for me is: What does a 48-core
chip have to do with fixing this mess of single-app/single machine?
(Assuming agreement that it is a mess; some would disagree.)

The answer is: Nothing. Zero. Not one blessed thing.

*If* everybody were _already_ happily virtualized or partitioned and
running multiple apps on single boxes, *then* a gazillion cores per
chip would be relevant as a way to reduce the number of boxes.

But they're not already virtualized. As you said. And having this
hardware won't help them do so; the inhibitors have nothing to do with
the number of cores. And we don't even know it's any good, really.
It's just a "here's what I know what to do" from Intel Labs.

Greg Pfister
http://perilsofparallel.blogspot.com/

On Feb 11, 7:10 am, "Greg O'Connor" <oc.g...@gmail.com> wrote:
> I think you guys have completely missed the point about what is happening in
> the virtualization market, in the data center and in particular on the
> Windows platform.
>
> @jim & @jan – Both of you talk about an application @jim “In the absence of
> such an application..” @jan “There are very few bits of code out there used
> for commercial purposes today that can take advantage of a modern Quad
> processor”.
>
> I surely have done a poor job of communicating in this blog.
>
> The point is that Hypervisors and OS can run multiple OS or apps and
> partition the workload across all these cores. It is not about a single app!
>  Unless of course we are talking about how a hypervisor or OS as an app that
> schedules, partition and utilizes the silicon.
>
> It is about a lot of apps that quite frankly are not doing a lot of stuff
> concurrently.
>

> Here is the data from *Gartner Dataquest Insight: Virtualization Market Size


> Driven by Cost Reduction, Resource Utilization and Management Advantages,

> January 5 January 2009*

> >http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> > Follow us on Twitterhttp://twitter.com/cloudcomp_groupor
> > @cloudcomp_group
> > Post Job/Resume athttp://cloudjobs.net


> > Buy 88 conference sessions and panels on cloud computing on DVD at
> >http://www.amazon.com/gp/product/B002H07SEC,

> >http://www.amazon.com/gp/product/B002H0IW1Uor get instant access to

Jim Starkey

unread,
Feb 12, 2010, 10:10:22 AM2/12/10
to cloud-c...@googlegroups.com
There is nothing inherent about eventual consistency in the cloud.  It is one technique.  There are others that are ACID.

Let us not confuse "can't" with "don't know how".

Ricky Ho wrote:
--
~~~~~
Register Today for Cloud Slam 2010 at official website - http://cloudslam10.com
Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net
Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html
 
~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com


-- 
Jim Starkey
Founder, NimbusDB, Inc.
978 526-1376

Miha Ahronovitz

unread,
Feb 12, 2010, 4:28:19 AM2/12/10
to cloud-c...@googlegroups.com
Ricky, your cloud map reduce tricks is fascinating, but it is still quite specialized to reach mainstream non-programming  reader. I have  two question, both related to the original paper of   Huan Liu and Dan Orban regarding the usage of Map Reduce in a cloud:

1. "Building a highly scalable system is not an easy task.
We have to invest in a great deal of engineering efforts
to make sure not only the overall system, but also every
single component are robust and scalable. Worst yet, we
tend to redo everything for the next system that we have
to build. In this paper, we explore a new way of building
these systems, i.e., building them on top of a cloud
OS. Because of its scale (both the size of the infrastructure
and the number of customers), a cloud vendor has to
spend a large amount of engineering efforts to make its
services scalable, possibly more scalable than any other
implementations."


This is fascinating, but it implies, the complexity of a Cloud MR is an insurmountable hurdle for commercial success.I wonder if it is not much easier to manage an Apache Hadoop Application integrated  a DRM scheduler, (like Sun Grid Engine). Then Map Reduce HAdoop applications can be managed from within the data center.  In light of complexities, why should we bother with Cloud MR (of course, this is rhetorical question meant to make you teach us :-) )


2. "We have implemented MapReduce on top of the Amazon
cloud OS. Our implementation has three primary advantages.
First, it is simpler. It has 3,000 lines of Java
code, two orders of magnitude simpler than a traditional
implementation. Second, our implementation is more
scalable because there is no single point of scalability
bottleneck, and we show experimentally that this is true.
Last, our implementation is faster. In one case, it is 60
times faster than the Hadoop implementation."


Wow, 60 times faster than Hadoop? Not a single point of failure to break the scalability?  But with "only "3,000 lines of Java code", and lot of tricks one has to know to produce those 3,000 lines of code, how can Cloud MR become mainstream?
Can an oil company looking at humongous data sets answer  the high level question: "Will Arctic Drilling cool High Oil Prices" as a Map Reduce application?  The answer to these questions, one can have 60 times faster than Apache Hadoop ina Cloud MR? Imagine how much faster a project is completed....

Maybe we should invent a new, improved, Map Reduced focused Cloud OS. The Amazon Cloud OS was not written with Map Reduce in mind, and this is why Huan and Orrban achievement is so remarkable, yet tough to implement...
What we need is a good product manager to match the benefits of a Cloud MR built from scratch to the needs of of cash rich large corporations for the type of applications described above and make a private cloud for their internal use. Those corporations can easily duplicate the MR Amazon WS  as their secure internal cloud. Other option is create a new company operating this cloud for large institutions. All we need is a good field work to make sure we have the ACHING NEEDS ADDRESSED and we for sure, by name , the first 20 customers who will buy the  Cloud MP  services...

2 cents,

Miha

From: Ricky Ho <rickyp...@yahoo.com>
To: cloud-c...@googlegroups.com
Sent: Thu, February 11, 2010 5:04:37 PM
Subject: [ Cloud Computing ] Cloud Map Reduce Tricks

Ricky Ho

unread,
Feb 12, 2010, 1:48:19 PM2/12/10
to cloud-c...@googlegroups.com
I am trying to illustrate a technique for a tackling a common problem in the cloud, since a good portion of cloud storage has the eventual consistency characteristic.

Of course, using an ACID DB doesn't have this problem.  So what tricks should I talk about in working around it ?

Rgds,
Ricky

From: Jim Starkey <jsta...@nimbusdb.com>
To: cloud-c...@googlegroups.com
Sent: Fri, February 12, 2010 7:10:22 AM
Subject: Re: [ Cloud Computing ] Cloud Map Reduce Tricks

Ricky Ho

unread,
Feb 12, 2010, 3:22:54 PM2/12/10
to cloud-c...@googlegroups.com
I think the first point may be mis-interpreted.

What the author means is:
1) Building a highly scalable MR from scratch is complex.  But building in on top of a Cloud OS (like Amazon AWS) is NOT.  Looking at the Cloud MR design, it is much simpler than Hadoop because all the scalability, availability, resilience comes for free.  There is only 3000 lines of Java code in the Cloud MR implementation.  This is looking from a MR technology provider's standpoint.

From an administrator's standpoint, operating Cloud MR will definitely be similar to using Amazon's Elastic Map/Reduce.  Both of these will be easier than operating your own Hadoop cluster yourself.

At this moment, the Hadoop ecosystem is much larger for sure (HBase, Hive, Pig, Mahout .... etc).  So CloudMR is not attractive to those MR user who wants to stay at a higher level language semantics (like Hive and Pig).  But I guess there are still a fair amount of lower level MR users (writing Java code) that will find CloudMR interesting.

2) Although Amazon Cloud is not designed specifically for MapReduce workload, it targets a more generic distributed architecture that will benefits MapReduce workload as well.  But I don't expect performance-wise it will be significantly faster than a specially design Hadoop cluster.

I won't take the 60 times faster too seriously.  The experiment would have done by combining many files into fewer large files to reduce the number of mappers.  Otherwise, the central coordinators in Hadoop environment will quickly becomes a bottleneck.  Hadoop is fundamentally optimized for disk streaming I/O so large file is a must.  On the other hand, Cloud MR argue than Network bandwidth is higher than disk bandwidth and so just use a network storage from the Cloud OS.  I think this is a sound argument.

That said, I am not promoting the use of Cloud Map Reduce as there are many other factors to consider.  I am looking from a pure architecture elegance point of view and capture some of the lessons learned that can be applied to other areas of designing cloud apps.

Rgds,
Ricky


From: Miha Ahronovitz <mij...@sbcglobal.net>
To: cloud-c...@googlegroups.com
Sent: Fri, February 12, 2010 1:28:19 AM
Subject: Re: [ Cloud Computing ] Cloud Map Reduce Tricks

Miha Ahronovitz

unread,
Feb 13, 2010, 8:59:00 PM2/13/10
to cloud-c...@googlegroups.com, Miha Ahronovitz
Ricky, there is a definite need to apply MR on a commerical sale . right
now. If Hadoop entered places like New York Times, it was because it had
employee who were visionaries and Whiz Kids who took destiny in their
hands See the Interview with Derek Gottfrid. He attended the world
Haddop conference because they gave them a discount, went back to NYT
and implemented. See what Derek says:

> *Gottfrid:* I�ve been working with Hadoop for the last three years.
> Back in 2007, the New York Times decided to make all the public domain
> articles from 1851-1922 available free of charge in the form of images
> scanned from the original paper. That�s eleven million articles
> available as images in PDF format. The code to generate the PDFs was
> fairly straightforward, but to get it to run in parallel across
> multiple machines was an issue. As I wrote about in detail back then,
> I came across the MapReduce paper from Google. That, coupled with what
> I had learned about Hadoop, got me started on the road to tackle this
> huge data challenge.
http://saviorodrigues.wordpress.com/2009/09/11/whats-the-nyt-doing-with-hadoop/

I am wondering how many customer-organizations in this world would need
to implement similar, if not identical projects, among thousands more
equally straight forward problems which can change the information and
decision taking game forever. .Whom shell they call? Cloudera? they are
handful of people and they are Apache Hadoop, not Cloud MR.

After reading your post Ricky, Amazon should do something by hiring
people with the same know how like Huan Liu and Dan Orban and offer easy
to use MR Amazon API. But then , you are talking of " HBase, Hive, Pig,
Mahout .... etc". What are those and who uses them and why? Why not
Java? When to use a public cloud and when to use in house tools for MR?
Can you write a post with your thoughts?

I am not sure how many people on this group, know really that Map Reduce
is. Sure they know it is a fantastic tool that fits the concept of
cloud, but have no idea whom to call or what to learn first.

There is a huge space to filled by new start-ups, who not only know
technically what they are doing, but have a solid business case for MR
as well

Thanks Rick,

Miha

> ------------------------------------------------------------------------
> *From:* Miha Ahronovitz <mij...@sbcglobal.net>
> *To:* cloud-c...@googlegroups.com
> *Sent:* Fri, February 12, 2010 1:28:19 AM
> *Subject:* Re: [ Cloud Computing ] Cloud Map Reduce Tricks

> ------------------------------------------------------------------------
> *From:* Ricky Ho <rickyp...@yahoo.com>
> *To:* cloud-c...@googlegroups.com
> *Sent:* Thu, February 11, 2010 5:04:37 PM
> *Subject:* [ Cloud Computing ] Cloud Map Reduce Tricks

mij123.vcf

Ricky Ho

unread,
Feb 13, 2010, 11:01:43 PM2/13/10
to cloud-c...@googlegroups.com
Miha,


You are probably right. It looks like the majority audience in this group are biz folks, not tech folks.

Regarding Cloud computing and Map Reduce, from the email thread I feel that Cloud Computing audience focus more in the infrastructure operation aspects (virtualization, economics) while Map Reduce audience focus more in algorithmic aspects (how to structure the processing in an easily parallelizable fashion). Their focus are quite different at the moment.

Although it is technically possible to run a large Map/Reduce job in the cloud (e.g. Run your Hadoop Cluster in EC2, or use Amazon's Elastic Map/Reduce), I don't know if any large enterprise doing this in production yet. I honestly think the bandwidth cost (and time for data upload into the cloud) is prohibitive to doing large scale parallel processing in the cloud. There are some mitigation technique that I mention in an earlier blog here at http://horicky.blogspot.com/2009/08/skinny-straw-in-cloud-shake.html but this is a hard problem in my opinion. I am also looking forward for Amazon to publish some large scale reference customer case using Elastic Map/Reduce.

I am not advocating we should use Cloud MR, as I said the community behind Hadoop is much bigger (and the presence of the strong Cloudera consulting team is another consideration factor as well). I am trying to articulate the architecture of Cloud MR's simplicity (and hence elegance) because it is built on top of a Cloud OS.

"Why not Java ?" I don't know how to answer this question. But programming language is just a tool and you use different tools at different level of abstraction. For example, most of the time when designing a parallel algorithm I will use higher level language like PIG / Hive. And when the design looks right, then we can rewrite the algorithm in Java (if you want more control than what the Pig / Hive compiler gives you). But using Java or not is an implementation decision, not important at the design phase.

But hearing your advices, getting down the Pig/Hive route may cause even more confusion. For those who are interested, there is another mail alias in the Apache Hadoop project.

Sure, I'll write more blogs on this as I learn more along the way. And thanks for your detail feedback and coments.

Rgds,
Ricky


----- Original Message ----
From: Miha Ahronovitz <mij...@sbcglobal.net>
To: cloud-c...@googlegroups.com

Cc: Miha Ahronovitz <mij...@sbcglobal.net>
Sent: Sat, February 13, 2010 5:59:00 PM
Subject: Re: [ Cloud Computing ] Cloud Map Reduce Tricks

Ricky, there is a definite need to apply MR on a commerical sale . right now. If Hadoop entered places like New York Times, it was because it had employee who were visionaries and Whiz Kids who took destiny in their hands See the Interview with Derek Gottfrid. He attended the world Haddop conference because they gave them a discount, went back to NYT and implemented. See what Derek says:

> *Gottfrid:* I’ve been working with Hadoop for the last three years. Back in 2007, the New York Times decided to make all the public domain articles from 1851-1922 available free of charge in the form of images scanned from the original paper. That’s eleven million articles available as images in PDF format. The code to generate the PDFs was fairly straightforward, but to get it to run in parallel across multiple machines was an issue. As I wrote about in detail back then, I came across the MapReduce paper from Google. That, coupled with what I had learned about Hadoop, got me started on the road to tackle this huge data challenge.

Rao Dronamraju

unread,
Feb 13, 2010, 11:38:43 PM2/13/10
to cloud-c...@googlegroups.com

Miha and Ricky,

A lot of people on this forum know about MR.

It is a "niche" area right now. Just like cloud data bases are new, MR based
applications are also new and there is still plenty of time 2 to 3 years or
more down the line for MR to become mainstream.

A lot of people for some reason think what Google is doing or did is
everything with respect to cloud. What google does is a very specialized
search niche application.

Other 90%+ world doesn't give a damn about search, although they use it may
be to a lesser extent in the form of intranet search provided by
applications like SharePoint etc, but ERP, CRM, Billing Systems, Messaging,
Collaboration, e-Commerce etc etc are the REAL application of the REAL IT
world.

Right now, it is the time of IaaS, PaaS and SaaS in the context of migarting
existing enterpises, SMBs and startups to the clooud as much AS IS as
possible so that there is minimal cost to the customers. Think of how many
data center are out there in the WORLD in the above market segments. It is a
HUGE market. MR is relatively small market at this time.

Exotic applications are going to be on the back burner for a while,
especially given the bad economy and investment.

Ray Nugent

unread,
Feb 14, 2010, 12:51:54 AM2/14/10
to cloud-c...@googlegroups.com
Ricky, don't let a couple of stuffed shirts prevent you from posting here. You've got great insight. Keep it coming!

Ray

Sent: Sat, February 13, 2010 8:01:43 PM

Miha Ahronovitz

unread,
Feb 14, 2010, 2:16:09 AM2/14/10
to cloud-c...@googlegroups.com, Miha Ahronovitz
Rao, I can hardly agree with your statements below:

> MR is relatively small market at this time.
> Exotic applications are going to be on the back burner for a while,
> especially given the bad economy and investment.

You think of Goggle as an "exotic" application, while the DC IaaS, SaaS
are more important because of the recession? I do agree DC cloud is very
important
I know many people know how Map Reduce works, but few understand the
business potential of MR. MR is equally important to PaaS, SaaS etc..

Hadoop runs in a a disruptive environment, (relative to a DC) and the
biggest hurdle is that it requires a separate environment from the Data
Center, and a separate set of skills. As Cloudera CTO said, Hadoop ( and
MR and the rest), are big hammers in search for nails, Every conceivable
business from Oil and Gas to analyzing business data (quotes, invoices
in massive volumes). This is why SGE (a DRM) was integrated with Haddop.
See this customer quote:

> Sun Grid Engine 6.2 Update 5 allows us to run Hadoop jobs within
> exactly the same scheduling and submission environment we use for
> traditional scalar and parallel loads.
>
> Before Sun Grid Engine 6.2 Update 5 we were forced to either dedicate
> specialized clusters or to make use of convoluted, ad-hoc, integration
> schemes; solutions that were both expensive to maintain and
> inefficient to run. Now we have the best of both worlds: high
> flexibility within a single, consistent and robust, scheduling system*"*
You can read the whole quote at http://www.sun.com/software/sge/ (which
is now part of Oracle).

Ricky's post presented another intriguing idea: What about a Map Reduce
on top of Amazon, as we have so many complementary services to the Cloud
Api's already provided. Assuming Amazon manages to deliver the
technology Ricky comments, we have another way to make MR mainstream,
with HUGE impact in the way we do get information to run a business. If
the in a business negotiation, we have asymmetric information, meanuing
one party knows much more than the othe partyr, gets who is the winner?

If one day MR is as easy to run as any services offered in the Data
Center, MR application wil innundate the Enterprise, litterally. I will
go as far as to say that organizations that ignore HAdoop and MR will no
longer be in business 10 years from now.

Miha

mij123.vcf

Peglar, Robert

unread,
Feb 14, 2010, 8:28:21 AM2/14/10
to cloud-c...@googlegroups.com
The only thing that is holding back MR implementations in cloud is our
old friend, data.

MR is data-intensive by definition. After all, one of the 'tricks' in
MR is to take huge amounts of data and split it up into smaller
individual files so the M routines can ingest it.

Once again, it's not cloud compute that is 'hard', it's storage. The
mere scheduling of CPU resources is very simple, and we've been doing it
in various formats/fashions for decades now. In fact, one can argue
that cloud compute is a return to batch jobs.

But unless the cloud storage mechanisms get to a point where they are
standardized - via CDMI - and onerous charges for storage and retrieval
aren't in the business model - MR won't take off as a viable model for
the cloud.

Data is central, compute is peripheral.

Rob


Robert Peglar
Vice President, Technology, Storage Systems Group
Xiotech Corporation | Toll-Free: 866.472.6764
o 952 983 2287 m 314 308 6983 f 636 532 0828
Robert...@xiotech.com | www.xiotech.com

-----Original Message-----

See this customer quote:

Miha

> Follow us on Twitter http://twitter.com/cloudcomp_group or
> @cloudcomp_group Post Job/Resume at http://cloudjobs.net Buy 88
> conference sessions and panels on cloud computing on DVD at
> http://www.amazon.com/gp/product/B002H07SEC,
> http://www.amazon.com/gp/product/B002H0IW1U or get instant access to
> downloadable versions at
> http://cloudslam09.com/content/registration-5.html
>
> ~~~~~ You received this message because you are subscribed to the
> Google Groups "Cloud Computing" group. To post to this group, send
> email to cloud-c...@googlegroups.com To unsubscribe from this
> group, send email to cloud-computi...@googlegroups.com
>
>
>
>
>

> Think of how many
> data center are out there in the WORLD in the above market segments.
It is a
> HUGE market. MR is relatively small market at this time.
>
> Exotic applications are going to be on the back burner for a while,
> especially given the bad economy and investment.

--

Ricky Ho

unread,
Feb 14, 2010, 12:16:16 PM2/14/10
to cloud-c...@googlegroups.com

Miha,

Running Hadoop on Amazon EC2 is available today. You can either DIY to install Hadoop into EC2, or pay 15% more EC2 charges to use Elastic Map Reduce.

At the time when Hadoop (also GFS, Goggle MR) is designed, there is no cloud concept out there. Hadoop (pretty much based on the Google model) is fundamentally designed to run in a DataCenter where everything is under your full control, which is very different from the cloud environment we have today.

1) Hadoop has focused a lot in optimizing disk performance (e.g. large files, sequential access ... etc). This is no longer important in the Cloud as the disk I/O has turned into network I/O
2) Hadoop has focused a lot in optimizing network I/O (e.g. replica placement, data colocation ... etc). This again is no longer important as you have no control in the location of data placement
3) Hadoop assumes a static infrastructure (highly distributed, large no of commodity hardware) but doesn't take advantage of the power of elasticity, which is a major strength of the Cloud environment. For example in Hadoop, you cannot add more Mappers or Reducers to speed up the execution after the job started.

Therefore, even though you can run Hadoop in the Cloud today, I seriously doubt the overall architecture is optimized.

So what do we need ? A specially designed SCHEDULER tailored for a) Map/Reduce load characteristic, b) The cloud environment characteristic.

Hadoop's current scheduler do (a) very well, but fail in b).
I haven't looked at SGE very detail. But I would be surprised by any general-purpose scheduler who can do (a) very well. I also think (b) is quite different from a federated grid where SGE is originally designed for. I'd love to be proved wrong.
Cloud MR, to me, seems to be closer because its design is definitely has both (a) and (b) in mind.

However, I think figuring out "What application can I restructure to run in Map/Reduce ?" is even more important than "How to run Map/Reduce efficiently in the infrastructure environment ?". I am referring to the transformation of a sequential algorithm to parallel.

And yes, I completely agree with you. We are at the beginning of an important phase.

Rao Dronamraju

unread,
Feb 14, 2010, 12:07:45 PM2/14/10
to cloud-c...@googlegroups.com

Miha,

I was not saying anything against Ricky's idea or the discussion that you
both were having.

I was refering to two statements in your discussion.

"I am not sure how many people on this group, know really that Map Reduce
is. Sure they know it is a fantastic tool that fits the concept of cloud,
but have no idea whom to call or what to learn first."

"You are probably right. It looks like the majority audience in this group


are biz folks, not tech folks."

Both the above statements are not right...


-----Original Message-----
From: cloud-c...@googlegroups.com
[mailto:cloud-c...@googlegroups.com] On Behalf Of Miha Ahronovitz
Sent: Sunday, February 14, 2010 1:16 AM
To: cloud-c...@googlegroups.com; Miha Ahronovitz

Jan Klincewicz

unread,
Feb 14, 2010, 9:24:14 AM2/14/10
to cloud-c...@googlegroups.com
Rao:

I agree that more potential customers are looking to move their COTS application over to the Cloud than are hoping to re-engineer their businesses to somehow accommodate the Map Reduce model.  Searching for one-offs to prove Hadoop's value will find enough candidates to make a case for its importance.

Shiny new toys are attractive and fun.  Then you grow up and realize you have to buy a lawn mower because the shiny new toy won't keep the grass from growing.
Cheers,
Jan

Jan Klincewicz

unread,
Feb 14, 2010, 10:05:16 AM2/14/10
to cloud-c...@googlegroups.com
Rick:

I think there is a split netween biz and tech folks on this forum (the numbers would certainly justify that both are represented) ans there are a few individuals with a foot in both camps.  While technology for its own sake is important (and is too often forsaken in exchange for want of immediate profits) I think there is a keen interest in "what can help me now."

As it stands, virtualization is a mature and pretty easily understood technology which easily compliments "business as usual" - just condenses it and makes it more efficient.  Massive Parallelism will require a change in the way most apps today are constructed and also would touch every aspect of a data center from Operating Systems to Networking to Storage.  No doubt that developers who take a long-time-horizon will be able to build very robust apps with this model, which can better withstand the weaknesses of cheap hardware... one that plagues the current approach.

This is not a topic that should be causing fights on this forum.
Cheers,
Jan

Greg Pfister

unread,
Feb 14, 2010, 5:29:06 PM2/14/10
to Cloud Computing
Agreed. It's the data.

As was said a long, long time ago by one of the guys involved in the
founding of this here whole internet thingy:

Distributed computing, FOO. Tell me where the data is, and I'll tell
you where the computing must be.

Greg Pfister
http://perilsofparallel.blogspot.com/

On Feb 14, 6:28 am, "Peglar, Robert" <Robert_Peg...@xiotech.com>
wrote:


> The only thing that is holding back MR implementations in cloud is our
> old friend, data.
>
> MR is data-intensive by definition.  After all, one of the 'tricks' in
> MR is to take huge amounts of data and split it up into smaller
> individual files so the M routines can ingest it.
>
> Once again, it's not cloud compute that is 'hard', it's storage.  The
> mere scheduling of CPU resources is very simple, and we've been doing it
> in various formats/fashions for decades now.  In fact, one can argue
> that cloud compute is a return to batch jobs.
>
> But unless the cloud storage mechanisms get to a point where they are
> standardized - via CDMI - and onerous charges for storage and retrieval
> aren't in the business model - MR won't take off as a viable model for
> the cloud.
>
> Data is central, compute is peripheral.
>
> Rob
>
> Robert Peglar
> Vice President, Technology, Storage Systems Group
> Xiotech Corporation | Toll-Free: 866.472.6764
> o 952 983 2287  m 314 308 6983  f 636 532 0828  

> Robert_Peg...@xiotech.com |www.xiotech.com

> ...
>
> read more »

Ricky Ho

unread,
Feb 14, 2010, 5:29:23 PM2/14/10
to cloud-c...@googlegroups.com
Jan,

I completely agree.  There is no benefit of restructure existing app to Map/Reduce if it is running fine today.  But there is a lot of economy benefit to move to the cloud.
I think Map/Reduce is more about enabling new apps (which wouldn't be done without highly parallel architecture) rather than migration of existing apps.

But the world is getting more and more data, and whatever business who can sooner extract useful information out from these vast amount of data will be the winner of future.  This is where Map/Reduce shines.

Rgds,
Ricky


From: Jan Klincewicz <jan.kli...@gmail.com>
To: cloud-c...@googlegroups.com
Sent: Sun, February 14, 2010 7:05:16 AM

Rao Dronamraju

unread,
Feb 14, 2010, 4:23:29 PM2/14/10
to cloud-c...@googlegroups.com

"Data is central, compute is peripheral."

A very interesting statement!.

If you look at it from the perspective of the most fundamental of all
processing units (human brain) and data (the world around us), is data
peripheral or processing peripheral?...
From your own (brain/processing) perpective data/world becomes peripheral.
But from other's perspective, you become data hence you (your brain which
data now) also becomes peripheral for others.

So the bottom line seems to be processing and data are both central and
peripheral at the same time?. Is this the duality of the nature of
data?...or processing?...

Just food for your thought!.

Rao Dronamraju

unread,
Feb 14, 2010, 3:17:35 PM2/14/10
to cloud-c...@googlegroups.com

"You think of Goggle as an "exotic" application, while the DC IaaS, SaaS
are more important because of the recession? I do agree DC cloud is very
important I know many people know how Map Reduce works, but few understand
the
business potential of MR. MR is equally important to PaaS, SaaS etc.."

Many people do not stop and ask one simple question. What is the value that
Google provides in terms of search?....Can you live without google
search?...Ofcourse 90%+ people in this world CAN LIVE without google search.
Most of google's usage is because it is FREE. It is no different than the
100s of unimportant channels that you get in a TV package. If the cable
company charges go up, immediately people will drop all the unnecssary
channels. Same with most web surfing stuff. It has not become essential
services like email, ERP, billing systems etc etc in a business.

MR is a very niche application. You have to have MAP and REDUCE as your
fundamental abstractions. Sure you will have MAP as your fundamental
abstration in many data intensive applications. But how many have
REDUCE?...Is genomic applications have REDUCE?....
Do financial applications have REDUCE?...You can always justify it by
saying, you always DERIVE a PATTERN or an INFERENCE from large amouns of
data so it is a REDUCE abstraction. But where is MAP and REDUCE in an
e-commerce application?....where is MAP and REDUCE in FICO, MM, SCM of
ERP?...

"I know many people know how Map Reduce works, but few understand the
business potential of MR. MR is equally important to PaaS, SaaS etc.."

At this time the business potential is limited, unless they convert all
applications and their data models to MR. Remember we solve world's
problems. You do not create problems to fit to certain abstrations and then
say here is the solution to the problem because we have an algorithm. You
have a solution for the problem, not a problem for a solution.

"As Cloudera CTO said, Hadoop ( and MR and the rest), are big hammers in
search for nails"

It doesn't matter whether you have a big hammer and/or nails, you need to
have something that needs nailing. If there is nothing that needs nailing,
there is no need for the hammer and nails.

"If one day MR is as easy to run as any services offered in the Data
Center, MR application wil innundate the Enterprise, litterally. I will
go as far as to say that organizations that ignore HAdoop and MR will no
longer be in business 10 years from now."

I totally disagree...it is like saying internet.web is all about surfing.
Although surfing is a large segment primarily consisting of consumers,
remember the B2B, B2C, E2E market segments of internet/web, they makeup for
the backbone of the VALUE delivered, not surfing. Similarly, google's search
and facebook do not make up the VALUE of internet/web/cloud, it is the
business services on these that have the real VALUE delivery.


-----Original Message-----
From: cloud-c...@googlegroups.com
[mailto:cloud-c...@googlegroups.com] On Behalf Of Miha Ahronovitz
Sent: Sunday, February 14, 2010 1:16 AM
To: cloud-c...@googlegroups.com; Miha Ahronovitz

Peglar, Robert

unread,
Feb 14, 2010, 7:10:04 PM2/14/10
to cloud-c...@googlegroups.com
Thanks @Rao - indeed, food for thought. Expanding on the brain analogy
a bit, the data is gathered into the brain - where data can reside at
rest, or be acted upon, as the brain sees fit. Or, even forgotten!
So, in that sense, data is central. The brain has various ways of
gathering the data- senses - and processes it, then acts (or not) on it
via gross motor activity or other mechanisms (e.g. speech, emotions,
etc.)

The brain is absolutely fascinating, no doubt, as it is its own little
'cloud', I suppose, with all the elements necessary for both compute and
storage. Plus, it commands an entire network (the nervous system) to
get data and send commands. Great design, no question.

Put another way ("data is central, compute is peripheral"), there is
plenty of data without compute, but there is no compute without data.

In the current thread about MapReduce, if the MR codes can't ingest data
fast enough to be efficient, they are not particularly viable. This is
where the rubber hits the cloud road. MR compute in the cloud is
exciting, but getting the MR data efficiently _into_ the cloud is the
trick.

Rob

Miha Ahronovitz

unread,
Feb 14, 2010, 8:42:11 PM2/14/10
to cloud-c...@googlegroups.com
Rao, it's very tempting to continue the exchange, but I will stop here.
All I say I went through a similar transition myself. I did not
understand the business potential of Map Reduce to begin with, only to
get to the stage to say WOW. I know from the few years I read you on
this list that you will discover by yourself in your "Kaizen" as
continuous learning and improvement.. The only reason people can not
solve the problems Map Reduce solves, is because it did not exist.

100 years ago, there was no way to travel from America to Europe in one
day. The reasons are the commercial aviation did not exist. But to leave
the metaphors, here is New York Times article about Cloudera, published
last year:

http://www.nytimes.com/2009/03/17/technology/business-computing/17cloud.html?_r=1

Look how everyone is smiling in the photo: Christophe Bisciglia, Amr
Awadallah, Jeff Hammerbacher and Mike Olson. What is great about them is
that see what other people can not see. This is why Google founders did:
they saw what others did not notice. By the time we, the "masses" will
understand, our opportunity will be to apply hat in hand for jobs in
their billion $ companies...

Also read "What You Didn�t Know About Cloudera"

http://gigaom.com/2010/02/10/what-you-didnt-know-about-cloudera/

> But Olson delivered a surprise when he said that it�s wrong to assume
> that his company is solely focused on open source software. On the
> contrary, Cloudera will diversify out of a strategy focused solely on
> it. �Either this quarter or next we will offer an enterprise software
> bundle consisting of proprietary enhancements for Hadoop users,� Olson
> said. �Our proprietary apps will complement the open source core, and,
> like Facebook and Yahoo, we continue to have core committers to Hadoop.�
Enterprise? Hadoop? Propietary? We can not ignore the Map Reduce
contributions to the cloud... After all, this is the cloud group on Google.

Cheers,

Miha

mij123.vcf

Jim Starkey

unread,
Feb 14, 2010, 8:53:34 PM2/14/10
to cloud-c...@googlegroups.com
Greg Pfister wrote:
> Agreed. It's the data.
>
> As was said a long, long time ago by one of the guys involved in the
> founding of this here whole internet thingy:
>
> Distributed computing, FOO. Tell me where the data is, and I'll tell
> you where the computing must be.
>
>

It's the idea that the data resides at a single location that is the
problem.

The data needs to be where ever it's needed. There isn't a world
bandwidth shortage, so the challenge is how to use it wisely. (Hint: a
distributed file system synchronized by an inter-galactic lock manager
isn't the answer.)

--
Jim Starkey
NimbusDB, Inc.
978 526-1376

Ray Nugent

unread,
Feb 14, 2010, 10:29:56 PM2/14/10
to cloud-c...@googlegroups.com
Is there? Without compute, how do you know there is data at all? If it's there but can't be comprehended, does it matter? If a tree falls in the forest...

Ray


From: "Peglar, Robert" <Robert...@xiotech.com>
To: cloud-c...@googlegroups.com
Sent: Sun, February 14, 2010 4:10:04 PM

Subject: RE: [ Cloud Computing ] Cloud Map Reduce Tricks

Ricky Ho

unread,
Feb 14, 2010, 10:45:14 PM2/14/10
to cloud-c...@googlegroups.com
Getting data into the Cloud is not an easy problem. But there are some strategies that I find useful in mitigating some of the issues

1) Create the data at the Cloud in the first place
2) Move the code to the data
3) Partition the data according to processing patterns to maximize data collocation
4) Push more processing at the data source


Please share your bag of tricks as well.

Rgds,
Ricky

----- Original Message ----
From: "Peglar, Robert" <Robert...@xiotech.com>
To: cloud-c...@googlegroups.com
Sent: Sun, February 14, 2010 4:10:04 PM

Ricky Ho

unread,
Feb 14, 2010, 11:22:37 PM2/14/10
to cloud-c...@googlegroups.com
Computer Science is about building a "general" model so a broad range of problems can "looks the same" and hence we can apply a "general" solution to solve a broad range of problems. If you have a powerful solution, it is not a bad idea to transform your problem in the form that the solution is designed to solve. For example, a lot of real life problem can be represented as a Graph problem and can benefit from the use of graph algorithms

Invent a problem for a solution is also not a bad idea. Sometimes we don't recognize a need because we cannot imagine beyond what is impossible.

In my opinion, we should always look at things from both directions. You are absolutely right that we should look at our problem and pick the best technology solution. But we should also look at new technologies and imagine what kind of opportunities that it has enabled.

Map/Reduce, in my opinion, belongs to the later one. If you are completely happy with your existing application, there is no point to transform it to Map/Reduce. But on the other hand, there are many new opportunities around you that map/reduce has enabled. Of course, you can choose to just focus in what you have and ignore these opportunities. And your competitors will be very happy that you do that.

I am surprised to find quite a lot of basic algorithms (sort, search, statistical calculation, matrix, query joins, graph, machine learning ... etc) can be represented using Map/Reduce and hence enjoy the power of parallelism.

There are two types of applications. Transaction system (most of the example that you mentioned about) as well as Analytic system. Transaction system is about many concurrent users making simple transaction. Map/Reduce is not relevant for this one. But for analytic system where you want to look across large amount of raw information to extract insight. Map/reduce is a key enabler for this.

Rgds,
Ricky

----- Original Message ----
From: Rao Dronamraju <rao.dro...@sbcglobal.net>
To: cloud-c...@googlegroups.com

Ray Nugent

unread,
Feb 14, 2010, 11:33:37 PM2/14/10
to cloud-c...@googlegroups.com
If the cloud is in your datacenter then getting the data into is simple. Again, it's not the data it's the cloud...


From: Ricky Ho <rickyp...@yahoo.com>
To: cloud-c...@googlegroups.com
Sent: Sun, February 14, 2010 7:45:14 PM

Ray Nugent

unread,
Feb 14, 2010, 11:42:35 PM2/14/10
to cloud-c...@googlegroups.com
Just Tweeted by Werner Volgel...

"Very honored to have speaking slot at June #nosql event. Returning to my roots. Some of our industry's smartest people are in that movement."

From: Ricky Ho <rickyp...@yahoo.com>
To: cloud-c...@googlegroups.com
Sent: Sun, February 14, 2010 8:22:37 PM

Rao Dronamraju

unread,
Feb 14, 2010, 11:43:54 PM2/14/10
to cloud-c...@googlegroups.com

“Put another way ("data is central, compute is peripheral"), there is
plenty of data without compute, but there is no compute without data.”

 

Data is converted into information through processing (even if it is in the brain)

Data has no value. Information has value.

So data without processing is useless.

 

Please note that data is processed by perceptions and at this stage has no value because of lack of semantics. Only when it gets processes through cognition that it has semantics and becomes information and hence has value.

Information/Value = Cognition ( No Value = Perception (Data))

 


From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Ray Nugent


Sent: Sunday, February 14, 2010 9:30 PM
To: cloud-c...@googlegroups.com

Rao Dronamraju

unread,
Feb 15, 2010, 12:05:56 AM2/15/10
to cloud-c...@googlegroups.com

"Computer Science is about building a "general" model so a broad range of
problems can "looks the same" and hence we can apply a "general" solution to
solve a broad range of problems. "

Ricky, it is not just computer science. All human learning is all about
generalization of the world/universe. This is the only way human beings can
learn. In fact this is the essence and foundation of machine learning.
Machine's cannot learn unless human beings make them learn. Only way human
beings can make a machine learn is duplicate the human learning process in
machines. So this is the foundation and essence of human learning, machine
learning and learning in all fields of study not just computer science.

"In my opinion, we should always look at things from both directions. You
are absolutely right that we should look at our problem and pick the best
technology solution. But we should also look at new technologies and
imagine what kind of opportunities that it has enabled."

Sure I agree, especially with new and highly disruptive areas you need to do
that. Infact we had a discussion on this forum about Intel's 48 core systems
and how they can be applied in ML and AI areas.

"There are two types of applications. Transaction system (most of the
example that you mentioned about) as well as Analytic system. Transaction
system is about many concurrent users making simple transaction. Map/Reduce
is not relevant for this one. But for analytic system where you want to
look across large amount of raw information to extract insight. Map/reduce
is a key enabler for this."

Yes, I agree with you. Infact MR is a great candidate for future
applications like analytics, ML, AI etc etc. But the real world is quite a
bit transactional systems. The existing data centers are heavily
transactional in nature. So MR is great for future applications not suitable
for the large transaction based IT industry that is existing.

Peglar, Robert

unread,
Feb 15, 2010, 7:02:24 AM2/15/10
to cloud-c...@googlegroups.com
Hi Ricky,

Having "the data come to you" is indeed the trick, or as you say
creating the data inside the cloud in the first place. This works
fairly well if the data is highly fragmented, can be assembled over long
periods of time and has little or no cost involved in the gathering (and
waiting). For example, O(millions) of people typing a few sentences on
keyboards, which is transmitted over relatively low-speed links to a
cloud. Over time, the cloud contains many TB of data. That's
'creating' the data at the cloud in the first place.

However, many commercial compute jobs - that IT directors want to move
into the cloud - already have petabytes of data at rest to be analyzed
and/or used in compute jobs. Moving those datasets into a cloud is
cost-prohibitive at best and logistically impossible at worst. Plus,
moving a small portion of that data into a cloud, compute, move the
results back to the commercial institution is also a non-starter, mostly
due to the transport time involved. Large datasets are going to stay
put and private clouds will have to be built around them, and/or
non-cloud compute resources put in place (such as those already present
in major datacenters worldwide).

The only way these commercial jobs will migrate into a cloud is through
sufficient bandwidth and sufficiently high-speed links, such as the
overlay Internet2, where one can put up a 10gb/s virtual circuit between
two points. This kind of connectivity and bandwidth is essential to
public cloud's success for anything other than small datasets.

There is also promising R&D now in the areas of data reduction
(compression, incrementalization, deduplication) that will also help.
So far, though, no major breakthroughs. I remain hopeful.

Sassa

unread,
Feb 15, 2010, 7:06:59 AM2/15/10
to Cloud Computing
Jan,

There was a discussion in this group arguing about the actual law of
occurrences of failures. I personally found very interesting that the
failures were shown to be NOT independent events. If you search the
archives, you'll find a reference to a statistical study that the CPU
and disk failures in the same room had short lag. Mirroring the state
of VMs to a box powered from a different grid is a power supply fault
tolerance. I can't tell if that's a bigger deal than "$2.00 component"
failure. Can you? Just curious how much concerned should I be.


If 1 X-core box is N times cheaper than X/Y Y-core boxes, I can afford
to waste N times more resources (this automatically means I am not
reaching the capacity for neither of the resources). Alternatively, I
can afford N times more guests (with the overhead guests being
passive, waiting for failover), and reduce time to recovery. Powering
off the cores without tasks is easier to achieve in a larger box than
a smaller box (6 x quad-cores @70% = 6 cores idle = can be powered
off; 1 x 24-core @70% = 7 cores idle = can be powered off).


Then the failover cost - yes, it is more expensive to migrate the
whole 24-core box than one quad-core (kinda, "stop the world" for more
vs less VMs at once). But does this mean you need to do it at the same
rate? Of course, not.


Several people banged on about "memory bandwidth". I buy this as a
prospect. What is a reliable way to estimate it for any given
application?


Sassa


On Feb 10, 2:05 pm, Jan Klincewicz <jan.klincew...@gmail.com> wrote:
> Erik:
>
> Failover refers to the ability of a OS instance to RESTART after failure.
> Failover still means sufficient downtime to re-boot an OS and load apps.
>
>   Mirroring the memory state (and CPU state) is what Marathon EverRun
> does.(I mentioned them in my last post)  Marathon can do this for physical
> as well as Virtual Xen servers.  I suppose you could call this a two-node
> cluster, but the real term in "Fault Tolerant" as opposed to "Highly
> Available".
>
> A configuration where every VM in a 48-core box was running in lockstep with
> its twin on a machine powered by a separate grid would be a pretty reliable
> design.
>
> On Tue, Feb 9, 2010 at 10:49 PM, Erik Sliman <erikslima...@gmail.com> wrote:
> > Jim and Jan,
>
> > Your points are all valid, and do primary concerns.  I understand the
> > concerns with memory contention and failover.  Being that this was once an
> > argument on why cheap wintel boxes could never displace the mainframe, my
> > gut tell me that these high density core chips do have a real chance in the
> > cloud, if, for no other reason, then for their ability to save space and
> > reduce power consumption.  So, let's play devil's advocate with the
> > concerns.
>
> > Memory contention:  imagine 48 cores and 12-channel memory.  What is the
> > REAL % of wait time for the threads?  It seems potentially contentious, but
> > without intricate knowledge of what % of a core's time requires an open
> > memory channel, I don't really know how contentious it will really be,
> > particularly considering potential core idle time.
>
> > Failover:  Why does a VM have to depend on a CPU housing for this?  Let's
> > assume it is using a fiber optic SAN and the memory state of a VM's
> > requiring high availability is mirrored.  Why can't the VM fail over to
> > another box if the box it is in fails?  This is, fundamentally, the type of
> > clustering that made cheap hardware capable of being used to build data
> > centers and supercomputers.
>
> > Xen does it
> >http://sheepy.org/node/65
>
> > Erik
> > OpenStandards.net
>
> > On Tue, Feb 9, 2010 at 7:57 PM, Jan Klincewicz <jan.klincew...@gmail.com>wrote:
>
> >> Fundamentally, I agree with Jim.  There are very few bits of code out
> >> there used for commercial purposes today that can take advantage of a modern
> >> Quad processor.  Once a chip has gone multi-core, adding additional cores is
> >> not that big a deal, and we see at this point the move from quad core to hex
> >> core gives about a 30% increase. This is not linear.  Also, servers are not
> >> CPU bound, thus we get diminishing returns pumping them full of cores.
> >> Right now they are out of sync with I/O and Storage.  Memory has become much
> >> less expensive, but we have precious few applications that can use more than
> >> 4GB since most of the world runs 32-bit code. I quoted 4GB of server memory
> >> to customer today at $208.00 U.S.
>
> >> HPTC outliers : Please refrain from reminding me how your protein folding
> >> app can chew up a TB of RAM and 64 cores ...   What's your email server
> >> running ?
>
> >> Virtualization is a paradigm that already shifted 10 years ago.  It IS a
> >> good way to squeeze more resources out of a single box, but very few truly
> >> Fault Tolerant solutions exist for SMP VMs. Check out Marathon Technologies
> >>http://www.marathontechnologies.com/for that.  Running 80 VMs on a host
> >> that will inevitably fail sometime is not what most people do in Production
> >> environments.
>
> >> Jim likes to talk about "sleds."  It must be snowing where he is too.  I
> >> find the concept very similar to blade servers (which we also called a
> >> paradigm shift five years ago.)   Sharing components like power supplies is
> >> a great idea, likewise virtualized I/O.  More efficient servers, though are
> >> still an evolutionary and not revolutionary accomplishment.
>
> >> A company called 3Leaf  http://www.3leafsystems.com/seems to have a
> >> means of cobbling together a huge SMP box out of smaller ones and sharing
> >> memory, but again. aside from HPTC applications, few programs scale up this
> >> well in the commodity space.
>
> >> I am having my cojones busted for suggesting that running high densities
> >> of VMs on dirt-cheap white boxes is ill-advised, but I stand by my assertion
> >> that if all your eggs are in one basket, don't cheap out on the basket.
>
> >> Moore's Law has stood the test of quite a bit of time, but at the end of
> >> the day,what are the practical ramifications of mega-powerful CPUs without
> >> highly available apps to run on them ?   If I were to predict the NEXT real
> >> paradigm shift, I would look for a more grid-oriented software architecture,
> >> where the loss of a single server is insignificant.   Google seems to
> >> operate its search this way, but they are an "outlyer" with the capacity and
> >> finances to focus on a specific area of compute.
>
> >> IMO, Intel is looking to attach the "Cloud" mojo to a product to jump on
> >> the bandwagon just like Compaq called itself the "Non-Stop Internet Company"
> >> back in 1999.
>
> >> On Tue, Feb 9, 2010 at 5:56 PM, Jim Starkey <jstar...@nimbusdb.com>wrote:
>
> >>> I'm skeptical, very skeptical.  I see the system cost of large number of
> >>> cores -- memory contention and memory bandwidth contention, but I don't see
> >>> the benefit unless there is an application that needs memory shared between
> >>> a large number of threads.  48 cores with 36 stalled waiting on the memory
> >>> controller doesn't strike me as a good architecture.
>
> >>> In the absence of such an application (and one that doesn't also require
> >>> scale-out), a more useful configuration is a server sled -- single board,
> >>> single power supply, on board switch but a half dozen servers each with
> >>> dedicated memory and maybe a dedicated local disk.  James Hamilton has
> >>> written about these.  A sled gives the server density of a massive-core
> >>> system without the memory contention, and is probably cheaper.
>
> >>> Intel, I think, is pushing what they think they know how to build.
> >>>  Whether there is any market pull for this, I don't know, but I doubt it.
>
> >>> The future doesn't belong to scale-up (bigger, faster machines) but to
> >>> scale-out (more, cheaper machines).  Maybe Intel is just looking for their
> >>> next boat to miss, but cloud computing will always be happier with more,
> >>> cheaper machines (Jan will insist on a high powered logo on each on,
> >>> though).
>
> >>> GregO wrote:
>
> >>>> Below is a snippet from my latest blog..
>
> >>>> I would like to hear from others what the effect of the "Cloud Chip"
> >>>> will have on Virtualization and Cloud Computing...
>
> >>>> Thanks,
> >>>> GregO
>
> >>>> Moore’s Law: The Future of Cloud Computing from the Bottom Up
>
> >>>> I'm a serial entrepreneurial leader.  It's an art/science, left/right
> >>>> brain thing. I have to say that one of the most challenging parts of
> >>>> creating a compelling strategy, leading a company or building products
> >>>> is getting people to see the possibilities, transitions and tipping
> >>>> points. Imagineering the future calls me to look back at what made
> >>>> companies great -- specifically, how they capitalized on paradigm
> >>>> shifts while the rest missed it. Reading the recent bestseller,
> >>>> Outliers, it struck me that, not only do you have to be smart, but you
> >>>> have to be in the right place with the experience to see and grab the
> >>>> brass ring.
>
> >>>> Moore's Law is one of those history lessons that have traditionally
> >>>> been a touchpoint that points the way to the future. Simply put,
> >>>> Moore's law describes a long-term trend in the history of computing
> >>>> hardware, in which the number of transistors that can be placed
> >>>> inexpensively on an integrated circuit has doubled approximately every
> >>>> two years.
>
> >>>> Translation: compute power has reliably doubled at a decreased cost
> >>>> every two years.
>
> >>>> In a recent announcement, Intel gave a glimpse of what the future will
> >>>> look like. The "Cloud" chip will have 48 cores, is available to
> >>>> Intel's ISV partner today and will be shipping in volume in less then
> >>>> 18 months. The quote from the Intel dude stated that it will increase
> >>>> the power of what is available today by 10-20 times.  Oh my.... Buckle
> >>>> your seatbelt .... Moore's law just took a giant step up the paradigm.
>
> >>>> <the rest athttp://blog.appzero.com/>
>
> >>> --
> >>> Jim Starkey
> >>> Founder, NimbusDB, Inc.
> >>> 978 526-1376


>
> >>> --
> >>> ~~~~~
> >>> Register Today for Cloud Slam 2010 at official website -
> >>>http://cloudslam10.com
> >>> Posting guidelines:

> >>>http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> >>> Follow us on Twitterhttp://twitter.com/cloudcomp_groupor
> >>> @cloudcomp_group
> >>> Post Job/Resume athttp://cloudjobs.net


> >>> Buy 88 conference sessions and panels on cloud computing on DVD at
> >>>http://www.amazon.com/gp/product/B002H07SEC,

> >>>http://www.amazon.com/gp/product/B002H0IW1Uor get instant access to


> >>> downloadable versions at
> >>>http://cloudslam09.com/content/registration-5.html
>
> >>> ~~~~~
> >>> You received this message because you are subscribed to the Google Groups
> >>> "Cloud Computing" group.
> >>> To post to this group, send email to cloud-c...@googlegroups.com
> >>> To unsubscribe from this group, send email to
> >>> cloud-computi...@googlegroups.com
>
> >> --

> >> Cheers,
> >> Jan


>
> >>  --
> >> ~~~~~
> >> Register Today for Cloud Slam 2010 at official website -
> >>http://cloudslam10.com
> >> Posting guidelines:

> >>http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> >> Follow us on Twitterhttp://twitter.com/cloudcomp_groupor
> >> @cloudcomp_group
> >> Post Job/Resume athttp://cloudjobs.net


> >> Buy 88 conference sessions and panels on cloud computing on DVD at
> >>http://www.amazon.com/gp/product/B002H07SEC,

> >>http://www.amazon.com/gp/product/B002H0IW1Uor get instant access to


> >> downloadable versions at
> >>http://cloudslam09.com/content/registration-5.html
>
> >> ~~~~~
> >> You received this message because you are subscribed to the Google Groups
> >> "Cloud Computing" group.
> >> To post to this group, send email to cloud-c...@googlegroups.com
> >> To unsubscribe from this group, send email to
> >> cloud-computi...@googlegroups.com
>
> >  --
> > ~~~~~
> > Register Today for Cloud Slam 2010 at official website -
> >http://cloudslam10.com
> > Posting guidelines:

> >http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> > Follow us on Twitterhttp://twitter.com/cloudcomp_groupor
> > @cloudcomp_group
> > Post Job/Resume athttp://cloudjobs.net


> > Buy 88 conference sessions and panels on cloud computing on DVD at
> >http://www.amazon.com/gp/product/B002H07SEC,

> >http://www.amazon.com/gp/product/B002H0IW1Uor get instant access to


> > downloadable versions at
> >http://cloudslam09.com/content/registration-5.html
>
> > ~~~~~
> > You received this message because you are subscribed to the Google Groups
> > "Cloud Computing" group.
> > To post to this group, send email to cloud-c...@googlegroups.com
> > To unsubscribe from this group, send email to
> > cloud-computi...@googlegroups.com
>
> --

> Cheers,
> Jan

Jan Klincewicz

unread,
Feb 15, 2010, 1:31:50 PM2/15/10
to cloud-c...@googlegroups.com
     I'm sure there are numerous statistical studies of component failure rates.  Having sold servers for about a decade, I have anecdotal evidence of angry customers (surprisingly few compared to the number of servers I saw go out the door) who purchased RAID contollers, battery-backed write cache, N+1 redundant power supplies etc. only to be done in by a CPU Voltage Regulator Module.  I'm not claiming I have statistical evidence to back me up, but am only stating that I have seens instances where a cheap, simple electronic component can take down a $10,000 server otherwise outfitted with every conceivable High-Availability option.   It can and does happen.

Based on that, it is academic how many VMs one runs on any given server.  The question is Cost of Downtime.  You can lose a number of web servers, probably app servers too and still load balance through that. A down database on the back end of an n-Tier E-Commerce can render the rest of the servers useless until it is back up.  Cluster-type fail-over still pretty much requires time of an OS  to boot, load it's services, connect to networks and storage, and then load the app.  It can help running off SSD, and stripping unnecessary services from the OS, but for some cases, that 60-seconds can lose a lot of money.




Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net

Buy 88 conference sessions and panels on cloud computing on DVD at


~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com



--
Cheers,
Jan

Jeff Darcy

unread,
Feb 15, 2010, 2:22:19 PM2/15/10
to cloud-c...@googlegroups.com
On 02/15/2010 01:31 PM, Jan Klincewicz wrote:
> Cluster-type fail-over
> still pretty much requires time of an OS to boot, load it's services,
> connect to networks and storage, and then load the app.

That's not even close to true. Failover can only occur when another
node knows enough cluster state to recognize the necessity, and it would
have to be up to know that. I was working on HA clusters as long ago as
1992 which could fail over *much* faster than the nodes could boot, and
the state of the art hasn't regressed since then. The limiting factor
tends to be not the speed with which anything can be brought up, since
nodes can be in varying states of readiness from "barely booted" to
"actively processing the same input but with output disabled" depending
on need, but the time necessary to keep transient problems from causing
premature failover.

Ricky Ho

unread,
Feb 15, 2010, 3:00:56 PM2/15/10
to cloud-c...@googlegroups.com
Hi Rob,

Lets say to conduct a weekly analytic processing, you need to gather the last 5 years of data, which is in multiple petabyte range.

This huge volume of data can be partitioned into 2 kinds ...

A = What you have produced over the last 5 years.
B = What you just produced over the last week.

You ship the hard drivers containing A to Amazon and create B in the cloud onwards.
Will this work ?

I assume ...
1) The output of the analytic result is small and downloading it is not a concern.
2) There is a storage cost to store petadata in Amazon Cloud long term, which I think is acceptable.
3) There is no security concern to store the data in Amazon Cloud long term, which I think is reasonable because otherwise you won't consider running Map/Reduce in the cloud anyway.

Peglar, Robert

unread,
Feb 16, 2010, 7:23:23 AM2/16/10
to cloud-c...@googlegroups.com
Won't work, at least not reliably or with integrity at non-trivial
amounts of data.

One of the problems is that joining datasets A and B is not trivial.
You'd have to not just collect set B (e.g. transaction logs, ordered
updates, etc.) but also a mechanism to play the I/O back into set A in
order. In other words, it's not just a pile of data, it's the ordering
of the I/O that counts as well. Now, if you have pure unstructured
data, a bit pile, it's much easier to perform union (A,B).

1) Agree, analytic results are usually small and not a concern.
2) Disagree, the current business models are very skewed towards small
datasets. I've already posted about how it's more cost-efficient to buy
and operate your own terabyte than it is to upload it to S3, keep it,
and download it (once) in a given year. When you are charged 10
cents/month per GB just for pure storage and also 10 cents per GB
transferred, in or out, that's cost-inefficient. In order to compete
with efficient onsite storage, the pricing should be at least an order
of magnitude cheaper. In terabyte terms, it's $100/TB/month and another
$100 to transfer it one way, one time. That's $1,200/TB/year just for
storage. I know plenty of enterprise storage vendors that would be
delighted to sell you a terabyte that runs for 5 years for $6,000 or a
petabyte for $6 million. At that rate, they'd pay your co-lo bill too,
and be _way_ money ahead.
3) There are certainly security concerns - unless you store data
encrypted at rest, which alleviates some of them. More importantly,
there are huge integrity concerns in large public stores because they
typically implement storage devices that do not perform DIF (e.g. SATA
disk drives). No DIF, no integrity at rest.

Cheers

Jan Klincewicz

unread,
Feb 16, 2010, 9:04:05 AM2/16/10
to cloud-c...@googlegroups.com
Erik:

In the example you reference, the memory state of a VM is NOT mirrored.  The example is one of common fail-over as the statement right up front that says : "I've done some testing and all appears to work fine. However let me stress that this is not live migration so you would suffer about a minute or so outage" implies.

I think I had mentioned previously that both VMware and and XenServer (via Marathon Technologies) do offer fault tolerance via lockstepping (mirroring both memory and CPU state across two hosts.)  To date, the issue is that this was limited to VMs with a single vCPU. Marathon had intended to solve the Virtual SMP issue by now, but I have not seen any information specifically addressing that on their website.

This is beyond fail-over as there is zero downtime.  Although this technique has been used with identical physical servers in the past, it is much more cost-effective with VMs.



<<Failover:  Why does a VM have to depend on a CPU housing for this?  Let's assume it is using a fiber optic SAN and the memory state of a VM's requiring high availability is mirrored.  Why can't the VM fail over to another box if the box it is in fails?  This is, fundamentally, the type of clustering that made cheap hardware capable of being used to build data centers and supercomputers. 

Xen does it
http://sheepy.org/node/65 >>




On Tue, Feb 9, 2010 at 10:49 PM, Erik Sliman <eriksl...@gmail.com> wrote:
Jim and Jan,

Your points are all valid, and do primary concerns.  I understand the concerns with memory contention and failover.  Being that this was once an argument on why cheap wintel boxes could never displace the mainframe, my gut tell me that these high density core chips do have a real chance in the cloud, if, for no other reason, then for their ability to save space and reduce power consumption.  So, let's play devil's advocate with the concerns.

Memory contention:  imagine 48 cores and 12-channel memory.  What is the REAL % of wait time for the threads?  It seems potentially contentious, but without intricate knowledge of what % of a core's time requires an open memory channel, I don't really know how contentious it will really be, particularly considering potential core idle time.

Failover:  Why does a VM have to depend on a CPU housing for this?  Let's assume it is using a fiber optic SAN and the memory state of a VM's requiring high availability is mirrored.  Why can't the VM fail over to another box if the box it is in fails?  This is, fundamentally, the type of clustering that made cheap hardware capable of being used to build data centers and supercomputers. 

Xen does it
http://sheepy.org/node/65

Erik
OpenStandards.net
On Tue, Feb 9, 2010 at 7:57 PM, Jan Klincewicz <jan.kli...@gmail.com> wrote:
Fundamentally, I agree with Jim.  There are very few bits of code out there used for commercial purposes today that can take advantage of a modern Quad processor.  Once a chip has gone multi-core, adding additional cores is not that big a deal, and we see at this point the move from quad core to hex core gives about a 30% increase. This is not linear.  Also, servers are not CPU bound, thus we get diminishing returns pumping them full of cores.  Right now they are out of sync with I/O and Storage.  Memory has become much less expensive, but we have precious few applications that can use more than 4GB since most of the world runs 32-bit code. I quoted 4GB of server memory to customer today at $208.00 U.S.

HPTC outliers : Please refrain from reminding me how your protein folding app can chew up a TB of RAM and 64 cores ...   What's your email server running ?

Virtualization is a paradigm that already shifted 10 years ago.  It IS a good way to squeeze more resources out of a single box, but very few truly Fault Tolerant solutions exist for SMP VMs. Check out Marathon Technologies http://www.marathontechnologies.com/ for that.  Running 80 VMs on a host that will inevitably fail sometime is not what most people do in Production environments.


Jim likes to talk about "sleds."  It must be snowing where he is too.  I find the concept very similar to blade servers (which we also called a paradigm shift five years ago.)   Sharing components like power supplies is a great idea, likewise virtualized I/O.  More efficient servers, though are still an evolutionary and not revolutionary accomplishment. 

A company called 3Leaf  http://www.3leafsystems.com/ seems to have a means of cobbling together a huge SMP box out of smaller ones and sharing memory, but again. aside from HPTC applications, few programs scale up this well in the commodity space.


I am having my cojones busted for suggesting that running high densities of VMs on dirt-cheap white boxes is ill-advised, but I stand by my assertion that if all your eggs are in one basket, don't cheap out on the basket.

Moore's Law has stood the test of quite a bit of time, but at the end of the day,what are the practical ramifications of mega-powerful CPUs without highly available apps to run on them ?   If I were to predict the NEXT real paradigm shift, I would look for a more grid-oriented software architecture, where the loss of a single server is insignificant.   Google seems to operate its search this way, but they are an "outlyer" with the capacity and finances to focus on a specific area of compute.

IMO, Intel is looking to attach the "Cloud" mojo to a product to jump on the bandwagon just like Compaq called itself the "Non-Stop Internet Company" back in 1999. 





On Tue, Feb 9, 2010 at 5:56 PM, Jim Starkey <jsta...@nimbusdb.com> wrote:
I'm skeptical, very skeptical.  I see the system cost of large number of cores -- memory contention and memory bandwidth contention, but I don't see the benefit unless there is an application that needs memory shared between a large number of threads.  48 cores with 36 stalled waiting on the memory controller doesn't strike me as a good architecture.




GregO wrote:
<the rest at http://blog.appzero.com/ >

 


--
Jim Starkey
Founder, NimbusDB, Inc.
978 526-1376
--
~~~~~
Register Today for Cloud Slam 2010 at official website - http://cloudslam10.com
Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net
Buy 88 conference sessions and panels on cloud computing on DVD at http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com



--
Cheers,
Jan



--
Cheers,
Jan

Jan Klincewicz

unread,
Feb 16, 2010, 7:22:24 AM2/16/10
to cloud-c...@googlegroups.com
@Jeff:

Before you veer so close to calling someone else a liar, <<That's not even close to true.>> consider that clusters with which you have had experience do not represent the entire universe of clusters.  Also, there are many different ways of achieving "fail-over."  Failover can occur without a node understanding any specific cluster state as long as some management console is monitoring the state of the pool.  You are making a generalized statement based on a specific example. 

This conversation is about fail-over of virtual servers, and when a host in a resource pool fails, the VMs which were running on that host need to re-load from scratch on a surviving host.


--
~~~~~
Register Today for Cloud Slam 2010 at official website - http://cloudslam10.com
Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net
Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com



--
Cheers,
Jan

Jan Klincewicz

unread,
Feb 16, 2010, 9:07:24 AM2/16/10
to cloud-c...@googlegroups.com
http://tinyurl.com/ykwmhkl

Addition : 

According to the above FAQ, Marathon EverRun now supports machines with Virtual SMP.  That has been a sever limitation, as more critical VMs would likely be beefier ones with multiple CPUs.
--
Cheers,
Jan

Ricky Ho

unread,
Feb 16, 2010, 11:28:33 AM2/16/10
to cloud-c...@googlegroups.com
I think the problem of constructing A U B is solved, either by the log replayed mechanism you describe, or by the ETL process that the DataWarehouse people is familiar about. It is not simple, but it will work.

For the point (2), the cost is not just hardware cost but also the administrator cost. If my company is google-size, then of course I won't consider running my business on Amazon. But what if I am a small online store. I think the fundamental question is where is the breakeven point (in terms of data volume and processing need) that public cloud doesn't make sense any more. But this is a general Cloud Computing question and non-specific to Map/Reduce.

My thought process is ...
a) If the security risk concern is so high that you are willing to pay the difference between purchasing and renting disk, go for it. But don't ignore you need to hire a DBA, and you may need to pay for the bandwidth cost of loading the data into the cloud.

b) If you cannot tolerate the data upload latency or don't want to pay for the upload bandwidth cost that you are willing to pay the difference between purchasing and renting CPU, go for it. But don't ignore you need to hire a system admin, as well as setting up the data center.

c) Well, if you actually don't have that much to process. You don't have to invent a new application just to use the cloud and map/reduce. Stay at where you are.

Peglar, Robert

unread,
Feb 16, 2010, 12:07:47 PM2/16/10
to cloud-c...@googlegroups.com
If you run a database in the cloud, you still need a DBA...where you run
it doesn't mitigate the need for an administrator. So the cost is the
same either way.

As for system admins, you don't need to pay for those necessarily, no.
But again I am looking at enterprise-scale which still needs an admin,
again regardless if the compute is in the cloud or not. Expertise is
always needed. What isn't needed is infrastructure babysitters, to be
colloquial. Unfortunately, many commercial datacenters have those out
of a lack of regard for good datacenter design, so the perception is
there. They've designed their datacenters inefficiently and poorly, so
they have to throw FTEs at the problem.

I agree for SMB the public cloud may make sense, no question. But I am
trying to solve petabyte-sized problems and the sheer physics of latency
and distance - never mind the outrageous costs - just don't permit much
use of public cloud. This is not to say it's not useful in several use
cases, as you correctly state. I think MR has great potential in the
cloud for small datasets.

Jeff Darcy

unread,
Feb 16, 2010, 3:14:42 PM2/16/10
to cloud-c...@googlegroups.com
On 02/16/2010 07:22 AM, Jan Klincewicz wrote:

> Before you veer so close to calling someone else a liar, <<That's not
> even close to true.>> consider that clusters with which you have had
> experience do not represent the entire universe of clusters.

If you don't want people to point out that your claims are untrue, don't
make untrue claims. The clusters or virtual infrastructures with which
you've had experience don't represent the entire universe either.

> Also,
> there are many different ways of achieving "fail-over."

Indeed. You referred to "cluster-type" failover. What does
"cluster-type" mean? You didn't specify, so I applied the definition
that would be most intuitive to people who've worked with clusters since
before virtualization became all the rage, and according to that
definition your claim remains nowhere close to true. It's entirely
possible to do the same kind of clustering between virtual machines as
was previously done between physical ones, instead of relying on the VM
hosts to do it, and by doing so one can achieve failover times much
lower than boot times.

> This conversation is about fail-over of virtual servers, and when a host
> in a resource pool fails, the VMs which were running on that host need
> to re-load from scratch on a surviving host.

So you *assume*, but you know what they say about assumptions. Just
because you hadn't already thought of other ways to do failover doesn't
mean those ways don't exist.

Reply all
Reply to author
Forward
0 new messages