Rao Dronamraju

unread,

Dec 2, 2009, 10:13:45 PM12/2/09

to cloud-c...@googlegroups.com

“Ultimately, Intel believes its aggressive multicore approach will be the way computers get enough power for tasks such as vision and speech comparable to what humans have.”

As we discussed before….it is the age of IaaS – Intelligence as a Service!.

http://news.cnet.com/2300-1001_3-10001951-1.html?tag=mncol

Regards,

Rao

anish.m...@gmail.com

unread,

Dec 3, 2009, 2:56:14 AM12/3/09

to cloud-c...@googlegroups.com

Hi Rao,
We already have single chips which could cover most of the functionality - microcontrollers. The intersting thing to notice is that, it doesn't have from Intel. Again its not the question of power in case of vision :)
Regards
Anish

Sent from my BlackBerry® wireless device

From: "Rao Dronamraju" <rao.dro...@sbcglobal.net>

Date: Wed, 2 Dec 2009 21:13:45 -0600

To: <cloud-c...@googlegroups.com>

Subject: [ Cloud Computing ] Single Chip Cloud Computer?....

--
~~~~~
Register Today for Cloud Slam 2010 at http://cloudslam10.com
Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net
Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com

Jan Klincewicz

unread,

Dec 3, 2009, 7:28:40 AM12/3/09

to cloud-c...@googlegroups.com

Aside from hypervisors and maybe some real heavy-duty databases or 64-bit Citrix XenApp I don't see a whole bunch of apps out there eating up 4/8 let alone more.

--

~~~~~
Register Today for Cloud Slam 2010 at http://cloudslam10.com
Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net
Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com

--
Cheers,
Jan

Rao Dronamraju

unread,

Dec 3, 2009, 11:12:23 AM12/3/09

to cloud-c...@googlegroups.com

What intels is saying is with multi-cores to the tune of 48 or 100 (some other company recently came up with it), you can do massive amounts of data / analytics processing in parallel, which is very necessary for both speech, vision and other ML/AI applications.

Ray DePena

unread,

Dec 3, 2009, 6:22:18 PM12/3/09

to cloud-c...@googlegroups.com

The Cell BE 8X is used in gaming applications PS/3, XBOX, Supercomputing, Cluster computing, HDTV, military applications, etc.

Anything that combines large data sets and graphics - gaming, weather, etc. will drive demand for this technology.

A 32X was under development but discontinued perhaps beat to market by Intel's 48X? Though they continue Cell BE development.

-RD

Best Regards,

Ray DePena, MBA, PMP
+1.916.941.5558
Ray.D...@gmail.com
Twitter: @RayDePena
LinkedIn: http://www.linkedin.com/in/raydepena

Jan Klincewicz

unread,

Dec 3, 2009, 8:23:49 PM12/3/09

to cloud-c...@googlegroups.com

If the software doesn't take advantage of all those procs it is a moot point. Apparently, developing multi-core hardware is a lot easier than designing Operating Systems and Apps that take advantage of them.

Ray DePena

unread,

Dec 3, 2009, 8:24:44 PM12/3/09

to cloud-c...@googlegroups.com

Rao,

That may be the case, though I doubt that the AI market is large enough to recoup the cost of manufacturing such products. There would have to be other attractive target markets to make it cost-effective to manufacture.

-RD

Rao Dronamraju

unread,

Dec 3, 2009, 8:37:58 PM12/3/09

to cloud-c...@googlegroups.com

A lot of present day web/internet apps, especially those that do internet scale stuff, like search, semantic web, analytics and future applications needing especially real-time performance including real-time analytics, natural language and speech processing, vision etc are prime candidates.

Ray DePena

unread,

Dec 3, 2009, 8:54:15 PM12/3/09

to cloud-c...@googlegroups.com

Doesn't the ??X processor have to exist before the software can be developed to take advantage of it?

I'm no expert in this area, but as I understand it, the clients that are using the Cell BE 8X processing capabilities are taking full advantage of the processing power and demanding more.

On the top500.org Supercomputing list, you'll see it in 2nd place (QS22 blade) and those are used for all sorts of protein folding, weather prediction, etc.

I was surprised to learn that the oil and gas industry is one of the biggest users of Supercomputers (to identify potential fields) along with government, military (nuclear simulations), and many others.

Even for the most mundane applications you'll end up needing Supercomputing type power just due to volumes of customers ie. China, India....

-RD

Jan Klincewicz

unread,

Dec 3, 2009, 9:41:08 PM12/3/09

to cloud-c...@googlegroups.com

@RAY:

Do you know any Protein folding apps that take advantage of large scale SMP boxes ?? Everyone I know involved in those kinds of studies use scale-out, Grid architectures. Also, I'm curious as to how one makes the leap between large populations (ie China or India) =Volume and the necessity for Supercomputing power. Is there a correlation of some sort ?

I realize a spot in the top500 gives some bragging rights, but I do not see any large markets for such boxes as a profitable business. The genome is decoded. Weather, despite how well you analyze it, will still occur as it wants to. Nuclear simulations are nice, but I don't think Iran will buckle under any time soon because the U.S. can prove in a simulation that it would win a confrontation.

Theory is wonderful, but people pay for practical applications. I think it's great to build godlike boxes for the sake of science, but I wouldn't be investing in anyone trying to build a business model around them.

Jeff Darcy

unread,

Dec 3, 2009, 9:53:33 PM12/3/09

to cloud-c...@googlegroups.com

Ray DePena wrote:
> The Cell BE <http://en.wikipedia.org/wiki/Cell_%28microprocessor%29> 8X

> is used in gaming applications PS/3, XBOX, Supercomputing, Cluster
> computing, HDTV, military applications, etc.

Not for much longer.

http://arstechnica.com/hardware/news/2009/11/end-of-the-line-for-ibms-cell.ars

dan cox

unread,

Dec 3, 2009, 10:03:40 PM12/3/09

to cloud-c...@googlegroups.com

This is all great but IBM is not going forward with any cell products in the future.

----- Original Message -----

From: Ray DePena

To: cloud-c...@googlegroups.com

Sent: Thursday, December 03, 2009 7:54 PM

Subject: Re: [ Cloud Computing ] Single Chip Cloud Computer?....

Jeff Darcy

unread,

Dec 3, 2009, 10:13:00 PM12/3/09

to cloud-c...@googlegroups.com

Jan Klincewicz wrote:
> I realize a spot in the top500 gives some bragging rights, but I do not
> see any large markets for such boxes as a profitable business.

In fact, as the founders of my previous company were fond of pointing
out, being near the top of the Top500 has historically been a good
predictor of imminent failure. There's far more money to be made down
at the bottom of the Top500 or just below it, where equipment fits
within the budgets and skill sets of customers with real money - like
those oil folks you mentioned.

> The
> genome is decoded.

Yeah, all done, nothing to learn there. No other genomes, no
exogenetics, no proteome, no individual variation or second-level
phenomena to study. Just knowing the sequence of nucleotides is sufficient.

> Weather, despite how well you analyze it, will still
> occur as it wants to.

When certain kinds of weather can cause billions of dollars in property
damage or crop-yield differences, don't you think even a few percent
better accuracy in prediction might be worth something?

> Nuclear simulations are nice, but I don't think
> Iran will buckle under any time soon because the U.S. can prove in a
> simulation that it would win a confrontation.

That's not what anyone outside of a 26-year-old movie thinks "nuclear
simulation" refers to. In a world where test-ban treaties preclude
physical tests, simulation is a key part of validating designs both for
their effects in use and their degradation in storage. Maintaining the
viability of our nuclear arsenal in this way does have important
geopolitical effects. A reasonable person might disagree with the
strategy that represents, but could hardly dismiss it as irrelevant.

> Theory is wonderful, but people pay for practical applications. I think
> it's great to build godlike boxes for the sake of science, but I
> wouldn't be investing in anyone trying to build a business model around
> them.

I don't think anyone would ask you to. The real point of such systems,
though, is that there's an established pattern of the technology they
use filtering down to the next couple of tiers. Maybe only the
government (not even the largest private-sector entities!) can afford to
build a Jaguar or Intrepid, but five to ten years from now real
businesses will benefit from what was learned in the process. Those are
efforts worth supporting and watching, for anyone who can look beyond
the next quarter's financial statements.

Ray DePena

unread,

Dec 3, 2009, 11:05:08 PM12/3/09

to cloud-c...@googlegroups.com

@Jan,

One of the first things I learned in marketing is to not assume that the market reflects my own experience. So while I can't name the customers and how they're using the technology, I can tell you IBM sold Supercomputers for scientific, military, weather and other applications.

I can give you examples from the airline and telecommunications industry both big users of scale-up systems. In telecom, for companies like Verizon, such systems handle massive call detail records for billing. It is not uncommon for next months billing for their tens of millions of customers to take days to weeks of computer processing.

Now, when you go to places like China and India where the populations are enormous and the providers are government backed entities even 10X the number of customers Verizon may have is less than 10% of their population.

Solutions which seem "large" for the U.S. are mid-sized at best over there given their requirements.

The business model for supercomputers is not like that of high volume systems where you sell many. A supercomputer vendor will bid on building one for a specific application. The challenges are many, a quick search will point you to some of the applications for supercomputers.

google quantum mechanical physics, weather forecasting, climate research, molecular modeling (computing the structures and properties of chemical compounds, biological macromolecules, polymers, and crystals), physical simulations (such as simulation of airplanes in wind tunnels, simulation of the detonation of nuclear weapons, and research into nuclear fusion). A particular class of problems, known as Grand Challenge problems, are problems whose full solution requires semi-infinite computing resources.

And U.S. nuclear simulations on supercomputers aren't meant to intimidate any country. Would you rather they continue testing their nukes by detonating them instead of simulations?

-RD

Greg Pfister

unread,

Dec 3, 2009, 11:10:24 PM12/3/09

to Cloud Computing

Cell BE -- yup, dead end.

The Ars Technica article has one bit of logic backwards, though:

Sony isn't doing something other than Cell for next-gen PS *because*
IBM's stopped development.

Rather, IBM's stopped development *because*, apparently, Sony decided
it didn't want Cell or a Cell-like thing for next-gen PS.

If Sony had wanted it, IBM would have built it. That's how the first
one got done.

Greg Pfister
http://perilsofparallel.blogspot.com/

> http://arstechnica.com/hardware/news/2009/11/end-of-the-line-for-ibms...

Ray DePena

unread,

Dec 3, 2009, 11:12:49 PM12/3/09

to cloud-c...@googlegroups.com

Hey Dan,

Good to hear from you. True, but the point isn't about IBM or the Cell itself, but rather multi-processors. Original story was about Intel's 48X.

"heterogeneous multiprocessors, of which Cell was the first mass-market example of, are here to stay"

-RD

Greg Pfister

unread,

Dec 3, 2009, 11:19:30 PM12/3/09

to Cloud Computing

On Dec 3, 6:23 pm, Jan Klincewicz <jan.klincew...@gmail.com> wrote:
> If the software doesn't take advantage of all those procs it is a
> moot point.
> Apparently, developing multi-core hardware is a lot easier than
> designing Operating Systems and Apps that take advantage of
> them

Yes, the hardware sure is easier. Especially when you have a bunch of
procs with multilevel caches, and don't provide hardware cache
coherence.

That little bit of work is now to be done in software.

No, I'm not really sure what that means, either. I have some guesses.
They're ugly.

This, by the way, is why it's a cloud on a chip and not a multicore or
multiprocessor. "Cloud" is the word to use now that "cluster" is
déclassé .

Greg Pfister
http://perilsofparallel.blogspot.com/
>
>
>
>
> On Thu, Dec 3, 2009 at 6:22 PM, Ray DePena <ray.dep...@gmail.com> wrote:
> > The Cell BE <http://en.wikipedia.org/wiki/Cell_%28microprocessor%29> 8X is

> > used in gaming applications PS/3, XBOX, Supercomputing, Cluster computing,
> > HDTV, military applications, etc.
>
> > Anything that combines large data sets and graphics - gaming, weather, etc.
> > will drive demand for this technology.
>
> > A 32X was under development but discontinued perhaps beat to market by
> > Intel's 48X? Though they continue Cell BE development.
>
> > -RD
>

> > On Thu, Dec 3, 2009 at 4:28 AM, Jan Klincewicz <jan.klincew...@gmail.com>wrote:
>
> >> Aside from hypervisors and maybe some real heavy-duty databases or 64-bit
> >> Citrix XenApp I don't see a whole bunch of apps out there eating up 4/8 let
> >> alone more.
>
> >> On Wed, Dec 2, 2009 at 10:13 PM, Rao Dronamraju <

> >> rao.dronamr...@sbcglobal.net> wrote:
>
> >>> *“Ultimately, Intel believes its aggressive multicore approach will be

> >>> the way computers get enough power for tasks such as vision and speech

> >>> comparable to what humans have.”*
>
> >>> * *
>
> >>> *As we discussed before….it is the age of IaaS – Intelligence as a
> >>> Service!.*
>
> >>> * *
>
> >>> *http://news.cnet.com/2300-1001_3-10001951-1.html?tag=mncol*
>
> >>> * *
>
> >>> *Regards,*
>
> >>> *Rao*
>
> >>> * *

>
> >>> --
> >>> ~~~~~
> >>> Register Today for Cloud Slam 2010 athttp://cloudslam10.com
> >>> Posting guidelines:

> >>>http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> >>> Follow us on Twitterhttp://twitter.com/cloudcomp_groupor
> >>> @cloudcomp_group
> >>> Post Job/Resume athttp://cloudjobs.net

> >>> Buy 88 conference sessions and panels on cloud computing on DVD at
> >>>http://www.amazon.com/gp/product/B002H07SEC,

> >>>http://www.amazon.com/gp/product/B002H0IW1Uor get instant access to

> >>> downloadable versions at
> >>>http://cloudslam09.com/content/registration-5.html
>
> >>> ~~~~~
> >>> You received this message because you are subscribed to the Google Groups
> >>> "Cloud Computing" group.
> >>> To post to this group, send email to cloud-c...@googlegroups.com
> >>> To unsubscribe from this group, send email to
> >>> cloud-computi...@googlegroups.com
>
> >> --
> >> Cheers,
> >> Jan
>
> >> --
> >> ~~~~~
> >> Register Today for Cloud Slam 2010 athttp://cloudslam10.com
> >> Posting guidelines:

> >>http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> >> Follow us on Twitterhttp://twitter.com/cloudcomp_groupor
> >> @cloudcomp_group
> >> Post Job/Resume athttp://cloudjobs.net

> >> Buy 88 conference sessions and panels on cloud computing on DVD at
> >>http://www.amazon.com/gp/product/B002H07SEC,

> >>http://www.amazon.com/gp/product/B002H0IW1Uor get instant access to

> >> downloadable versions at
> >>http://cloudslam09.com/content/registration-5.html
>
> >> ~~~~~
> >> You received this message because you are subscribed to the Google Groups
> >> "Cloud Computing" group.
> >> To post to this group, send email to cloud-c...@googlegroups.com
> >> To unsubscribe from this group, send email to
> >> cloud-computi...@googlegroups.com
>
> > --
> > Best Regards,
>
> > Ray DePena, MBA, PMP
> > +1.916.941.5558

> > Ray.DeP...@gmail.com

> > Twitter: @RayDePena
> > LinkedIn:http://www.linkedin.com/in/raydepena
>
> > --
> > ~~~~~
> > Register Today for Cloud Slam 2010 athttp://cloudslam10.com
> > Posting guidelines:

> >http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> > Follow us on Twitterhttp://twitter.com/cloudcomp_groupor
> > @cloudcomp_group
> > Post Job/Resume athttp://cloudjobs.net

> > Buy 88 conference sessions and panels on cloud computing on DVD at
> >http://www.amazon.com/gp/product/B002H07SEC,

> >http://www.amazon.com/gp/product/B002H0IW1Uor get instant access to

J. Andrew Rogers

unread,

Dec 4, 2009, 2:56:04 AM12/4/09

to cloud-c...@googlegroups.com

On Thu, Dec 3, 2009 at 6:41 PM, Jan Klincewicz <jan.kli...@gmail.com> wrote:
> I realize a spot in the top500 gives some bragging rights, but I do not see
> any large markets for such boxes as a profitable business. The genome is
> decoded. Weather, despite how well you analyze it, will still occur as it
> wants to. Nuclear simulations are nice, but I don't think Iran will buckle
> under any time soon because the U.S. can prove in a simulation that it would
> win a confrontation.

Top500 is primarily about embarrassingly parallel numerical codes.
Great if you have a code that looks like that, irrelevant otherwise.
Most codes that require high scalability won't run well on a Top500
system, so it is a bit deceptive as a benchmark of either need or
capability.

The major unmet market for massive computational systems is analytics.
In many (most?) cases these codes won't scale on a Top500 system and
require a different hardware architecture that is less available than
Top500. The size of the market for systems that can handle these codes
is essentially incalculable at this point.

--
J. Andrew Rogers
realityminer.blogspot.com

Miha Ahronovitz

unread,

Dec 4, 2009, 3:38:27 AM12/4/09

to cloud-c...@googlegroups.com

Jeff, total agreement here
http://my-inner-voice.blogspot.com/2009/11/does-it-still-make-sense-top-500.html

m

From: Jeff Darcy <je...@pl.atyp.us>
To: cloud-c...@googlegroups.com
Sent: Thu, December 3, 2009 7:13:00 PM

Subject: Re: [ Cloud
 Computing ] Single Chip Cloud Computer?....

Jeff Darcy

unread,

Dec 4, 2009, 9:06:57 AM12/4/09

to cloud-c...@googlegroups.com

On 12/04/2009 02:56 AM, J. Andrew Rogers wrote:
> The major unmet market for massive computational systems is analytics.
> In many (most?) cases these codes won't scale on a Top500 system and
> require a different hardware architecture that is less available than
> Top500.

The axiom stated in the first part of that sentence does not necessarily
lead to the conclusion in the second. In fact some of those codes can
and do scale today on the same sort of system that dominates the Top500
- i.e. loosely coupled, large node count, fast interconnect. More
importantly, even more will do so tomorrow. It takes time to make an
application designed for a more tightly-coupled architecture run on such
systems, but it can be done. I've personally engaged with enough users
doing just that to believe that many profit-focused organizations have
recognized the need to move in that direction. It's the software
architecture that must adapt, not the hardware architecture. Even for
large code bases, adapting the software to hardware is usually easier
than the other way around. This is often true even when development
cost outweighs operational cost, and as one moves toward more commercial
applications that becomes less and less the case. When you're talking
about multi-million-dollar systems that will mostly run one application
(e.g. drug discovery, geophysics) over and over again, the one-time cost
of hiring a couple of good developers to make it run on cheaper systems
starts to look mighty good compared to the recurring costs of buying
more expensive systems.

Jan Klincewicz

unread,

Dec 4, 2009, 9:46:02 AM12/4/09

to cloud-c...@googlegroups.com

<<One of the first things I learned in marketing is to not assume that the market reflects my own experience. So while I can't name the customers and how they're using the technology, I can tell you IBM sold Supercomputers for scientific, military, weather and other applications.>>

I learned the same thing in Marketing. I do not presume because I have no personal use for supercomputers that nobody else does. I do see a trend, however, away from proprietary processors (MIPS, Alpha, SPARC, CELL etc.) In the context of Cloud Computing, in which I made my statements, I see (in General) more x86 scale-out than RISC scale up being adopted. Even Apple caved and went Intel. Granted, I have a bias, but I am making an observation, not expressing an opinion.

<<I can give you examples from the airline and telecommunications industry both big users of scale-up systems. In telecom, for companies like Verizon, such systems handle massive call detail records for billing. It is not uncommon for next months billing for their tens of millions of customers to take days to weeks of computer processing. >>

I worked for Bell Atlantic (Verizon's predecessor) in IT for about 14 years, so I am pretty aware of what their data centers look like. I remember wondering why all their home-grown apps had acronyms that all began with the letter "M". Subsequently, I discovered that it stood for "MECHANIZED." The "phone company" has been around a long time, and suffers from more inertia than most other organizations. I would also question whether the delays in billing are due to insufficient processing cycles or rather inefficient policies, procedures, code and employees.

<<Now, when you go to places like China and India where the populations are enormous and the providers are government backed entities even 10X the number of customers Verizon may have is less than 10% of their population.

Solutions which seem "large" for the U.S. are mid-sized at best over there given their requirements.>>

"Emerging" countries are a very different animal than Western ones. For one, they don't have the disadvantages of legacy that "developed" countries do. Perhaps their is a market for Supercomputers to handle the sheer volume of such large populations, but also consider how much of those populations are rural, dirt poor, and more in need of a bowl of rice than a cell phone.

<<The business model for supercomputers is not like that of high volume systems where you sell many. A supercomputer vendor will bid on building one for a specific application. The challenges are many, a quick search will point you to some of the applications for supercomputers.

google quantum mechanical physics, weather forecasting, climate research, molecular modeling (computing the structures and properties of chemical compounds, biological macromolecules, polymers, and crystals), physical simulations (such as simulation of airplanes in wind tunnels, simulation of the detonation of nuclear weapons, and research into nuclear fusion). A particular class of problems, known as Grand Challenge problems, are problems whose full solution requires semi-infinite computing resources. >>

I worked for awhile at HP specifically tasked to sell gear to "Biotech" in the Post-Genomics era (immediately following the successes of Celera and the Human Genome project. Though my commissions suffered, I was surprised how many scientists told me that their computing needs in analyzing those results consisted of a laptop and a DOS version of Word Perfect.

<<And U.S. nuclear simulations on supercomputers aren't meant to intimidate any country. Would you rather they continue testing their nukes by detonating them instead of simulations?>>

I'd rather see nuclear proliferation ended. I see no burning need for nuclear simulations or live tests. They make a loud noise and destroy everything near where they are detonated. Beyond that, it's all a matter of degree.

dan cox

unread,

Dec 4, 2009, 9:57:35 AM12/4/09

to cloud-c...@googlegroups.com

Well said. Top 500 slots is a performance game based on Linpack and has been
the major measurement for years. Statements here are correct in that in
memory calculations are critical and with large enough memories and enough
"vendor technical tweaking" which goes on by the Tier 1 players - IBM, HP.
Analytics is the most commercial application for scalable computing today.
Jan's comments are right on. The key here is parallel computing applications
based on multi core machines and small compute clusters have made this type
of parallel computing available to tons of folks who up to ten years ago
could not have had any chance at this. Thanks to Intel and Xeon and AMD and
Opteron to broaden this field of computing.
Grand Challenge problems still exist (Weather etc.) but providing better CFD
analysis for dampness control in disposable diapers or tensile strength for
plastic bottles on supermarket shelves have been more effective by "x86"
systems.
In addition the key aspects for growth here are graphics and the use of FPU
and GPU auxiliary processors much the same way in the early days of 286 -
287 math co processors. This further enhances the ability to do rapid and
more comprehensive functions in a shorter period of time.
I am excited about this technology and what it can bring to improving our
lives. Not so much for the gamer's but for the product analysis and medical
analysis aspects. Maybe a faster cure for cancers, or development of
synthetic hearts and kidneys,a cure for diabetes or a raving problem -
autism.

----- Original Message -----
From: "J. Andrew Rogers" <realit...@gmail.com>
To: <cloud-c...@googlegroups.com>
Sent: Friday, December 04, 2009 1:56 AM
Subject: Re: [ Cloud Computing ] Single Chip Cloud Computer?....

Jim Starkey

unread,

Dec 4, 2009, 1:30:21 PM12/4/09

to cloud-c...@googlegroups.com

How well do codes designed for loosely coupled, fast interconnect work
on commodity hardware and inexpensive networking? Is there, for
example, a suitable entry level configuration of, say, a rackfull of 1U,
double Gigabit Ethernet and switch? In other words, where is the magic
sauce? Interconnect speed, latency, shallow protocol stack (or all of
the above)?

--
Jim Starkey
Founder, NimbusDB, Inc.
978 526-1376

Jeff Darcy

unread,

Dec 4, 2009, 2:08:24 PM12/4/09

to cloud-c...@googlegroups.com

On 12/04/2009 01:30 PM, Jim Starkey wrote:
> How well do codes designed for loosely coupled, fast interconnect work
> on commodity hardware and inexpensive networking? Is there, for
> example, a suitable entry level configuration of, say, a rackfull of 1U,
> double Gigabit Ethernet and switch? In other words, where is the magic
> sauce? Interconnect speed, latency, shallow protocol stack (or all of
> the above)?

Unfortunately, I can't say much more than that it depends on the code.
Code which has a high compute-to-communicate ratio can sometimes run
reasonably well on a GigE-connected cluster. On the other hand, many
applications won't be happy with anything less than DDR IB and you might
as well get that instead of 10GbE since they're about the same price
anyway. It was usually much more about latency than bandwidth, BTW, but
not always. At SiCortex, where the interconnect was very fast relative
to the processors, we found that CFD, oil-patch, and intelligence-agency
codes ran great. Weather and climate models, on the other hand, sucked
rocks. Sometimes the differences were more subtle, too, having less to
do with problem domain than with specific applications or frameworks.
Some bioinformatics codes worked really well; some didn't. In the end,
performance of a given application on a given architecture is a complex
interplay of two factors: the intrinsic fit between the two, and how far
the computational specialists in a particular field are along the
learning curve of how to use the architecture most effectively. There
are huge differences in the second area, almost overwhelming the
differences in the first. That's where I think the really small systems
like SiCortex's SC072 or Cray's CX1 come in; they're cheap enough to use
for training the specialists to write good MPI/PGAS/whatever code which
can then run in production on bigger systems of the same type.

Jeff Darcy

unread,

Dec 4, 2009, 2:53:41 PM12/4/09

to cloud-c...@googlegroups.com

On 12/02/2009 10:13 PM, Rao Dronamraju wrote:
> * http://news.cnet.com/2300-1001_3-10001951-1.html?tag=mncol *

As further fuel for the loosely-coupled vs. tightly-coupled debate, it's
worth noting that Tilera has a 100-core chip that's already shipping
(i.e. not a mere tech demo).

http://www.tilera.com/products/TILE-Gx.php

It's not x86 and probably doesn't have much virtualization support, but
OTOH it does have coherent cache. Non-disclaimer: I have no association
whatsoever with Tilera.

Peglar, Robert

unread,

Dec 4, 2009, 4:03:22 PM12/4/09

to cloud-c...@googlegroups.com

Jim asked:

" How well do codes designed for loosely coupled, fast interconnect work
on commodity hardware and inexpensive networking? Is there, for
example, a suitable entry level configuration of, say, a rackfull of 1U,
double Gigabit Ethernet and switch? In other words, where is the magic
sauce? Interconnect speed, latency, shallow protocol stack (or all of
the above)?"

Good question - and after having spent a bunch of years in HPC, I would
fall into the 'all of the above' category. Every code is different.
For some, node interconnect is vital because the nodes do a lot of data
passing and messaging. For others, a shallow stack is vital.

Where I think most codes are alike is in their need to keep the CPUs
busy, full of data. This is where end-to-end design is very important -
from the last peripheral to the cores themselves. The worst thing that
can happen to any code is for the CPU to stall waiting for a data
transfer.

In the end, it's the code itself and its ability to take advantage of a
given architecture. That's where humans enter the picture. For
example, rewriting a code so it can take advantage of O(N) cores
simultaneously rather than just O(N/2), or do I/O transfers in parallel
rather than serially, or ...

Rob

---
Robert Peglar
Vice President, Technology, Storage Systems Group
Email: mailto:Robert...@xiotech.com
Office: 952 983 2287
Mobile:314 308 6983
Fax: 636 532 0828
Xiotech Corporation
1606 Highland Valley Circle
Wildwood, MO 63005 http://www.xiotech.com/ : Toll-Free 866 472 6764

-----Original Message-----

From: Jim Starkey [mailto:jsta...@nimbusdb.com]
Sent: Friday, December 04, 2009 12:30 PM
To: cloud-c...@googlegroups.com
Subject: Re: [ Cloud Computing ] Single Chip Cloud Computer?....

Greg Pfister

unread,

Dec 4, 2009, 6:22:11 PM12/4/09

to Cloud Computing

I've written a blog post about the intel 48-core single-chip "cloud",
indicating what is currently known about it, without the hype.

http://bit.ly/75TpRG

Greg Pfister
http://perilsofparallel.blogspot.com/

On Dec 3, 9:19 pm, Greg Pfister <greg.pfis...@gmail.com> wrote:
> On Dec 3, 6:23 pm, Jan Klincewicz <jan.klincew...@gmail.com> wrote:
>
> > If the software doesn't take advantage of all those procs it is a
> > moot point.
> > Apparently, developing multi-core hardware is a lot easier than
> > designing Operating Systems and Apps that take advantage of
> > them
>
> Yes, the hardware sure is easier. Especially when you have a bunch of
> procs with multilevel caches, and don't provide hardware cache
> coherence.
>
> That little bit of work is now to be done in software.
>
> No, I'm not really sure what that means, either. I have some guesses.
> They're ugly.
>
> This, by the way, is why it's a cloud on a chip and not a multicore or
> multiprocessor. "Cloud" is the word to use now that "cluster" is
> déclassé .
>

> Greg Pfisterhttp://perilsofparallel.blogspot.com/

> > >>>http://www.amazon.com/gp/product/B002H0IW1Uorget instant access to

> > >>> downloadable versions at
> > >>>http://cloudslam09.com/content/registration-5.html
>
> > >>> ~~~~~
> > >>> You received this message because you are subscribed to the Google Groups
> > >>> "Cloud Computing" group.
> > >>> To post to this group, send email to cloud-c...@googlegroups.com
> > >>> To unsubscribe from this group, send email to
> > >>> cloud-computi...@googlegroups.com
>
> > >> --
> > >> Cheers,
> > >> Jan
>
> > >> --
> > >> ~~~~~
> > >> Register Today for Cloud Slam 2010 athttp://cloudslam10.com
> > >> Posting guidelines:
> > >>http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> > >> Follow us on Twitterhttp://twitter.com/cloudcomp_groupor
> > >> @cloudcomp_group
> > >> Post Job/Resume athttp://cloudjobs.net
> > >> Buy 88 conference sessions and panels on cloud computing on DVD at
> > >>http://www.amazon.com/gp/product/B002H07SEC,

> > >>http://www.amazon.com/gp/product/B002H0IW1Uorget instant access to

> > >http://www.amazon.com/gp/product/B002H0IW1Uorget instant access to

J. Andrew Rogers

unread,

Dec 4, 2009, 10:23:20 PM12/4/09

to cloud-c...@googlegroups.com

On Fri, Dec 4, 2009 at 6:06 AM, Jeff Darcy <je...@pl.atyp.us> wrote:
> On 12/04/2009 02:56 AM, J. Andrew Rogers wrote:
>> The major unmet market for massive computational systems is analytics.
>> In many (most?) cases these codes won't scale on a Top500 system and
>> require a different hardware architecture that is less available than
>> Top500.
>
> The axiom stated in the first part of that sentence does not necessarily
> lead to the conclusion in the second. In fact some of those codes can
> and do scale today on the same sort of system that dominates the Top500
> - i.e. loosely coupled, large node count, fast interconnect.

Sure, some can. Many (most?) do not in a fashion that can be described
as remotely "efficient". Most types of complex analytics (relational,
graph, spatial, etc.) are poorly suited for Top500 systems, so the
analytics that are done are the residue that do scale in this way.

Modest computing systems tuned for graph analytics, to use that as an
example, would savage the largest systems in the Top500 for graph
analytic workloads even though graph-friendly systems are much smaller
by any computing hardware metric you care to use. Architecture matters
a lot.

> It takes time to make an
> application designed for a more tightly-coupled architecture run on such
> systems, but it can be done.

No, it really can't in many cases. Many algorithms are *intrinsically*
unsuited to those types of systems. Perhaps there is some radical new
theoretical computer science breakthrough that will give us equivalent
algorithms that do run well on those types of systems, but we don't
have them now. There are myriad examples of important algorithms that
don't run well on conventional tightly-coupled architectures never
mind loosely-coupled architectures.

> It's the software
> architecture that must adapt, not the hardware architecture. Even for
> large code bases, adapting the software to hardware is usually easier
> than the other way around.

Unless, of course, there is a theoretical computer science reason that
it is not currently possible to make existing algorithms work on
existing hardware. Sometimes tweaking the hardware is easier than
solving longstanding hard problems in theoretical computer science.

> When you're talking
> about multi-million-dollar systems that will mostly run one application
> (e.g. drug discovery, geophysics) over and over again, the one-time cost
> of hiring a couple of good developers to make it run on cheaper systems
> starts to look mighty good compared to the recurring costs of buying
> more expensive systems.

Except for the extremely common case where the only way a developer
could make the codes run on a cheaper system would be to come up with
a major breakthrough in theoretical computer science. Which is
cheaper, more expensive hardware or betting that your developer is the
next theoretical computer science genius?

I think you are underestimating just how many software problems are
*theoretically* incapable of scaling on loosely coupled systems given
existing computer science. The codes people are running on these
systems are the low-hanging fruit that can be cheaply adapted to such
systems. Many other codes will run 10-100x faster on hardware better
suited to the algorithm characteristics, which has quite a bit of
price performance going for it.

Jeff Darcy

unread,

Dec 5, 2009, 1:19:59 PM12/5/09

to cloud-c...@googlegroups.com

J. Andrew Rogers wrote:
> I think you are underestimating just how many software problems are
> *theoretically* incapable of scaling on loosely coupled systems given
> existing computer science.

...while I think you're even more drastically *over*estimating the
number of such problems. The particular type of analytics you're
talking about accounts for what percentage of computer use? Do you have
any figures, or is this just a case of overgeneralizing from one's own
immediate experience?

Ray DePena

unread,

Dec 5, 2009, 2:17:26 PM12/5/09

to cloud-c...@googlegroups.com

@Jeff,

They also have 64X, 36X, and 16X versions. More interesting (for me at least) are the types of applications that can leverage this technology (cut/paste below).

@Greg,

Thank you for the blog entry describing the 48X Intel product.

I may be way over my head here as I'm more of a biz dev. / alliance / product type, and focused on servers, blades, and networking, but I'll go ahead and ask my naive engineering questions anyway as I'm always interested in what applications emerging technology like this can be used for.

Apologies in advance if technical accuracy is off base. Please feel free to correct me, I'm always willing to learn.

So here goes.

Doesn't cache coherence and virtualization techniques introduce large overhead which end up slowing the very large scale processing trying to be accomplished?
Similarly with the interconnects and memory controller... even if question 1 was addressed, wouldn't these become bottlenecks?
In your opinion, at what point in such designs is the trade-off between shared and unshared memory optimal? Do such multi-core approaches end up having diminishing returns as a result of the shared memory? If so, what sort of design would be optimal to mitigate such effects?

As a side note, as far as the video applications go, one of the things I recall when working in IPTV, particularly streaming, the CPU was not the limitation, the bottleneck was the I/O capacity of the server, another was the number of physical ports available.

Now, when it came to transcoding / encoding / decoding, yes, processing codecs particularly in real-time, was an area where CPU processing could be advantageous.
This was an area where we used the Cell based QS22. DPI being another.

Is the concept of CoD viable with multi-core processors - that is being completely turned off when not in use and turned on when needed? I see the Intel and Tilera processors both do it, though it seems that it's just reduced power consumption and not truly "off". Then again, IBM's CoD implementation was not a multi-proc as I recall, but different physical processors that were turned on/off.

And finally, as a lay person, it seems to me that regardless of whether it's 8x, 16x, 64x, 100x or more, aside from the issues already brought forth, the server itself (memory, bus, controllers, physical ports, etc.), storage components, and the network (switch, LAN, etc) through which this data will flow has to be able to accommodate rather large amounts of bandwidth rather than just shifting the bottleneck from one area to another.

Applications

Advanced Networking: Firewall & VPN Intrusion Detection & Prevention (IDS/IPS) Unified Threat Management (UTM) L4-7 deep packet inspection Network Monitoring & Forensics		Digital Video: Video transcoding/Transrating Videoconferencing MCU and endpoints Streaming IPTV and Video-on-Demand Video Post-Production processing
Wireless Infrastructure: Base Transceiver Station (BTS) Base Station Controllers (BSC) Wireless backbone gateways (GGSN, MGW )		Cloud Computing Web Applications (LAMP) Data caching (Memcached) Database Applications

	Feature	Enables
Massively Scalable Performance	• Array of 16 to 100 general-purpose processor cores (tiles) • 64-bit VLIW processors with 64-bit instruction bundle • 3-deep pipeline with up to 3 instructions per cycle • 32K L1i cache, 32K L1d cache, 256K L2 cache per tile • Up to 750 billion operations per second (BOPS) • Up to 200 Tbps of on-chip mesh interconnect • Over 500 Gbps memory bandwidth with four 64-bit DDR3 controllers	• 40 - 80 Gbps Snort® processing • 40 - 80 Gbps nProbe • H.264 HD video encode: dozens of streams of 1080p (baseline profile) • 64+ channels of OFDM baseband receiver processing (wireless)
Power Efficiency	• 1.0 to 1.5GHz operating frequency • 10 to 55W for typical applications • Idle Tiles can be put into low-power sleep mode • Power efficient inter tile communications	• Highest performance per watt • Simple thermal management & power supply design • Small System form factor • Lowest operating cost
Integrated Solution	• Four DDR3 memory controllers with optional ECC • Up to eight 10GbE XAUI interfaces; 2 Interlaken interfaces • Three Gen2 PCIe interfaces, each selectable as endpoint or root complex • Up to 32 GbE MAC interfaces • Wire-speed mPIPE™ packet processing engine • On-chip hardware encryption and compression(MiCA™)	• Reduces BOM cost - standard interfaces on-chip • Dramatically reduced board real estate • Up to 80 Gbps PCIe bandwidth • Over 80 Gbps of packet I/O bandwidth • Up to 40 Gbps VPN performance
Ease of Programming	• ANSI standard C / C++ compiler • Advanced profiling and debugging designed for multicore programming • Supports SMP Linux and virtualization • TMC libraries for efficient inter-tile communication	• Run off-the-shelf C and C++ programs • Leverage investment in existing code • Standard multicore communication mechanisms • Reduce debug and optimization time • Faster time to production code

--
~~~~~

Register Today for Cloud Slam 2010 at http://cloudslam10.com
Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group

Post Job/Resume at http://cloudjobs.net

Buy 88 conference sessions and panels on cloud computing on DVD at

http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

~~~~~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com

--
Best Regards,

Ray DePena, MBA, PMP
+1.916.941.5558

Ray.D...@gmail.com

Rao Dronamraju

unread,

Dec 5, 2009, 3:23:02 PM12/5/09

to cloud-c...@googlegroups.com

Greg,

I agree with you that it is still in the prototype stage, but as you also
mentioned, it is very useful & interesting to work with such large scale
core/parallel systems anytime. I agree with Intel that such systems are
needed for computers to get closer to human (brain) thinking. Since most
information processing especially perceptual by human beings is highly
parallel and in real-time, such architectures are best suited for such
applications - vision, speech and other real-time analytics applications
(some of which have been mentioned in the post - financial/wall street). In
addition, these applications by their very nature are exibit high degree of
data independency, hence they have low degree of cache/memory
coherency/contention. The question is how applicable are such architectures
to traditional IT applications?...where cache/memory contention and
coherency are lot more important. A lot of super computing applications are
compute intensive, with the result, by the time they need the data that has
been changed by another core, can be updated to the other cores in due time,
even in software, and still meet the cohernecy requirements. I won't be
surpised if this is the reason why Intel did not place much emphasis on
cache coherecy at this time.

Register Today for Cloud Slam 2010 at http://cloudslam10.com
Posting guidelines:
http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group

Post Job/Resume at http://cloudjobs.net

Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC,

http://www.amazon.com/gp/product/B002H0IW1U or get instant access to

J. Andrew Rogers

unread,

Dec 5, 2009, 4:24:47 PM12/5/09

to cloud-c...@googlegroups.com

It is sufficient to point out how difficult it is to scale traditional
relational OLAP workloads in loosely coupled systems. I would be
willing to bet that relational OLAP is still the predominant type of
data analytics being done today. My specialty is massive scale spatial
and graph analytics so my expertise is in those markets, but I don't
even have to invoke those pathological cases since the argument still
stands for simpler, boring cases.

The majority of analytics codes on small systems are like this.
Relational analytics are far more widely used than MapReduce, yet you
only ever see the latter on large-scale systems. Graph analytics are
even more pathological, not even scaling up to a modern laptop -- the
codes violate the assumptions of most CPU architectures never mind
loosely coupled systems. Spatial analytics are a little better, but
not by much. The requirement for complex analytics doesn't evaporate
when data sets become large, we simply can't implement them. This is
one of the major challenges of large-scale context-based systems.

Loosely coupled systems are dependent on an extremely high degree of
locality in data structures. Many analytic data structures have
intrinsically weak locality such that in extreme cases there is not
even enough locality for a CPU cache to be useful. It is great if your
code is nothing but predictable one-dimensional traversals -- those
parallelize very well on loosely coupled systems -- but most
interesting real-world analytical relationships are not
one-dimensional or amenable to simple brute-force pattern matching and
aggregation.

Greg Pfister

unread,

Dec 5, 2009, 6:02:12 PM12/5/09

to Cloud Computing

Rao,

We've had this discussion before in this forum. I'm not remotely
interested in having it again. We disagree.

(But just to not let others think there's nothing on the other side,
I'll note for the record that saying parallel is necessary for AI is
like saying airplanes have to flap their wings. We need the science
first; then we can figure out how or whether to do it in parallel.)

Greg Pfister
http://perilsofparallel.blogspot.com/

On Dec 5, 1:23 pm, "Rao Dronamraju" <rao.dronamr...@sbcglobal.net>
wrote:

> Greg Pfisterhttp://perilsofparallel.blogspot.com/

> > > >>>http://www.amazon.com/gp/product/B002H0IW1Uorgetinstant access to

> > > >>> downloadable versions at
> > > >>>http://cloudslam09.com/content/registration-5.html
>
> > > >>> ~~~~~
> > > >>> You received this message because you are subscribed to the Google
> Groups
> > > >>> "Cloud Computing" group.
> > > >>> To post to this group, send email to
> cloud-c...@googlegroups.com
> > > >>> To unsubscribe from this group, send email to
> > > >>> cloud-computi...@googlegroups.com
>
> > > >> --
> > > >> Cheers,
> > > >> Jan
>
> > > >> --
> > > >> ~~~~~
> > > >> Register Today for Cloud Slam 2010 athttp://cloudslam10.com
> > > >> Posting guidelines:
>
> >>http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> > > >> Follow us on Twitterhttp://twitter.com/cloudcomp_groupor
> > > >> @cloudcomp_group
> > > >> Post Job/Resume athttp://cloudjobs.net
> > > >> Buy 88 conference sessions and panels on cloud computing on DVD at
> > > >>http://www.amazon.com/gp/product/B002H07SEC,

> > > >>http://www.amazon.com/gp/product/B002H0IW1Uorgetinstant access to

> > > >http://www.amazon.com/gp/product/B002H0IW1Uorgetinstant access to

> > > > downloadable versions at
> > > >http://cloudslam09.com/content/registration-5.html
>
> > > > ~~~~~
> > > > You received this message because you are subscribed to the Google
> Groups
> > > > "Cloud Computing" group.
> > > > To post to this group, send email to cloud-c...@googlegroups.com
> > > > To unsubscribe from this group, send email to
> > > > cloud-computi...@googlegroups.com
>
> > > --
> > > Cheers,
> > > Jan
>
> --
> ~~~~~
> Register Today for Cloud Slam 2010 athttp://cloudslam10.com
> Posting guidelines:http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> Follow us on Twitterhttp://twitter.com/cloudcomp_groupor @cloudcomp_group
> Post Job/Resume athttp://cloudjobs.net

> Buy 88 conference sessions and panels on cloud computing on DVD athttp://www.amazon.com/gp/product/B002H07SEC,http://www.amazon.com/gp/product/B002H0IW1Uor get instant access to
> downloadable versions athttp://cloudslam09.com/content/registration-5.html

Jeff Darcy

unread,

Dec 5, 2009, 6:40:20 PM12/5/09

to cloud-c...@googlegroups.com

J. Andrew Rogers wrote:
> It is sufficient to point out how difficult it is to scale traditional
> relational OLAP workloads in loosely coupled systems. I would be
> willing to bet that relational OLAP is still the predominant type of
> data analytics being done today.

...and an awful lot of it *is* being done on such loosely coupled systems.

> My specialty is massive scale spatial
> and graph analytics so my expertise is in those markets, but I don't
> even have to invoke those pathological cases since the argument still
> stands for simpler, boring cases.

No, it doesn't. You've claimed that there is a class of problems that
do not run well on top500-style systems, and won't run well on them
without fundamental advances. There certainly are such problems, but
then you also imply that they run fine on some other architecture and
that they're common enough that the people who design computers should
give a hoot. Since the loosely coupled model is already working quite
well in industries from finance to oil to pharma, I would love to hear
what problems and what architectures those are. What special
characteristics do they have that make the applications run so poorly on
what is by now a conventional commodity-based architecture, what magic
do your preferred architectures do, and if the opportunity is so great
then why aren't more people chasing those dollars?

I understand that it can be frustrating when the people who design
computers don't make your job easier, but *why should they*? They're in
this businss to make money too.

Rao Dronamraju

unread,

Dec 5, 2009, 7:39:45 PM12/5/09

to cloud-c...@googlegroups.com

Greg,

We do not have to agree at all....but I do not think neither Intel not MIT (where you can get lot more information) mention anything if science did not exist.

Here is information about parallelism/massive-parallelism and AI from horse's mouth (Intel).

The Future is Parallel

Many-core chips, parallel processing, and tera-scale computing require a paradigm shift. But that shift gives us the next level in what computing can and will do for our world. It places many challenges before us and opens a vast horizon of opportunities. Think in terms of when PCs first entered the marketplace decades ago and the inspiring applications that followed.

What will future tera-scale workloads look like? What part of these workloads can be parallelized? And how will they benefit on a tera-scale processor and platform? The tera-scale research teams at Intel have engaged with industry and academia to explore these topics.

RMS offers some exciting possibilities. At Intel, we’ve developed several RMS research application codes and primitives, and we’re offering some of them for public research use. They will be combined with many codes developed by leading thinkers and software architects interested in tera-scale research.

Beyond RMS, our research also shows significant performance potential for real-time analytics codes in finance. Others see the potential for tera-scale capabilities in AI, machine-learning optimization, and prediction.

Today, some existing codes can be parallelized. Many others cannot without a major effort. Thinking massively parallel processing from the beginning of software development is a requirement for tera-scale computing. But therein lies the challenge. Parallelizing is not necessarily trivial. It’s an iterative process that will require new tools, optimizers, and compilers. Intel is engaging with research, academia, and industry to spur efforts to discover new parallel programming techniques, parallelizable algorithms, and tools.

Tera-scale computing will require new tera-scale parallel benchmarks to test hardware and software performance. The current benchmarks are not optimized for many-core, tera-scale computing.

These are all areas needing further work to accelerate the development of tera-scale computing.

http://software.intel.com/en-us/articles/strandberg/

http://software.intel.com/en-us/articles/tera-scale-computing-a-parallel-path-to-the-future/

Register Today for Cloud Slam 2010 at http://cloudslam10.com

Posting guidelines: http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions

Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group

Post Job/Resume at http://cloudjobs.net

Buy 88 conference sessions and panels on cloud computing on DVD at

http://www.amazon.com/gp/product/B002H07SEC, http://www.amazon.com/gp/product/B002H0IW1U or get instant access to downloadable versions at http://cloudslam09.com/content/registration-5.html

J. Andrew Rogers

unread,

Dec 6, 2009, 2:02:33 AM12/6/09

to cloud-c...@googlegroups.com

On Sat, Dec 5, 2009 at 3:40 PM, Jeff Darcy <je...@pl.atyp.us> wrote:
> You've claimed that there is a class of problems that
> do not run well on top500-style systems, and won't run well on them
> without fundamental advances. There certainly are such problems, but
> then you also imply that they run fine on some other architecture and
> that they're common enough that the people who design computers should
> give a hoot.

Access locality is a continuum. The more locality you have, the
better it will run on loosely coupled clusters.

There is some hardware that is explicitly designed to run codes with
poor locality, but it is not common and it is not as mature as
conventional high-locality silicon. Even if you bought the systems,
you won't be running a Google-scale cluster, though they will still
out-perform a much larger Google-scale cluster. For the kinds of
analytics they are used for they'll still buy you a couple orders of
magnitude in terms of performance for the silicon budget, though
you'll lose high-locality numerical performance in the bargain. The
assumption of locality is well-conserved in commodity silicon.

Since I do a lot of work with graph analytics (which is mostly used to
analyze complex human behavior for predictive purposes), I have some
experience with these kinds of machines in addition to conventional
loosely coupled clusters. If you know what you are doing, the price
performance crossover point between various hardware architectures
does not favor the high-locality assumption nearly as much as you seem
to think it does. It would make my life easier if this was not the
case.

> Since the loosely coupled model is already working quite
> well in industries from finance to oil to pharma, I would love to hear
> what problems and what architectures those are.

Oil is mostly embarrassingly parallel numerical codes, a natural fit
for loosely coupled systems. Finance analytics is largely ill-suited
to loosely coupled systems, they can only run very simple models on
them and definitely not real-time models -- there is a reason they buy
so much odd silicon. The spirit is willing but the flesh is weak. I
have a little less experience with pharma, but that seems to be a
50/50 split in terms of whether or not the codes will run on loosely
coupled systems.

Also realize that in many cases companies and organizations run codes
on loosely coupled systems that are *grossly* inefficient on such
systems because the marginal cost is worth it. Only the most
high-value applications can afford this kind of waste, but similar
codes for many other purposes become economical if the efficiency
improves.

> What special
> characteristics do they have that make the applications run so poorly on
> what is by now a conventional commodity-based architecture, what magic
> do your preferred architectures do, and if the opportunity is so great
> then why aren't more people chasing those dollars?

I don't have a preferred architecture. No one architecture makes my
life easy. Furthermore, the best architecture is highly dependent on
the nature of the application. If an application will run well on a
loosely coupled commodity architecture, that is what gets recommended.
It is purely about bang for the buck. Some weird architectures are
more expensive, but for some types of analytics the bang for the buck
is a large integer factor over conventional architectures. Again, it
is about the assumptions about locality made by the architecture and
how those assumptions are implemented.

Until very recently there hasn't been a large market for scalable
analytics. Graph analytics are a nascent big deal because of social
networks, deep contextual targeting, and similar -- it takes a few
years for hardware architectures to get designed and built. A year ago
it wasn't on anyone's radar except for bleeding edge folks like the
intelligence agencies. Ironically, one of the best current pieces of
hardware for graph analytics was built in the 1990s and was largely
forgotten (no applications at the time) but is making a resurgence
because it outperforms the latest commodity silicon for that task.

The important trend is that almost all high-value analytics are moving
toward that class of algorithm where loosely coupled commodity
clusters offer miserable performance and little scalability.

> I understand that it can be frustrating when the people who design
> computers don't make your job easier, but *why should they*? They're in
> this businss to make money too.

I'm all about price performance. That's not the point. For a broad
swath of analytics, you can't get the performance for *any* price on a
loosely coupled architecture. If you allow for other types of
architectures (which may be very cloud-like in their own way) that
class of applications shrinks a bit and it is unambiguously economical
to do things other ways.

The path to massive scalability is not loosely coupled systems, it is
theoretical computer science that makes it plausible to use these
loosely coupled systems in a fashion that actually delivers
scalability. You can't make algorithms scale on architectures they
are not suited for, you have to change the algorithms. There hasn't
been a lot of that going on.

Miha Ahronovitz

unread,

Dec 6, 2009, 3:16:54 PM12/6/09

to cloud-c...@googlegroups.com

@Jeff

"Access locality is a continuum. The more locality you have, the better it will run on loosely coupled clusters."

Small steps are better than no steps at all. Resource Management Software, such as Sun Grid Engine scales up to 64,000 cores running , for example a 60,000 core MPI application. Face recognition software). For those familiar with parallel application, this is mind boggling

In the next release (6.2 Update 5) SGE will have data aware scheduling. For now we will have a Hadoop (Map-Reduce) integration. SGE will make the best effort to schedule a job to nodes where Hadoop places the data. In the future we will do the same using Oracle Coherence (formerly Tangosol).

Other refinements is better Topological Scheduling (for now specific to Nehalem). It allows to schedule jobs at core level or CPU level according to its unique needs.

The two features above produce spectacular performance improvements, versus the one core per CPU classical clusters. Let's not sub-estimate the middleware improvement role to deliver scalability in large clusters

http://my-inner-voice.blogspot.com/2009/10/features-in-both-sun-grid-engine-6.html

Miha

From: J. Andrew Rogers <realit...@gmail.com>
To: cloud-c...@googlegroups.com

Sent: Sat, December 5, 2009 11:02:33 PM

Subject: Re: [ Cloud Computing ] Single Chip Cloud Computer?....

Greg Pfister

unread,

Dec 6, 2009, 10:26:39 PM12/6/09

to Cloud Computing

Rao, I hope everybody will note that the quotes and references you
provide are from Intel, and not from cognitive and/or behavioral
neuroscientists.

Greg Pfister
http://perilsofparallel.blogspot.com/

On Dec 5, 5:39 pm, "Rao Dronamraju" <rao.dronamr...@sbcglobal.net>
wrote:
> Greg,
>

> http://software.intel.com/en-us/articles/tera-scale-computing-a-paral...
> h-to-the-future/

>
>
>
> -----Original Message-----
> From: Greg Pfister [mailto:greg.pfis...@gmail.com]
> Sent: Saturday, December 05, 2009 5:02 PM
> To: Cloud Computing
> Subject: [ Cloud Computing ] Re: Single Chip Cloud Computer?....
>
> Rao,
>
> We've had this discussion before in this forum. I'm not remotely
> interested in having it again. We disagree.
>
> (But just to not let others think there's nothing on the other side,
> I'll note for the record that saying parallel is necessary for AI is
> like saying airplanes have to flap their wings. We need the science
> first; then we can figure out how or whether to do it in parallel.)
>

> Greg Pfisterhttp://perilsofparallel.blogspot.com/

> > > > >>>http://www.amazon.com/gp/product/B002H0IW1Uorgetinstantaccess to

> > > > >>> downloadable versions at
> > > > >>>http://cloudslam09.com/content/registration-5.html
>
> > > > >>> ~~~~~
> > > > >>> You received this message because you are subscribed to the Google
> > Groups
> > > > >>> "Cloud Computing" group.
> > > > >>> To post to this group, send email to
> > cloud-c...@googlegroups.com
> > > > >>> To unsubscribe from this group, send email to
> > > > >>> cloud-computi...@googlegroups.com
>
> > > > >> --
> > > > >> Cheers,
> > > > >> Jan
>
> > > > >> --
> > > > >> ~~~~~
> > > > >> Register Today for Cloud Slam 2010 athttp://cloudslam10.com
> > > > >> Posting guidelines:
>
> > >>http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> > > > >> Follow us on Twitterhttp://twitter.com/cloudcomp_groupor
> > > > >> @cloudcomp_group
> > > > >> Post Job/Resume athttp://cloudjobs.net
> > > > >> Buy 88 conference sessions and panels on cloud computing on DVD at
> > > > >>http://www.amazon.com/gp/product/B002H07SEC,

> > > > >>http://www.amazon.com/gp/product/B002H0IW1Uorgetinstantaccess to

> > > > >> downloadable versions at
> > > > >>http://cloudslam09.com/content/registration-5.html
>
> > > > >> ~~~~~
> > > > >> You received this message because you are subscribed to the Google
> > Groups
> > > > >> "Cloud Computing" group.
> > > > >> To post to this group, send email to
> cloud-c...@googlegroups.com
> > > > >> To unsubscribe from this group, send email to
> > > > >> cloud-computi...@googlegroups.com
>
> > > > > --
> > > > > Best Regards,
>
> > > > > Ray DePena, MBA, PMP
> > > > > +1.916.941.5558
> > > > > Ray.DeP...@gmail.com
> > > > > Twitter: @RayDePena
> > > > > LinkedIn:http://www.linkedin.com/in/raydepena
>
> > > > > --
> > > > > ~~~~~
> > > > > Register Today for Cloud Slam 2010 athttp://cloudslam10.com
> > > > > Posting guidelines:
>
> > >http://groups.google.ca/group/cloud-computing/web/frequently-asked-qu...
> > > > > Follow us on
>

> ...
>
> read more »

Rao Dronamraju

unread,

Dec 7, 2009, 12:06:01 PM12/7/09

to cloud-c...@googlegroups.com

Greg,

Yes, I forwarded Intel's postings. Although I agree that Intel would have
self-interest in their postings in promoting their multicore HW, I did not
see any self-interest in promoting non-traditional applications of it in AI,
ML and Real-time Analytics. They could have easily mentioned traditional
multicore applications like bioinformatics, Fast Fourier Transform, Seismic
analysis, Medical Imaging, Finite Element Methods etc, but they did not and
they specifically mentioned AI, ML and Real-time Analytics in addition to
RMS. Here are some publicly available references....

Map-Reduce for Machine Learning on Multicore
http://www.cs.stanford.edu/people/ang//papers/nips06-mapreducemulticore.pdf

MapReduce: Distributed Computing for Machine Learning
http://www.icsi.berkeley.edu/~arlo/publications/gillick_cs262a_proj.pdf

UPCRC Multicore Applications Workshop - Session #5 - Human-machine
Interaction.
http://www.researchchannel.org/prog/displayevent.aspx?rID=29992&fID=6431

Parallel Machine Learning Toolbox
http://www.haifa.ibm.com/projects/verification/ml_toolbox/index.html

Regards,
Rao

Register Today for Cloud Slam 2010 at http://cloudslam10.com
Posting guidelines:
http://groups.google.ca/group/cloud-computing/web/frequently-asked-questions
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group

Post Job/Resume at http://cloudjobs.net

Buy 88 conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC,

http://www.amazon.com/gp/product/B002H0IW1U or get instant access to

Reply all

Reply to author

Forward