Low Power Super Computing

Rick C

unread,

May 31, 2022, 10:34:06 PM5/31/22

to

I watched a video (well, part of it anyway) about the current top dog super computer that performs 52.2 GFLOPS per watt. I think that's the territory of the GA144, no? I can't recall how many watts it is, but I'm thinking it's around 1 watt running flat out. Of course, it doesn't do floating point ops natively, so not really a good comparison. But for MIPS, its about 100 GIPS per watt.

Not too shabby for a 12 year old design.

--

Rick C.

- Get 1,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209

Marcel Hendrix

unread,

Jun 1, 2022, 12:52:56 AM6/1/22

to

On Wednesday, June 1, 2022 at 4:34:06 AM UTC+2, gnuarm.del...@gmail.com wrote:
> I watched a video (well, part of it anyway) about the current top dog super computer that performs 52.2 GFLOPS per watt. I think that's the territory of the GA144, no? I can't recall how many watts it is, but I'm thinking it's around 1 watt running flat out. Of course, it doesn't do floating point ops natively, so not really a good comparison. But for MIPS, its about 100 GIPS per watt.
>
> Not too shabby for a 12 year old design.

Is there no theoretical limit on the GLOPS/MIPS given a certain manufacturing process and maybe a few other parameters?

-marcel

Rick C

unread,

Jun 1, 2022, 2:00:06 AM6/1/22

to

Yes, there is a theoretical limit on the energy used for a given computation. I remember a Scientific American paper about it back when they actually had papers, before they become another Discover magazine.

--

Rick C.

+ Get 1,000 miles of free Supercharging
+ Tesla referral code - https://ts.la/richard11209

none albert

unread,

Jun 1, 2022, 3:54:25 AM6/1/22

to

In article <d0c1bdda-dade-44bc...@googlegroups.com>,

Rick C <gnuarm.del...@gmail.com> wrote:
>I watched a video (well, part of it anyway) about the current top dog
>super computer that performs 52.2 GFLOPS per watt. I think that's the
>territory of the GA144, no? I can't recall how many watts it is, but
>I'm thinking it's around 1 watt running flat out. Of course, it doesn't
>do floating point ops natively, so not really a good comparison. But
>for MIPS, its about 100 GIPS per watt.
>
>Not too shabby for a 12 year old design.

It love to see breaking the hurdle of of 1000 sensible instructions
per second on a GA144 chip.

>Rick C.

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

none albert

unread,

Jun 1, 2022, 4:00:41 AM6/1/22

to

In article <0beff084-7928-42bd...@googlegroups.com>,

Rick C <gnuarm.del...@gmail.com> wrote:
>On Wednesday, June 1, 2022 at 12:52:56 AM UTC-4, Marcel Hendrix wrote:
>> On Wednesday, June 1, 2022 at 4:34:06 AM UTC+2,
>gnuarm.del...@gmail.com wrote:
>> > I watched a video (well, part of it anyway) about the current top
>dog super computer that performs 52.2 GFLOPS per watt. I think that's
>the territory of the GA144, no? I can't recall how many watts it is, but
>I'm thinking it's around 1 watt running flat out. Of course, it doesn't
>do floating point ops natively, so not really a good comparison. But for
>MIPS, its about 100 GIPS per watt.
>> >
>> > Not too shabby for a 12 year old design.
>> Is there no theoretical limit on the GLOPS/MIPS given a certain
>manufacturing process and maybe a few other parameters?
>
>Yes, there is a theoretical limit on the energy used for a given
>computation. I remember a Scientific American paper about it back when
>they actually had papers, before they become another Discover magazine.

I remember an other article about reversible computation in the
same SA (that doesn't increase entropy) that requires no energy consumption.
Apparently reversible computation can calculate anything.

P.S.
I dropped my subscription when they were expressing energy consumption
equivalent to how many hairdryers.

Anton Ertl

unread,

Jun 1, 2022, 8:13:38 AM6/1/22

to

Rick C <gnuarm.del...@gmail.com> writes:
>I watched a video (well, part of it anyway) about the current top dog super=
> computer that performs 52.2 GFLOPS per watt. I think that's the territory=
> of the GA144, no?

No.

>I can't recall how many watts it is, but I'm thinking i=
>t's around 1 watt running flat out. Of course, it doesn't do floating poin=
>t ops natively, so not really a good comparison. But for MIPS, its about 1=
>00 GIPS per watt. =20

Doing what? Supercomputers are evaluated using the linpack benchmark,
which solves a dense system of linear equations
<https://www.top500.org/project/linpack/>, something that
supercomputers tend to do not just for benchmarking.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2021: https://euro.theforth.net/2021

Anton Ertl

unread,

Jun 1, 2022, 8:14:45 AM6/1/22

to

Marcel Hendrix <m...@iae.nl> writes:
>Is there no theoretical limit on the GLOPS/MIPS given a certain manufacturi=

>ng process and maybe a few other parameters?

Not that I know of.

Anton Ertl

unread,

Jun 1, 2022, 8:27:35 AM6/1/22

to

Rick C <gnuarm.del...@gmail.com> writes:
>Yes, there is a theoretical limit on the energy used for a given computatio=
>n.

But it has nothing to do with semiconductor processes.

You are thinking of the Landauer limit
<https://en.wikipedia.org/wiki/Landauer%27s_principle>, which is far
below the power dissipation of computers implemented in current
processes.

The thing about reversible computation is that it does not erase
memory (what costs energy in Landauer's principle), so it would allow
going below the Landauer limit, in a sense. However, you still need
some energy to drive the computation in a specific direction, and more
for driving it faster (at least that's what I read at one point).

Marcel Hendrix

unread,

Jun 1, 2022, 1:00:14 PM6/1/22

to

On Wednesday, June 1, 2022 at 2:27:35 PM UTC+2, Anton Ertl wrote:
[..]

> The thing about reversible computation is that it does not erase
> memory (what costs energy in Landauer's principle), so it would allow
> going below the Landauer limit, in a sense. However, you still need
> some energy to drive the computation in a specific direction, and more
> for driving it faster (at least that's what I read at one point).

I expected there to be a minimum amount of energy
to push a bunch of electrons from one detectable state
to another one. Might be same principle as Landauer, but
his idea that information and energy are somehow related
I find hard to grasp.

A boundary that is maybe more of practical concern: are there
theoretical limits related to pipelining (i.e. branch removal)
and/or parallel computing?

The human brain does not seem much of a problem with the speed
of communication (between cells), and doesn't overheat. Unfortunately,
it most-times refuses to compute exactly what I want.

-marcel

Anton Ertl

unread,

Jun 1, 2022, 1:49:41 PM6/1/22

to

Marcel Hendrix <m...@iae.nl> writes:
>On Wednesday, June 1, 2022 at 2:27:35 PM UTC+2, Anton Ertl wrote:
>[..]
>> The thing about reversible computation is that it does not erase
>> memory (what costs energy in Landauer's principle), so it would allow
>> going below the Landauer limit, in a sense. However, you still need
>> some energy to drive the computation in a specific direction, and more
>> for driving it faster (at least that's what I read at one point).

I think I read it in a collection by Feynmann (who held a regular
lecture about physics of computation in the 1980s).

>I expected there to be a minimum amount of energy
>to push a bunch of electrons from one detectable state
>to another one.

I think that's already too implementation-specific for this kind of
reasoning.

>Might be same principle as Landauer, but
>his idea that information and energy are somehow related
>I find hard to grasp.

Information and enthropy are related. E.g., consider Maxwell's demon.

>A boundary that is maybe more of practical concern: are there
>theoretical limits related to pipelining (i.e. branch removal)

Pipelining is not the same as branch removal. In (hardware)
pipelining, every pipeline stage adds ~5 gate delays to the delay of
the whole thing, for the holding latches, and for the jitter etc. of
the pipeline stage. It also adds to the power needs (both for the
additional gates and due to clocking higher). Intel planned to deepen
the Pentium 4 pipeline [sprangle&carmean02] in the Tejas (and AMD also
worked on a deeply pipelined CPU at the same time), but both projects
were cancelled in 2005; my guess is that there was a promising cooling
technology that did not work out, so they could not produce CPUs with
such a high power density as planned.

Branch prediction helps avoid the branch penalty of deep pipelines;
you cannot predict a really random branch, but apparently patterns in
the data that we don't see easily can be used by branch predictors.

@InProceedings{sprangle&carmean02,
author = {Eric Sprangle and Doug Carmean},
title = {Increasing Processor Performance by Implementing
Deeper Pipelines},
crossref = {isca02},
pages = {25--34},
url = {http://www.cs.cmu.edu/afs/cs/academic/class/15740-f03/public/doc/discussions/uniprocessors/technology/deep-pipelines-isca02.pdf},
annote = {This paper starts with the Williamette (Pentium~4)
pipeline and discusses and evaluates changes to the
pipeline length. In particular, it gives numbers on
how lengthening various latencies would affect IPC;
on a per-cycle basis the ALU latency is most
important, then L1 cache, then L2 cache, then branch
misprediction; however, the total effect of
lengthening the pipeline to double the clock rate
gives the reverse order (because branch
misprediction gains more cycles than the other
latencies). The paper reports 52 pipeline stages
with 1.96 times the original clock rate as optimal
for the Pentium~4 microarchitecture, resulting in a
reduction of 1.45 of core time and an overall
speedup of about 1.29 (including waiting for
memory). Various other topics are discussed, such as
nonlinear effects when introducing bypasses, and
varying cache sizes. Recommended reading.}
}

>and/or parallel computing?

Amdahl's law. Often underestimated, often overestimated.

>The human brain does not seem much of a problem with the speed
>of communication (between cells),

It does not compute very fast.

>and doesn't overheat.

Actually humans are reported to spend 25% of their energy on the
brain, and certainly more when people are thinking hard. And it can
become too hot.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html

New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

Rick C

unread,

Jun 1, 2022, 4:45:54 PM6/1/22

to

The analysis has to be abstract. Electrons are not the only way to perform logic.

--

Rick C.

-- Get 1,000 miles of free Supercharging
-- Tesla referral code - https://ts.la/richard11209

none albert

unread,

Jun 2, 2022, 5:35:12 AM6/2/22

to

In article <6e8d4873-b1bc-4973...@googlegroups.com>,

Marcel Hendrix <m...@iae.nl> wrote:
>On Wednesday, June 1, 2022 at 2:27:35 PM UTC+2, Anton Ertl wrote:
>[..]
>> The thing about reversible computation is that it does not erase
>> memory (what costs energy in Landauer's principle), so it would allow
>> going below the Landauer limit, in a sense. However, you still need
>> some energy to drive the computation in a specific direction, and more
>> for driving it faster (at least that's what I read at one point).
>
>I expected there to be a minimum amount of energy
>to push a bunch of electrons from one detectable state
>to another one. Might be same principle as Landauer, but
>his idea that information and energy are somehow related
>I find hard to grasp.

Not the same as Landauer. Electric energy and gravitational energy
are types of free energy. They can converted into each order
without loss. Theoretically. Going from 95 to 99 to 99.9 % is
possible, but they require more and more sophistication.
That kind of thing. Lossless in a hard to reach limit.

>
>A boundary that is maybe more of practical concern: are there
>theoretical limits related to pipelining (i.e. branch removal)
>and/or parallel computing?

I ignored that article because it wasn't practical, and I
didn't see consequences for real life.

>-marcel

groetjes Albert