Marcel Hendrix <
m...@iae.nl> writes:
>On Wednesday, June 1, 2022 at 2:27:35 PM UTC+2, Anton Ertl wrote:
>[..]
>> The thing about reversible computation is that it does not erase
>> memory (what costs energy in Landauer's principle), so it would allow
>> going below the Landauer limit, in a sense. However, you still need
>> some energy to drive the computation in a specific direction, and more
>> for driving it faster (at least that's what I read at one point).
I think I read it in a collection by Feynmann (who held a regular
lecture about physics of computation in the 1980s).
>I expected there to be a minimum amount of energy
>to push a bunch of electrons from one detectable state
>to another one.
I think that's already too implementation-specific for this kind of
reasoning.
>Might be same principle as Landauer, but
>his idea that information and energy are somehow related
>I find hard to grasp.
Information and enthropy are related. E.g., consider Maxwell's demon.
>A boundary that is maybe more of practical concern: are there
>theoretical limits related to pipelining (i.e. branch removal)
Pipelining is not the same as branch removal. In (hardware)
pipelining, every pipeline stage adds ~5 gate delays to the delay of
the whole thing, for the holding latches, and for the jitter etc. of
the pipeline stage. It also adds to the power needs (both for the
additional gates and due to clocking higher). Intel planned to deepen
the Pentium 4 pipeline [sprangle&carmean02] in the Tejas (and AMD also
worked on a deeply pipelined CPU at the same time), but both projects
were cancelled in 2005; my guess is that there was a promising cooling
technology that did not work out, so they could not produce CPUs with
such a high power density as planned.
Branch prediction helps avoid the branch penalty of deep pipelines;
you cannot predict a really random branch, but apparently patterns in
the data that we don't see easily can be used by branch predictors.
@InProceedings{sprangle&carmean02,
author = {Eric Sprangle and Doug Carmean},
title = {Increasing Processor Performance by Implementing
Deeper Pipelines},
crossref = {isca02},
pages = {25--34},
url = {
http://www.cs.cmu.edu/afs/cs/academic/class/15740-f03/public/doc/discussions/uniprocessors/technology/deep-pipelines-isca02.pdf},
annote = {This paper starts with the Williamette (Pentium~4)
pipeline and discusses and evaluates changes to the
pipeline length. In particular, it gives numbers on
how lengthening various latencies would affect IPC;
on a per-cycle basis the ALU latency is most
important, then L1 cache, then L2 cache, then branch
misprediction; however, the total effect of
lengthening the pipeline to double the clock rate
gives the reverse order (because branch
misprediction gains more cycles than the other
latencies). The paper reports 52 pipeline stages
with 1.96 times the original clock rate as optimal
for the Pentium~4 microarchitecture, resulting in a
reduction of 1.45 of core time and an overall
speedup of about 1.29 (including waiting for
memory). Various other topics are discussed, such as
nonlinear effects when introducing bypasses, and
varying cache sizes. Recommended reading.}
}
>and/or parallel computing?
Amdahl's law. Often underestimated, often overestimated.
>The human brain does not seem much of a problem with the speed
>of communication (between cells),
It does not compute very fast.
>and doesn't overheat.
Actually humans are reported to spend 25% of their energy on the
brain, and certainly more when people are thinking hard. And it can
become too hot.
New standard:
https://forth-standard.org/
EuroForth 2022:
http://www.euroforth.org/ef22/cfp.html