Yet another Spectré variant

248 views
Skip to first unread message

MitchAlsup

unread,
Sep 21, 2022, 11:52:31 AMSep 21
to
https://misc0110.net/web/files/netspectre.pdf

This paper describes a means to leak data over a network without
controlling any of the code being executed. Once again it illustrates
exposed microarchitectural state to architecture.

One variant uses the original "mistrain the branch predictor" and
then use the mistraining to alter cache state which can then be
measured.

The important variant, however, measures the delay of turning
back on function units (AVX2 in particular) that power down to
<well> save power.

I suspect that those architectures with separate FP register files
AND a means to avoid saving and restoring that file on various
context switches would also be a candidate to leak data by, in
effect, measuring whether (or not) the FP file was loaded at any
given point in time.

The paper makes the point that it is hard to "make these leaks
go away" because GBOoO designs have all the requisite means
to make microarchitectural state visible with a high precision timer.
However, neither My 66000 nor Mill is attackable using these means.

Anton Ertl

unread,
Sep 21, 2022, 1:00:49 PMSep 21
to
MitchAlsup <Mitch...@aol.com> writes:
>The important variant, however, measures the delay of turning
>back on function units (AVX2 in particular) that power down to
><well> save power.

The way to deal with this is to power up the upper part of a
powered-down AVX unit only when the AVX operation becomes
architectural.

>I suspect that those architectures with separate FP register files
>AND a means to avoid saving and restoring that file on various
>context switches would also be a candidate to leak data by, in
>effect, measuring whether (or not) the FP file was loaded at any
>given point in time.

That is an architectural operation, so it cannot be used for a Spectre
variant. Moreover, executing an FP instruction architecturally rarely
reveals something interesting about the program, so I don't expect a
non-speculative side-channel attack through that, either.

>The paper makes the point that it is hard to "make these leaks
>go away" because GBOoO designs have all the requisite means
>to make microarchitectural state visible with a high precision timer.

I outlined above how to make the AVX-powerup side channel
non-speculative. Not particularly hard.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

MitchAlsup

unread,
Sep 21, 2022, 2:18:39 PMSep 21
to
On Wednesday, September 21, 2022 at 12:00:49 PM UTC-5, Anton Ertl wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >The important variant, however, measures the delay of turning
> >back on function units (AVX2 in particular) that power down to
> ><well> save power.
<
> The way to deal with this is to power up the upper part of a
> powered-down AVX unit only when the AVX operation becomes
> architectural.
<
How do you do the AVX calculation with AVX powered down ?
AVX does not become "architectural" until the instruction gets
retired. So if you have a body of code with a AVX path and a non-
AVX path, one trains the predictor, and causes the non-AVX path
to be predicted.........
<
> >I suspect that those architectures with separate FP register files
> >AND a means to avoid saving and restoring that file on various
> >context switches would also be a candidate to leak data by, in
> >effect, measuring whether (or not) the FP file was loaded at any
> >given point in time.
<
> That is an architectural operation, so it cannot be used for a Spectre
> variant. Moreover, executing an FP instruction architecturally rarely
> reveals something interesting about the program, so I don't expect a
> non-speculative side-channel attack through that, either.
<
Many architectures with separate FP files have a bit in program status
structure indicating if FP file is loaded. If FP file is not being used,
you don't want to pay the latency of loading the file. However when
the application performs FP instruction, core takes a tap to load the
FP file state.
<
The measurement is not of the FP calculation, but whether it took
a trap to load the file (or not). Being several hundred cycles it is
easy to measure remotely.
<
> >The paper makes the point that it is hard to "make these leaks
> >go away" because GBOoO designs have all the requisite means
> >to make microarchitectural state visible with a high precision timer.
>
> I outlined above how to make the AVX-powerup side channel
> non-speculative. Not particularly hard.
<
I don't think your solution actually works.

Anton Ertl

unread,
Sep 21, 2022, 4:49:17 PMSep 21
to
MitchAlsup <Mitch...@aol.com> writes:
>On Wednesday, September 21, 2022 at 12:00:49 PM UTC-5, Anton Ertl wrote:
>> MitchAlsup <Mitch...@aol.com> writes:
>> >The important variant, however, measures the delay of turning
>> >back on function units (AVX2 in particular) that power down to
>> ><well> save power.
><
>> The way to deal with this is to power up the upper part of a
>> powered-down AVX unit only when the AVX operation becomes
>> architectural.
><
>How do you do the AVX calculation with AVX powered down ?
>AVX does not become "architectural" until the instruction gets
>retired.

Assuming all of that was true, the way to go would be to let the
instruction progress into retirement state (if it becomes
architectural) without performing the computation, only then power up
the AVX unit, then replay it. Cost: it takes 20 or so cycles longer
(in addition to the thousands of cycles until powerup is done) until
program can progress.

However, on actual Intel CPUs like the Skylake the lower 128 bits of
the AVX unit is always powered on, and it can perform AVX256
instructions even in that state, it's just slower; that's because
powering up the upper half takes substantial time, and they don't want
to stop the core during that time. So the only thing that happens is
that they start powering up the upper half not when a speculative
AVX256 instruction is decoded, but when an architectural AVX256
instruction is retired. Cost: it takes 20 or so cycles longer until
the upper half is powered up and AVX256 can run at full speed.

>> >I suspect that those architectures with separate FP register files
>> >AND a means to avoid saving and restoring that file on various
>> >context switches would also be a candidate to leak data by, in
>> >effect, measuring whether (or not) the FP file was loaded at any
>> >given point in time.
><
>> That is an architectural operation, so it cannot be used for a Spectre
>> variant. Moreover, executing an FP instruction architecturally rarely
>> reveals something interesting about the program, so I don't expect a
>> non-speculative side-channel attack through that, either.
><
>Many architectures with separate FP files have a bit in program status
>structure indicating if FP file is loaded. If FP file is not being used,
>you don't want to pay the latency of loading the file. However when
>the application performs FP instruction, core takes a tap to load the
>FP file state.
><
>The measurement is not of the FP calculation, but whether it took
>a trap to load the file (or not). Being several hundred cycles it is
>easy to measure remotely.

Sure, and it tells the remote side that the program performed an
architectural FP instruction. That's no side channel that allows to
reveal speculatively accessed data, i.e., no Spectre variant.

anti...@math.uni.wroc.pl

unread,
Sep 23, 2022, 7:43:00 PMSep 23
to
Consider My 66000 with 256-bit wide execution unit for VMM.
Significant proportion of code will get close to optimal
performance using 128-bit wide execution unit, so powering
down half of your execution unit can give valuable power
savings. And one needs to decide when to power down part
of execution unit and when to power up whole. If done
"correctly" this will not introduce extra troubles. But
there are tricky aspects and only seeing real implemention
we would see if My 66000. Of course, narrow in-order
My 66000 (or Mill) will have less tricky aspects.

More generally, I would say that modern architectures show
variable data-dependent performance. Various performance
enhancing measures lead to covert chanells. At some
time DEC tried to create computer system free from
covert chanells. IIRC their system has half of speed
of conventional system. It eliminated most, but not
all covert chanells. Eventually this project got
cancelled. Modern systems are much more complicated and
much faster than old ones. So covert chanells are
likely to have much higher bandwith and low bandwith
chanells are likly to be more tricky. It make sense
to eliminate blatant holes like orignal Meltdown or
Spectre. But I am not sure if eliminating all covert
chanells is feasible. And when there are covert
chanells, then there is possiblity that attacker can
find appropriate gadgets in existing code and use them
to sniff data from the system.

One way to plug network covert chanells would be to
have fixed service time. That is always give response
after fixed time, even if system can compute answer faster.
But that would require completely different software
architecture. And almost surely slow down network
performance quite a lot. So for network servers
slowing down processor by avoiding speculation may
be preferable to fixed response time. OTOH computational
nodes (and GUI clients) that devote to network only small
portion of resurces may prefer speculative execution
to get high compute performance and slowing down
network at software level.

--
Waldek Hebisch

Anton Ertl

unread,
Sep 24, 2022, 1:04:33 PMSep 24
to
anti...@math.uni.wroc.pl writes:
>But I am not sure if eliminating all covert
>chanells is feasible.

Unlikely.

>And when there are covert
>chanells, then there is possiblity that attacker can
>find appropriate gadgets in existing code and use them
>to sniff data from the system.

The classical approach is to write code dealing with secret keys and
the like in a way that avoids timing variations and avoids
key-dependent memory accesses, and therefore the timing and cache side
channels.

And then came Spectre, which allows the attacker to induce unknown
parts of the process to not only speculatively access the secret keys,
but also to extract them through various side channels. But if the
speculative side channels are closed (and that's possible), we don't
need to worry about that and can go back to the classical approach.

There is also Rowhammer, which also goes beyond the classical
approach, and which can (and should) also be fixed.

Michael S

unread,
Sep 24, 2022, 1:21:09 PMSep 24
to
What is 'covert channel' ?
I think it's not the same as ' side-channel vulnerability' and I
think that authors of the article do not use it in the meaning
of 'side-channel vulnerability'.
IMHO, what they call 'covert channel' is something completely
unavoidable and completely non-dangerous except when system
in question is already heavily compromised in other way.

Anton Ertl

unread,
Sep 25, 2022, 4:35:28 AMSep 25
to
Michael S <already...@yahoo.com> writes:
>What is 'covert channel' ?
>I think it's not the same as ' side-channel vulnerability' and I
>think that authors of the article do not use it in the meaning
>of 'side-channel vulnerability'.

Looking at <https://en.wikipedia.org/wiki/Covert_channel>, it seems to
me that it has more or less the same meaning as "side channel".
"Covert channel" seems to be the term more used in connection with
networking, "side channel" elsewhere. And looking at the Netspectre
paper again, it seems to me that they use these two terms
synonymously. E.g., looking at the first occurence of "side channel":

|By using a novel side channel based on the execution time of AVX2
|instructions

and the first occurence of "covert channel":

|We present a novel high-performance AVX-based covert channel

They obviously refer to the same channel here.

My intuition would be that a covert channel is if the sender intends
to send the information (in a covert way), while it's a side channel
if the sending is an unintentional (and unwanted) side effect of
whatever the sender is doing. In the case of Netspectre the data
transfer is certainly unintended by the sender, however.

anti...@math.uni.wroc.pl

unread,
Sep 25, 2022, 10:56:00 AMSep 25
to
Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> anti...@math.uni.wroc.pl writes:
> >But I am not sure if eliminating all covert
> >chanells is feasible.
>
> Unlikely.
>
> >And when there are covert
> >chanells, then there is possiblity that attacker can
> >find appropriate gadgets in existing code and use them
> >to sniff data from the system.
>
> The classical approach is to write code dealing with secret keys and
> the like in a way that avoids timing variations and avoids
> key-dependent memory accesses, and therefore the timing and cache side
> channels.
>
> And then came Spectre, which allows the attacker to induce unknown
> parts of the process to not only speculatively access the secret keys,
> but also to extract them through various side channels. But if the
> speculative side channels are closed (and that's possible), we don't
> need to worry about that and can go back to the classical approach.

Well, Spectre is bad. But look at the gadget they have:

if(x < limit) {
if (array[x] < y) {
....
}
}

The first conditional is essentially classic bounds check. I would
like bounds checks to be used more widely. For this it is natural
to demand that in case of correct branch prediction latency of
code without bounds check should be the same as latency of code
containing bounds check. Which means that inner 'if' should
execute in parallel to outer if. This is likely to lead to some
period of time where inner 'if' can produce microarchitectural side
effects before misprediction is detected.

Of course, given specific problem one can invent a solution.
In case of classic Spectre one could limit cache replecement
only to "architectural" cases. But already this is tricky,
as this would mean waiting for several instructions to retire.
One could have some ephemeral state which is propagated to
more persistent form only when instructions causing change
are retired. But such ephemeral state is likely to cause its
own troubles. For example, extra temporary buffers are
subject to contention and that may cause delays dependent
on data accessed in speculative way.

I think that one aspect is _very_ damaging. Namely, one could
hope to put programs into separate security domains and hope
that program leaking info to itself is harmless. But
unfortunatly it is not.

Let me add that my interest is in computational tasks for
which I would like to have best performance and for which
security concerns are almost non-existent. But IMO
Spectre implies that there are significant performance
costs for security. This leads to question if we should
depend on security in high-performance contexts.

BTW: Architects of in-order machines may claim that they
are immune to Spectre. But if machine is wide enough
there will be temptation to provide speculative execution
at software level. While static compiler is unlikely to
schedule array fetch in parallel with bounds check, due to
possible page fault I would expect some JITers to do such
thing: JITer runtime can supress efect of page fault
and if JITer statistics indicate that there is expected
time win JITer could optimize. In case of AVX side
chanell even static compiler could decide that speculativly
executing AVX instruction is harmless. That would
defeat hardware defence that you proposed...

--
Waldek Hebisch

Anton Ertl

unread,
Sep 25, 2022, 1:00:13 PMSep 25
to
anti...@math.uni.wroc.pl writes:
>Well, Spectre is bad. But look at the gadget they have:
>
> if(x < limit) {
> if (array[x] < y) {
> ....
> }
> }
>
>The first conditional is essentially classic bounds check. I would
>like bounds checks to be used more widely. For this it is natural
>to demand that in case of correct branch prediction latency of
>code without bounds check should be the same as latency of code
>containing bounds check. Which means that inner 'if' should
>execute in parallel to outer if. This is likely to lead to some
>period of time where inner 'if' can produce microarchitectural side
>effects before misprediction is detected.
>
>Of course, given specific problem one can invent a solution.
>In case of classic Spectre one could limit cache replecement
>only to "architectural" cases. But already this is tricky,
>as this would mean waiting for several instructions to retire.
>One could have some ephemeral state which is propagated to
>more persistent form only when instructions causing change
>are retired. But such ephemeral state is likely to cause its
>own troubles. For example, extra temporary buffers are
>subject to contention and that may cause delays dependent
>on data accessed in speculative way.

Please elaborate by working out a gadget that would have this problem.

I think that we don't have such problems, for the following reason:
Spectre attacks involve mispredicted branches. In the gadget above,
you only get a speculative out-of-bounds access if x is actually
outside the bounds, but predicted to be inside. So some cycles later
all of the speculative work is dumped[1], and any contention for the
ephemeral buffers vanishes along with all that. You have to be
careful about other resource contention, however, e.g., cache reads
from shared caches (and on a cache-coherent machine all caches are
shared); I have outlines some ideas for that problem.

[1] Or at least should be; Spectre attacks known to me make use of
microarchitectural state changes that are not dumped.

>I think that one aspect is _very_ damaging. Namely, one could
>hope to put programs into separate security domains and hope
>that program leaking info to itself is harmless. But
>unfortunatly it is not.

Of course not; that would mean that you would have to secure the whole
code, not just the small pieces that handle secret keys and such.

>But IMO
>Spectre implies that there are significant performance
>costs for security.

What makes you think so? I think it does not.

>BTW: Architects of in-order machines may claim that they
>are immune to Spectre. But if machine is wide enough
>there will be temptation to provide speculative execution
>at software level. While static compiler is unlikely to
>schedule array fetch in parallel with bounds check, due to
>possible page fault

IA-64 has a special architectural feature to move the load in the
gadget above up above the bounds check; and it can move the dependent
load (needed for Spectre v1 and v2) up above the bounds check, too.
So the combination of a compiler that performs these code motions, and
IA-64 is vulnerable to Spectre.

And any other in-order architecture intended for high performance
execution of latency-limited code will have similar features for
compiler speculation.

Thomas Koenig

unread,
Sep 25, 2022, 3:15:11 PMSep 25
to
anti...@math.uni.wroc.pl <anti...@math.uni.wroc.pl> schrieb:

> Well, Spectre is bad. But look at the gadget they have:
>
> if(x < limit) {
> if (array[x] < y) {
> ....
> }
> }
>
> The first conditional is essentially classic bounds check. I would
> like bounds checks to be used more widely.

So would I.

However, I think the default action for a failed bounds check, a
program abort, should not prevent too much of a side channel.

Quadibloc

unread,
Sep 25, 2022, 3:31:35 PMSep 25
to
On Sunday, September 25, 2022 at 1:15:11 PM UTC-6, Thomas Koenig wrote:

> However, I think the default action for a failed bounds check, a
> program abort, should not prevent too much of a side channel.

I will assume that "prevent" was a typo for "present", and indeed this
answers the objection of Waldek Hebish that wide in-order machines
could do speculative execution in software. If programs "speculatively"
access memory to which they do not have access rights, that is a
*real* access outside of bounds, and thus it won't be ignored or
glossed over, it will cause the program to be aborted.

John Savard

Anton Ertl

unread,
Sep 25, 2022, 5:18:01 PMSep 25
to
Quadibloc <jsa...@ecn.ab.ca> writes:
>On Sunday, September 25, 2022 at 1:15:11 PM UTC-6, Thomas Koenig wrote:
>
>> However, I think the default action for a failed bounds check, a
>> program abort, should not prevent too much of a side channel.
>
>I will assume that "prevent" was a typo for "present",

Ah, that makes more sense.

It seems that Thomas Koenig is considering a user-level program that
exits when the user inputs an out-of-bounds value. As long as the
attacker has the possibility to call the program again, this does not
stop the attack. Ok, the attacker now has to live with the overhead
of process creation and teardown for every probe, but that just
reduces the bandwidth.

>and indeed this
>answers the objection of Waldek Hebish that wide in-order machines
>could do speculative execution in software. If programs "speculatively"
>access memory to which they do not have access rights, that is a
>*real* access outside of bounds, and thus it won't be ignored or
>glossed over, it will cause the program to be aborted.

CPUs with the Meltdown bug will actually access memory that's
MMU-protected but mapped (e.g., kernel pages from a user-level program
before Meltdown mitigations).

Spectre is about a different kind of access: It's about memory that
can be accessed as far as the CPU is concerned, but that the program
does not access architecturally (Burroughs B5500-style protection).

BGB

unread,
Sep 25, 2022, 5:29:15 PMSep 25
to
Yeah. A compiler may not safely do these types of optimizations,
specifically because they would lead to side effects.

Realistically, the "cleverness" of a compiler optimization needs to be
bounded to what it can do without risking unintended changes to program
semantics, and something like this would fail this test in a pretty
major way.

This is where humans differ, as they are better able to determine which
types of semantic changes are acceptable.


Some types of optimizations fall into a gray area, such as those based
on strict-aliasing / type-based alias analysis. Many compilers consider
these as acceptable, so will use them by default.

Some others will not use them. In BGBCC, they are disabled by default,
and are explicitly opt-in. I have turned them on for some of my test
programs, mostly because by the time one has the program working
correctly in GCC or Clang, this part of the work is basically already done.


Then, there are still some longer term unresolved issues:
One of the demos in Doom desync's differently on BJX2 than it does on
x86 (with MSVC, GCC, and Clang), but does match the behavior seen with
GCC on 32-bit ARM (in particular, in the demo, an Imp and Baron seem to
behave in slightly different ways in one of the demos).

Note that the demos still technically desync in all cases, just in
"slightly different ways". Apparently, for things like ZDoom, it uses
stuff to detect specific WADs from specific Doom versions and similar,
and then enables/disables stuff to try to match the behavior of a
specific engine version, and also does stuff to fake memory contents for
some originally out-of-bounds memory accesses, etc. My Doom port doesn't
really do any of this.


Likewise, ROTT seems to have some more significant issues:
MSVC, GCC, and Clang all seem to play the demos without desync;
The demos desync rapidly on BJX2, for reasons I can't seem to identify,
though it does seem to be timing sensitive, and the demo may diverge in
different ways every time the demo is played.

In both cases, I have not found any code-generation options which change
the observed behaviors (this would make it easier to track down), nor
have I been able to locate any code which "obviously misbehaves".


In the Heretic port, everything is consistent between targets, though
demos play correctly for the Shareware WAD, but are prone to desync for
the release WAD.

For Hexen, demos seem to playback without any desync issues.

For Quake, Demos work in an entirely different way (it is a log of
messages sent from the server to the client, rather than a recording of
the player's keyboard inputs), so desync is no longer an issue.



Though, I can note that there are cases where x86 and BJX2 will produce
different behaviors, such as shifts where a variable shift amount is
negative:
x86 will result in a larger modulo shift;
BJX2 will shift in the other direction.

Though, it is possible that edge cases like this could also be related
to the demo desync behavior. Early on, I mostly ignored desync, but at
this point it serves as an indication for whether or not there is a
behavioral divergence somewhere.


Did recently add a compiler check to BGBCC to detect use of
uninitialized local variables (after such a variable went and started
causing something to crash), and went and cleaned up a bunch of these in
TestKern and similar.

No obvious change in these areas, but did fix up some stuff that was
likely to cause bugs in other areas (and a few "potentially significant"
ones related to memory-management and the filesystem and similar).

The check isn't perfect, as it may miss cases where a variable is
assigned in one branch but not in another, or some convoluted use of
"goto" which could lead to false positives.

But, mostly works (if I had a mechanism to reliably determine this, this
would also allow for potentially "more clever" register allocation).


Though, in any case, it would appear that the compiler does statically
assign most of the variables which are eligible for static assignment
(and most of those that remain are effectively ineligible).

Generally, it appears that non-leaf functions also have a higher
proportion of ineligible variables, so full-static assignment appears to
be common in leaf functions, but rare in non-leaf functions (averages:
statically_assigned=5.3, eligible=6.5, total_vars=17.2, *).

*: The total also includes constants, global variables, etc. Global
variables are ineligible for static-assignment in non-leaf functions, as
this would require some way of proving that a called function does not
change the global variable in some way (or could potentially call
another function which could potentially change the variable, ...).

Constants are a special case, only considered "eligible" if the function
could otherwise go full-static, otherwise trying to statically assign
them to a register makes things worse than if they are left to being
dynamically assigned (partly as the code in question isn't really smart
enough to figure out which constants will be handled as immediate
values, and it is a waste to burn a register on a constant which is only
ever used as an immediate).

But, OTOH, I guess "most of the local variables and similar are held in
registers in most cases" is still a "net win" I guess.

And, also, possibly one could argue against people writing code where
significant amounts of program state is held in global variables (my own
coding styles mostly avoid this, but things like Doom and similar still
use a lot of global variables).


Could potentially improve some cases by "gluing" some small blocks
together into larger blocks in cases where unconditional branches are
used, but this would have a detrimental effect on code density.


Or, in some ways, compiler development is kind of a pain...

..



anti...@math.uni.wroc.pl

unread,
Sep 25, 2022, 7:41:24 PMSep 25
to
The context was network server or OS kernel. Kernel do not have
option of exiting... More generaly, in network context leaving
request without answer is consdered bad behaviour. Servers do
this if they think that they are under attack. But things can
go out of bounds due to silly errors and server normally
produces some error message. This gives enough space for side
channel.

--
Waldek Hebisch

anti...@math.uni.wroc.pl

unread,
Sep 25, 2022, 7:44:04 PMSep 25
to
Well, program has full right to access this memory, it just does
not want. Context is network server, it has secret info which
it does not want to disclose to the client.

--
Waldek Hebisch

anti...@math.uni.wroc.pl

unread,
Sep 25, 2022, 8:30:29 PMSep 25
to
One problem is "some cycles". If "some cycles" depends on speculative
state, then measuring delay caused by "some cycles" leaks data. To
make this delay independent of ephemeral state looks tricky, unless
you make this delay much longer than it is now.

Another potential problem is what exactly gets dumped? Is ephemeral
state belonging to valid path preserved? If yes, it can speed
up further execution in ways which depend on dumped state. If
no you can significantly increase cost of mispredicted branches.

Third, it time between starting speculative path and dumping
speculative state some instructions may retire. Their
execution time (and possiblity to retire) may depend on
availability of ephemeral state, so may leak info due to
contention.

Of course, one could try to overprovision CPU so that no
contention for ephemeral resources is possible. At least
in case of cache buffers this looks quite costly to me.

--
Waldek Hebisch

Thomas Koenig

unread,
Sep 26, 2022, 1:52:27 AMSep 26
to
anti...@math.uni.wroc.pl <anti...@math.uni.wroc.pl> schrieb:
In that case, doing something that will take longer might work,
such as calling sched_yield() or equivalent. High performance
on error responses should not play a role.

Depending on the context, this might lead to DoS attack, though,
this shold not be done blindly.

Andy Valencia

unread,
Sep 26, 2022, 9:25:26 AMSep 26
to
> (Waldek Hebisch writes:)
> ...
> Let me add that my interest is in computational tasks for
> which I would like to have best performance and for which
> security concerns are almost non-existent. But IMO
> Spectre implies that there are significant performance
> costs for security. This leads to question if we should
> depend on security in high-performance contexts.

Well put! What would such a bifurcation of the computing market look like?

Andy Valencia
Home page: https://www.vsta.org/andy/
To contact me: https://www.vsta.org/contact/andy.html

John Dallman

unread,
Sep 26, 2022, 9:58:28 AMSep 26
to
In article <166419859003.8563....@media.vsta.org>,
van...@vsta.org (Andy Valencia) wrote:

> > (Waldek Hebisch writes:)
> > Spectre implies that there are significant performance
> > costs for security. This leads to question if we should
> > depend on security in high-performance contexts.
> Well put! What would such a bifurcation of the computing market
> look like?

That's an interesting question. The HPC world is already a thing of its
own. But there's a large category of work where performance is important
that runs on the same kinds of computers as office automation.

It uses beefier processors, larger memories, better graphics cards and so
on, but it benefits from being on the same OSes as office work. Companies
really like engineers being able to use MS Office on the same machines as
they use for CAD, EDA and so on.

John

Thomas Koenig

unread,
Sep 26, 2022, 12:21:06 PMSep 26
to
John Dallman <j...@cix.co.uk> schrieb:
That is exactly the case with my compute workstation. I would
dearly like to run it under Linux, but it is running Windows.
Easier to administrate, for once.

And as for HPC - people who disparaged JCL should take a look
at slurm.

Anton Ertl

unread,
Sep 26, 2022, 4:06:54 PMSep 26
to
It does not depend on the value of array[x], so it does not leak that.
At least that's my expectation based on that resolving the branch
prediction for "if (x<limit)" does not depend on array[x] and the
branch retires before array[x] and anything that depends on array[x]
(which are not retired at all if they are in a misspeculation).

>To
>make this delay independent of ephemeral state looks tricky, unless
>you make this delay much longer than it is now.

I don't see that it's tricky at all to make it independent of the
state that would be attacked with this gadget. If you think there is
a gadget that speculatively reveals something secret through the
length of the misprediction penalty, please present it. It seems to
me that the only thing revealed is latencies of architectural
accesses.

>Another potential problem is what exactly gets dumped? Is ephemeral
>state belonging to valid path preserved?

Speculative results are either retired to architectural state if the
instructions are executed architecturally, or they are canceled/dumped
of the instruction was misspeculated.

>If yes, it can speed
>up further execution in ways which depend on dumped state.

Please elaborate. Normal OoO does not do this. The speculative state
of a misspeculated instruction is not used in further execution of the
correct path. The only issue here is the Spectre problem where
microarchitectural state is not kept speculative until retirement, but
made permanent right away. But a Spectre-proof CPU will not do that.

There have been proposals for value prediction, where a supposed
advantage would be that the correct path after a mispredicted path
might benefit from getting some of the results earlier, but AFAIK that
never has become reality, and Spectre-proof hardware would not go
there.

>If
>no you can significantly increase cost of mispredicted branches.

What makes you think so?

>Third, it time between starting speculative path and dumping
>speculative state some instructions may retire. Their
>execution time (and possiblity to retire) may depend on
>availability of ephemeral state, so may leak info due to
>contention.

Instructions are issued in-order and retired in-order, and in case of
resource contention, they better give priority to the older
instruction to avoid hardware deadlocks. In any case, a Spectre-proof
CPU will give priority to the older instruction, so if a retired
(i.e., architectural) instruction sees resource contention, it sees
only resource contention with an older instruction, which is also
architectural.

The only issue is multi-core CPUs with speculative accesses to shared
caches and such, and I have discussed solutions to that earlier.

Quadibloc

unread,
Sep 26, 2022, 5:07:50 PMSep 26
to
On Sunday, September 25, 2022 at 5:41:24 PM UTC-6, anti...@math.uni.wroc.pl wrote:

> The context was network server or OS kernel. Kernel do not have
> option of exiting.

In that case, I am really confused. If the kernel-mode code is malicious, it does not
need to exploit any bugs. It already has what it wants, full control of the computer.

John Savard

EricP

unread,
Sep 27, 2022, 12:44:02 PMSep 27
to
It seems to me that the situations where Spectre could matter occur
when those with responsibility for hardware have no authority or
control over the selection of or details of what software executes.
E.g. timeshare (cloud), downloaded binary apps on desktop or mobile.
Blocking Spectre and friends for one applies to all.

Rather than trying to fix Spectre for everyone equally all the time,
perhaps HW & SW can separate off these untrusted binaries.

The performance hit for Spectre blocking could be addressed not by
making all processes pay, but by only making "untrusted" ones pay.
The enable/disable of Spectre blocking mechanism can't depend on compiler
constructs like Retpoline because that assumes we have source code access.
It also assumes one can identify source code points which might be
vulnerable, which is probably unrealistic beyond toy programs.

This would focus the hardware changes in two areas:
a) eliminating things that never should have leaked cross-domain info
in the first place AND can be isolated relatively easily.
E.g. Marking branch predictor table entries with Super/User and
HW thread T0/T1 flags eliminating cross domain info access.

b) all other more expensive HW changes have an OS controlled
enable/disable that stalls certain operations while in an
unresolved branch shadow or purges caches on thread switch.

On process create the parent process sets a flag indicating to
OS that a child is trusted/untrusted WRT Spectre et al.

Timeshare companies could then have two pricing models, one where
customer code is trusted, perhaps requiring inspection, or not.
Individuals can decide for themselves whose source or binaries they trust.


Andy Valencia

unread,
Sep 27, 2022, 2:31:57 PMSep 27
to
EricP <ThatWould...@thevillage.com> writes:
> ...
> The performance hit for Spectre blocking could be addressed not by
> making all processes pay, but by only making "untrusted" ones pay.

I was pondering this just this morning, wondering about something along the
lines of big.LITTLE, but in this case a strict in-order CPU to run the
untrusted things--closed source binaries, web browsers, and such. There's a
big hit in performance from losing OoO and any other speculation, but if it's
on a CPU with its design focused here, there's probably a lot of design
streamlining to offset some of that. 2x slower when the dust settles? I'd
live with that.

Otherwise it's time for a compute brick hanging off your machine. But then
you're back to hoping the endless compromises via architectural probing
on the controlling system don't cost you too dearly. It doesn't currently
feel like an arms race we're winning.

MitchAlsup

unread,
Sep 27, 2022, 6:02:58 PMSep 27
to
On Saturday, September 24, 2022 at 12:04:33 PM UTC-5, Anton Ertl wrote:
> anti...@math.uni.wroc.pl writes:
> >But I am not sure if eliminating all covert
> >chanells is feasible.
> Unlikely.
> >And when there are covert
> >chanells, then there is possiblity that attacker can
> >find appropriate gadgets in existing code and use them
> >to sniff data from the system.
> The classical approach is to write code dealing with secret keys and
> the like in a way that avoids timing variations and avoids
> key-dependent memory accesses, and therefore the timing and cache side
> channels.
>
> And then came Spectre, which allows the attacker to induce unknown
> parts of the process to not only speculatively access the secret keys,
> but also to extract them through various side channels. But if the
> speculative side channels are closed (and that's possible), we don't
> need to worry about that and can go back to the classical approach.
<
Neither My 661x0 or MILLs are sensitive to Spectré.
>
> There is also Rowhammer, which also goes beyond the classical
> approach, and which can (and should) also be fixed.
<
Rowhammer is easily fixed. What you need is a write buffer between
data-to-be-written and data-going-out-over-DRAM-pins; {An L3 cache is
just perfect, BTW.} Then as the line is modified and then pushed out of
CPU caches, it lands in the L3, and is available to be written. Any demand
request certainly sees the last data written, and overwritting of the data
updates the data in L3 and the write remains pending until, at some point,
The DIMM[j].ROW[k].Column[n] is available to write, then the data from the
L3 is written into DRAM and its status updated in L3.
<
You can touch, modify, evict the line BILLIONs of times, and it get written
once or twice in DRAM, and the subsequent address submitted to DRAM
is not the same as the address being RowHammered. Q.E.D.

MitchAlsup

unread,
Sep 27, 2022, 6:14:54 PMSep 27
to
Yes, it is sad that the code used to protect that resource is the
very code that enables the attack vector.
<
> For this it is natural
> to demand that in case of correct branch prediction latency of
> code without bounds check should be the same as latency of code
> containing bounds check. Which means that inner 'if' should
> execute in parallel to outer if. This is likely to lead to some
> period of time where inner 'if' can produce microarchitectural side
> effects before misprediction is detected.
<
So the pipeline designer has to prevent that microarchitectural state
from ever becoming visible long enough to be "seen" with a high
precision timer.
>
> Of course, given specific problem one can invent a solution.
> In case of classic Spectre one could limit cache replecement
> only to "architectural" cases. But already this is tricky,
<
Having done a design that did exactly this, I can say it is not
tricky, it is straightforward--however it does tie up some resources
longer which may lead to minor losses.
<
> as this would mean waiting for several instructions to retire.
<
As long as that latent state can be "snooped" by the pipeline before
it gets retired into the cache, very little harm is done. So, not just
"critical word first" but "whole line" while it is in the buffer.
Similar for TLB, too.
<
> One could have some ephemeral state which is propagated to
> more persistent form only when instructions causing change
> are retired. But such ephemeral state is likely to cause its
> own troubles. For example, extra temporary buffers are
> subject to contention and that may cause delays dependent
> on data accessed in speculative way.
<
The temporal buffers I am using are typically called "Miss Buffers".
And even rather puny implementations have 6-8 of them.
>
> I think that one aspect is _very_ damaging. Namely, one could
> hope to put programs into separate security domains and hope
> that program leaking info to itself is harmless. But
> unfortunatly it is not.
>
> Let me add that my interest is in computational tasks for
> which I would like to have best performance and for which
> security concerns are almost non-existent. But IMO
> Spectre implies that there are significant performance
> costs for security. This leads to question if we should
> depend on security in high-performance contexts.
<
Correction: avoiding Spectré attack vectors on current machines
are a significant impactor to performance (in a negative way).
However, It is possible to build architectures and implementations
thereof which are not sensitive to Spectré. The bad part is neither
known implementations are ready to be purchased.
>
> BTW: Architects of in-order machines may claim that they
> are immune to Spectre. But if machine is wide enough
> there will be temptation to provide speculative execution
> at software level. While static compiler is unlikely to
> schedule array fetch in parallel with bounds check, due to
> possible page fault I would expect some JITers to do such
> thing: JITer runtime can supress efect of page fault
> and if JITer statistics indicate that there is expected
> time win JITer could optimize. In case of AVX side
> chanell even static compiler could decide that speculativly
> executing AVX instruction is harmless. That would
> defeat hardware defence that you proposed...
>
AVX is NEVER harmless.
> --
> Waldek Hebisch

MitchAlsup

unread,
Sep 27, 2022, 6:21:07 PMSep 27
to
xterm--open a shell on the remote big server farm and copy and paste the data
into eXcel to make your graphs.
>
> John

MitchAlsup

unread,
Sep 27, 2022, 6:31:47 PMSep 27
to
On Monday, September 26, 2022 at 3:06:54 PM UTC-5, Anton Ertl wrote:
> anti...@math.uni.wroc.pl writes:
> >Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> >> anti...@math.uni.wroc.pl writes:
> >> >Well, Spectre is bad. But look at the gadget they have:
> >> >
> >> > if(x < limit) {
> >> > if (array[x] < y) {
> >> > ....
> >> > }
> >> > }
<snip>
> >
> >One problem is "some cycles". If "some cycles" depends on speculative
> >state, then measuring delay caused by "some cycles" leaks data.
> It does not depend on the value of array[x], so it does not leak that.
> At least that's my expectation based on that resolving the branch
> prediction for "if (x<limit)" does not depend on array[x] and the
> branch retires before array[x] and anything that depends on array[x]
> (which are not retired at all if they are in a misspeculation).
<
GBOoO machines can see that the access of array[x] has no dependencies
and can AGEN at the same time (or earlier than the CMP-BC). The LD is
still covered by the shadow of the prediction and cannot retire.
<
Meltdown observed if one can determine if the access occurred micro-
architecturally but not architecturally. Intel x86s were more sensitive than
AMD x86s. The observation used the result of the first access to seed the
address of a second access, and then measure if the second access
brought a line into the cache (or not). We used to call this pipeline
strategy "flying blind" whereas it was we the designers who were blind
to the loopholes this opened up.
<
If/when CPU designers cause AGEN-access to fail when presented with
illegal data; meltdown disappears. That is, you have to know hit of AGEN[n]
before you allow the result of AGEN[n] to create an address for AGEN[n+1].
<
<
> >To
> >make this delay independent of ephemeral state looks tricky, unless
> >you make this delay much longer than it is now.
<
> I don't see that it's tricky at all to make it independent of the
> state that would be attacked with this gadget. If you think there is
> a gadget that speculatively reveals something secret through the
> length of the misprediction penalty, please present it. It seems to
> me that the only thing revealed is latencies of architectural
> accesses.
<
It is not tricky at all--but it may be area intensive (just a bit) and
will add some sequencing.
<
> >Another potential problem is what exactly gets dumped? Is ephemeral
> >state belonging to valid path preserved?
<
> Speculative results are either retired to architectural state if the
> instructions are executed architecturally, or they are canceled/dumped
and everything younger than them
> of the instruction was misspeculated.
doesn't mater how the decision to back up was decided.

MitchAlsup

unread,
Sep 27, 2022, 6:32:42 PMSep 27
to
There are these things called Hypervisors who give the illusion that GuestOSs
have complete control of the computer.
>
> John Savard

MitchAlsup

unread,
Sep 27, 2022, 6:41:08 PMSep 27
to
Is a zero privilege hypervisor (worker) thread that happens to access
hypervisor memory User or Super ? Seems to me he has to access
pages which will be marked Super, but he otherwise has no need to
execute privileged instructions so why is he Super ?
>
> b) all other more expensive HW changes have an OS controlled
> enable/disable that stalls certain operations while in an
> unresolved branch shadow or purges caches on thread switch.
<
From what I have seen of Spectré, leaving the predictor(s) in one state
is just about as dangerous as flipping it to another state. Perhaps the
root of the problem is the predictors ? Could we migrate predictors
with the threads for which they have been trained ? No--that likely takes
too much time.
>
> On process create the parent process sets a flag indicating to
> OS that a child is trusted/untrusted WRT Spectre et al.
>
> Timeshare companies could then have two pricing models, one where
> customer code is trusted, perhaps requiring inspection, or not.
<
Compiled from a language not in the JIT category on a trusted compiler.
JITs from a trusted local library can run trusted, JITs from everywhere else
are untrusted.
<
> Individuals can decide for themselves whose source or binaries they trust.
<
s/Individuals/trained security personnel/
s/themselves/users of their system/

Quadibloc

unread,
Sep 27, 2022, 10:47:23 PMSep 27
to
It is true that "social engineering" is a thing.

However, one thing at a time. *If* it is possible for the user of a computer
to specify that code coming from certain sources is untrusted, and the
operating system will genuinely enforce that specification... _then_ it
is also possible to bring in security people to improve on the individual
user's judgment.

Some computers are used at home, others are used in places of work.
Both run Windows, although usually different SKUs. If the operating
system is porous as all get-out, the corporate security department will
not be able to save the situation.

John Savard

Quadibloc

unread,
Sep 27, 2022, 10:48:02 PMSep 27
to
Oh, dear. I did overlook that possible situation.

John Savard

Quadibloc

unread,
Sep 27, 2022, 11:01:42 PMSep 27
to
On Tuesday, September 27, 2022 at 10:44:02 AM UTC-6, EricP wrote:

> The performance hit for Spectre blocking could be addressed not by
> making all processes pay, but by only making "untrusted" ones pay.

This is the approach I advocate - for the simple reason that the performance
costs of mitigating Spectre are high, and people desire the maximum
possible performance from their computer systems.

But just because such an approach is _desirable_ doesn't mean it is
feasible; am I engaging in wishful thinking?

Based on certain naive assumptions, it _would_ be feasible.

If I assume:

- Spectre and its cousins are the only serious security threat facing
computers today; it's possible to make operating systems fully
secure against every other potential vulnerability, and

- Supply-chain attacks and the like are so uncommon we don't
really have to worry about them; the only real danger to computers
is where, in an E-mail or on a web site, there lurks malicious code
that users will unknowingly execute because it appears that what
they're doing is reading a legitimate E-mail, and using the functionality
of a reputable web site.

Then the model of a computer that runs all the *Internet* stuff on
a 486-type computer with no Spectre problems, but runs your
games and your spreadsheet on the real screaming-fast OoO
processor without a care in the world... *would* be secure.

Of course, my two assumptions can, quite easily, be shown not
only to be wrong, but ludicrously wrong.

Despite that, I still think that the kind of computer I'm suggesting
is a *good first step* because it would prevent about 95% of existing
computer compromises. But, yes, I realize it can't be used all by
itself; once this kind of computer became the "standard", hackers would
shift to making use of the opportunities that remained.

John Savard

Thomas Koenig

unread,
Sep 28, 2022, 3:52:25 AMSep 28
to
MitchAlsup <Mitch...@aol.com> schrieb:

> From what I have seen of Spectré, leaving the predictor(s) in one state
> is just about as dangerous as flipping it to another state. Perhaps the
> root of the problem is the predictors ? Could we migrate predictors
> with the threads for which they have been trained ? No--that likely takes
> too much time.

How big are today's predictors, how many bits would have to
be stored?

In looking at such a faeature, there is also the question of how
fine-grained such a migration would be - kernel vs user,
userid, process level, thread level, per executable program
...

Apart from the security aspects, there is also a potential performance
upside by not having other processes/threads stomp on the information
gathered.

Being able to mainipulate predictors could also have other benefits:
A hardware-supported form of profile-guided optimization o the one hand,
being able to flush all predictors for the current thread for security-
relevant code, as well.

The different variants would really need a cost-benefit analysis,
for which I lack the basic information :-)

Terje Mathisen

unread,
Sep 28, 2022, 4:54:52 AMSep 28
to
This all supposes that the attacker isn't using instructions that force
cache eviction, right?

I.e. most LOCK prefix'ed operations, as well as user-level cache clear
ops would suffice to generate actual RAM updates afaik?

Terje


--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

MitchAlsup

unread,
Sep 28, 2022, 12:25:31 PMSep 28
to
On Wednesday, September 28, 2022 at 3:54:52 AM UTC-5, Terje Mathisen wrote:
> MitchAlsup wrote:
> > On Saturday, September 24, 2022 at 12:04:33 PM UTC-5, Anton Ertl wrote:
> >> There is also Rowhammer, which also goes beyond the classical
> >> approach, and which can (and should) also be fixed.
> > <
> > Rowhammer is easily fixed. What you need is a write buffer between
> > data-to-be-written and data-going-out-over-DRAM-pins; {An L3 cache is
> > just perfect, BTW.} Then as the line is modified and then pushed out of
> > CPU caches, it lands in the L3, and is available to be written. Any demand
> > request certainly sees the last data written, and overwritting of the data
> > updates the data in L3 and the write remains pending until, at some point,
> > The DIMM[j].ROW[k].Column[n] is available to write, then the data from the
> > L3 is written into DRAM and its status updated in L3.
> > <
> > You can touch, modify, evict the line BILLIONs of times, and it get written
> > once or twice in DRAM, and the subsequent address submitted to DRAM
> > is not the same as the address being RowHammered. Q.E.D.
<
> This all supposes that the attacker isn't using instructions that force
> cache eviction, right?
<
The explanation above used user-level-cache-eviction.
My solution to the problem is (IS) independent of user-level-cache-eviction.
In effect, I changed the definition of cache eviction not-to-DRAM but
close enough that nobody could tell if it was in DRAM and also in a cache
near DRAM.
>
> I.e. most LOCK prefix'ed operations, as well as user-level cache clear
> ops would suffice to generate actual RAM updates afaik?
<
The DRAM is not actually updated instantaneously, or necessarily, but
every requestor in the system will see it in its current (last written) state.
DRAM is updated is a time period congruent to probability of power failure;
and the danger and damages of such.
This time period, and the number of DRAM updates performed are sufficient
to prevent hammering of a single word line in the DRAM.

Robert Swindells

unread,
Sep 28, 2022, 1:24:25 PMSep 28
to
On Tue, 27 Sep 2022 15:02:57 -0700 (PDT), MitchAlsup wrote:

> Neither My 661x0 or MILLs are sensitive to Spectré.

What is Spectré? Is it similar to Spectre or something completely
different?

Or are you trying to prevent your posts getting found by someone searching
for Spectre.

MitchAlsup

unread,
Sep 28, 2022, 1:30:05 PMSep 28
to
No, the contrapositive:: I can find my posts by searching.

Quadibloc

unread,
Sep 28, 2022, 8:00:48 PMSep 28
to
On Wednesday, September 28, 2022 at 11:24:25 AM UTC-6, Robert Swindells wrote:

> What is Spectré? Is it similar to Spectre or something completely
> different?

Well, it would be pronounced "spectray", but the correct French spelling of Spectre
does not include any accents.

John Savard

Terje Mathisen

unread,
Sep 29, 2022, 6:20:40 AMSep 29
to
Is that even possible in a multi-CPU environment, or are those simply
not important anymore?

I'm guessing a sufficiently large $L3 (last-level cache) can make it
impossible to rowhammer a bunch of lines over a very short time period,
i.e. writing to single bytes (or words) in a large number of cache
lines, in which case "adding another layer of indirection" is once again
the solution. :-)

Having RAM chips which simply aren't vulnerable to any possible access
patern, because it either has a sufficiently short refresh interval, or
adaptive refresh based on number of writes, would make me feel a bit
safer though.

Quadibloc

unread,
Sep 29, 2022, 11:18:40 AMSep 29
to
On Thursday, September 29, 2022 at 4:20:40 AM UTC-6, Terje Mathisen wrote:

> Having RAM chips which simply aren't vulnerable to any possible access
> patern, because it either has a sufficiently short refresh interval, or
> adaptive refresh based on number of writes, would make me feel a bit
> safer though.

Very definitely. Since Rowhammer exists, the question is why any other
kind of RAM chip is still being manufactured. Unless, of course, this would
reduce performance and/or increase cost - because *no* costs are acceptable
for security.

Instead, just make it clear that the penalty for trying to hack a computer is
instant death! Then the problem is solved at zero cost!

Except the problem is that there are such things as nation-state attacks,
*and* some of the countries responsible, like Russia and China, have
nuclear weapons, so regime change is not available.

So this explains why we need to go to the trouble and expense of making
our computers secure.

And it seems to me that while the cost of protecting against Spectre is
too high to impose on trusted code, the cost of protecting against
Rowhammer seems moderate enough that it can be tolerated, and a way
to turn it off for trusted code isn't needed. Although adaptive refresh,
instead of simply a higher refresh rate, would turn it off for well-behaved
code.

If a DRAM memory spent more of its time refreshing, that would impose
a speed penalty; but if, instead, it was simply divided into smaller parts...
no, that would simply be a _way_ to make it possible for a higher proportion
of the time to be spent refreshing. But even refreshing once every 16 cycles
would be tolerable, which is a very high refresh rate.

John Savard

MitchAlsup

unread,
Sep 29, 2022, 11:39:30 AMSep 29
to
The strategy works when CPUs > 0 and when CPUs >=1
>
> I'm guessing a sufficiently large $L3 (last-level cache) can make it
> impossible to rowhammer a bunch of lines over a very short time period,
<
That is the basic strategy

MitchAlsup

unread,
Sep 29, 2022, 11:50:48 AMSep 29
to
On Thursday, September 29, 2022 at 10:18:40 AM UTC-5, Quadibloc wrote:
> On Thursday, September 29, 2022 at 4:20:40 AM UTC-6, Terje Mathisen wrote:
>
> > Having RAM chips which simply aren't vulnerable to any possible access
> > patern, because it either has a sufficiently short refresh interval, or
> > adaptive refresh based on number of writes, would make me feel a bit
> > safer though.
<
> Very definitely. Since Rowhammer exists, the question is why any other
> kind of RAM chip is still being manufactured. Unless, of course, this would
> reduce performance and/or increase cost - because *no* costs are acceptable
> for security.
<
DRAM manufactures are constrained to produce parts that can be soldered down
or put onto DIMMs {typically DDR[k]--both of which has tight specifications,
agreed to with national standards.} They don't want to change--so it is up to chip
designers to fix RowHammer.
>
> Instead, just make it clear that the penalty for trying to hack a computer is
> instant death! Then the problem is solved at zero cost!
>
> Except the problem is that there are such things as nation-state attacks,
> *and* some of the countries responsible, like Russia and China, have
> nuclear weapons, so regime change is not available.
<
This problem would also go away if your previous paragraph was active.
>
> So this explains why we need to go to the trouble and expense of making
> our computers secure.
<
I can see a case where the NSA would want to make a computer LESS
secure so they could give away things that are untrue without the receiving
party getting suspicious.
>
> And it seems to me that while the cost of protecting against Spectre is
> too high to impose on trusted code, the cost of protecting against
> Rowhammer seems moderate enough that it can be tolerated, and a way
> to turn it off for trusted code isn't needed. Although adaptive refresh,
> instead of simply a higher refresh rate, would turn it off for well-behaved
> code.
<
I don't see Spectré being hard to eliminate.*
I see RowHammer easy to eliminate.
<
(*) That Intel, AMD, ARM, RISC-V have failed does not mean it is impossible.
>
> If a DRAM memory spent more of its time refreshing, that would impose
> a speed penalty; but if, instead, it was simply divided into smaller parts...
> no, that would simply be a _way_ to make it possible for a higher proportion
> of the time to be spent refreshing. But even refreshing once every 16 cycles
> would be tolerable, which is a very high refresh rate.
<
There are those national standards you are pushing against. Standards to
which all DRAM manufactures have agreed.
>
> John Savard

EricP

unread,
Sep 29, 2022, 12:15:30 PMSep 29