Are there any possible adding condition codes?

Leway Colin

unread,

Dec 23, 2021, 9:22:22 PM12/23/21

to RISC-V ISA Dev

The RISC-V Instruction Set Manual introduces that the conditional branches were designed to include arithmetic comparison operations between two registers, rather than use condition codes.

Are there any possible adding condition codes in the future?

Bruce Hoult

unread,

Dec 24, 2021, 1:49:28 AM12/24/21

to Leway Colin, RISC-V ISA Dev

Never say never, but you'd have to provide very strong proof that there was a measurable and significant advantage in performance, performance/Joule, performance/mm^2 on a general body of code (such as SPEC, or a web browser). On the contrary, both x86 and ARM vendors go to considerable efforts to do macro-op fusion to turn a compare and an adjacent branch into a single operation. Were the designers of their ISA in the 1970s or 1980s correct, or do their current micro-architects know something more about current technology and typical software?

On Fri, Dec 24, 2021 at 3:22 PM Leway Colin <coli...@gmail.com> wrote:

The RISC-V Instruction Set Manual introduces that the conditional branches were designed to include arithmetic comparison operations between two registers, rather than use condition codes.

Are there any possible adding condition codes in the future?

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/54735f86-5cc5-4086-844e-85718451878bn%40groups.riscv.org.

Robert Finch

unread,

Dec 24, 2021, 6:52:15 AM12/24/21

to RISC-V ISA Dev, Bruce Hoult, RISC-V ISA Dev, Leway Colin

It is possible to do many things using custom instructions which is part of the value of the architecture, being able to adapt to particular circumstances. But I do not see condition codes being adopted on a large scale. Are these condition code beyond the norm? If they are I/O state they could be reached with load / store instructions to a dedicated address.

What I have wondered is if it is possible to incorporate more arithmetic / logic operations in branch instructions. A couple I have used is branch-and, and branch-or in addition to the relational operations to take care of cases like “if (a && b)” or “if (a || b)” and absorb them into a single instruction. This does require branch-nand and branch-nor as the if conditions are inverted. I also miss branch-on-bit-set / clear instructions.

Jim Wilson

unread,

Dec 24, 2021, 1:56:29 PM12/24/21

to Robert Finch, RISC-V ISA Dev, Bruce Hoult, Leway Colin

On Fri, Dec 24, 2021 at 3:52 AM Robert Finch <robf...@gmail.com> wrote:

What I have wondered is if it is possible to incorporate more arithmetic / logic operations in branch instructions. A couple I have used is branch-and, and branch-or in addition to the relational operations to take care of cases like “if (a && b)” or “if (a || b)” and absorb them into a single instruction. This does require branch-nand and branch-nor as the if conditions are inverted. I also miss branch-on-bit-set / clear instructions.

Another thing that would be useful is add with carry, which works better if you have a carry flag. There are some important applications, like GMP, that really need a fast add with carry.

Jim

Markku-Juhani O. Saarinen

unread,

Dec 24, 2021, 5:29:32 PM12/24/21

to Jim Wilson, Robert Finch, RISC-V ISA Dev, Bruce Hoult, Leway Colin

Hi All,

The carry flag issue is pretty much the first thing coders bump into when porting cryptographic middleware to RISC-V. My personal thoughts on this often-visited issue.

From a cryptography/security viewpoint, the lack of a carry flag is good and bad.

(1) The lack of carry(/borrow/overflow/underflow) conditional is mostly good. The use of carry-conditional branching based on input data generally leads to timing attacks, so cryptographers avoid it even on platforms that have such conditionals. I urge people to read up on related speculative execution security issues etc. This is why we have the Zkt data-independent latency extension now ( not yet integrated into the main specs but available in Chapter 5 of the Scalar Crypto: https://github.com/riscv/riscv-crypto/releases/download/v1.0.0-rc6-scalar/riscv-crypto-spec-scalar-1.0.0-rc6.pdf )

Learning to live with it, the recommendation has been to use Redundant Binary Representation (RBR) for cryptographic big integer arithmetic (RSA, ECC, and Isogeny cryptography). However, the lack of a carry bit is not necessary such a great loss as it might first seem as packed integer carry chains are not parallelizable. RBR matches with how big integer arithmetic is done on vector architectures, which certainly can't support packed (non-RBR) big integers easily. Some notes: https://github.com/riscv/riscv-crypto/blob/master/doc/supp/rbr-arithmetic.adoc

(2) However, there would be code density advantages for an add-with-carry instruction (which is a completely different thing from branching), especially on scalar/embedded RV32. There are large chunks of code that use 64-bit integers. 64-bit addition on RV32 is usually 3x ADD + 1x SLTU which could be potentially replaced with 1x ADD + 1x"ADC".

64-bit addition is important in prominent crypto algorithms such as SHA2-512 (but not SHA2-256 !). Here's the SHA2-512 compression function on RV32 with that newly ratified scalar crypto extension (those _rv32_sha512sig../sum.. intrinsics) -- it still has a lot of SLTUs as it was assumed that we can't have "ADC". https://github.com/rvkrypto/rvkrypto-fips/blob/main/sha2/sha2_cf512_rvk32.c#L33

Note that SLTU is also used by 64-bit integer subtract and there's no C.SLTU in RV32C. So: RV32 CPU designers should be aware that compilers generate a lot of ADD/SUB/SLTU combos for cryptographic workloads.

Cheers,

- Markku

Dr. Markku-Juhani O. Saarinen <mj...@pqshield.com> PQShield, Oxford UK.

Robert Finch

unread,

Dec 24, 2021, 5:57:20 PM12/24/21

to Markku-Juhani O. Saarinen, Jim Wilson, RISC-V ISA Dev, Bruce Hoult, Leway Colin

>Another thing that would be useful is add with carry, which works better if you have a carry flag. There are some important applications, like GMP, that really need a fast add with carry.

Given the two inputs and the result it is possible to compute what the carry is. One instruction I have added in the past is to compute the carry for an add or substract (GAC generate add carry). It can then be added in the next instruction using a three input add instruction. So, it saves an instruction over the 3*ADD + 1 SLTU, requiring 2*Add (1 3 input though) and a GAC. But the architecture must have support for 3R operations.

MitchAlsup

unread,

Dec 28, 2021, 9:51:34 PM12/28/21

to RISC-V ISA Dev, Robert Finch, jim.wil...@gmail.com, RISC-V ISA Dev, Bruce Hoult, Leway Colin, mj...@pqshield.com

As to condition codes:

<

The most useful is carry. Carry is the first step into multi-precision <integer> arithmetics. If you are going to do carry, do the entire multi-precision set of arithmetics.

<

During development, every ISA encounters the pressures to "just add a carry bit" even when the rest of the ISA is condition code free. I did way back in 1981 with M88K ISA and succumbed to the pressure. I suggest two (2) things here: a) do not succumb, b) do real multi-precision arithmetics.

<

In the My 66000 ISA (free for the asking), I faced the same pressures, but this time around I found a different and more powerful means around the problem, which also preserved the constant run time desired by crypto.

<

The solution is to have an instruction which "decorates" several following instructions with "extra" OpCode bits. I use this uniformly through the My 66000 ISA, and I invite RISC-V to follow--it has not been and will not be patented (at least by me).

<

What follows is a series of multi-precision arithmetics which demonstrates the power of using instruction-modifiers; and which I shall return to afterwards:

<

Multi-precision integer add:

<

CARRY R16,{{I}{IO}{IO}{O}}

ADD R12,R4,R8 // carry Out only

ADD R13,R5,R9 // Carry In and Out

ADD R14,R6,R10 // Carry In and Out

ADD R15,R7,R11 // Carry In only

<

Multi-precision integer multiply:

CARRY R16,{{I}{IO}{IO}{O}}

MUL R12,R4,R8 // carry Out only

MUL R13,R5,R9 // Carry In and Out

MUL R14,R6,R10 // Carry In and Out

MUL R15,R7,R11 // Carry In only

<

Multi-precision integer shift:

CARRY R16,{{I}{IO}{IO}{O}}

SL R12,R4,R8 // carry Out only

SL R13,R5,R9 // Carry In and Out

SL R14,R6,R10 // Carry In and Out

SL R15,R7,R11 // Carry In only

<

Even multi-precision integer: 256-bits by 64-bit division:

CARRY R16,{{I}{IO}{IO}{O}}

DIV R12,R4,R8 // carry Out only

DIV R13,R5,R9 // Carry In and Out

DIV R14,R6,R10 // Carry In and Out

DIV R15,R7,R11 // Carry In only

<

With suitable specification one can even get strings of these instructions to raise exceptions at the appropriate point (or ignore the exception because "all bits get delivered").

<

Not shown is this is also the access "port" for multi-precision floating arithmetics, exact floating point, access to bits that got rounded off (like Kahan–Babuška summation).

<

The CARRY instruction supplies a register, and the instruction contains an immediate field. The immediate field supplies the "decoration" for subsequent instructions. As used here (in CARRY), the decoration uses 2-bits for each subsequent instruction. 00 means that the supplied register takes no part in this particular instruction; 01 means that the register supplies an operand into the calculation, 10 means the register captures the result of the instruction, and 11 means the register provides and operand and receives a result. MY 66000 ISA has 16-bit immediates available for this instruction-modifier so up to 8 subsequent instructions can be so "decorated".

<

Presto: add 1 instruction to your ISA, and you get <basically> all forms of multi-precision arithmetic as you can define and train a compiler to use. And in the case of floating point, the inexact bit is NOT set when the result plus the CARRY container can contain all of the produced bits.

<

Thus, no condition codes are ever needed, and one has direct access to all double-width calculations by adding the least damage to the instruction set, its encoding, and what you have to compile with and to.

Robert Finch

unread,

Dec 28, 2021, 11:32:25 PM12/28/21

to RISC-V ISA Dev, MitchAlsup, Robert Finch, jim.wil...@gmail.com, RISC-V ISA Dev, Bruce Hoult, Leway Colin, mj...@pqshield.com

That is a very cool idea, and no worries about patents too. I am a little confused as to the use of R16 in the examples though. Is it an intermediary register for each following instruction? Is it effectively as if there were an extra target register R16 and extra operand register R16 for each following instruction?

Are the instructions effectively like:

ADD R16,R12,R4,R8 // carry Out only

ADD R16,R13,R5,R9,R16 // Carry In and Out

ADD R16,R14,R6,R10,R16 // Carry In and Out

ADD R15,R7,R11,R16 // Carry In only

Or am I missing something?

This seems to require two register write ports.

It looks like the same thing could be accomplished with an intermediate pipeline register, so that two write ports on the register file are not needed.

Allen Baum

unread,

Dec 29, 2021, 1:25:42 AM12/29/21

to Robert Finch, RISC-V ISA Dev, MitchAlsup, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com

This is tricky; if I understand it correctly, it implicitly uses an anonymous output of a previous op as an anonymous input of the next.

That's cute, because it doesn't need an extra read or write port, nor extra register specifiers, so you're effectively using a bypass result explicitly.

Not the main result though, but just for the carry bit (in the add case; a multiply would have the entire upper half of the result)

Up until you get an interrupt in the middle of the sequence, this looks nice.

When you do, it requires explicit progress state (the carry and the remaining IO bits) to be saved somewhere (CSRs most likely) and restored.

It may also require rename hardware for those CSRs in an OOO processor if more than one of them can be in progress at the same time

(e.g. if loop unrolled and they're logically independent).

If can unroll these sequences, but you can't interleave the instruction groups, they must be a sequential block of instructions.

A carry bit by itself needs renaming in OOO, of course, but this is trickier because the prefix covers 4 instructions, so it looks like 4 copies instead of 1

(Mitch probably has figured out a way to make that cheaper)

Prefix instructions always have attracted me, and I've designed a CPU that had one, but in an in-order machine with explicit CSR state for the prefix.

This is cute, as it isn't as special purpose as a carry bit, so you could use this to do an MADD or maybe even SELECT.

It adds complexity to a scheduler, because you can't simply schedule some random instruction between something that produces a carry and then uses it

SELECT is even more problematic because it isn't the case you're always borrowing the previous instruction;

the condition might be calculated well ahead of time, or one of the operands may have been.

The Mill architecture has the concept of "borrowing" operands from an adjacent op for similar reasons and functionality.

--

You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/6946e3ed-8b79-403d-8e49-57b6583a3327n%40groups.riscv.org.

Allen Baum

unread,

Dec 29, 2021, 1:34:50 AM12/29/21

to Robert Finch, RISC-V ISA Dev, MitchAlsup, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com

Oops, I missed the fact that the anonymous input isn't anonymous; it's defined in the CARRY instruction.

That simplifies some things. You still need to save that carry in the CARRY register at some point, and

possibly need an extra cycle to write it if the last op in the sequence has [O], [IO] or [none] semantics.

There are some weird things you might do, like have one instruction with [O] semantics, and the next 2 both using the same carry with [I] semantics.

Albert Cahalan

unread,

Dec 29, 2021, 2:27:06 AM12/29/21

to MitchAlsup, RISC-V ISA Dev, Robert Finch, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com

> The solution is to have an instruction which "decorates" several following
> instructions with "extra" OpCode bits. I use this uniformly through the My
> 66000 ISA, and I invite RISC-V to follow

It reminds me of the ever-changing ARM Thumb, particularly the
conditionals but also the old ARMv4T jumps.

Jumps had a prefix that loaded bits into the link register. You weren't
supposed to put anything else between the prefix and the jump, but
you certainly could. Behavior was quite predictable and usable, until
one day it was changed to allow more wide encodings.

Conditionals also have lots of "don't do that" in the documentation.
Well, what if you do? You could jump in or out of an if-then block.
You could start a new one. You aren't supposed to, but when has that
ever worked to stop people?

So with your 66000 ISA, I might...

1. CARRY then immediately another CARRY (inside the first)

2. CARRY then some inapplicable instructions

3. CARRY, then call some applicable instructions, then return

4. CARRY, then applicable instructions, then a loop that bypasses the carry

5. jump over a recently executed CARRY

6. CARRY, after having recently jumped over it

Allen Baum

unread,

Dec 29, 2021, 2:45:44 AM12/29/21

to Albert Cahalan, MitchAlsup, RISC-V ISA Dev, Robert Finch, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com

Yea, you can do odd things. Mitch didn't post a spec, just the general idea.

Using the CARRY register as an Rd of a following instruction that is marked O or IO is another example - which one gets priority.

That's easy, just pick one to have priority, or say unspecified

A spec would have to document exactly what was expected in all of those cases (or just say it is unspecified, and you get what you deserve if you try.

I can certainly imagine incredibly useful things to do with CARRY, LUI, LUI, JALR/LD/SD.

One question is whether the CARRY arguments are temporal or local, e.g. do they apply to the next 4 ops that are executed, even at branch targets, or the next 4 sequentially located targets

(and what happens when you branch out of the middle of the next 4, etc).

It's easy to spec what should happen for all those cases; that's what (good) specs do.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CABfYdSqKPzJgZmHZjeJsM%2B5bM4hJeQRw9h%2BeLgBJ%3DoXEMjQmjQ%40mail.gmail.com.

Samuel Falvo II

unread,

Dec 29, 2021, 11:33:23 AM12/29/21

to Allen Baum, Albert Cahalan, MitchAlsup, RISC-V ISA Dev, Robert Finch, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com

Perhaps one implementation technique is to widen the registers by one bit. So, in effect, each GPR has its own carry bit. So, on an RV32 implementation, all X registers are secretly 33 bits wide, even though you're only given architectural access to 32 of them. A CSR would reflect the carry bits of all the GPRs, allowing them to be saved and restored during context switches.

In the vast majority of the time, when fed to the ALU, this bit is masked off, so adds and subtracts ignore this carry input.

But, when enabled for input, the carry bit on (just picking arbitrarily) Xr1 is gated through to the ALU,

where it now becomes significant.

I have to admit that I'm confused by the need for an output mask. I would unconditionally write the carry bit, opting instead to just ignore it most of the time. That would also serve to double the number of opcode modifier bits, allowing for longer sequences of multi-precision logic. Alternatively, it would allow you to conditionally use carries on up to five subsequent instructions assuming your CARRY instruction is mapped to CSRWI.

My imagination is perhaps limited here, but I would imagine for a CARRY-like instruction to work properly, such an instruction would need to be decoded *and acted upon* quite early in the pipeline (e.g., in the first decode stage, perhaps), so that the dynamically-extended opcode bits are present for subsequent stages to process.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAF4tt%3DD92NK6%3D97ZefcDurY13w1-nyuO_v2AHxNQr3LePphXwg%40mail.gmail.com.

--

Samuel A. Falvo II

MitchAlsup

unread,

Dec 29, 2021, 11:52:24 AM12/29/21

to RISC-V ISA Dev, Robert Finch, MitchAlsup, jim.wil...@gmail.com, RISC-V ISA Dev, Bruce Hoult, Leway Colin, mj...@pqshield.com

"Or am I missing something?"

<

No, you got the message.

MitchAlsup

unread,

Dec 29, 2021, 11:52:54 AM12/29/21

to RISC-V ISA Dev, acahalan, RISC-V ISA Dev, Robert Finch, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com, MitchAlsup

Even when writing multi-precision arithmetic functions, I have not found a need for CARRY inside CARRY,

For example:

<

void Long_multiplication( uint64_t multiplicand[],

multiplier[],

sum[],

ilength, jlength )

{

for( uint64_t i = 0;

i < (ilength + jlength);

i++ )

sum[i] = 0;

for( uint64_t acarry = j = 0; j < jlength; j++ )

{

for( uint64_t mcarry = i = 0; i < ilength; i++ )

{

{mcarry, product} = multiplicand[i]*multiplier[j] + mcarry;

{acarry,sum[i+j]} = {sum[i+j]+acarry} + product;

}

<

So, while you do use 2 different carries, they are not inside each other, but flat.

MitchAlsup

unread,

Dec 29, 2021, 11:53:00 AM12/29/21

to RISC-V ISA Dev, Allen Baum, MitchAlsup, RISC-V ISA Dev, Robert Finch, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com, acahalan

I am merely suggesting I "think" I have it worked out for My 66000, but I present it to RISC-V as a starting point Idea.

I did not use the SRC1 source register of my CARRY instruction because I found the output of one shadowed instruction could simply consume the input of the previous--as Allen indicates.

<

I am happy to supply all of what I did to My 66000 Specifications, but you need to work out exactly how you convert this idea into fruition.

<

And as Allen questioned above, I use the register specifier in CARRY as a place to perform the write should an interrupt occur.

MitchAlsup

unread,

Dec 29, 2021, 11:53:04 AM12/29/21

to RISC-V ISA Dev, Samuel Falvo II, acahalan, MitchAlsup, RISC-V ISA Dev, Robert Finch, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com, Allen Baum

" Perhaps one implementation technique is to widen the registers by one bit."

How does this solve the multiply, divide, shift problems which are not a single bit ?

Samuel Falvo II

unread,

Dec 29, 2021, 12:29:04 PM12/29/21

to MitchAlsup, RISC-V ISA Dev, acahalan, Robert Finch, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com, Allen Baum

On Wed, Dec 29, 2021 at 8:53 AM 'MitchAlsup' via RISC-V ISA Dev <isa...@groups.riscv.org> wrote:

" Perhaps one implementation technique is to widen the registers by one bit."

How does this solve the multiply, divide, shift problems which are not a single bit ?

It wouldn't, clearly. But since the topic seemed focused on carry *flags* or *bits*, it seemed apropos. I'd argue the name of the operation, CARRY, is misleading. I now understand that you've extended the topic to account for entire intermediate results. For this reason, I think a more informative name for such an opcode would borrow from 68000's terminology: EXTEND.

But, your question to me indirectly answers another question that was asked earlier but I don't recall seeing an answer for, which is the purpose for the GPR in the CARRY instruction. That GPR (R16 in the example provided) is used as the extension/carry-over register, and basically adds another read and write port to all operations decorated by CARRY with I and O flags, respectively.

With my understanding of what was intended, I agree that my solution isn't complete.

MitchAlsup

unread,

Dec 29, 2021, 12:40:16 PM12/29/21

to RISC-V ISA Dev, Samuel Falvo II, RISC-V ISA Dev, acahalan, Robert Finch, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com, Allen Baum, MitchAlsup

" which is the purpose for the GPR in the CARRY instruction."

<

The GPR provides the syntactic sugar to "connect the dots". In my architecture, the GPR provides a thread-state storage location for a value used as an operand (last operand) or as a result (second result). But there may be many other ways of going about this "connecting of dots".

<

In many cases the GPR receives the bits that cannot go in the "normal" result (the high bits of multiply,...), GPU carries (sic.) bits from one instruction in a sequence to a next instruction in the sequence, and gets the HOBs of any final result delivered. So, the mentioned GPRs gets and carries values--most of the uses will find these values exist only between instruction of that sequence.

<

Specified correctly, if the final instruction in the sequence does not deliver all of the significant bits, then the instruction can raise overflow (should that kind of semantic be part of your ISA.)

MitchAlsup

unread,

Dec 29, 2021, 12:41:26 PM12/29/21

to RISC-V ISA Dev, Samuel Falvo II, RISC-V ISA Dev, acahalan, Robert Finch, jim.wil...@gmail.com, Bruce Hoult, Leway Colin, mj...@pqshield.com, Allen Baum, MitchAlsup

Also note: How the instruction name is spelled is left up to your group. I happened to spell mine CARRY.

Bruce Hoult

unread,

Dec 30, 2021, 1:44:19 AM12/30/21

to Samuel Falvo II, Allen Baum, Albert Cahalan, MitchAlsup, RISC-V ISA Dev, Robert Finch, jim.wil...@gmail.com, Leway Colin, mj...@pqshield.com

On Thu, Dec 30, 2021 at 5:33 AM Samuel Falvo II <sam....@gmail.com> wrote:

Perhaps one implementation technique is to widen the registers by one bit. So, in effect, each GPR has its own carry bit. So, on an RV32 implementation, all X registers are secretly 33 bits wide, even though you're only given architectural access to 32 of them. A CSR would reflect the carry bits of all the GPRs, allowing them to be saved and restored during context switches.

Note that some DEC Alpha implementations did this to support the conditional select instruction. The instruction was broken into two µops the first of which tested the condition and (possibly) copied the first option to the destination register. It also set a hidden bit on the destination register based on whether the condition was true or false. The second µop used the hidden bit to determine whether to copy the other option.

A hidden bit is pretty cheap.

Mitch's proposal also seems pretty cheap if it doesn't write the register nominated by the CARRY instruction unless there is an interrupt (or maybe at the end). Usually the carry value can exist only in the bypass network, with instructions specified as '00' simply feeding it back unchanged.

The biggest challenge might be the extra (implicit) dependency in OoO designs. No worse than traditional condition codes, and of course most of the time it won't be there at all.

Allen Baum

unread,

Dec 30, 2021, 3:07:06 AM12/30/21

to Bruce Hoult, Samuel Falvo II, Albert Cahalan, MitchAlsup, RISC-V ISA Dev, Robert Finch, jim.wil...@gmail.com, Leway Colin, mj...@pqshield.com

I think it's at least a little worse than a carry bit, because the carry can have its own separate rename logic,

but this has to share it with the entire GPR renaming, so it's an extra port on a much larger structure.

It's also a bit weird that it has to be held in the bypass network and not renamed if re-written

(which is what Alpha did, if I understand it - but Alpha didn't worry about interrupting in the middle, it was a single instruction)

It's a wart. Maybe not a specially expensive wart, but bugs feed on warts.

Reply all

Reply to author

Forward