https://salsa.debian.org/Kazan-team/kazan/blob/master/docs/SVprefix%20Proposal.rst
The ISA extension proposal looks logical to me. I like it, myself.
It’s not different at all in shorter instruction lengths, yet goes farther in longer lengths by using more bit fields for extension.
I’m not sure about the utility of large literals.
Large literals reduce pipeline stalls from using constant pools, but large literals are rare.
It might be more valuable to fuse instructions to build large literals.
From: Clifford Wolf <cliffor...@gmail.com>
Sent: Wednesday, April 24, 2019 5:17 AM
To: RISC-V ISA Dev <isa...@groups.riscv.org>
Subject: [isa-dev] Alternative proposal for instructions >32 bit, proposed instructions
Hi,
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at
https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAG5EYeXxBu9xYD8SwQwYj7pr4vWPntqVn0aBmbrX29wH88xBJw%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAP8PnuS4hNP-1Q2znGh0z-8HUXdDG1s6hz0Wb1TG5iy41Rd0zg%40mail.gmail.com.
One of the parts I specifically like about the proposal (and I think is a hole in x86-64) is support for floating-point immediates.To avoid needing a separate barrel-shifter/leading-zero-counter for f64 immediates encoded as f32 (because of needing to renormalize denormal numbers), it might be a better idea to use something like the 32-bit version of bfloat16, basically the high half of f64 with the low half being all zeros.
The ISA extension proposal looks logical to me. I like it, myself.
It’s not different at all in shorter instruction lengths, yet goes farther in longer lengths by using more bit fields for extension.
I’m not sure about the utility of large literals.
Large literals reduce pipeline stalls from using constant pools, but large literals are rare.
It might be more valuable to fuse instructions to build large literals.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAG5EYeW7qu1%3DkAPkDgri3_TFPVj9h_D1CnSherR%3Dc_K5EX7E6g%40mail.gmail.com.
Was that 1-2% measure static code size or dynamic. I would expect quite a bit less than 1-2% for a dynamic measurement.
I like your proposal but for the automatic immediate variant for all R-type instructions, it seems to me that it is much simpler to have an immediate that always replaces rs2 (as for the existing i instructions).
Also I don't quite see how your scheme works for the 4 register variants.
Perhaps an encoding like this would be simpler?| 3 2 1 ||1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0||-------------------------------------------------------------------------------|| funct7 |f3 |0e | rs1 | opc3 | rd | ??11111 | LS2-type| rs3 | f2 |f3 |1e | rs1 | opc3 | rd | ??11111 | LS3-type
Hi,
On Fri, Apr 26, 2019 at 5:37 PM Rogier Brussee <rogier...@gmail.com> wrote:I like your proposal but for the automatic immediate variant for all R-type instructions, it seems to me that it is much simpler to have an immediate that always replaces rs2 (as for the existing i instructions).I don't quite see what you would gain by only supporting rs2 when it's as easy to support rs1, rs2, and rs3.
Not all instructions are commutative. And for some non-commutative instructions an immediate makes sense on any of the arguments.
Think for example of bdep from the B extension draft proposal. You'd think that the mask (rs2) is the only argument where an immediate would make sense for this instruction. Until you see ctz(bdep(1<<N, x)). This calculates the index of the Nth set bit in x, using the constant 1<<N in rs1.
Also I don't quite see how your scheme works for the 4 register variants.Well, I don't see why it wouldn't and my text explains how it works.Maybe if you could point to a concrete problem, that would be more constructive.
Perhaps an encoding like this would be simpler?| 3 2 1 ||1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0||-------------------------------------------------------------------------------|| funct7 |f3 |0e | rs1 | opc3 | rd | ??11111 | LS2-type| rs3 | f2 |f3 |1e | rs1 | opc3 | rd | ??11111 | LS3-typeNo, that's worse imo. Because in OP-FP there are a couple of unary instructions that only have rs1, not rs2.Of course unary instructions with a constant are not particularly important, considering they only produce a constant result, but by putting that stuff in the rs2 field instead of rs1 you are destroying the brownfields created by those unary instructions that we inherit in this encoding.
I have no idea what's going on in bit 21 (rs2[1]) in your encoding. There is no need to explicitly tag an instruction as LS2 or LS3. Like there's also no explicit tag to distinguish R-type and R4-type instructions either. The decoder already knows which instruction uses which format.
(Right now there are no ternary instructions in OP or OP-FP. But the B extension proposes 4 ternary instructions in OP, and in the proposed encoding all ternary instructions in OP would use op[26]=1 in order to simplify instruction decoders. This encoding doesn't collide with anything else going on in OP right now, and it would allow for two ternary instructions per minor opcode, or 16 ternary instructions in OP total.)
regards,- Clifford
Op vrijdag 26 april 2019 18:31:38 UTC+2 schreef clifford:
On Fri, Apr 26, 2019 at 5:37 PM Rogier Brussee <rogier...@gmail.com> wrote:
I like your proposal but for the automatic immediate variant for all R-type instructions, it seems to me that it is much simpler to have an immediate that always replaces rs2 (as for the existing i instructions).I don't quite see what you would gain by only supporting rs2 when it's as easy to support rs1, rs2, and rs3.
Your scheme breaks the property of the encoding that the registers can always be found at the same bits in the encoding which is emphasised as a key point in the design of the RISC V isa (although by necessity it is obviously also broken for RVC). Unless I misunderstood you, in your scheme rsA can refer to either rs2 if ind = 0b00, or rs1 if ind = 0b01. (I take that for the LS2 encoding there is a scheme like rs<Ind> rsA rsB being in cyclic order, although this is not specified AFAICS). If nothing else, always making rs2 to be the immediate, reuses more of the decoder for OP and OP-FP and should make it easier to reserve the right instruction in the early decoding phase, and makes it closer to the relation between OP en OP-imm.
Not all instructions are commutative. And for some non-commutative instructions an immediate makes sense on any of the arguments.It is definitely less powerful but that is the cost of greater simplicity. Which of the registers you would choose to have with an easy and cheap immediate would be one of the design criteria for the non commutative instructions. If there really would be enough benefit to being able to specify either one with an immediate one can always define two R-type instructions (like the sub and a new negsub instruction).
Think for example of bdep from the B extension draft proposal. You'd think that the mask (rs2) is the only argument where an immediate would make sense for this instruction. Until you see ctz(bdep(1<<N, x)). This calculates the index of the Nth set bit in x, using the constant 1<<N in rs1.?? Isn't that two instructions? Anyway, see above for the possibility of having an immediate in other
Also I don't quite see how your scheme works for the 4 register variants.Well, I don't see why it wouldn't and my text explains how it works.Maybe if you could point to a concrete problem, that would be more constructive.It was most certainly my intent to be constructive. More to the point, I was under the (apparently mistaken) impression that minor opcode OP en OP-FP could only have 3 register (rd rs1 rs2) instructions and that the 4 instructions versions referred to the fused multiply add instructions MADD (and the like) which live in other minor opcodes than OP-FP.
Perhaps an encoding like this would be simpler?| 3 2 1 ||1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0||-------------------------------------------------------------------------------|| funct7 |f3 |0e | rs1 | opc3 | rd | ??11111 | LS2-type| rs3 | f2 |f3 |1e | rs1 | opc3 | rd | ??11111 | LS3-typeNo, that's worse imo. Because in OP-FP there are a couple of unary instructions that only have rs1, not rs2.Of course unary instructions with a constant are not particularly important, considering they only produce a constant result, but by putting that stuff in the rs2 field instead of rs1 you are destroying the brownfields created by those unary instructions that we inherit in this encoding.That is a fair point. However, if the f3 and func7 field would encode for several instructions determined by further bits in the rs2 field, all of which are unary, one can simply declare this immediate extension scheme to not be applicable.
Alternatively you could, in that case, use the 32 bit in the immediate field as a func field with plenty of room to encode the bits that were encoded in the rs2 field in the original OP-FP encoding and use that.
That's also why I think it makes sense to discuss this kind of stuff now, even though it seems like nobody is planning on building processors with support for large instructions yet, because how we are planning to use the large encoding space can inform how we allocate instructions in the 32-bit encoding space.
| Apr 24, 2019, 2:34 PM (4 days ago) | ![]() ![]() | ||
|
Sounds like a good idea to me. This would require changing the WIP ISA extension proposals we've (libre-riscv.org) been working on as I had designed the encodings to take half the 48-bit encoding space (7 LSB bits == 0011111) defined in the RISC-V spec.
https://salsa.debian.org/Kazan-team/kazan/blob/master/docs/SVprefix%20Proposal.rst
On Wed, Apr 24, 2019 at 5:10 PM Jacob Lifshay <program...@gmail.com> wrote:Sounds like a good idea to me. This would require changing the WIP ISA extension proposals we've (libre-riscv.org) been working on as I had designed the encodings to take half the 48-bit encoding space (7 LSB bits == 0011111) defined in the RISC-V spec.
https://salsa.debian.org/Kazan-team/kazan/blob/master/docs/SVprefix%20Proposal.rstThere are two possible scenarios for how to implement something like this within my proposal:(1) You need this to be a 48-bit instruction and you don't care if your extension can ever become a std extension.
In this case simply use the funct3=110 custom space. Then you'd need to set op[6:0]=0011111 and op[[14:12]=110 and the rest of the encoding space is all yours. Afaict you have one reserved bit. So you'd need to find two more bits for this to work.
In this case simply use the funct3=110 custom space. Then you'd need to set op[6:0]=0011111 and op[[14:12]=110 and the rest of the encoding space is all yours. Afaict you have one reserved bit. So you'd need to find two more bits for this to work.I think it would be useful to change the long-instruction proposal to reserve one more funct3 custom space since there would otherwise be 3 reserved-for-standard-extensions-only funct3 spaces and only 1 custom space.
I think choosing the two custom funct3 so that they have a hamming distance of 1 between them will make them more useful as then they can be treated as if the opcode was 1 bit shorter and the remaining bit can be used as additional space for the other instruction fields.
On Sun, Apr 28, 2019, 23:06 Clifford Wolf <cliffor...@gmail.com> wrote:On Wed, Apr 24, 2019 at 5:10 PM Jacob Lifshay <program...@gmail.com> wrote:Sounds like a good idea to me. This would require changing the WIP ISA extension proposals we've (libre-riscv.org) been working on as I had designed the encodings to take half the 48-bit encoding space (7 LSB bits == 0011111) defined in the RISC-V spec.
https://salsa.debian.org/Kazan-team/kazan/blob/master/docs/SVprefix%20Proposal.rstThere are two possible scenarios for how to implement something like this within my proposal:(1) You need this to be a 48-bit instruction and you don't care if your extension can ever become a std extension.The ISA extension we're working on overlaps quite a bit with the design space for the V extension, so I don't think we will probably end up with it becoming a std extension.
The ISA extension we're working on overlaps quite a bit with the design space for the V extension, so I don't think we will probably end up with it becoming a std extension.