The condensed extension (Xcondensed) is an alternative for the compact extension using the three quadrants reserved for a 16 bit encoding . Like the C extension it can be used in combination with thefull >= 32 bit instructions ISA. Unlike the standard C extension it can be used as a stand alone instruction set, in fact containg a condensed version of the full G instruction set. By design It contains most of the C extension unaltered, the part of the C instruction responsible for >90% of the savings, so that the performance characteristics should be very similar to the standard C extension (possibly even improved). Also by design, like for the C extension, every instruction translates to a single 32 bit instructions . Finally by design it tries to stay as close as possible to the original ISA, being a mostly mechanical excercise in translating one in the other once the proper tradeoff is chosen where to be stingy with the encoding to sacrifice immediate versions of instructions, restricting use of registers to only the 8 most common ones and range of immediates (this is the hard part). The part that deviates most from the original ISA (memory mapped CSR's and privileged instructions) is also the most messy part. |
Rationale |
* If the 16 bit instruction is a standalone instructionset an implementation can choose to use a simpeler fixed length ISA. It should also be simpeler than the full ISA together with the C extension. The combination should be more performant but the question becomes wheter it is worth the extra complexity. |
* Allows fused ops to be composed with essentially all basic instructions from condensed ones. E.g fusing Xc.auipc_ra with Xc.jalr_ra immediately gives a 22 bit version of JAL in just 32 bit, and likewise a 32 bit JAL in 48 bit fusing auipc ra with Xc.auipc_ra . |
* VLIW with bundles of 16 bit instructions seems a bit more reasonable (although such an architecture probably still wants the full ISA) |
* A RV16 architecture drops out if somebody needs it. |
* To explore what can be done if one is stingy in the encoding of the C extension. It was when dabbling how a RV64IM could be accomodated as a strict superset of the RV32IM I realised there was room for a condensed version of the amo's and that there was just enough room for a basic but complete set of floating point instructions. |
Of course something has to give. |
* Compared to the C extension Some of the instructions that give the least benefit are replaced by versions that use 3 bit registers (i.e. the 8 used in the C extension) and immediates are reduced to 5 bit. The support for floating point load and store is greatly reduced. |
* Only 2 registers or 1 register and an immediate version (like the compact encoding) |
* Although like the C extension the Xcondensed extension can be used by only teaching the assembler in combination with the full ISA, it probably needs compiler support if used as a standalone instructionset. As a stand alone instuctionset it will need compiler support. However it is very much a instructionset in the RV family and it should be relatively easy to add. |
Compared to the 32 bit encoding, there are less amo's and the fused floating point operations are not supported |
* CSR's are supposed to be memory mapped. This more or less forces one to have amo's to have an essentially equivalent semantics. |
* Similar complexity as the C extension, but more instructions will inevitably be more complex than fewer. |
* RV128 should "just work" because the lx/sx instructions are aligned to xlen and load xlen bits of memory, i.e. they work as lq/sq. That should work well with a LP128 memory model. However even on 128 bit machines one might use LP64 loading 64 bits at a time may be more useful. In any case if you can afffored 128 bit you can afford the full ISA (and there is still a little room in the ISA) |
*RV16
just drops out of the ISA. No special provision is made and I didn't think it
through very carefully. Not sure if it
is even useful and just using 3/4 of the encoding space for an architecture
that is obviously restricted seems inefficient, although such an architecture
could not reasonably use a 32 bit ISA so could use the first quadrant for
other purposes without portability worries. This spreadsheet lists all instructions their encoding and how they map to 32 bit ISA. https://docs.google.com/spreadsheets/d/1rray4sbhGarasDS6acnWyAlOjLvDqBXX3s1LrBLtFs8/edit?usp=sharing Below list the IM64 instructions. instruction R1 R2 imm semantics
lxsp 5rd* ___ 6imm l[hwdq] rd (imm<<x)(sp) lwsp 5rd* ___ 6imm lw rd (imm<<2)(sp) lx 3rd 3rs1 5imm l[hwdq] rd (imm<<x)(rs1) lw 3rd 3rs1 5imm lw rd (imm<<2)(rs1) l.b 3rd 3rs1 ____ lb rd 0(rs1) l.h 3rd 3rs1 ____ lh rd 0(rs1) l.bu 3rd 3rs1 ____ lbu rd 0(rs1) l.hu 3rd 3rs1 ____ lhu rd 0(rs1) l.wu 3rd 3rs1 ____ lwu rd 0(rs1) sxsp 5rs1* ___ 6imm s[hwdq] rs1 (imm<<x)(sp) swsp 5rs1* ___ 6imm sw rs1 (imm<<2)(sp) sx 3rs1 3rs2 5imm s[hwdq] rs1 (imm<<x)(rs2) sw 3rs1 3rs2 5imm sw rs1 (imm<<2)(rs2) s.b 3rs1 3rs2 ____ sb rs1 0(rs2) s.h 3rs1 3rs2 ____ sh rs1 0(rs2) li 5rd* ___ 6imm li rd imm lui 3rd ___ 5imm* lui rd imm auipc_ra ___ 11imm auipc ra imm addi 5rsd* ___ 6imm* addi rsd rsd imm add6i 5rsd* ___ 6imm* addi rsd rsd imm<<6 add4i_sp _ _ 6imm* addi sp sp imm<<4 addxi 3rsd ___ 5imm* addi rsd rsd imm<<x axisp 3rd ___ 5imm* addi rd sp imm<<x a7isp 3rd ___ 5imm* addi rd sp imm<<7 addi.w 3rsd ___ 5imm addiw rsd rsd imm andi 3rsd ___ 5imm** andi rsd rsd imm slli 5srd* ___ 6imm* slli rsd rsd imm srli 3rsd ___ 5imm* srli rsd rsd imm srai 3rsd ___ 5imm* srai rsd rsd imm srli16 3rsd ___ ____ srli rsd rsd 16 srai16 3rsd ___ ____ sra rsd rsd 16 mv 5rd* 5rs2* ____ add rd zero rs2 add 5rsd* 5rs2* ____ add rsd rsd rs2 sub 3rsd 3rs2 ____ sub rsd rsd rs2 neg 5rsd* ___ ____ sub rsd zero rsd add.w 3rsd 3rs2 ____ addw rsd rsd rs2 sub.w 3rsd 3rs2 ____ subw rsd rsd rs2 and 3rsd 3rs2 ____ and rsd rsd rs2 or 3rsd 3rs2 ____ or rsd rsd rs2 xor 3rsd 3rs2 ____ xor rsd rsd rs2 not 3rd 3rs1 ____ xori rd rs1 -1 sll 3rsd 3rs2 ____ sll rsd rsd rs2 srl 3rsd 3rs2 ____ srl rsd rsd rs2 sra 3rsd 3rs2 ____ sra rsd rsd rs2 sll.w 3rsd 3rs2 ____ sllw rsd rsd rs2 srl.w 3rsd 3rs2 ____ srlw rsd rsd rs2 sra.w 3rsd 3rs2 ____ sra rsd rsd rs2 slt 3rsd 3rs2 ____ slt rsd rsd rs2 sltu 3rsd 3rs2 ____ sltu rsd rsd rs2 seqz 3rd 3rs1 ____ sltui rd rs1 1 sltz 3rd 3rs1 ____ slt rd rs1 zero slez 3rd 3rs1 ____ slti rd rs1 1 mult 3rsd 3rs2 ____ mult rsd rsd rs2 div 3rsd 3rs2 ____ div rsd rsd rs2 divu 3rsd 3rs2 ____ divu rsd rsd rs2 rem 3rsd 3rs2 ____ rem rsd rsd rs2 remu 3rsd 3rs2 ____ remu rsd rsd rs2 multh 3rsd 3rs2 ____ multh rsd rsd rs2 multhu 3rsd 3rs2 ____ multhu rsd rsd rs2 multhsu 3rsd 3rs2 ____ multhsu rsd rsd rs2 mult.w 3rsd 3rs2 ____ multw rsd rsd rs2 div.w 3rsd 3rs2 ____ divw rsd rsd rs2 divu.w 3rsd 3rs2 ____ divwu rsd rsd rs2 rem.w 3rsd 3rs2 ____ remw rsd rsd rs2 remu.w 3rsd 3rs2 ____ remwu rsd rsd rs2 beqz 3rs1 ___ 8imm* beq rs1 zero imm<<1 bnez 3rs1 ___ 8imm* bne s1 zero imm<<1 j ___ ___ 11imm* jal zero imm<<1 jal ___ ___ 11imm* jal ra imm<<1 jalr_ra_ra _ _ 11imm* jalr ra ra imm<<1 jalr 3rd 3rs1 ____ jalr rd rs1 0x0 jr 5rs1 ____ ___ jalr zero rs1 0x0 jalr_ra 5rs1 __ ____ jalr ra rs1 0x0 ret __ __ ___ jalr zero ra 0x0 and pop the return stack illegal __ __ __ illegal uret __ __ __ uret sret __ __ __ sret hret __ __ __ hret mret __ __ __ mret sfence.vm_zero _ _ _ sfence.vm zero sfence.vm_ra _ _ _ sfence.vm x1 csrrw_sp_mscratch_sp _ _ _ csrrw sp mscratch sp csrrw_sp_hscratch_sp _ _ _ csrrw sp hscratch sp csrrw_sp_sscratch_sp _ _ _ csrrw sp sscratch sp csrrw_sp_uscratch_sp _ _ _ csrrw sp uscratch sp ebreak __ __ __ ebreak nop __ __ __ nop ecall __ __ __ ecall fence.i __ __ __ fence.i wfi __ __ __ wfi rdcycle_ra __ __ __ csrr x1 rdcycle rdinstret_ra __ _ __ csrr x1rdtime rdtime_ra __ __ __ csrr x1 rdtime fence.mem __ __ 4imm fence --imm[3:2]-- imm[1:0]
fence.io __ __ 4imm fence imm[3:2] -- imm[1:0]-- |
The Xcondensed-A could conceivably be reduced to just lr, sc, which is enough, for example, to run the MUSL libc. However I don't know how that would work out with memory mapped CSR's (if these are needed) and as Andrew Waterman wrote: bit twiddling device registers. It seemed more generally useful to have the aq.rl versions of amo.l, amo.s, amo.or, amo.and and amo.swap to "emulate" the csr instructions with memory mapped csr's than to have specialised instructions that pass the CSR value in a register rather than as an immediate. It would also violate the "every instruction maps to a an instruction in the base 32bit-ISA" constraint. Having gone that road, leaving out amo.add seemed illogical. Having this in both w and x = XLEN versions does add up a bit in terms of encoding space but it fits. https://docs.google.com/spreadsheets/d/1rray4sbhGarasDS6acnWyAlOjLvDqBXX3s1LrBLtFs8/edit?usp=sharing Ciao Rogier |
Hi Rogier,I thought the previous responses to your proposal got too bogged down in AMOs (when they were just one small element of your overall proposal).
Instead I found it very useful to compare the major (5 bit) opcode map in your spreadsheet to the RV64GC opcode map in Table 14.3 of the User Level spec (RV v2.1)What it shows is your Xcondensed proposal contains a multiplicity of good ideas, which can each (in my opinion) should be considered individually for incorporation into standard "C" extension. In fact, I hope RVC 1.9 won't get finalised too quickly and instead more time is spent examining the alternative choices you outline in Xcondensed.
Here are the ideas I like especially, and these should be key decisions to discuss & debate:1(a). Splitting RVC "MISC-ALU" opcode into `ALU register" and "ALU-immediate" opcodes. The RVC "MISC-ALU" instruction format is quite messy, being a complex mix of immediate shifts and register based arithmetic instructions, each group having a different instruction sub-format. Having "ALU" and "ALU-I" opcodes creates a much nicer and tidier instruction format organisation.
1(b). "Paying" for the new ALU-I opcode by downgrading the "C.LUI" to a subopcode within "ALU-I". C.LUI is extremely rarely used, <0.1% of dynamic Linux or SPEC code according to Table 14.8 in the RV v2.1 spec. I'd rather much have "ALU-I" with all the extra orthogonality & simplicity it creates, with an 8 register version of LUI as one of the eight immediate ALU operations offered by ALU-I. This also more closely matches to uncompressed ISA distinction between OP-32 and OP-32IMM.
2. Downgrading ADDI4SPN into an instruction within the ALU-I group, thereby freeing an extra major opcode. This major opcode was then used (notionally, in my comparison) to add back the C.JAL instruction that was present in RV32C but lost in RV64C. This improves the orthogonality of RV32C and RV64C, ie: the only difference between these two instruction sets are the load/stores rather than branch instructions. C.JAL has slightly higher usage (0.59% vs 0.44%) so this is a choice that favours reduction of dynamic & static instruction count.
3. Rebalancing the opcodes allocated to floating point load/stores versus floating point ops. RV64GC has a massive 4 opcodes for C.FLD, C.FSD, C.FLDSP, C.FLSDP (versus none for FPU operations). This choice was justified by analysis of SPEC2006 code only (as Linux kernel code does not have any floating point at all). Even in SPEC2006, floating point is still only typically ~0.5% or less of instruction counts, althought FLD is more common at 1.6% pf dynamic instructions. Is it wise to make such an extreme decision (biased towards load/store vs ops) based purely on optimising for a single benchmark set? Admittedly, Xcondensed goes to the other extreme of downgrading FPU load/store to 1/4 of the opcode space of "load"/"store" major opcodes, ie: effectively 1/2 of one major opcode in total. It shows though that FPU ops can fit nicely into one major opcode, so maybe a better balance can be achieved with two FPU load/store opcodes and 1-2 major opcodes for FPU operations (I'd argue one opcode should be reserved for decimal FP).
On 13 Dec 2016, at 8:52 PM, Xan Phung <xan....@gmail.com> wrote:Anyway, in congrats to RISC V designers for their great progress to date! But I hope they will look more closely at some ideas from Rogier's Xcondensed instruction set in the spreadsheet link below.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/a2676393-2270-44b9-bb6a-d9692cfe914f%40groups.riscv.org.
Dear Xan,thank you very much for taking this up and your excellent comments on the Xcondensed proposal (aka proposal for Cv2.0). Sparking discussion in this direction was exactly what I hoped for.I completely share your sentiment, the designers did a very good job, but 75% of the opcode space is a very valuable resource and I believe there is room for improvement especially in the direction of providing a fixed width very basic 16 bit general purpose ISA that doubles as code compression.
Because this thread has been renamed (thanks Xan!) let me for future reference link once more to the spreadsheet of the Xcondensed proposal that implements the points Xan made.https://docs.google.com/spreadsheets/d/1rray4sbhGarasDS6acnWyAlOjLvDqBXX3s1LrBLtFs8/edit?usp=sharing
Op dinsdag 13 december 2016 08:52:29 UTC+1 schreef Xan Phung:Hi (to anyone on RISC V Foundation):Can anyone outline the process required for providing input into RISC V Foundation standards setting/review?The fact (as pointed out by Rogier) that RV Compressed uses 3/4 of all opcode space was what me think RVC v1.9 should have more extensive analysis before it gets "frozen" as a v2.0 spec...Very close and intense review of RVC v1.9 is therefore warranted, perhaps even more than uncompressed RV itself... not because RVC designers haven't done great work (they have!) but because RVC2.0 will be the critical step that will lock-away the vast portion (~90%) of the RISC V ISA that isn't yet frozen.I believe the current RVC v1.9 is generally very high quality and the majority of it (but not all of it) is ready for "freezing". It is highly optimised along one dimension, ie: for the perspective of providing compression for SPEC2006 code & Linux kernel code. However, I think there is a very strong case that it can be improved along other dimensions (eg. robustness for general purpose computing & versatility across broader range of use-cases), **without** reducing the existing optimisation for SPEC2006/Linux.(Wasn't a key lesson from the MIPS/SPARC era that over-optimising for a single implementation strategy or targeting for too narrow a set of use-cases is the reason for mistakes made in those architectures?)A key example is the "C.ADDI4SPN" instruction. On it's own this instruction consumes ~2.5% of entire RISC V opcode space, ie: just 40 similar instructions will use up nearly the entire opcode space. It requires it's own dedicated instruction format (not used by any other instruction). Yet ADDI4SPN caters for (in my opinion) a specialised use case, that is not highly used. (Even in the benchmark code that is it's optimisation target, it is only 0.07% of SPEC2006 dynamic count, and 0.05% of Linux dynamic count).In comparison, C.JAL has higher usage than C.ADDI4SPN in non-SPEC benchmarks, of up to 0.59%, yet gets left out of RV64C. Also, JAL has merit from the perspective of 16 bit instruction general purpose computing use cases (beyond just the compression use case). Alternatively, use the extra RVC opcode taken from ADDI4SPN to divide the C.MISC-ALU instructions amongst two opcodes, "C.ALU register" and "C.ALU immediate". This will tidy up what is currently a messy "mixed" opcode and create a much nicer & orthogonal set of integer ALU 16 bit instructions.I also note that RVC v1.9 is only optimised for assembler stage 32->16bit compression, and there may be other compression opportunities if the whole compilation stage is redesigned for the 16 bit instructions - this may mean that having a nice orthogonal Compressed instruction set may make it a better target for compilers, and perhaps better compiler stage register allocation may mean the 32 register forms of the "C" instructions (see C.LUI/C.ADDIW below) provide even less marginal benefit beyond what 8 registers can provide.In addition to C.ADDI4SPN, there are also other lower-usage opcodes like C.LUI, C.ADDIW, and floating point opcodes (up to 20% of entire opcode space, for just load/store!), which I think should also be reviewed. (LUI/ADDIW can still be provided in an 8 register form within a new ALU-I opcode)Anyway, in congrats to RISC V designers for their great progress to date! But I hope they will look more closely at some ideas from Rogier's Xcondensed instruction set in the spreadsheet link below.
On Friday, 7 October 2016 03:44:53 UTC+11, Rogier Brussee wrote:> This spreadsheet lists all instructions their encoding and how they map to 32 bit ISA.
> https://docs.google.com/spreadsheets/d/1rray4sbhGarasDS6acnWyAlOjLvDqBXX3s1LrBLtFs8/edit?usp=sharing
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/0c7c856e-5b94-4bce-82bb-d8294d8ec537%40groups.riscv.org.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/0c7c856e-5b94-4bce-82bb-d8294d8ec537%40groups.riscv.org.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CA%2B%2B6G0C%2BP5DX%3DGqa%2Bv9%3DX4LGG6hfuVoVR_ccwMYS7hvx%2BDRrHA%40mail.gmail.com.
I completely share your sentiment, the designers did a very good job, but 75% of the opcode space is a very valuable resource and I believe there is room for improvement especially in the direction of providing a fixed width very basic 16 bit general purpose ISA that doubles as code compression.This is an interesting design exercise, but a functional 16-bit ISA is an explicit non-goal of the C extension. Xcondensed remains the right name for this effort, because it is outside the scope of the standard RISC-V ISA, which mandates the presence of the 32-bit base.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/0c7c856e-5b94-4bce-82bb-d8294d8ec537%40groups.riscv.org.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/e872467e-e9dc-461e-b0f7-9e70b7538752%40groups.riscv.org.
Dear Andrew,thanks for your reaction.[snip]I completely share your sentiment, the designers did a very good job, but 75% of the opcode space is a very valuable resource and I believe there is room for improvement especially in the direction of providing a fixed width very basic 16 bit general purpose ISA that doubles as code compression.This is an interesting design exercise, but a functional 16-bit ISA is an explicit non-goal of the C extension. Xcondensed remains the right name for this effort, because it is outside the scope of the standard RISC-V ISA, which mandates the presence of the 32-bit base.I know that a 16 bit ISA is a non goal of the C extension, but as it is, the C extension makes it impossible to have something like Xcondensed as an extension of C or even something that can co-exist with C. Something like a RV32EMXcondensed that only has a fixed width 16 bit decoder seems a reasonable design point but you seem to suggest that you (and or the RISC-V foundation) explicitly don't want this to be RISC-V but something that is as best RISC-V inspired. Is that a deliberate decision or just something that drops out of the Cv1.9 design?
--Because this thread has been renamed (thanks Xan!) let me for future reference link once more to the spreadsheet of the Xcondensed proposal that implements the points Xan made.https://docs.google.com/spreadsheets/d/1rray4sbhGarasDS6acnWyAlOjLvDqBXX3s1LrBLtFs8/edit?usp=sharing
Op dinsdag 13 december 2016 08:52:29 UTC+1 schreef Xan Phung:Hi (to anyone on RISC V Foundation):Can anyone outline the process required for providing input into RISC V Foundation standards setting/review?The fact (as pointed out by Rogier) that RV Compressed uses 3/4 of all opcode space was what me think RVC v1.9 should have more extensive analysis before it gets "frozen" as a v2.0 spec...Very close and intense review of RVC v1.9 is therefore warranted, perhaps even more than uncompressed RV itself... not because RVC designers haven't done great work (they have!) but because RVC2.0 will be the critical step that will lock-away the vast portion (~90%) of the RISC V ISA that isn't yet frozen.I believe the current RVC v1.9 is generally very high quality and the majority of it (but not all of it) is ready for "freezing". It is highly optimised along one dimension, ie: for the perspective of providing compression for SPEC2006 code & Linux kernel code. However, I think there is a very strong case that it can be improved along other dimensions (eg. robustness for general purpose computing & versatility across broader range of use-cases), **without** reducing the existing optimisation for SPEC2006/Linux.(Wasn't a key lesson from the MIPS/SPARC era that over-optimising for a single implementation strategy or targeting for too narrow a set of use-cases is the reason for mistakes made in those architectures?)A key example is the "C.ADDI4SPN" instruction. On it's own this instruction consumes ~2.5% of entire RISC V opcode space, ie: just 40 similar instructions will use up nearly the entire opcode space. It requires it's own dedicated instruction format (not used by any other instruction). Yet ADDI4SPN caters for (in my opinion) a specialised use case, that is not highly used. (Even in the benchmark code that is it's optimisation target, it is only 0.07% of SPEC2006 dynamic count, and 0.05% of Linux dynamic count).In comparison, C.JAL has higher usage than C.ADDI4SPN in non-SPEC benchmarks, of up to 0.59%, yet gets left out of RV64C. Also, JAL has merit from the perspective of 16 bit instruction general purpose computing use cases (beyond just the compression use case). Alternatively, use the extra RVC opcode taken from ADDI4SPN to divide the C.MISC-ALU instructions amongst two opcodes, "C.ALU register" and "C.ALU immediate". This will tidy up what is currently a messy "mixed" opcode and create a much nicer & orthogonal set of integer ALU 16 bit instructions.I also note that RVC v1.9 is only optimised for assembler stage 32->16bit compression, and there may be other compression opportunities if the whole compilation stage is redesigned for the 16 bit instructions - this may mean that having a nice orthogonal Compressed instruction set may make it a better target for compilers, and perhaps better compiler stage register allocation may mean the 32 register forms of the "C" instructions (see C.LUI/C.ADDIW below) provide even less marginal benefit beyond what 8 registers can provide.In addition to C.ADDI4SPN, there are also other lower-usage opcodes like C.LUI, C.ADDIW, and floating point opcodes (up to 20% of entire opcode space, for just load/store!), which I think should also be reviewed. (LUI/ADDIW can still be provided in an 8 register form within a new ALU-I opcode)Anyway, in congrats to RISC V designers for their great progress to date! But I hope they will look more closely at some ideas from Rogier's Xcondensed instruction set in the spreadsheet link below.
On Friday, 7 October 2016 03:44:53 UTC+11, Rogier Brussee wrote:> This spreadsheet lists all instructions their encoding and how they map to 32 bit ISA.
> https://docs.google.com/spreadsheets/d/1rray4sbhGarasDS6acnWyAlOjLvDqBXX3s1LrBLtFs8/edit?usp=sharing
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/0c7c856e-5b94-4bce-82bb-d8294d8ec537%40groups.riscv.org.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/e872467e-e9dc-461e-b0f7-9e70b7538752%40groups.riscv.org.