There were some discussions in the ISA-Dev mailing list about DSP instructions recently. With the past experience in this area, Andes would like to contribute our >150 DSP and Packed SIMD instructions (or simply DSP ISA) developed in our AndeStar V3 architecture as a starting basis for the RISC-V “P” extension. Our DSP ISA uses only GPRs and targets for high efficiency. We have quite successful experience in the DSP ISA, which made the AndesCore D1088 our most popular core after its debut 3 years ago.
The proposed ISA covers not only SIMD computations, but also a lot of modes needed in traditional DSP computations such as fixed-points, saturation with an overflow flag, shifting-and-round, accumulation to 64 bits (to guarantee the precision) , bit-reversal, etc. Accumulate-to-64b instructions consume 4 reads/2 writes of GPRs. Implementations can split them to 2 cycles to reduce the ports if desired, but still much better than 12 cycles using the existing instructions, or 5 cycles using the best instructions one can design with only 2R1W. The attached file shows the semantics of the 32-bit DSP ISA. We’re extending the ISA for RV64.
DSP and SIMD computations cover a wide range of applications, from audio/voice, to image/video, deep learning, and more. One size doesn’t fit all well. We’re aware that there are discussions about extensions for high-performance DSP computations. Our DSP ISA covers the high-efficiency end of the spectrum, demanded by many high-volume applications. In addition to the DSP ISA spec, Andes can contribute the corresponding compiler support as well.
Please feel free to drop us a line here, or come by for discussion in the coming workshop, where we’ll present the performance gains brought by the DSP ISA as part of our 12-minute presentation.
Chuanhua Chang
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/07af1851-92fc-4a46-ad1e-9c51442c11df%40groups.riscv.org.
On Thu, Nov 23, 2017 at 4:06 PM, chuanhua.chang <chuanhu...@gmail.com> wrote:
There were some discussions in the ISA-Dev mailing list about DSP instructions recently. With the past experience in this area, Andes would like to contribute our >150 DSP and Packed SIMD instructions (or simply DSP ISA) developed in our AndeStar V3 architecture as a starting basis for the RISC-V "P" extension. Our DSP ISA uses only GPRs and targets for high efficiency. We have quite successful experience in the DSP ISA, which made the AndesCore D1088 our most popular core after its debut 3 years ago.
The proposed ISA covers not only SIMD computations, but also a lot of modes needed in traditional DSP computations such as fixed-points, saturation with an overflow flag, shifting-and-round, accumulation to 64 bits (to guarantee the precision) , bit-reversal, etc. Accumulate-to-64b instructions consume 4 reads/2 writes of GPRs. Implementations can split them to 2 cycles to reduce the ports if desired, but still much better than 12 cycles using the existing instructions, or 5 cycles using the best instructions one can design with only 2R1W. The attached file shows the semantics of the 32-bit DSP ISA. We're extending the ISA for RV64.
DSP and SIMD computations cover a wide range of applications, from audio/voice, to image/video, deep learning, and more. One size doesn't fit all well. We're aware that there are discussions about extensions for high-performance DSP computations. Our DSP ISA covers the high-efficiency end of the spectrum, demanded by many high-volume applications. In addition to the DSP ISA spec, Andes can contribute the corresponding compiler support as well.
Please feel free to drop us a line here, or come by for discussion in the coming workshop, where we'll present the performance gains brought by the DSP ISA as part of our 12-minute presentation.
--
Cool!Do you have RISC-V instruction encodings worked out for those? (they aren't in this document)
Note that bit reversal (at least) is part of the bit manipulation extension.
chuanhua.chang wrote:
>
> There were some discussions in the ISA-Dev mailing list about DSP
> instructions recently. With the past experience in this area, Andes
> would like to contribute our >150 DSP and Packed SIMD instructions (or
> simply DSP ISA) developed in our AndeStar V3 architecture as a
> starting basis for the RISC-V “P” extension. Our DSP ISA uses only
> GPRs and targets for high efficiency. We have quite successful
> experience in the DSP ISA, which made the AndesCore D1088 our most
> popular core after its debut 3 years ago.
>
On this Thanksgiving holiday here in the US, I think I can speak for all
of us in saying "thank you" for this.
> The proposed ISA covers not only SIMD computations, but also a lot of
> modes needed in traditional DSP computations such as fixed-points,
> saturation with an overflow flag, shifting-and-round, accumulation to
> 64 bits (to guarantee the precision) , bit-reversal, etc.
> Accumulate-to-64b instructions consume 4 reads/2 writes of GPRs.
> Implementations can split them to 2 cycles to reduce the ports if
> desired, but still much better than 12 cycles using the existing
> instructions, or 5 cycles using the best instructions one can design
> with only 2R1W. The attached file shows the semantics of the 32-bit
> DSP ISA. We’re extending the ISA for RV64.
>
I think that we will probably need to break this up into several smaller
extensions, with some of them under a "DSP" umbrella and others in other
categories. Unfortunately, RVP is already specified in the current
draft ISA spec as using the FP register file, but this could be a good
starting point for a new fixed-point extension (possibly our first to
use 48-bit opcodes) using the integer register file.
The major concern I have with this as written is introducing an overflow
flag, since RISC-V explicitly eschews condition codes. I presume that
AndeStar V3 uses a condition code (and can probably branch on OV, much
like x86).
A wide-accumulate instruction will probably be appropriate, but could
also easily use "magic" CSRs: define user CSRs acc0v0, acc0v1, acc0v2,
acc0i0, acc0i1 with special write behavior: values written to acc0i0
are added to acc0v0 with carry into acc0v1 with second carry into
acc0v2, while values written to acc0i1 are added to acc0v1 with carry
into acc0v2. Additional accumulators can be acc1*, acc2*, etc.
The zero-overhead loop mechanism is interesting and simple enough to
adapt readily, even though its current MTLBI/MTLEI instructions will not
fit in RISC-V. I suggest simplifying it to a counted implicit branch:
When loopcount is greater than 1 and the program counter is equal to
loopend, the instruction at loopend is executed, but the program counter
is loaded from loopstart instead of advancing normally and loopcount is
decremented by 1. No constraints are placed on the values in loopstart
and loopend. An ordinary instruction fetch fault will be raised if
execution is transferred to loopstart and that address is found to be
invalid. The loopstart and loopend CSRs have special behavior when
written: a sign-extended 12-bit value is multiplied by 4 and added to
the current program counter to produce the value actually stored in the
CSR. Other values are stored as given. This precludes loading
addresses directly that are very close to NULL or top-of-memory, but
neither of those pages should be holding zero-overhead loops and this
allows the loop control to be provided using ADDI x0/CSRW pairs instead
of needing new encoding. Assuming the namespace change goes through,
this could be "Zizoloop" or "Izoloop" easily.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/9214b53b-c3e1-4e25-9b27-6333213952bc%40groups.riscv.org.
There were some discussions in the ISA-Dev mailing list about DSP instructions recently. With the past experience in this area, Andes would like to contribute our >150 DSP and Packed SIMD instructions (or simply DSP ISA) developed in our AndeStar V3 architecture as a starting basis for the RISC-V “P” extension. Our DSP ISA uses only GPRs and targets for high efficiency. We have quite successful experience in the DSP ISA, which made the AndesCore D1088 our most popular core after its debut 3 years ago.
--
You received this message because you are subscribed to a topic in the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this topic, visit https://groups.google.com/a/groups.riscv.org/d/topic/isa-dev/vYVi95gF2Mo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/3484b7de-e754-44fe-ac82-93b6444f8aa3%40groups.riscv.org.
will start linking in, tomorrow.
what do you think of the "CSR cross[32][6]" idea? sorry below may
not be exactly clear, it's basically a way to generalise all
cross-operations, even the SUNPKD810 rt, ra and ZUNPKD810 rt, ra would
reduce down to one instruction as opposed to 8 right now.
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
On Fri, Apr 27, 2018 at 11:27 AM, Xan Phung <xan....@gmail.com> wrote:
> Can anyone on the RV Vector WG give more info on:
> 1. vrgather instruction
> 2. VOP: how many unused func7 opcodes are there? (func7=bits 31:25).
> 3. "mm" opcodes: are there two "mm" mask opcode bits in VOP (as per Roger
> Espaza's Nov 2017 presentation) or is there one predicate selection bit (as
> per current draft Vector spec?)
> 4. missing vsle instruction (see slide 5 of Roger Espaza presentation): why
> vsge (set greater than or equal) is provided when it doesn't add any new
> functionality not provided by vslt (set less-than), but vsle (set less than
> or equal) is missing from the instruction list?
> 5. vslide instruction
> 6. vextract & vpopc instructions
Luke Kenneth Casson Leighton wrote:
On Fri, Apr 27, 2018 at 11:27 AM, Xan Phung <xan....@gmail.com> wrote:
In relation to the Andes crossed SIMD instructions, I don't know enough
about the use cases of crossed SIMD arithmetic to comment in an informed way
on how they should be provided.
i'm currently looking at a way to implement these in simple-v using:
* xBitManip GREV followed by
* non-crossed SIMD op followed by
* xBitManip GREVI
GREVI is "GREV immediate", not "GREV inverse".
with optimisations being free to perform macro-op fusion on 2 (or even
3) of those. i'm not totally happy with that approach however as it
does rather overload the instruction cache (3 instructions instead of
1, even if the SIMD op itself can be Compressed).
i'm really quite concerned about O(N^6) SIMD instruction proliferation
[0], am having difficulty thinking how to do this.
l.
[0] 1: opcode 2: num-elements 3: el-bitwidth 4: src1 cross 5: src2
cross 6: dest cross
In the RVP partitioning scheme I propose these map to:
1: opcode (new opcodes for partitioned ops)
2 - 3: part CSR (implicitly defines both element count and bitwidth, but total width must equal XLEN)
4 - 6: use BitManip to shuffle inputs and outputs
-- Jacob
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/5B00C88C.7050906%40gmail.com.
Hi Luke,Thanks for also for the Berkley teaching materials on RVV - I guess this is now mostly out of date given the Barcelona changes.I have also seen the Barcelona slides
on the revised Vector proposal & I notice that (in the base V ISA) they drop polymorphic vector instructions and instead provide multiple sets of FP and integer arithmetic instructions. An extension to V will then re-introduce polymorphic instructions.
Adding the Andes SIMD proposal, this means there will be at least 5x SIMD-like instruction sets in RISC V:(i) RVP for integer 8 bit types(ii) RVP for integer 16 bit types(iii) RVV Base for integer types(iv) RVV Base for FP types (with H, S, D and Q subsets)(v) RVV Extension for polymorphic typesThe above seems to contradict the RISC V philosophy, which had previously critiqued other architectures for their explosion of SIMD instruction counts.
The reason given in the Barcelona slides was "concern on total state needed to hold reg types".