Having read the specification, I am left wondering if there is a way for the compiler or application to query the current memory hierarchy and use the query to control the non-temporal nature of the memory references. Reading the spec would indicate no. However, using the proposed instruction substitute, one could put a bit pattern in x6 and use ADD x0,x0,x6 as the dynamic NT selection instruction.
On Tuesday, February 14, 2023 at 5:02:35 PM UTC-6 Andrew Waterman wrote:We are delighted to announce the start of the public review period for the proposed Fast-Track extension Zihintntl to the RISC-V ISA. This extension adds non-temporal locality hints, which affect the performance characteristics of memory-access instructions.
The review period begins today, February 14, 2023, and ends on March 31, 2023.
This extension is part of the Unprivileged Specification.
These extensions are described in the PDF spec available at https://drive.google.com/file/d/1QfGFllFivV1cVM899TCRMfpBNwhhZjWp/view?usp=share_link which was generated from the source available in the following GitHub repo: https://github.com/riscv/riscv-isa-manual
To respond to the public review, please either email comments to the public isa-dev mailing list or add issues and/or pull requests to the RISC-V ISA Manual GitHub repo, https://github.com/riscv/riscv-isa-manual. We welcome all input and appreciate your time and effort in helping us by reviewing the specification.
During the public review period, corrections, comments, and suggestions, will be gathered for review by the Unprivileged Spec ISA Committee. Any minor corrections and/or uncontroversial changes will be incorporated into the specification. Any remaining issues or proposed changes will be addressed in the public review summary report. If there are no issues that require incompatible changes to the public review specification, the Unprivileged ISA Committee will recommend the updated specifications be approved and ratified by the RISC-V Technical Steering Committee and the RISC-V Board of Directors.
Thanks to all the contributors for all their hard work.
Andrew Waterman
Vice-Chair, Privileged ISA Committee
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/d80acb79-6c97-4480-ad37-8eee45a56b52n%40groups.riscv.org.
_._,_._,_
Links:You receive all messages sent to this group.
View/Reply Online (#419) | Reply To Group | Reply To Sender | Mute This Topic | New Topic
Your Subscription | Contact Group Owner | Unsubscribe [allen...@esperantotech.com]
_._,_._,_
Thinking about this I suspect there's scope here for a (probably existing)
covert channel that can be made easier (ie become higher bandwidth) by
manipulating cache state.
It's not a reason for not having such a facility, but it probably is a reason
for having a standard way for making sure that higher access modes have a way
to turn it off
Paul
Well, the use as a hint would query nothing - it can't - but what it sounds like is that what he is proposing isthe equivalent of a dynamic rounding mode, so it would select the cache level based on the contents of x6.I don't see a particular use case for that, really. I
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/347196ff-d073-4cc3-ba38-90cee59d6164n%40groups.riscv.org.
We would prefer to see non-temporal accesses as their own instructions, rather than a hint in a two-instruction sequence.
This would apply equally to stores, loads, and prefetches.
I also wonder if it is wise to provide guidelines on portable usage. Given the breadth of RISC-V designs from microcontrollers to high-perf, those guidelines don’t seem generally applicable. Perhaps they belong in a RISC-V profile, but I would prefer omitting them entirely since performance portability is difficult to achieve in practice.
As an alternative, we could consider specifying a data set size rather than a target cache level. That would resolve issues with implementations having vastly different memory hierarchies and leave it up to a design to decide what to do. For example, a non-temporal access could encode a data set size as a power of two between, say, 16KB-1G.
-Derek Hower
Qualcomm
+ Of course, these problems can be solved with micro-architecture complexity, but why add complexity when the problem is simple to solve in the ISA?
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/0c9f18fc-d7b9-4131-b592-2fd09480e1ccn%40groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/fce5bb2a-8c21-4885-a9f3-835b7f1c1ae2n%40groups.riscv.org.
Another difference compared to other documented suggestions for macro fusion is that the Zihintntl fusion is essentially mandated. If you don’t fuse, you aren’t actually implementing the Zihintntl extension.
-Derek
From: Allen Baum <allen...@esperantotech.com>
Sent: Thursday, February 23, 2023 10:35 AM
To: Derek Hower <dho...@qti.qualcomm.com>
Cc: RISC-V ISA Dev <isa...@groups.riscv.org>; kr...@sifive.com <kr...@sifive.com>
Subject: Re: [isa-dev] Public review of Fast Track extension Zihintntl
WARNING: This email originated from outside of Qualcomm. Please be wary of any links or attachments, and do not enable macros.
We allow implementations to say they support Zihintntl even if they
handle as NOP. These are hints, and implementations are free to
ignore them. While some folk might believe we should have a stronger
requirement before allowing an implementation to claim they've
implemented Zihintntl, this doesn't make sense given that it is only a
performance tweak and there is a huge diversity of uarchs that vary
widely in performance for many other reasons.
On Feb 27, 2023, at 11:24 AM, Derek Hower <dho...@qti.qualcomm.com> wrote:
We allow implementations to say they support Zihintntl even if they
handle as NOP. These are hints, and implementations are free to
ignore them. While some folk might believe we should have a stronger
requirement before allowing an implementation to claim they've
implemented Zihintntl, this doesn't make sense given that it is only a
performance tweak and there is a huge diversity of uarchs that vary
widely in performance for many other reasons.If NOP is a valid implementation, then I presume every existing and future RISC-V automatically supports Zihintntl since the opcode is already defined as a NOP. While that won't break any code from a functional standpoint, it certainly makes it confusing from a performance standpoint.
Given how implementation-dependent this is, why standardize the hint at all?
In legacy ISAs, it makes sense to have standard non-temporal operations since custom instructions aren't an option. At the specification level, it's understood that the effects of legacy non-temporal operations will vary widely, but at the practical level there is a specific expectation about what will happen. For example, in the use case you mentioned earlier about an optimized BLAS library, software is written with the knowledge of how a non-temporal operation is implemented even though it is using a standard encoding with vague behavior.If, like RISC-V, these legacy ISAs could be extended, would we still have standard non-temporal operations? Custom instructions will be a better match to the specific uarch since they can match the actual memory hierarchy and are a fine solution for implementation-specific code. Custom instructions can't be used in generic "performance portable" software, but it's arguable such a thing exists anyway.
--
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAMU%2BEkz41tS7AH6XmeaUtCOSfotBo%2BMHC%3DiyOAWCO7MtZ0Hprw%40mail.gmail.com.