The User mode CSRs mentioned in the privileged ISA spec version 1.9.1

912 views
Skip to first unread message

Akshay Dalvi

unread,
Apr 24, 2017, 4:02:45 AM4/24/17
to RISC-V ISA Dev
Hello Everyone!!

I was going through the RISC V privileged ISA Spec version 1.9.1 that shows there are more CSR registers like ustatus,uie,utvec and others as compared to the older spec v 1.7. with respect to this I have the following queries
  • On the lines of the machine and Supervisor mode CSRs which have their fields explained in the spec I did not see any explanation for the User Mode CSRs. I see there are some User Mode CSRs that are Read Write (URW) while some are just Read only (URO), is there some other document that I need to see to know what the registers do. Please let me know if I did not understand the obvious as I`am not into ISA stuff but more of an Hardware guy.
  • On the same lines as said above do I have to assume anything like can I assume the bit fields of ustatus to be same like mstatus and so on.
Please let me know how to go about this.

Thanks!!

Stefan O'Rear

unread,
Apr 24, 2017, 12:26:05 PM4/24/17
to Akshay Dalvi, RISC-V ISA Dev
Those are placeholders for a feature that has not yet been defined.
No more and no less.

-s

Sean Halle

unread,
Apr 24, 2017, 6:03:37 PM4/24/17
to Stefan O'Rear, Akshay Dalvi, RISC-V ISA Dev

From a hardware standpoint, we have had growing concern about the accumulation of complexity in the CSRs.  We are creating an ultra simple high performance compute engine, which we are stripping of as much complexity as possible, while maintaining base OS support.  We will not be implementing the full set of defined CSRs.  This means that if all CSRs are part of the base specification, then we will be non-compliant.

What I propose is to move the non-critical CSRs into an extension.  There appear to be only 4 or 5 CSRs absolutely necessary for Linux support.  Others are helpful, and yet more are simply nice to have.  In order to avoid having a large number of chips out there that don't comply with the standard, I propose to make a base standard that even ultra stripped down compute engines like ours will be compliant with.

The implication is that at least two different versions of Linux will be needed -- one for the stripped down base ISA, which is missing many features and will be lower performance, and a second version of Linux, which takes advantage of all the additional CSRs.  We are fine with having only the high performance version of Linux as the official distribution, as we plan to modify the kernel ourselves, to make it compatible with our lower level of hardware support.

What do you think about that?

Sean
Intensivate




CONFIDENTIALITY NOTICEThis e-mail and its attachments contain information which is confidential. It's use is strictly limited to the stated intent of the provider and is only for the intended recipient(s). If the reader of this e-mail is not the intended recipient, or the employee, agent or representative responsible for delivering the e-mail to the intended recipient, you are hereby notified that any dissemination, distribution, copying or other use of this e-mail is strictly prohibited. If you have received this e-mail in error, please reply immediately to the sender. Thank you.




--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CADJ6UvNoYgpbiYeJzni88qagNjPe0gQ57qdKAFtHBtFTVA_22Q%40mail.gmail.com.

Stefan O'Rear

unread,
Apr 25, 2017, 12:30:15 AM4/25/17
to Sean Halle, Akshay Dalvi, RISC-V ISA Dev
On Mon, Apr 24, 2017 at 2:52 PM, Sean Halle <sean...@intensivate.com> wrote:
>
> From a hardware standpoint, we have had growing concern about the
> accumulation of complexity in the CSRs. We are creating an ultra simple
> high performance compute engine, which we are stripping of as much
> complexity as possible, while maintaining base OS support. We will not be
> implementing the full set of defined CSRs. This means that if all CSRs are
> part of the base specification, then we will be non-compliant.
>
> What I propose is to move the non-critical CSRs into an extension. There
> appear to be only 4 or 5 CSRs absolutely necessary for Linux support.
> Others are helpful, and yet more are simply nice to have. In order to avoid
> having a large number of chips out there that don't comply with the
> standard, I propose to make a base standard that even ultra stripped down
> compute engines like ours will be compliant with.

Help me understand the part that's confusing to you.

Very few of the CSR numbers, and none of the ones that have been
allocated post-1.7, are mandatory.

The names that Akshay Dalvi are asking about, are already part of an
extension, namely "N", which Linux will never use (it's incompatible
with S).

> The implication is that at least two different versions of Linux will be
> needed -- one for the stripped down base ISA, which is missing many features
> and will be lower performance, and a second version of Linux, which takes
> advantage of all the additional CSRs. We are fine with having only the high
> performance version of Linux as the official distribution, as we plan to
> modify the kernel ourselves, to make it compatible with our lower level of
> hardware support.
>
> What do you think about that?

That's exactly what we already have.

-s

Akshay Dalvi

unread,
Apr 25, 2017, 2:13:59 AM4/25/17
to RISC-V ISA Dev, xha...@gmail.com
Hi Stefan,

Thanks for the info, I was wondering since those are placeholders then it is good to just consider the 6 user mode CSRs mentioned in the RISC-V v1.7 spec viz. cycle, time and instret and their upper 32 bit equivalents coupled with the only mandatory CSRs from the Machine mode while looking at the V1.9.1 privileged spec.I think that solves my doubt.

Thanks!

Akshay Dalvi

unread,
Apr 25, 2017, 2:15:26 AM4/25/17
to RISC-V ISA Dev, sor...@gmail.com, xha...@gmail.com, sean...@intensivate.com
Hi Sean,

That is indeed a good idea about what you mentioned, I think it should be worthwhile doing that way.. Thanks!!
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Sean Halle

unread,
Apr 25, 2017, 5:36:49 AM4/25/17
to Stefan O'Rear, RISC-V ISA Dev

Hi Stefan,

   Yes we talked about that before, but upon looking again at the spec:

It wasn't clear that there are (more than just a few) optional CSRs, only optional privilege levels..  For example, M mode is stated as required by all RISC-V implementations, and M-mode CSRs are listed in table 2.5 and table 2.6. There are 49 of them.  The tables don't indicate that any are optional.  The text describes them in pages 17 through 36 and the only mention of "optional" that I found was in regards to privilege level, although it does imply that a few CSRs might be optional, such as the base and bound CSRs.  Quite a few of these M mode CSRs seem to be nice-to-haves, such as vendor ID, time related (mtime), performance counters, and so forth.  They do not appear to be critical to correct functional operation.  The spec didn't mention whether these nice-to-haves were optional or not, that I saw..

In our chip we do want S level, in order to support Linux.  The S mode CSRs are leaner, but do still include artifacts such as the delegation registers, which would not be needed in a stripped down implementation, but are rather in the spec because of the other trap levels that _might_ be implemented (namely U level trap handlers and H level handlers).

We also need the floating point CSRs from the U level..  but table 2.2 doesn't indicate whether some of the U level CSRs are optional vs some required..  page 9 says "The timers, counters, and floating-point CSRs are the only standard user-level CSRs currently defined. "  implying that all those types of CSRs are standard.

Honestly, it feels like feature creep.  Many of these functions don't feel like they belong in a one-size-fits-all spec, such as the performance counters and the various clocks, or even the vendor and architecture IDs.  Those just aren't critical things.  Intel did place all this functionality into their core ISA, but it doesn't seem like we should take that as our gold standard of how we want to be.  The plethora of different hardware vendors implementing RISC-V means we have a different set of constraints.  The current approach just feels as though it bakes in the complexity required for full hypervisor support, causing features for hypervisor to leak into lesser modes.  It feels as though it has made choices based on a "standard" design, and imposed those choices on all designs.  In a typical high performance processor, the size of this set of CSRs would likely be insignificant compared to the rest of the logic.  But that's not the case for our design.  Not to mention the resources consumed by the added complexity when implementing, and verifying the implementation of, the spec.

It seems as though we can come up with a cleaner decomposition, which imposes fewer things.  We should be able to find a way that leaves more freedom, and imposes less of a tax for deviating from what are standard implementations these days.

If what you say is true, and most CSRs are, indeed, optional, is there a different document somewhere that says which are which?

Even if many do turn out to be optional, the current way of doing them is tied in with the Linux kernel implementation and with standard tools, which may be implementated with the expectation of optional CSRs being present..  We should make this explicit, so that there is a clear indication of which subset of the CSRs a particular compiler or particular C library assumes are present.  I did not see anything in the spec document indicated above that would cover this.  To be clear, what I mean is the analog of the letters that indicate ISA extension..  have a set of letters for CSR sub-sets..  so a version of the Linux kernel says the letters for which CSRs it expects (IE, maybe that kernel uses performance counters..)..  and a version of GCC indicates which CSR letters it expects (it might target particular time related CSRs) and so forth.

Best,

Sean
Intensivate




CONFIDENTIALITY NOTICEThis e-mail and its attachments contain information which is confidential. It's use is strictly limited to the stated intent of the provider and is only for the intended recipient(s). If the reader of this e-mail is not the intended recipient, or the employee, agent or representative responsible for delivering the e-mail to the intended recipient, you are hereby notified that any dissemination, distribution, copying or other use of this e-mail is strictly prohibited. If you have received this e-mail in error, please reply immediately to the sender. Thank you.




-s

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Andrew Waterman

unread,
Apr 25, 2017, 3:03:47 PM4/25/17
to sean...@intensivate.com, Stefan O'Rear, RISC-V ISA Dev
On Tue, Apr 25, 2017 at 2:35 AM, Sean Halle <sean...@intensivate.com> wrote:
>
> Hi Stefan,
>
> Yes we talked about that before, but upon looking again at the spec:
> https://riscv.org/specifications/privileged-isa/
>
> It wasn't clear that there are (more than just a few) optional CSRs, only
> optional privilege levels.. For example, M mode is stated as required by
> all RISC-V implementations, and M-mode CSRs are listed in table 2.5 and
> table 2.6. There are 49 of them. The tables don't indicate that any are
> optional. The text describes them in pages 17 through 36 and the only
> mention of "optional" that I found was in regards to privilege level,
> although it does imply that a few CSRs might be optional, such as the base
> and bound CSRs. Quite a few of these M mode CSRs seem to be nice-to-haves,
> such as vendor ID, time related (mtime), performance counters, and so forth.
> They do not appear to be critical to correct functional operation. The spec
> didn't mention whether these nice-to-haves were optional or not, that I
> saw..

All of the CSRs I'd classify as "nice-to-have" are already optional,
e.g., the performance counters, mvendorid, marchid, mimpid, and
possibly mhartid can be hard-wired to zero. (The cost of providing
them and supplying the value 0 vs. generating an illegal instruction
trap is a wash.)

>
> In our chip we do want S level, in order to support Linux. The S mode CSRs
> are leaner, but do still include artifacts such as the delegation registers,
> which would not be needed in a stripped down implementation, but are rather
> in the spec because of the other trap levels that _might_ be implemented
> (namely U level trap handlers and H level handlers).

The delegation registers don't exist if U-mode traps are not implemented.

>
> We also need the floating point CSRs from the U level.. but table 2.2
> doesn't indicate whether some of the U level CSRs are optional vs some
> required.. page 9 says "The timers, counters, and floating-point CSRs are
> the only standard user-level CSRs currently defined. " implying that all
> those types of CSRs are standard.
>
> Honestly, it feels like feature creep. Many of these functions don't feel
> like they belong in a one-size-fits-all spec, such as the performance
> counters and the various clocks, or even the vendor and architecture IDs.
> Those just aren't critical things. Intel did place all this functionality
> into their core ISA, but it doesn't seem like we should take that as our
> gold standard of how we want to be. The plethora of different hardware
> vendors implementing RISC-V means we have a different set of constraints.
> The current approach just feels as though it bakes in the complexity
> required for full hypervisor support, causing features for hypervisor to
> leak into lesser modes. It feels as though it has made choices based on a
> "standard" design, and imposed those choices on all designs. In a typical
> high performance processor, the size of this set of CSRs would likely be
> insignificant compared to the rest of the logic. But that's not the case
> for our design. Not to mention the resources consumed by the added
> complexity when implementing, and verifying the implementation of, the spec.

Our experience implementing area-efficient processors contradicts this
observation. These additional CSRs are either optional or hard-wired
to zero, and as mentioned above, the latter doesn't necessarily imply
any additional hardware cost.

The problem seems to me to be that a description of the minimal legal
implementation is not written down in one place. We plan to resolve
this by writing up a set of "platform profiles," and we will include a
minimal S-mode profile in that set.
>> email to isa-dev+u...@groups.riscv.org.
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit
> https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAJ4GwDJfChEVim3WX4riFdv8_8Hu5zcrcSWp8sTpMr2X7O6-Lg%40mail.gmail.com.

Jacob Bachmeyer

unread,
Apr 25, 2017, 9:41:25 PM4/25/17
to Stefan O'Rear, Sean Halle, Akshay Dalvi, RISC-V ISA Dev
Stefan O'Rear wrote:
> On Mon, Apr 24, 2017 at 2:52 PM, Sean Halle <sean...@intensivate.com> wrote:
>
>> From a hardware standpoint, we have had growing concern about the
>> accumulation of complexity in the CSRs. We are creating an ultra simple
>> high performance compute engine, which we are stripping of as much
>> complexity as possible, while maintaining base OS support. We will not be
>> implementing the full set of defined CSRs. This means that if all CSRs are
>> part of the base specification, then we will be non-compliant.
>>
>> What I propose is to move the non-critical CSRs into an extension. There
>> appear to be only 4 or 5 CSRs absolutely necessary for Linux support.
>> Others are helpful, and yet more are simply nice to have. In order to avoid
>> having a large number of chips out there that don't comply with the
>> standard, I propose to make a base standard that even ultra stripped down
>> compute engines like ours will be compliant with.
>>
>
> Help me understand the part that's confusing to you.
>
> Very few of the CSR numbers, and none of the ones that have been
> allocated post-1.7, are mandatory.
>
> The names that Akshay Dalvi are asking about, are already part of an
> extension, namely "N", which Linux will never use (it's incompatible
> with S).
>

What? Why would user-mode interrupts be incompatible with a
supervisor? Especially when we already have separate cause codes for
interrupts targeted at each privilege level? And user-mode interrupts
are very helpful for implementing user-mode device drivers with minimal
overhead? And why would sideleg exist if N and S are incompatible?

Incompatibility of N and S must be an erroneous statement?

-- Jacob

Sean Halle

unread,
Apr 26, 2017, 3:16:40 AM4/26/17
to Andrew Waterman, RISC-V ISA Dev

Thank you Andrew, for clarifying.  I like the idea of platform profiles.  One way that they could be very valuable is for quickly identifying a fit between binaries and hardware.  For example, if a binary expects particular CSRs to be implemented, then tagging the binary with the profile, or CSR "sets", would allow a quick check against the CSR profile or sets, that a given hardware platform has implemented, and so verify compatibility between them.

To clarify, a hypothetical example could be that libc were expecting particular CSRs to be present, and bugs may manifest if those CSRs were implemented as all zeros -- I'm thinking about <time.h>.  Or, perhaps an advanced compiler inserted an auto-tuner into a binary.  The performance would be atrocious if the performance counter CSRs were not in place.  And so on..  if the binary were somehow tagged with the fact that it expects the performance CSRs to be implemented, then an end-user could quickly determine whether this binary is compatible with their machine. 

For these purposes, it might be useful to divide the CSRs up into groups, and assign a letter to each group.  And it may be helpful to modify the ELF format, to include a list of expected CSR groups.  Together with these, add a new CSR, whose contents indicate which CSR groups are implemented in the hardware.  This would allow automation of the compatibility check.

Best,

Sean
Intensivate

Andrew Waterman <and...@sifive.com> wrote:


>> To post to this group, send email to isa...@groups.riscv.org.
>> Visit this group at
>> https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
>> To view this discussion on the web visit
>> https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CADJ6UvN_i%2BoYjtpOVR%3Da6OQATjk2jPcfGD_vM_xeGKeyKJhkSw%40mail.gmail.com.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit

Stefan O'Rear

unread,
Apr 26, 2017, 3:28:57 AM4/26/17
to Sean Halle, Andrew Waterman, RISC-V ISA Dev
On Wed, Apr 26, 2017 at 12:15 AM, Sean Halle <sean...@intensivate.com> wrote:
> implemented as all zeros -- I'm thinking about <time.h>. Or, perhaps an
> advanced compiler inserted an auto-tuner into a binary. The performance
> would be atrocious if the performance counter CSRs were not in place. And

I'm not sure it's currently _possible_ to rely on performance
counters, because the hpmevent* values are (a) undocumented (b) differ
between implementations.

> For these purposes, it might be useful to divide the CSRs up into groups,
> and assign a letter to each group. And it may be helpful to modify the ELF
> format, to include a list of expected CSR groups. Together with these, add
> a new CSR, whose contents indicate which CSR groups are implemented in the
> hardware. This would allow automation of the compatibility check.

The profiles stuff is for firmware and kernel images, which frequently
don't go through a full (or any) ELF loader.

-s

Andrew Waterman

unread,
Apr 26, 2017, 3:52:29 AM4/26/17
to Sean Halle, RISC-V ISA Dev
On Wed, Apr 26, 2017 at 12:15 AM, Sean Halle <sean...@intensivate.com> wrote:
>
> Thank you Andrew, for clarifying. I like the idea of platform profiles.
> One way that they could be very valuable is for quickly identifying a fit
> between binaries and hardware. For example, if a binary expects particular
> CSRs to be implemented, then tagging the binary with the profile, or CSR
> "sets", would allow a quick check against the CSR profile or sets, that a
> given hardware platform has implemented, and so verify compatibility between
> them.

I think the software story will necessarily vary by privilege mode.

There's no leeway to remove standard U-mode CSRs from the perspective
of U-mode programs, as their presence is mandated by the ABI. But
simpler hardware implementations may defer their implementation to
more-privileged software (e.g., our implementations rely upon M-mode
software to emulate access to the TIME CSR).

S-mode software can probe the existence of CSRs by relying on
illegal-instruction trap behavior, so it is inherently more tolerant
to certain CSRs not being provisioned. (Even so, the SBI may mandate
that certain S-mode CSRs appear to exist, in which case M-mode
illegal-instruction trap handlers can emulate their behavior.)

Most M-mode software will have a complete understanding of the CSRs
actually implemented, and will statically avoid accessing the
nonexistent ones.
>> >> email to isa-dev+u...@groups.riscv.org.
>> >> To post to this group, send email to isa...@groups.riscv.org.
>> >> Visit this group at
>> >> https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
>> >> To view this discussion on the web visit
>> >>
>> >> https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CADJ6UvN_i%2BoYjtpOVR%3Da6OQATjk2jPcfGD_vM_xeGKeyKJhkSw%40mail.gmail.com.
>> >
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "RISC-V ISA Dev" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> > an
>> > email to isa-dev+u...@groups.riscv.org.

Ray Van De Walker

unread,
Apr 26, 2017, 1:56:53 PM4/26/17
to RISC-V ISA Dev
Maybe I'm missing something, but I thought this was already settled?
CSRs should be a side-show. According to the ISA specs I've read,
the crucial item to standardize are the ABI softwares for each privilege level.
This approach lets vendors do clever hardware, yet keep high levels of software standardization.
Vendors will write the ABIs as part of the applications effort. Ported software should call ABIs.

That said, platform profiles make a great deal of sense.
Vendors should publish them, to support their ABI libraries.
Maybe the RISC-V seal-of-approval could require this...

So does a minimum required set of CSRs for each of the major privilege levels in the ISA spec(s).
The RISC-V specs should have this. The criteria (I thought): What CSRs and system features
can't be hidden by an ABI?

-----Original Message-----
From: Andrew Waterman [mailto:and...@sifive.com]

On Wed, Apr 26, 2017 at 12:15 AM, Sean Halle <sean...@intensivate.com> wrote:
>
> Thank you Andrew, for clarifying. I like the idea of platform profiles.

Sean Halle

unread,
Apr 26, 2017, 9:11:04 PM4/26/17
to Andrew Waterman, RISC-V ISA Dev

Thanks Andrew.

If I could verify my understanding, the statement "There's no leeway to remove standard U-mode CSRs from the perspective of U-mode programs, as their presence is mandated by the ABI."  Seems to imply that all U mode CSRs are, indeed, required -- none are optional..  but you implement some in such a way that when they are accessed they trap to firmware or to the OS, which then implements their behavior in software.  Did I understand that correctly?

Does this imply that there is effectively no hardware cost to the non-implemented CSRs, because they just generate a trap?  Does that imply that _all_ CSRs that are optional can be successfully handled this way?  That may not be possible.. consider, for example, implementing performance counters via traps..   performance counters are part of U mode CSRs, as far as the spec seems to indicate.

One feeling is that there may be some unwitting contradiction in play..  on the one had "presence is mandated by the ABI" on the other hand "most CSRs are nice-to-haves and are optional" and on a third hand "defer implementation to software" and on the fourth hand "implement CSR by wiring it to zero"  --  these statements, from various emails, seem to contradict each other.   For example, in one place there is mention of implementing non-critical CSRs by tying bits to zero, but in other places there is mention of trapping and implementing by software..  Perhaps there is more information behind the statements that hasn't been shared yet?  

What I'm hoping for is a clear vision of how all these requirements can all be simultaneously met:
1] Presence of U mode CSRs is required
2] Stripped down compute engines need total CSR area to be a fraction of the area of the scalar register file  (say 1/4 the area)
- -] CSR state is typically implemented as flip-flops (much larger area per bit of state), making this more difficult
3] Performance counters likely cannot be implemented by software, yet are mandated because they are part of U mode CSRs
- -] (also, other CSRs may similarly be barred from being implemented by software)
4] Software -- both user level and OS -- will, over time, come to expect particular CSRs to be functioning
- -] If hardware has not implemented the expected CSRs, or has done so in a crippled way, then the software will have consequences
5] Many different kinds of hardware need to be supported
- -] The approach to CSRs will be different for different categories of hardware -- micro-controllers will likely just not implement any CSRs at all, not even tying them to 0 -- stripped down compute engines will only implement the ones critical to OS functionality (it looks like roughly 6 or 7 total) -- full scale server cores will implement many, or perhaps all, directly.

we need to also satisfy these, at the same time:
6] Before downloading a particular binary, a means is needed to check its compatibility with the hardware -- not just functional, but performance related (IE, software CSRs may be unacceptable to some binaries)
- -] Checking at run time, for which CSRs are implemented in each particular fashion is too late -- the binary has already been selected and installed
7] Coders are often poorly informed about hardware features, and make bad assumptions, and bake those into their code

The main point for me, personally, is implications for software -- programmers often don't spend the time to understand the hardware subtleties, if they see something in the spec they quickly jump to an intuition about it and go with that.  Hence, warning that CSRs may not be fully implemented won't sink in for many coders.  We can feel fairly certain that popular software packages will misunderstand, and assume that all CSRs are fully implemented, and make their code rely on the presence of all CSRs -- despite the fine print embedded into the spec that warns against this.

My proposal is designed to make it obvious to coders, who are in a hurry, that some CSRs are optional, and to provide a simple _automated_ way to handle matching binaries with hardware.  Tools need to be involved with the process, so that the tool creators are the ones who read the spec, and understand, and then add checks to the tools, that then inform the coders.  More importantly, the tools need to automate the matching of binaries to hardware.  Think about downloading on the web -- we need the tools to know, before download, which binaries are compatible with the target processor.  To accomplish this, it isn't enough for certain things to be _possible_.  Instead, a system needs to be in place that makes the default way of doing things be a good way of doing things.  Unless we make the easiest, path of least resistance, be one that has good outcomes, then software will accumulate worst case behaviors, which will become the norm.  That creates de-facto standards that we didn't intend.  Then hardware has to implement things that are harmful, even though they are optional, in order to support the bad habits baked into popular software.  This kind of legacy issue has a long history of repeating itself.  I'm proposing that we do something to head off the problem before it arises.

Best,

Sean
Intensivate





CONFIDENTIALITY NOTICEThis e-mail and its attachments contain information which is confidential. It's use is strictly limited to the stated intent of the provider and is only for the intended recipient(s). If the reader of this e-mail is not the intended recipient, or the employee, agent or representative responsible for delivering the e-mail to the intended recipient, you are hereby notified that any dissemination, distribution, copying or other use of this e-mail is strictly prohibited. If you have received this e-mail in error, please reply immediately to the sender. Thank you.



Stefan O'Rear

unread,
Apr 26, 2017, 9:46:34 PM4/26/17
to Sean Halle, Andrew Waterman, RISC-V ISA Dev
On Wed, Apr 26, 2017 at 6:10 PM, Sean Halle <sean...@intensivate.com> wrote:
>
> Thanks Andrew.
>
> If I could verify my understanding, the statement "There's no leeway to
> remove standard U-mode CSRs from the perspective of U-mode programs, as
> their presence is mandated by the ABI." Seems to imply that all U mode CSRs
> are, indeed, required -- none are optional.. but you implement some in such
> a way that when they are accessed they trap to firmware or to the OS, which
> then implements their behavior in software. Did I understand that
> correctly?

"standard U-mode CSRs" means CSRs which are defined in "The RISC-V
Instruction Set Manual Volume I: User-Level ISA", of which there are
precisely four:

1. cycle
2. instret
3. time
4. fcsr

fcsr is only required if you are doing hardware floating point. The
other three must appear to exist; they can be hardwired to zero,
emulated by a trap handler, or have a "real" implementation, whatever
is most convenient.

The privileged spec describes other CSRs which might sometimes be
visible in U-mode. Since those CSRs are not always present and are
not described by any ABI, they do not qualify as "standard U-mode
CSRs".

> Does this imply that there is effectively no hardware cost to the
> non-implemented CSRs, because they just generate a trap? Does that imply
> that _all_ CSRs that are optional can be successfully handled this way?
> That may not be possible.. consider, for example, implementing performance
> counters via traps.. performance counters are part of U mode CSRs, as far
> as the spec seems to indicate.

Performance counters are sometimes accessible from U-mode but they are
not mentioned in the "User-Level ISA" so they are not standard U-mode
CSRs.

cycle/time/instret are required to *appear* to exist in U-mode, but
they need not have hardware cost because they can be trapped *and
emulated* in M-mode.

> One feeling is that there may be some unwitting contradiction in play.. on
> the one had "presence is mandated by the ABI" on the other hand "most CSRs
> are nice-to-haves and are optional" and on a third hand "defer
> implementation to software" and on the fourth hand "implement CSR by wiring
> it to zero" -- these statements, from various emails, seem to contradict
> each other. For example, in one place there is mention of implementing
> non-critical CSRs by tying bits to zero, but in other places there is
> mention of trapping and implementing by software.. Perhaps there is more
> information behind the statements that hasn't been shared yet?

The specs are vague in places where we haven't figured everything out,
and the rules will likely be different between different CSRs. I can
go into a lot more detail on one CSR at a time.

> What I'm hoping for is a clear vision of how all these requirements can all
> be simultaneously met:
> 1] Presence of U mode CSRs is required
> 2] Stripped down compute engines need total CSR area to be a fraction of the
> area of the scalar register file (say 1/4 the area)
> - -] CSR state is typically implemented as flip-flops (much larger area per
> bit of state), making this more difficult
> 3] Performance counters likely cannot be implemented by software, yet are
> mandated because they are part of U mode CSRs
> - -] (also, other CSRs may similarly be barred from being implemented by
> software)

I don't understand "barred". Real performance counters require real
hardware but a degenerate implementation (counts no events, counters
always read as 0) could be done in software just fine.

> 4] Software -- both user level and OS -- will, over time, come to expect
> particular CSRs to be functioning
> - -] If hardware has not implemented the expected CSRs, or has done so in a
> crippled way, then the software will have consequences

TBH I'm more worried about de facto standardization of things like
indexed-memory access
You live in a world where no two Linux vendors agree on what
instructions an "i686" binary is allowed to use.

-s

Stefan O'Rear

unread,
Apr 26, 2017, 9:57:10 PM4/26/17
to Jacob Bachmeyer, Sean Halle, Akshay Dalvi, RISC-V ISA Dev
On Tue, Apr 25, 2017 at 6:41 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>> The names that Akshay Dalvi are asking about, are already part of an
>> extension, namely "N", which Linux will never use (it's incompatible
>> with S).
>
> What? Why would user-mode interrupts be incompatible with a supervisor?
> Especially when we already have separate cause codes for interrupts targeted
> at each privilege level? And user-mode interrupts are very helpful for
> implementing user-mode device drivers with minimal overhead? And why would
> sideleg exist if N and S are incompatible?
>
> Incompatibility of N and S must be an erroneous statement?

Linux does not save and restore the uepc, utvec, etc registers on
process switches.

Linux has no way of determining if uepc is implemented, so it cannot
save and restore uepc.

If you implement uepc+utvec on a chip that will run Linux, all you
have implemented is a high-bandwidth covert channel.

(A supervisor with out of band knowledge could get around this, but I
don't actually think N+S makes sense for some other far more subtle
reasons. I could go into those reasons but on a new thread.)

-s

Jacob Bachmeyer

unread,
Apr 26, 2017, 10:51:36 PM4/26/17
to Stefan O'Rear, Sean Halle, Akshay Dalvi, RISC-V ISA Dev
Stefan O'Rear wrote:
> On Tue, Apr 25, 2017 at 6:41 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>
>>> The names that Akshay Dalvi are asking about, are already part of an
>>> extension, namely "N", which Linux will never use (it's incompatible
>>> with S).
>>>
>> What? Why would user-mode interrupts be incompatible with a supervisor?
>> Especially when we already have separate cause codes for interrupts targeted
>> at each privilege level? And user-mode interrupts are very helpful for
>> implementing user-mode device drivers with minimal overhead? And why would
>> sideleg exist if N and S are incompatible?
>>
>> Incompatibility of N and S must be an erroneous statement?
>>
>
> Linux does not save and restore the uepc, utvec, etc registers on
> process switches.
>

Not yet. That is easily changed.

> Linux has no way of determining if uepc is implemented, so it cannot
> save and restore uepc.
>

Sure it does: uepc and such are implemented if and only if the
configuration string declares support for the N extension, just like the
f registers are implemented if and only if the configuration string
declares the F extension and the vector unit exists if and only if the V
extension is present.

> If you implement uepc+utvec on a chip that will run Linux, all you
> have implemented is a high-bandwidth covert channel.
>

This is not correct, since the configuration string will declare the
presence of the N extension and Linux will then know that it must swap
or clear those CSRs.

> (A supervisor with out of band knowledge could get around this, but I
> don't actually think N+S makes sense for some other far more subtle
> reasons. I could go into those reasons but on a new thread.)

The configuration string *exists* to provide exactly that kind of "out
of band" knowledge. I think N+S makes great sense for implementing
user-mode drivers and reducing attack surface. Please explain why you
think it is infeasible?


-- Jacob

Michael Clark

unread,
Apr 26, 2017, 11:35:14 PM4/26/17
to sean...@intensivate.com, Andrew Waterman, RISC-V ISA Dev
Hi Sean,

I don’t think CSR profile labels should be required for user mode executables.

CSRs are somewhat similar to MSRs (Model Specific Registers) on other architectures and are typically only used by the kernel or system software. READPMC on Intel is a very low-level interface and is only used by very special purpose performance monitoring tools. Performance counters are not part of the user-mode System V ABI and Linux requires a non-default setting to even allow user-mode executables to read the performance monitoring counters. e.g.

echo 2 >  /sys/bus/event_source/devices/cpu/rdpmc

An ELF executable is typically going to be using gettimeofday(2) or clock_gettime(2) to read timers. These are implemented by libc/glibc and even if the C library was reading the CSR timers directly, it exposes the C library and POSIX interfaces to the user-mode executable and the C library has to handle encapsulation of the system interface (syscall, VDSO, CSR), e.g. if glibc finds that a CSR or VDSO page is not present it will fall back to using a kernel syscall interface. gettimeofday(2) on Linux can on some platforms be accelerated by reading the time from a VDSO read-only page because this is faster than a syscall however these abstractions and fallbacks are encapsulated in the POSIX and C library interface exposed to user mode executables.

There is the ELF HWCAPs mechanism but this is generally limited to use by the dynamic linker to choose hardware optimised versions of the C library. e.g. SSE optimised string functions (memcpy/memcmp). HWCAPs is likely more relevant at the extension level e.g. a user mode library that depends on the Vector extension may have the V HWCAP label. e.g. a BLAS library that has vector and scalar versions could use HWCAPs so that the dynamic linker loads the accelerated version if the V extension is present. A HWCAPs ELF implementation probably wants to expose the misa CSR somehow to the dynamic loader; perhaps via sysconf(3) or sysctl(3). e.g.

hw.riscv.csr.misa

The N extension CSRs are very special purpose CSRs for implementing device drivers in user-mode and would not be used by portable executables.

I think there is a reasonable argument to use the ELF HWCAPs for loading of alternate libraries based on the presence of a processor extension however there are existing interfaces that can be used.

Michael.

Jacob Bachmeyer

unread,
Apr 27, 2017, 12:22:41 AM4/27/17
to sean...@intensivate.com, Andrew Waterman, RISC-V ISA Dev
Sean Halle wrote:
> If I could verify my understanding, the statement "There's no leeway
> to remove standard U-mode CSRs from the perspective of U-mode
> programs, as their presence is mandated by the ABI." Seems to imply
> that all U mode CSRs are, indeed, required -- none are optional.. but
> you implement some in such a way that when they are accessed they trap
> to firmware or to the OS, which then implements their behavior in
> software. Did I understand that correctly?

In section 2.8 of the RISC-V User ISA spec: "We define the full set of
CSR instructions here, although in the standard user-level base ISA,
only a handful of read-only counter CSRs are accessible." That leads to
the RDCYCLE, RDTIME, and RDINSTRET instructions. So user-visible CSRs
in RVI are not actually required, only that those pseudo-instructions work.

> Does this imply that there is effectively no hardware cost to the
> non-implemented CSRs, because they just generate a trap? Does that
> imply that _all_ CSRs that are optional can be successfully handled
> this way? That may not be possible.. consider, for example,
> implementing performance counters via traps.. performance counters
> are part of U mode CSRs, as far as the spec seems to indicate.

Only "cycle", "time", and "instret" are part of the baseline user ISA.

> One feeling is that there may be some unwitting contradiction in
> play.. on the one had "presence is mandated by the ABI" on the other
> hand "most CSRs are nice-to-haves and are optional" and on a third
> hand "defer implementation to software" and on the fourth hand
> "implement CSR by wiring it to zero" -- these statements, from
> various emails, seem to contradict each other. For example, in one
> place there is mention of implementing non-critical CSRs by tying bits
> to zero, but in other places there is mention of trapping and
> implementing by software.. Perhaps there is more information behind
> the statements that hasn't been shared yet?
>
> What I'm hoping for is a clear vision of how all these requirements
> can all be simultaneously met:
> 1] Presence of U mode CSRs is required
Only "cycle", "time", and "instret" are present in the RVI baseline ISA;
all the rest are actually defined by various extensions or by the
privileged ISA (which can be swapped out /in toto/ if I understand
correctly). Even these are optional in RV32E, so there could be some
precedent for omitting them entirely.
> 2] Stripped down compute engines need total CSR area to be a fraction
> of the area of the scalar register file (say 1/4 the area)
> - -] CSR state is typically implemented as flip-flops (much larger
> area per bit of state), making this more difficult
A simple model: each compute engine is at all times either running in
U-mode or halted; when an engine is halted, its context (scalar register
file and program counter) is accessible in a memory-mapped region for an
associated control processor. All external interrupts are taken on the
control processor and halting a compute engine causes an interrupt to
the control processor (probably via the PLIC). Compute engines halt
instead of trapping. This would require a slightly modified supervisor,
but as long as the workload is predominately in U-mode, several compute
engines should be able to share a control processor with minimal
contention. In this model, compute engines could halt upon executing
any SYSTEM instruction. As I understand it, this is fully compliant
with the user ISA (since the control processor could emulate
RDCYCLE/RDTIME/RDINSTRET) and would be a non-standard privileged ISA.
RISC-V allows this. (Or you could push to standardize an alternate
privileged ISA for systems that have U-mode-only harts.)
> 3] Performance counters likely cannot be implemented by software, yet
> are mandated because they are part of U mode CSRs
> - -] (also, other CSRs may similarly be barred from being implemented
> by software)
Generally, a CSR that can be fully implemented without hardware support
is probably a waste of a CSR slot. On the other hand, CSRs may control
things that minimal compute engines may or may not actually have. If
your compute engines do not have paging support, then the satp CSR is
useless and can be omitted entirely. (This is plausible in the above
"simple model" if a group of compute engines that share a single virtual
address space mapped using a shared PMMU, possibly even shared with the
control processor, or the PMMU control registers can also be
memory-mapped for the control processor.) If your compute engines are
U-mode-only, than *all* of the privileged CSRs vanish in a puff of
smoke. With no interrupts at all (allowed in the current privileged
spec, which permits interrupts to be routed to specific harts) and traps
handled by halting (and sending an interrupt to the control processor),
the only CSRs you could even have are the cycle, time, and
instructions-retired counters--and fcsr if you implement floating-point.
> 4] Software -- both user level and OS -- will, over time, come to
> expect particular CSRs to be functioning
> - -] If hardware has not implemented the expected CSRs, or has done so
> in a crippled way, then the software will have consequences
The specs set fairly tight limits on what portable software can expect.
For example, the other 29 performance counters are almost completely
unspecified. Portable software cannot rely on them.
> 5] Many different kinds of hardware need to be supported
> - -] The approach to CSRs will be different for different categories
> of hardware -- micro-controllers will likely just not implement any
> CSRs at all, not even tying them to 0 -- stripped down compute engines
> will only implement the ones critical to OS functionality (it looks
> like roughly 6 or 7 total) -- full scale server cores will implement
> many, or perhaps all, directly.

RV32E permits exactly that--omitting all CSRs, but some of the functions
CSRs provide will still be needed (and would probably be
memory-mapped). As I suggested above, a truly minimal compute engine
could execute exclusively in U-mode, and pass all traps off to a nearby
control processor as interrupts.

> we need to also satisfy these, at the same time:
> 6] Before downloading a particular binary, a means is needed to check
> its compatibility with the hardware -- not just functional, but
> performance related (IE, software CSRs may be unacceptable to some
> binaries)
> - -] Checking at run time, for which CSRs are implemented in each
> particular fashion is too late -- the binary has already been selected
> and installed

Unfortunately, this is a general and not-entirely-solved problem. I
suspect that the practical solution, if your compute engines are
specialized enough, would be that special binaries for "Intensivate
FooGrid 9000" (product name made-up for an example) would be produced
and distributed. Labeling those binaries in a machine-readable way is a
different problem.

> 7] Coders are often poorly informed about hardware features, and make
> bad assumptions, and bake those into their code
>
> The main point for me, personally, is implications for software --
> programmers often don't spend the time to understand the hardware
> subtleties, if they see something in the spec they quickly jump to an
> intuition about it and go with that. Hence, warning that CSRs may not
> be fully implemented won't sink in for many coders. We can feel
> fairly certain that popular software packages will misunderstand, and
> assume that all CSRs are fully implemented, and make their code rely
> on the presence of all CSRs -- despite the fine print embedded into
> the spec that warns against this.

How about the fact that the user spec only mentions "time", "cycle", and
"instret"? If user-mode programmers rely on CSRs defined only in the
system ISA spec in portable programs, then PEBKAC and there is no
technical solution other than finding better programmers.

> My proposal is designed to make it obvious to coders, who are in a
> hurry, that some CSRs are optional, and to provide a simple
> _automated_ way to handle matching binaries with hardware. Tools need
> to be involved with the process, so that the tool creators are the
> ones who read the spec, and understand, and then add checks to the
> tools, that then inform the coders. More importantly, the tools need
> to automate the matching of binaries to hardware. Think about
> downloading on the web -- we need the tools to know, before download,
> which binaries are compatible with the target processor. To
> accomplish this, it isn't enough for certain things to be _possible_.
> Instead, a system needs to be in place that makes the default way of
> doing things be a good way of doing things. Unless we make the
> easiest, path of least resistance, be one that has good outcomes, then
> software will accumulate worst case behaviors, which will become the
> norm. That creates de-facto standards that we didn't intend. Then
> hardware has to implement things that are harmful, even though they
> are optional, in order to support the bad habits baked into popular
> software. This kind of legacy issue has a long history of repeating
> itself. I'm proposing that we do something to head off the problem
> before it arises.

Speaking of repeating legacy issues, I think that I understand your
frustration--I have been trying to get SUM changed to never permit
S-mode instruction fetch from a user page mapping since the first
message I sent to this list (back when it had the opposite sense and was
called "PUM") and my effort still seems to fall on deaf ears.



-- Jacob

Michael Clark

unread,
Apr 27, 2017, 12:35:36 AM4/27/17
to jcb6...@gmail.com, sean...@intensivate.com, Andrew Waterman, RISC-V ISA Dev

On 27 Apr 2017, at 4:22 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:

Speaking of repeating legacy issues, I think that I understand your frustration--I have been trying to get SUM changed to never permit S-mode instruction fetch from a user page mapping since the first message I sent to this list (back when it had the opposite sense and was called "PUM") and my effort still seems to fall on deaf ears.

The situation has changed since last year when we initially discussed this issue. The ISA manual is now on GitHub so it is possible to create an issue or fork the repository and make a pull request. That way at least it will be tracked.


Michael.

Albert Cahalan

unread,
Apr 27, 2017, 3:27:34 AM4/27/17
to sean...@intensivate.com, Stefan O'Rear, RISC-V ISA Dev
> although it does imply that a few CSRs might be optional, such as the base
> and bound CSRs. Quite a few of these M mode CSRs seem to be nice-to-haves,
> such as vendor ID, time related (mtime), performance counters, and so
> forth. They do not appear to be critical to correct functional operation.
> The spec didn't mention whether these nice-to-haves were optional or not,
> that I saw..
>
> In our chip we do want S level, in order to support Linux. The S mode CSRs
...
> Honestly, it feels like feature creep. Many of these functions don't feel
> like they belong in a one-size-fits-all spec, such as the performance
> counters and the various clocks, or even the vendor and architecture IDs.
> Those just aren't critical things. Intel did place all this functionality
> into their core ISA, but it doesn't seem like we should take that as our
> gold standard of how we want to be. The plethora of different hardware
> vendors implementing RISC-V means we have a different set of constraints.

You make a mess of the architecture for everybody when you fail to
implement some of these things.

Vendor ID: Uh, by what dirty trick should your CPU be detected? It isn't
really OK to be unidentifiable. All sorts of software will need this. Linus is
not too keen on hardware "detection" via compile options; this is the path
that ARM and PowerPC originally went down and it is much hated.
Distributors (such as the Debian FTP maintainers) are loath to add yet
another variant of a common architecture, never mind an obscure one.

Time: This has long been a pain point on x86. There is the RTC, the PIT,
the HPET, the TSC, the PM timers, the local APIC... and generally these
are all awful in one way or another. Trying to choose and use the best of
the available choices has caused lots of trouble over the years. It's also
trouble on PowerPC, where decrementer and timebase behavior has been
changed in incompatible ways. This hurts the architecture.

Stefan O'Rear

unread,
Apr 27, 2017, 3:37:34 AM4/27/17
to Albert Cahalan, Sean Halle, RISC-V ISA Dev
On Thu, Apr 27, 2017 at 12:27 AM, Albert Cahalan <acah...@gmail.com> wrote:
> You make a mess of the architecture for everybody when you fail to
> implement some of these things.

I think that with the device tree adoption we have a pretty good story
for CPU vendor identification and timebase handling. Feel free to
reach out privately if you have questions about how any of this is put
together.

Places where I'm less sure of our story are absolute
time/(virtual-)RTC support, and identification of M-level software.

-s

Jacob Bachmeyer

unread,
May 16, 2017, 6:16:25 PM5/16/17
to Michael Clark, sean...@intensivate.com, Andrew Waterman, RISC-V ISA Dev
Michael Clark wrote:
>> On 27 Apr 2017, at 4:22 PM, Jacob Bachmeyer <jcb6...@gmail.com
And done... It is pull request #62:
https://github.com/riscv/riscv-isa-manual/pull/62

-- Jacob

Michael Clark

unread,
May 16, 2017, 8:39:41 PM5/16/17
to jcb6...@gmail.com, sean...@intensivate.com, Andrew Waterman, RISC-V ISA Dev



Sent from my iPhone
Excellent. This is a good change. I'll read your commentary.


Reply all
Reply to author
Forward
0 new messages