DTS and Config Strings

359 views
Skip to first unread message

Don A. Bailey

unread,
Jan 10, 2017, 3:31:28 PM1/10/17
to RISC-V ISA Dev
Hi All,

Over lunch, I was taking a quick review of a previous email thread on Config Strings in RISC-V. I'm now thinking about it in the context of DTS. Are those implementing RISC-V cores using DTS as the model for configuration strings? 

I made the joke in the last thread that I was concerned we were going in the direction of 1275. Ironically, the DTS specification seems to cite 1275 in its lineage. Though, I'm actually starting to understand why DTS is becoming more adopted. 

If you are using (or just pro-) DTS, can you describe why? 

My concerns are the same as the last thread, that DTS will be a challenge to parse correctly on constrained environments. My renewed interest is that if my kernel uses DTS, I no longer have to have a board-specific port of my kernel for constrained environments (our kernel, Harvest, is also our bootloader). With DTS, we can still act as a bootloader, but in a more abstract way that makes us massively more portable than we were. 

I may not like the complexity of DTS, but I can at least now see the argument for it. 

Thanks for any thoughts.

Don A. Bailey
Founder / CEO
Lab Mouse Security

Stefan O'Rear

unread,
Jan 12, 2017, 4:36:39 AM1/12/17
to Don A. Bailey, RISC-V ISA Dev
On Tue, Jan 10, 2017 at 12:31 PM, Don A. Bailey <do...@securitymouse.com> wrote:
> Hi All,
>
> Over lunch, I was taking a quick review of a previous email thread on Config
> Strings in RISC-V. I'm now thinking about it in the context of DTS. Are
> those implementing RISC-V cores using DTS as the model for configuration
> strings?

AFAIK only the Berkeley cores run general-purpose operating systems
and they use the DTS-inspired config strings.

> I made the joke in the last thread that I was concerned we were going in the
> direction of 1275. Ironically, the DTS specification seems to cite 1275 in
> its lineage. Though, I'm actually starting to understand why DTS is becoming
> more adopted.
>
> If you are using (or just pro-) DTS, can you describe why?

1. Because device tree works well for its intended use case ( multi-MB
kernels with drivers for every piece of hardware ), and it's a
Schelling point with ready support in multiple open-source kernels as
well as supporting utilities (QEMU can generate device trees
corresponding to its internal hardware model, etc).

2. If you're making a board-specific kernel for a board with <1 MB
memory, you are not in the device tree target audience, and that's OK.

3. If your kernel has out-of-band knowledge of the board, then your
kernel doesn't need to look at the device tree. You could still do a
strcmp() check to panic if the kernel and the board got out of sync,
or not if you prefer. _Just because it's there doesn't mean the
kernel is required to look at it_

4. For intermediate sizes (let's say 64KB to 1MB), you might have a
fixed set of drivers but still use the device tree to get the memory
map. This is also a valid design point.

5. Including it will make tethered debuggers happier, but it's
perfectly OK to have it and not use it in the kernel itself.

> My concerns are the same as the last thread, that DTS will be a challenge to
> parse correctly on constrained environments. My renewed interest is that if

2 & 3 above.

> my kernel uses DTS, I no longer have to have a board-specific port of my
> kernel for constrained environments (our kernel, Harvest, is also our
> bootloader). With DTS, we can still act as a bootloader, but in a more
> abstract way that makes us massively more portable than we were.

2 & 3.

> I may not like the complexity of DTS, but I can at least now see the
> argument for it.

DTS is a pretty ugly format to parse at runtime. Config string is
better, but needs a few small changes to align it with the device tree
data model, as well as a general spec clarification. FDT is also
better for parsing, but has a moderate disadvantage of being a
typeless binary format.

I'm on the fence about whether I prefer FDT or an evolved config
string, but I'm quite sure we want the device tree data model (a tree,
each node of which has key/value properties).

I can't seem to find any links for Harvest right now, but if you can
spare 8kb of _text_ and 100k instructions of startup time, FDT and
config-string can both be parsed with no heap and very little
data/stack; so if you're doing runtime configuration of the memory map
at all it might still be worth using config-string.

It's possible to design nonhierarchical systems that are marginally
simpler, but IMO would be a bad use of time to standardize anything
simpler than FDT/config string.

> Thanks for any thoughts.

-s

ron minnich

unread,
Jan 12, 2017, 2:49:28 PM1/12/17
to Stefan O'Rear, Don A. Bailey, RISC-V ISA Dev
Having had to write parsers for all these various formats over the years, I still think config string is the one with room to grow and change cleanly, in part because it's not overly specified.

I'm reminded, since I'm having to deal with it again, of just how many tables  have grown up around the PC: ACPI, SMBIOS, _MP_, $PIR, ... it never ends. The problem was that the very specific definition of these tables meant they were very hard to extend and change. _MP_ had a version in it, for instance, and it never had more than two versions, with only version 1.4 really widely used. Binary format inflexibility leads to a proliferation of tables.

The config string is nice in that I can extend it in a way that doesn't break what came before, because of the {} scope it introduces. For example, to communicate coreboot info, I take the config string I have, and append
coreboot{...} information to it. Done. Either your kernel can parse that extra info or it can't, but I don't break kernels that can not. Further, since config strings don't have embedded pointers in them to other parts of the config string, they're position independent: ah, joy!

I don't see that it would be all that hard to write a config string parser in assembly, especially if we take the approach of making it very structured. "string" does not imply "disorganized" or "free form". It just means humans can read it :-). 

We all did that parsing in the early Unix days, for pretty complex stuff -- the first Unix compilers were not written in C -- so I believe it's not a problem for assembly.

I also don't even see config string as ruling out other formats; it might even contain them:
FDT{len=123,blob=<binary data>}
where FDT means  an fdt, and has the two key-value pairs, len and blob.
I'm not saying I like it: every vendor who's created binary tables has made a giant mess of it at some point. 

Now, personally, I would have preferred that the config string be a text protobuf or some similar simple format, but I can live with it as is.

ron

Don A. Bailey

unread,
Jan 12, 2017, 2:55:07 PM1/12/17
to ron minnich, Stefan O'Rear, RISC-V ISA Dev
I still prefer the format we discussed in the last thread. Simple strings are, as we all agree, very easy to write a parser for. More importantly, they are clear to the naked eye and require little interpretation. 

However, I did not realize that DTS is an evolving standard. Now that I know, even though I don't *like* DTS, and raised my concern regarding 1275 style solutions, it is easier for me to adapt DTS if I know there are enough organizations to keep it relatively clean and up-to-date. So, if chip implementations will use DTS specifically, I can see the value in that. If it isn't going to be DTS specifically, but some <variant>, then I would really prefer config strings in the model discussed in the prior thread. 

This is why I'd love to hear from people that are actually implementing the chips themselves. What are they actually doing? DTS will require more space and descriptiveness per "node", whereas config strings will take less complexity and memory to implement. 

As Ron has asserted, it would be great to get a standard chosen here. But, I haven't observed any commentary. Is it all behind closed doors? 

Best,

Don A. Bailey
Founder / CEO
Lab Mouse Security


--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAP6exY%2B2kV-mDsSeVreLiA7-YpUfaRbBXkt38_ofDHce0xEjBQ%40mail.gmail.com.

Stefan O'Rear

unread,
Jan 12, 2017, 3:31:46 PM1/12/17
to Don A. Bailey, ron minnich, RISC-V ISA Dev
On Thu, Jan 12, 2017 at 11:55 AM, Don A. Bailey <do...@securitymouse.com> wrote:
> However, I did not realize that DTS is an evolving standard. Now that I
> know, even though I don't *like* DTS, and raised my concern regarding 1275

I'd like you to confirm that you understand the following:

* Device Tree is an _abstract_ model of data, and does not correspond
to a specific syntax
* IEEE 1275 provides an API for Device Tree but not a syntax
* DTS is one syntax for Device Tree
* FDT is a different syntax for Device Tree
* Config string is a third syntax for Device Tree (once we fix a few
small issues)

RISC-V is not going to use DTS as a runtime format, ever. I will
personally make sure of that. It is far too complicated.

I could do either FDT or Config String. Note that both of these are
presentations of Device Tree, so I can say "the device tree" without
committing to FDT-or-configstring.

-s

Don A. Bailey

unread,
Jan 12, 2017, 3:39:29 PM1/12/17
to Stefan O'Rear, ron minnich, RISC-V ISA Dev
For those interested, when I refer to DTS I am referring to the DeviceTree Specification as described in the following document:
https://github.com/devicetree-org/devicetree-specification-released/blob/master/devicetree-specification-v0.1-20160524.pdf

This document is very explicit about the syntax and structure of objects that compose a DT. I do not consider this an abstract guidance document. 

Furthermore, a newer version of the specification is expected in the near future. 

Thanks,

Don A. Bailey
Founder / CEO
Lab Mouse Security


Stefan O'Rear

unread,
Jan 12, 2017, 3:45:07 PM1/12/17
to ron minnich, Don A. Bailey, RISC-V ISA Dev
On Thu, Jan 12, 2017 at 11:23 AM, ron minnich <rmin...@gmail.com> wrote:
> I don't see that it would be all that hard to write a config string parser
> in assembly, especially if we take the approach of making it very
> structured. "string" does not imply "disorganized" or "free form". It just
> means humans can read it :-).

Doing this right will require some careful language constraints, in
particular on numeric formats… accepting all of the "strtod" grammar
would probably be a mistake.

> I also don't even see config string as ruling out other formats; it might
> even contain them:
> FDT{len=123,blob=<binary data>}
> where FDT means an fdt, and has the two key-value pairs, len and blob.
> I'm not saying I like it: every vendor who's created binary tables has made
> a giant mess of it at some point.

Unless I'm misunderstanding you, that would seem to require that every
kernel include a config-string parser _and_ a FDT parser, which
doubles the text footprint and seriously hurts use cases like Don's.
I'd rather have converters that can be used at design time and then
feed the kernel config string _or_ FDT but not both?

Stefan O'Rear

unread,
Jan 12, 2017, 3:49:40 PM1/12/17
to Don A. Bailey, ron minnich, RISC-V ISA Dev
On Thu, Jan 12, 2017 at 12:39 PM, Don A. Bailey <do...@securitymouse.com> wrote:
> For those interested, when I refer to DTS I am referring to the DeviceTree
> Specification as described in the following document:
> https://github.com/devicetree-org/devicetree-specification-released/blob/master/devicetree-specification-v0.1-20160524.pdf

That's not what DTS means.

DTS = Device Tree Source, as described in chapter 6 of the PDF.
**this will not be used in RISC-V boot protocols**

FDT = Flattened Device Tree, as described in
https://www.kernel.org/doc/Documentation/devicetree/booting-without-of.txt

Config string = I think you know already

All of them are presentations of the Device Tree data model.

-s

Don A. Bailey

unread,
Jan 12, 2017, 4:10:22 PM1/12/17
to Stefan O'Rear, ron minnich, RISC-V ISA Dev
DTS is referenced in chapter 8, not chapter 6, of the privilege specification. Unless you're reading an alternative than the latest 1.9.1. 

The DTS they reference is, as far as I can tell, the same DTS that I am referencing. Albeit, the DeviceTree Specification that I'm referencing is written as DTSpec and I shorten it to DTS, which is used colloquially among the embedded groups I hang out with. 

That said, it is, presumably, the same thing, as the header /dts-v1/ is the same as used in the DTSpec "DTS". 

So, yes, this is what DTS means. I'd appreciate it if we stopped redirecting the conversation toward off-topic remarks. I'd like to hold off conversation until someone implementing <a spec> comes along and makes a statement. 

Thanks.

Don A. Bailey
Founder / CEO
Lab Mouse Security


Karsten Merker

unread,
Jan 12, 2017, 4:24:16 PM1/12/17
to Don A. Bailey, RISC-V ISA Dev
On Tue, Jan 10, 2017 at 01:31:26PM -0700, Don A. Bailey wrote:

> Over lunch, I was taking a quick review of a previous email thread on Config
> Strings in RISC-V. I'm now thinking about it in the context of DTS. Are those
> implementing RISC-V cores using DTS as the model for configuration strings? 
>
> I made the joke in the last thread that I was concerned we were going in the
> direction of 1275. Ironically, the DTS specification seems to cite 1275 in its
> lineage. Though, I'm actually starting to understand why DTS is becoming more
> adopted. 
>
> If you are using (or just pro-) DTS, can you describe why? 

As some background information for people who read isa-dev but
not sw-dev: we have had a rather extensive discussion about
config string and device-tree on sw-dev a while ago that shows
many of the pro- and contra-arguments for the various options:

- https://groups.google.com/a/groups.riscv.org/forum/?_escaped_fragment_=topic/sw-dev/ItrMfAqtHHU#!topic/sw-dev/ItrMfAqtHHU
- https://groups.google.com/a/groups.riscv.org/forum/?_escaped_fragment_=topic/sw-dev/XQeRD-yC6A8#!topic/sw-dev/XQeRD-yC6A8

The result of this discussion was that the embedded hardware
description (vulgo "config string") should use the device-tree data
model (including the bindings, i.e. names and types of the
properties as specified for device-tree). Whether the information
following this data model should be placed on the SoC as a textual
representation (e.g. dts) or as a binary representation (e.g.
dtb) is something that one can argue about - both have pros and
cons, but please don't come up with yet another incompatible data
model. The SBI definitely needs to pass the hardware description
to the operating system in the form of a device-tree (please refer
to the links above for a large number of reasons why this is the
only reasonable option) and it doesn't make sense to have a
different data model from hardware->SBI than from SBI->operating
system. Having different representations (i.e. textual from
hardware->SBI and binary from SBI->operating system) isn't that
much of a problem as long as the data model stays the same, but
using different data models inevitably leads to problems when
converting from one data model to the other, so by any means
avoid falling into that trap.

Regards,
Karsten
--
Gem. Par. 28 Abs. 4 Bundesdatenschutzgesetz widerspreche ich der Nutzung
sowie der Weitergabe meiner personenbezogenen Daten für Zwecke der
Werbung sowie der Markt- oder Meinungsforschung.

Michael Clark

unread,
Jan 12, 2017, 4:36:48 PM1/12/17
to Don A. Bailey, Stefan O'Rear, ron minnich, RISC-V ISA Dev
On 13 Jan 2017, at 10:10 AM, Don A. Bailey <do...@securitymouse.com> wrote:

DTS is referenced in chapter 8, not chapter 6, of the privilege specification. Unless you're reading an alternative than the latest 1.9.1. 

The DTS they reference is, as far as I can tell, the same DTS that I am referencing. Albeit, the DeviceTree Specification that I'm referencing is written as DTSpec and I shorten it to DTS, which is used colloquially among the embedded groups I hang out with. 

That said, it is, presumably, the same thing, as the header /dts-v1/ is the same as used in the DTSpec "DTS". 

So, yes, this is what DTS means. I'd appreciate it if we stopped redirecting the conversation toward off-topic remarks. I’d like to hold off conversation until someone implementing <a spec> comes along and makes a statement. 

I think a large part of the discussion was focused around reuse of existing driver code in well known kernels: 

linux-4.6.2$ find . -name '*.c' | xargs egrep '(of_read_number|of_read_addr|of_property_read_u32|of_property_read_bool|of_property_read_string)' | wc -l
    3014


The config string model could for example be plumbed into an existing property tree mechanism in various kernels to avoid rewriting drivers.

For new drivers, it’s a very much a question of how wisely the config mechanism is used. e.g. a base address can communicate enough information for a well-designed device, and it is typically all that is needed for most devices.

One example is the OpenPIC/MPIC interrupt controller which is implemented in KVM. It’s explicitly called out as a design feature that the number of IRQs and processors as well as other features can be reflected from the device aperture. See section 2.1 Feature Highlights.

http://www.mess.org/_media/datasheets/chrp/19725c_opic_spec_1.2_oct95.pdf

The granularity needs to be balanced. It’s appropriate for synthesis time register parameters to be exposed via the IO aperture of the device rather than out-of-band in a config string. Too many attributes to describe one device in the config string and things will quickly become unwieldy.

It seems clear we need a way to put 128-bit addresses in the config string, and ASCII hex is probably the most future proof. If we look at the history of PC interfaces for querying memory and peripherals (disk, RAM) we see a continual redesign of binary interfaces to accommodate larger addresses. e.g. 20-bit, 32-bit, 48-bit (SATA), 64-bit, etc.. I think encoding memory addresses in the config string is a wise use of a mechanism that we know will be future-proof in this regard.

Thanks.

Don A. Bailey
Founder / CEO
Lab Mouse Security


On Thu, Jan 12, 2017 at 1:49 PM, Stefan O'Rear <sor...@gmail.com> wrote:
On Thu, Jan 12, 2017 at 12:39 PM, Don A. Bailey <do...@securitymouse.com> wrote:
> For those interested, when I refer to DTS I am referring to the DeviceTree
> Specification as described in the following document:
> https://github.com/devicetree-org/devicetree-specification-released/blob/master/devicetree-specification-v0.1-20160524.pdf

That's not what DTS means.

DTS = Device Tree Source, as described in chapter 6 of the PDF.
**this will not be used in RISC-V boot protocols**

FDT = Flattened Device Tree, as described in
https://www.kernel.org/doc/Documentation/devicetree/booting-without-of.txt

Config string = I think you know already

All of them are presentations of the Device Tree data model.

-s


-- 
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Samuel Falvo II

unread,
Jan 12, 2017, 4:46:30 PM1/12/17
to Don A. Bailey, Stefan O'Rear, ron minnich, RISC-V ISA Dev
On Thu, Jan 12, 2017 at 1:10 PM, Don A. Bailey <do...@securitymouse.com> wrote:
> DTS is referenced in chapter 8, not chapter 6, of the privilege
> specification. Unless you're reading an alternative than the latest 1.9.1.


He's referring to chapter 6 of the same Device Tree Specifications
document you linked to. First sentence of the first paragraph, to be
specific.

--
Samuel A. Falvo II

Michael Clark

unread,
Jan 12, 2017, 4:46:54 PM1/12/17
to RISC-V ISA Dev, Stefan O'Rear, ron minnich, Don A. Bailey

On 13 Jan 2017, at 10:36 AM, Michael Clark <michae...@mac.com> wrote:

One example is the OpenPIC/MPIC interrupt controller which is implemented in KVM. It’s explicitly called out as a design feature that the number of IRQs and processors as well as other features can be reflected from the device aperture. See section 2.1 Feature Highlights.

http://www.mess.org/_media/datasheets/chrp/19725c_opic_spec_1.2_oct95.pdf


BTW I think the design of the PLIC register interface to read an IRQ number from a register and write it to the register is a simpler, better and more scalable design than the OpenPIC/MPIC IRQ bitvectors. The PLIC design is very much appropriate for message signalled interrupts, and lets the hardware handle priority and scanning of a potentially much larger number of devices that would require onerous scanning of large bitvectors in software. I can see this design approach will allow the PLIC to scale.

I was primarily mentioning OpenPIC/MPIC not for its register layout (it uses bitvector for claim). Clearly we are going to need bitvectors for enable and priority, but it would be nice if their register layout could be discovered via parameters in the MMIO aperture. We want to spend as minimum effort as possible plumbing config string through the various kernels. When you write the code, once you find the device, it’s easier if you can reflect the parameters from the context you have. i.e. initialisation routine can read the dimensions of the PLIC using a register or two in the config space.

I guess it depends on how easy it is to plumb through parameters.

Don A. Bailey

unread,
Jan 12, 2017, 4:53:38 PM1/12/17
to Samuel Falvo II, Stefan O'Rear, ron minnich, RISC-V ISA Dev
Ah, thanks Samuel. I couldn't quite grok what was being stated there. 



Don A. Bailey
Founder / CEO
Lab Mouse Security


Stefan O'Rear

unread,
Jan 12, 2017, 4:54:11 PM1/12/17
to Don A. Bailey, ron minnich, RISC-V ISA Dev
On Thu, Jan 12, 2017 at 1:10 PM, Don A. Bailey <do...@securitymouse.com> wrote:
> DTS is referenced in chapter 8, not chapter 6, of the privilege
> specification. Unless you're reading an alternative than the latest 1.9.1.

Not the privilege spec, the device tree specs.

DTS is https://rawcdn.githack.com/devicetree-org/devicetree-specification-released/master/devicetree-specification-v0.1-20160524.pdf#page=46

https://github.com/torvalds/linux/blob/master/Documentation/devicetree/booting-without-of.txt#L1098-L1100

Both of these documents indicate Device Tree Source, _not_ Device Tree
Specification.

> So, yes, this is what DTS means. I'd appreciate it if we stopped redirecting
> the conversation toward off-topic remarks.

I would too, but on-topic conversation requires mutual understanding.
I now know that you use "DTS" to mean what I mean by "Device Tree", so
we can make this work.

> I'd like to hold off conversation
> until someone implementing <a spec> comes along and makes a statement.

My next big project is going to be a RISC-V hypervisor, as an acid
test of the S-mode design and so that I can make more informed
proposals on how to change it.

-s

Don A. Bailey

unread,
Jan 12, 2017, 4:57:58 PM1/12/17
to Stefan O'Rear, ron minnich, RISC-V ISA Dev
Yeah, sorry, I thought you were pointing to the RISC-V specs and misunderstood what you were trying to say. 

Thanks,

Don A. Bailey
Founder / CEO
Lab Mouse Security


Stefan O'Rear

unread,
Jan 12, 2017, 5:01:23 PM1/12/17
to Michael Clark, RISC-V ISA Dev, ron minnich, Don A. Bailey
On Thu, Jan 12, 2017 at 1:46 PM, Michael Clark <michae...@mac.com> wrote:
> BTW I think the design of the PLIC register interface to read an IRQ number
> from a register and write it to the register is a simpler, better and more
> scalable design than the OpenPIC/MPIC IRQ bitvectors. The PLIC design is
> very much appropriate for message signalled interrupts, and lets the
> hardware handle priority and scanning of a potentially much larger number of
> devices that would require onerous scanning of large bitvectors in software.
> I can see this design approach will allow the PLIC to scale.

My main complaint with the PLIC as currently specified is that it
requires 3 MMIO accesses for every interrupt delivered, which seems
problematic for virtualization (once the PLIC's self-virtualization
abilities are exhausted).

I'd be interested in a longer discussion of the tradeoffs and design
parameters around interrupt handling.

-s

Michael Clark

unread,
Jan 12, 2017, 5:50:02 PM1/12/17
to Stefan O'Rear, RISC-V ISA Dev, ron minnich, Don A. Bailey
A message signalled interrupt approach is certainly going to better than scanning large bit vectors as used in traditional PICs.


If there is an MMIO register as there is in the current PLIC, then there needs to be either two or three accesses? with the third access amortising the cost of processing more than one interrupt inside on trap, that if signalled separately would require 4 accesses?

- check sign on cause, use alternate sparse interrupt jump table
- read IRQ number, until zero (2 accesses)
- <process interrupt>
- write IRQ number for End Of Interrupt (1 access)

Two points worth raising:

- potential for using single cause vector with one entry each for timer, software and external, remove sign bit from async exceptions
- tighter coupling of PLIC and/or software interrupts could use the (2^32 - 11) unused causes, and inject interrupts into the cause CSR


I can see the benefit in software to removing the sign check, having a single trap jump table, and potentially using “cause” to hold the message signal number. I however don’t know what constraints this would imply on the hardware side i.e. injecting an async interrupt trap cause into a CSR that may be renamed with inflight synchronous traps (page faults, etc). This is the right place to trim cycles. If a few cycles can be removed from the next order of quintillion RISC-V interrupts, it may well add up to a serious amount of processing time.

The problem with injecting into “cause” would dictate a queuing mechanism or a potential for lost interrupts which is a problem for edge triggered interrupts.


I have some notes about mideleg (potential removal). I was wondering whether mideleg, hideleg and sideleg can be removed. If there was a single cause vector then medeleg could be used (for the first 32 trap causes).

I’ll make an attempt at some justifications:

- The multiplexed timer interrupt uses M mode software based delegation i.e. setting HTIP, STIP or UTIP
- if the mideleg mechanism is used, and if there is a single clock per hart, M-mode would be starved of timer interrupts and no longer able to multiplex them
- Similar rationale for software interrupts, if there is a single source per hart they need to be multiplexed by M-mode
- There so far does not exist 4 mode specific Timer compare registers, nor 4 mode specific IPI registers, as the causes would lead one to believe
- External interrupt priority is handled by the PLIC and it presumably will not use the mideleg mechanism as it will handle a large vector of interrupts
- Delegation can be achieved by M-mode simply not setting MTIE, MSIE or MEIE, if mstatus and a lower privilege {H,S,U}{T,S,E}IE flag is set, then the interrupt will be effectively delegated.
- Only one cause per interrupt source is required as the mode is also M-mode for mtimecmp and mipi.

The only use for the delegation registers is that they allow interrupts to avoid an interrupt mask check at a higher privilege level, however this complicates the logic. Given one interrupt source for each type (timer, ipi, plic) Interrupt delegation can be performed by the higher privilege level simply by leaving interrupts masked so they get delivered to the next lower privilege level.

Michael Clark

unread,
Jan 12, 2017, 6:05:59 PM1/12/17
to Stefan O'Rear, RISC-V ISA Dev, ron minnich, Don A. Bailey

On 13 Jan 2017, at 11:49 AM, Michael Clark <michae...@mac.com> wrote:

- check sign on cause, use alternate sparse interrupt jump table
- read IRQ number, until zero (2 accesses)
- <process interrupt>
- write IRQ number for End Of Interrupt (1 access)

oh right; I see what you mean; we could use a bit to indicate it’s the last interrupt in this interrupt amortisation scheme. cycles on the processor are cheaper than cycles on the bus.

Bruce Hoult

unread,
Jan 12, 2017, 8:00:53 PM1/12/17
to Stefan O'Rear, Don A. Bailey, ron minnich, RISC-V ISA Dev
Any opinion on dtb? Is it any easier to parse? It's not hard to find editors/compilers/decompilers.


--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Ilan Pardo

unread,
Jan 26, 2017, 11:09:32 AM1/26/17
to RISC-V ISA Dev, sor...@gmail.com, rmin...@gmail.com, do...@securitymouse.com


On Friday, January 13, 2017 at 12:50:02 AM UTC+2, michaeljclark wrote:

> On 13 Jan 2017, at 11:01 AM, Stefan O'Rear <sor...@gmail.com> wrote:
>
> On Thu, Jan 12, 2017 at 1:46 PM, Michael Clark <michae...@mac.com> wrote:
>> BTW I think the design of the PLIC register interface to read an IRQ number
>> from a register and write it to the register is a simpler, better and more
>> scalable design than the OpenPIC/MPIC IRQ bitvectors. The PLIC design is
>> very much appropriate for message signalled interrupts, and lets the
>> hardware handle priority and scanning of a potentially much larger number of
>> devices that would require onerous scanning of large bitvectors in software.
>> I can see this design approach will allow the PLIC to scale.
>
> My main complaint with the PLIC as currently specified is that it
> requires 3 MMIO accesses for every interrupt delivered, which seems
> problematic for virtualization (once the PLIC's self-virtualization
> abilities are exhausted).
>
> I'd be interested in a longer discussion of the tradeoffs and design
> parameters around interrupt handling.

A message signalled interrupt approach is certainly going to better than scanning large bit vectors as used in traditional PICs.
 
Message Signaled Interrupt is better form scalability as well. Message defines target core and vector. 
Within core there could be limited amount of "vectors" (256 as x86?)  so can be tracked in pending bits in HW.
HW can expose index of "highest priority pending bit location" as "cause" to be used as a jump table to handler.
Agree simpler model is to separate vector form other exceptions.

Virtualizing MSIs is simply remapping the messages in the same manner as memory address.

Samuel Falvo II

unread,
Jan 26, 2017, 11:23:20 AM1/26/17
to Ilan Pardo, RISC-V ISA Dev, Stefan O'Rear, ron minnich, Don A. Bailey
On Thu, Jan 26, 2017 at 8:09 AM, Ilan Pardo <pardo...@gmail.com> wrote:
> Message Signaled Interrupt is better form scalability as well. Message
> defines target core and vector.
> Within core there could be limited amount of "vectors" (256 as x86?) so can
> be tracked in pending bits in HW.
> HW can expose index of "highest priority pending bit location" as "cause" to
> be used as a jump table to handler.


This was my original point, but was pretty much summarily shot down.
MSI is a feature of the *bus*, not necessarily of the software running
on a given core.

Can anyone give a valid use-case of a computer having thousands of
uniquely identifiable interrupt sources, each requiring its own
handler? I can't think of any; even zArchitecture gets by with only a
handful of *classes* of interrupts, and I can't think of a more I/O
bound architecture than that.

64 pending interrupt flags is **A** **LOT** of interrupt pending
flags. My exposure to telco equipment chips, back when I worked on
the HIPP series of chips at Hifn, used only a single interrupt pin and
a bit-vector of pending flags. I'm positive the 32-bit register had
more than a few unused bits.

Michael Clark

unread,
Jan 26, 2017, 4:11:24 PM1/26/17
to Samuel Falvo II, Ilan Pardo, RISC-V ISA Dev, Stefan O'Rear, ron minnich, Don A. Bailey

> On 27 Jan 2017, at 5:23 AM, Samuel Falvo II <sam....@gmail.com> wrote:
>
> On Thu, Jan 26, 2017 at 8:09 AM, Ilan Pardo <pardo...@gmail.com> wrote:
>> Message Signaled Interrupt is better form scalability as well. Message
>> defines target core and vector.
>> Within core there could be limited amount of "vectors" (256 as x86?) so can
>> be tracked in pending bits in HW.
>> HW can expose index of "highest priority pending bit location" as "cause" to
>> be used as a jump table to handler.
>
>
> This was my original point, but was pretty much summarily shot down.
> MSI is a feature of the *bus*, not necessarily of the software running
> on a given core.

I think reading the highest priority interrupt number from a per hart cause register is a good idea.

> Can anyone give a valid use-case of a computer having thousands of
> uniquely identifiable interrupt sources, each requiring its own
> handler? I can't think of any; even zArchitecture gets by with only a
> handful of *classes* of interrupts, and I can't think of a more I/O
> bound architecture than that.

Yes. Example is virtualised infrastructure in data centers. A distinct interrupt source is required to allow a device to raise an interrupt on a specific hart in a per hart interrupt enable bit vector. Imagine a present day 4 x 25GbE network card with 64 circular buffers on a 64 core system with one circular buffer assigned to each core. One device now requires 64 distinct interrupt 'numbers'.

> 64 pending interrupt flags is **A** **LOT** of interrupt pending
> flags. My exposure to telco equipment chips, back when I worked on
> the HIPP series of chips at Hifn, used only a single interrupt pin and
> a bit-vector of pending flags. I'm positive the 32-bit register had
> more than a few unused bits.


64 pins is a lot of interrupt pins however with message signalled interrupts ~32 pins and a valid/ready bus protocol for interrupts, the same number of pins can express 2^30 interrupt sources.

I see that even in 1995 OpenPIC were specifying 2048 interrupt sources, and this is well before commodity multi-core systems were common and we now have RISC-V systems with more than 1024 cores.

Reading a prioritised per hart cause or MMIO register is the best way to read interrupts versus scanning a large pending bit vector in software. We would need a count trailing zeros instruction to do this quickly in software.

64-bits would be fine for scanning a bit vector, but if the bit vector was 2048-bits then reading the highest priority interrupt number is going to be more efficient.

Jacob Bachmeyer

unread,
Jan 26, 2017, 6:39:37 PM1/26/17
to Samuel Falvo II, Ilan Pardo, RISC-V ISA Dev, Stefan O'Rear, ron minnich, Don A. Bailey
Samuel Falvo II wrote:
> On Thu, Jan 26, 2017 at 8:09 AM, Ilan Pardo <pardo...@gmail.com> wrote:
>
>> Message Signaled Interrupt is better form scalability as well. Message
>> defines target core and vector.
>> Within core there could be limited amount of "vectors" (256 as x86?) so can
>> be tracked in pending bits in HW.
>> HW can expose index of "highest priority pending bit location" as "cause" to
>> be used as a jump table to handler.
>>
>
>
> This was my original point, but was pretty much summarily shot down.
> MSI is a feature of the *bus*, not necessarily of the software running
> on a given core.
>
> Can anyone give a valid use-case of a computer having thousands of
> uniquely identifiable interrupt sources, each requiring its own
> handler? I can't think of any; even zArchitecture gets by with only a
> handful of *classes* of interrupts, and I can't think of a more I/O
> bound architecture than that.

If nothing else, this could be done as a rather extreme performance
optimization: move the ISR dispatch entirely into hardware. But if you
are going to spend the silicon area to support this, I would suggest
going the rest of the way and making the ISRs ordinary C functions. At
that point, you have made a new privileged architecture and I would also
suggest dedicated interrupt-handling harts while you are at it.

On a more practical note, having a range of cause codes available for
virtual interrupts would be very nice. The channel I/O model I envision
would allow a supervisor to specify a "virtual interrupt number" for any
readiness or completion notification. The specified virtual interrupt
would then turn up in the cause CSR when the S-mode interrupt handler is
invoked. The supervisor can assign virtual interrupts to I/O channels
any way it likes, from one interrupt number per I/O transaction to one
I/O interrupt number period, or anything in between.

-- Jacob

Samuel Falvo II

unread,
Jan 26, 2017, 6:50:30 PM1/26/17
to Jacob Bachmeyer, Ilan Pardo, RISC-V ISA Dev, Stefan O'Rear, ron minnich, Don A. Bailey
So now, instead of performing a boolean test to see if an IRQ has
occurred, you've replaced that logic with (at a minimum) a range test.
Doesn't seem reasonable to me.

But that's just me, I guess. I still think this is more complex than
it needs to be.

Michael Clark

unread,
Jan 26, 2017, 7:53:26 PM1/26/17
to jcb6...@gmail.com, Samuel Falvo II, Ilan Pardo, RISC-V ISA Dev, Stefan O'Rear, ron minnich, Don A. Bailey
cause injection is interesting, especially for virtualisation.

“external” could be reserved for simple 1-pin interrupt line systems e.g. zscale RV32I. A full system with integrated PLIC or virtualisation software could use cause to inject the software and hardware message signalled interrupt number potentially making use of all bits of the cause register. In the IPI case, the interrupting processor typically wants to send a message like TLB_SHOOTDOWN, or RESCHEDULE, so values below 16 are for hardware exceptions, values above are for hardware/virtual interrupts. The line becomes blurred when an injected interrupt number is from a PLIC or software emulating a hardware interrupt.

The problem is that “cause” has normal register semantics whereas the PLIC is a device that holds state and can queue while the target CPU is busy with interrupts masked. The PLIC’s MMIO claim address can return a different interrupt number each time it is read. i.e. dequeue highest priority interrupt first. In this read interrupt number model, part of the interrupt processing has been moved into hardware i.e. there is no need for is __builtin_ctz(pending) to find the first set bit in a large pending bitfield (assuming lowest value is highest priority).

But if the cause CSR is used instead of MMIO how does one handle the case of queuing a software interrupt if an ISR is already running on the target hart? i.e. cause can’t be changed while interrupts are masked. Instead of holding a bit pending from a source hard with requires 2^n bits for n harts, the problem suddenly becomes much harder. i.e. the endpoint needs to spin and attempt to redeliver. This then ends up moving interrupt processing into software. Or the PLIC has to keep state somehow.

It is at least an interesting thought experiment to think of the PLIC claim register being accessed via the cause CSR and providing a mechanism for virtualisation software to inject into cause.

So one writes to the PLIC via MMIO, it keeps state, but interrupts emerge, one at a time via mcause?

Michael Clark

unread,
Jan 26, 2017, 10:46:25 PM1/26/17
to Samuel Falvo II, Jacob Bachmeyer, Ilan Pardo, RISC-V ISA Dev, Stefan O'Rear, ron minnich, Don A. Bailey
On 27 Jan 2017, at 12:50 PM, Samuel Falvo II <sam....@gmail.com> wrote:

So now, instead of performing a boolean test to see if an IRQ has
occurred, you've replaced that logic with (at a minimum) a range test.
Doesn't seem reasonable to me.

But that's just me, I guess.  I still think this is more complex than
it needs to be.

For a 64-bit vector of interrupts there is no win over a pending bitfield vs the claim messaging approach other than the bitfield scanning that ctz will do in 1 cycle. It’s probably more circuit complexity for the message interface for pin based interrupts. In fact, if the claim messaging approach is changed to set the high bit for the last message versus reading zero then its equal to one read and one write acknowledge to a 64-bit bitfield.

But if there are going to be 512 or more IRQs then a pending bitfield requires reading 8 x 64 bit words, which is 8X the overhead, so the message based approach is a win. Interrupt acknowledge is on the fast path. However, if the interrupt controller pending mask is coherent with the cache and the 16 words end up in L2 then acknowledge overhead might just be in the noise as writes to an integrated interrupt controller should be much faster than writes to DRAM. 512-bits is one cache line.

If we consider IOV (IO Virtualisation)… hardware designers are putting 128 transmit and 128 receive queues into present day IOV devices; so when RISC-V is in the datacenter, the PLIC will be connected to single devices with 128 sources inside of a compute and storage nodes with several of these IOV devices (several multi-queue NICs and SAS HBAs per system). 

I could envisage 512 interrupts being required in one system,… which means provide for more…

It kind of shares a similar circuit to __builtin_ctz so scanning in software would eliminate duplicate function in silicon… unless the interrupts were principally message signalled… then the message signalled interface makes much more sense… as they could be queued as words. The density for number of sources is better for a small queue of words if only a few have interrupts. The bitfield approach doesn’t support queueing.

If someone wants 16,384 IRQs… then it’s clear cut. :-D

-- 
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Jacob Bachmeyer

unread,
Jan 27, 2017, 8:50:13 PM1/27/17
to Samuel Falvo II, Ilan Pardo, RISC-V ISA Dev, Stefan O'Rear, ron minnich, Don A. Bailey
Samuel Falvo II wrote:
> So now, instead of performing a boolean test to see if an IRQ has
> occurred, you've replaced that logic with (at a minimum) a range test.
> Doesn't seem reasonable to me.
>
> But that's just me, I guess. I still think this is more complex than
> it needs to be.
>

In the model I envision, the supervisor assigns virtual interrupt
numbers and can choose virtual interrupt numbers that permit dispatching
through a jump table (mask; add to table base pointer; jump). If the
supervisor assigns 11-bit virtual interrupt numbers, that could be as
little as four instructions: ANDI, LOAD-POINTER (to get the table base
address), ADD, JAL (probably "linking" into x0). If more than 2048
virtual interrupts are needed, a few more instructions are required to
form a larger mask, but most supervisors should be able to make do with
2048 virtual interrupts. Interrupt vs. synchronous trap is still a
boolean test on the sign bit in *cause. I would prefer that virtual
interrupts be allocated a similarly large space within interrupt cause
codes, perhaps virtual interrupts are indicated by the
second-most-significant-bit in *cause and the rest of the register is a
virtual interrupt number if the top two bits are set?

To clarify, virtual interrupts are only used with SBI channel I/O and
are not actual hardware interrupts, although I expect that they would
usually be reflections of hardware interrupts in practice. Instead, the
SEE saves S-mode state, sets up register values for a virtual interrupt,
loads {m,h}epc with stvec, and executes {M,H}RET to "return" to the
S-mode trap entry point. I am still working on how S-mode acknowledges
virtual interrupts and how the SEE regains control and restores the
saved S-mode state.

Virtual interrupts could also be useful in user mode, even on systems
lacking the U extension, since a supervisor could emulate URET and the
relevant CSRs in its illegal instruction handler.

-- Jacob

Reply all
Reply to author
Forward
0 new messages