mtime/mtimecmp and timer interrupt delegation

786 views
Skip to first unread message

Michael Clark

unread,
Nov 19, 2016, 10:42:43 PM11/19/16
to RISC-V ISA Dev
Hi,

I have a question about timer interrupt delegation. First - background:

- There are 4 timer interrupt causes: M_TIMER, H_TIMER, S_TIMER and U_TIMER
- There are 4 timer interrupt enable flags: mie.MTIE, mie.HTIE, mie.STIE, mie.UTIE
- There are 4 timer interrupt pending flags: mip.MTIP, mip.HTIP, mip.STIP, mip.UTIP
- There are 4 timer interrupt delegation flags: mideleg.M_TIMER, mideleg.H_TIMER, mideleg.S_TIMER, mideleg.U_TIMER

I have attempted to implement trap delegation (mideleg) and timer interrupt delegation (medeleg) as per priv-spec-1.9.1 with mtime/mtimecmp as MMIO devices (including U mode interrupts).

The question I have is about the defined semantics of timer interrupt delegation and which timer interrupt enable flags should be tested during timer interrupt delivery.

Below is the logic I have implemented. The cause transitions depending on the timer interrupt delegation flags. Is this the intention? (there is only one timer)

if mstatus.MIE=1 and mideleg.M_TIMER=0 and mie.MTIE=1
    then
        set mip.MTIP=1
        set mcause = signbit | M_TIMER
        set pc = mtvec

if mstatus.MIE=1 and mideleg.M_TIMER=1 and mideleg.H_TIMER=0 and mie.HTIE=1
    then
        set mip.HTIP=1
        set hcause = signbit | H_TIMER
        set pc = htvec

if mstatus.MIE=1 and mideleg.M_TIMER=1 and mideleg.H_TIMER=1and mideleg.S_TIMER=0 and mie.STIE=1
    then
        set mip.STIP=1
        set scause = signbit | S_TIMER
        set pc = stvec

if mstatus.MIE=1 and mideleg.M_TIMER=1 and mideleg.H_TIMER=1and mideleg.S_TIMER=1 and mie.UTIE=1
    then
        set mip.UTIP=1
        set scause = signbit | U_TIMER
        set pc = utvec

In priv-spec-1.9, Section 3.1.14 Machine Timer Registers (mtime and mtimecmp) only specify MTIE is specified, however the interrupt delegation mechanism is defined along with the timer interrupt causes for each privilege level. Just wondering if my interpretation matches the intent of the specification.

Here is the sample code:


Cheers,
Michael.

Stefan O'Rear

unread,
Nov 20, 2016, 3:56:08 AM11/20/16
to Michael Clark, RISC-V ISA Dev
On Sat, Nov 19, 2016 at 7:42 PM, Michael Clark <michae...@mac.com> wrote:
> In priv-spec-1.9, Section 3.1.14 Machine Timer Registers (mtime and
> mtimecmp) only specify MTIE is specified, however the interrupt delegation
> mechanism is defined along with the timer interrupt causes for each
> privilege level. Just wondering if my interpretation matches the intent of
> the specification.

My interpretation—

* the hardware timer causes M_TIMER interrupt, which is delivered to
M-mode and can be delegated, but always delegates as M_TIMER
* the other three "timers" are actually software interrupts triggered
by the corresponding bits of MIP, e.g. if M-mode code sets mip.STIP,
then a S_TIMER exception will immediately(!) be pending in M-mode. If
mideleg.S_TIMER and hideleg.S_TIMER are both set, then the S_TIMER
interrupt will not be taken until the hart is in S-mode

-s

Michael Clark

unread,
Nov 21, 2016, 1:07:05 PM11/21/16
to Stefan O'Rear, RISC-V ISA Dev
This is interesting. I wasn’t aware that M-mode had to run code to dispatch S-mode interrupts. If these interrupts are dispatched in hardware things will be faster… however it may necessitate more than one timer compare register in the case that more than one mode is implemented…

S-mode is able to set the sstatus.STIE bit (through sstatus masked alias to mstatus) and the timer is an MMIO page so M-mode shouldn’t need to be involved in the S-mode fast path. We’ll have to figure this out… I will proceed confidently with the model I am working on.

I’m not suggesting what I have implemented is right or wrong as the privileged spec is still evolving, but I’m now thinking about direct dispatch of interrupts to specific modes based on the delegation registers. I guess in the model you propose there are several cycles in M-mode to set mip.STIP and then MRET. Perhaps it is only a small overhead. In any case it’s worthwhile to explore the solution space. I like the xRET model as there is no need to allow for a one cycle delay before actually enabling interrupts as there is with the STI IRET model (STI waits for the next instruction to complete before enabling interrupts).

The way I have rigged the model I am working on, is that S-mode doesn’t need to involve M-mode (to set interrupt pending flag) or to use SBI for timer interrupt setup. I guess this is by design in the specification. However we may need 2 or more timer compare register with an associated priority/privilege level to control dispatch directly to a specific privilege levels, because if one timer is delegated to S-mode then there is no way for M-mode to receive timer interrupts. M-mode would become completely passive in the single timer model with S-mode delegation. I don’t know how expensive a timer compare register is per privilege level.

The model I am thinking of has one timer compare for the M-mode minimum spec level and two timer compare registers with the M-mode plus S-mode for protected memory operating system case, and potentially 3 or 4 timer compare registers with the H-mode and U-mode extended use case. 4 timer compare registers in an MMIO device have an associated privilege level (or priority) of the interrupt to the core.

SMM must have it’s own timer independent from the Per Core Local APIC on Intel (thinking aloud). Of course each core has it’s own timer so in the SMP case, there needs to be an aperture with timer compare registers for each core (and potentially for 1,2,3 or 4 priority levels). mtime and mtimecmp could be ina LLIC (Local Level Interrupt Controller) versus a distinct MMIO device. The LLIC could container the local time, local timer compare registers (1-4). The current mtime and mtimecmp model does not consider SMP.

Regarding direct interrupt delivery versus M-mode interposition. The PLIC spec also alludes to direct privilege level delivery, so what I have may not be completely wrong, or only part wrong. The case may be different in a virtualised environment where H-mode wants to choose which S-mode instance recieves the interrupt however in the model I am working in, this could be achieved by delegating the interrupts to H-mode, with H-mode then setting the correct S-mode context before setting SxIP.

"Interrupt notifications generated by the PLIC appear in the meip/heip/seip/ueip bits of
the mip/hip/sip/uip registers for M/H/S/U modes respectively”

That wording implies that at least “external interrupt” pending bits for each mode can be set directly by the PLIC, so I see no reason why this should not be the case for timers and software interrupts. The CSR machinery is there to configure the delegation and access the interrupt enable bits from masked aliases of mie and mip e.g. {m,h,s,u}ip and {m,h,s,u}ie and there is provision for setting the priority in the PLIC.



A PLIC-like Interrupt Controller

So I have now implemented a PLIC-like interrupt controller and have had similar issues with my understanding of riscv interrupt routing. I have given the PLIC interrupt delegation and cause mapping symmetrical treatment to timers and software interrupts. i.e. cause is translated to S_EXTERNAL if hardware interrupts are delegated to S-mode. I also consider the priority setting in the PLIC when delivering interrupts. I reasoned from the distinct external interrupt causes defined for M,H,S,U. I was not aware that lower privilege levels may need M-mode to dispatch interrupts (set flags and call MRET, potentially also ping ponging through H-mode), however this may be one compatible model. i.e. as long as the SxIP flag is eventually set and stvec for the correct context is invoked then it can be either soft or hard dispatch.

The MMIO devices (or each interrupt source) should have a priority/privilege level associated with them i.e. the LLIC timecmp registers should potentially be mode specific. The privilege level/priority of the interrupt should be wired in the config string to describe the routing so we know which privilege level the interrupt is associated with. M-mode interrupts in my mind are more like SMM interrupts, and H-mode delegation versus S-mode delegation is for the case where the Hypervisor is to receive the interrupt and delegate to the specific S-mode context. There are four external interrupt bits that can be set by the PLIC.

So the question is will M-mode software dispatch always be the way H,S, and U external interrupts are triggered? by software? and not by the processor directly? or is this implementation defined? I see this an implementation defined until it is standardised.

I understand that a minimal implementation of RV32I with hardware TLB, could perform a large amount of the privileged spec in M-mode software. i.e. walking page tables, handling interrupt delegation with many of the CSRs in RAM emulated via trap, etc however I imagine some implementation may choose more area if this allows saving cycles in critical paths and only have one bit and emulate much of the mode machinery in software. The trap delegation and privilege level routing seems to be something that could be done in hardware in some implementations instead of having M-mode software set mstatus.STIP, we can have the hardware set it based on a privilege level associated with an interrupt source. They are not necessarily mutually exclusive. I am working on an emulator so in my case I am setting the bit outside of the machine versus in an M-mode trap handler so this can be seen to be virtual hardware dispatch versus M-mode software dispatch.

From reading the spec, it seems that the PLIC should be able to deliver interrupts to any privilege level however {m,h,s,u}_external delegation is somewhat global, so I don’t know how the PLIC priorities, privilege levels and delegation will interwork. I do however have a prototype I am working on.


Privilege Level and Interrupt Routing

In addition to priority/privilege level for direct dispatch there is also the issue of multiprocessor interrupt routing, which is hinted at in the example config strings but not yet explicitly documented in the spec. I understand that the PLIC-like interrupt controller will likely have a NODE x HART x IRQ matrix to specify which harts are enabled to receive which IRQs (I am using Linux IRQ terminology here) so this is what I have implemented. e.g. a 4 node 64 IRQ interrupt router that supports 16 hardware threads would have a 4x16x64 bit matrix (512 bytes) to allow masking individual IRQs to the set of cores that can receive interrupts. I have a parameterisable model that is self describing (dimensions of the vectors and offsets are in MMIO config registers). This is a scalable model as the same driver can work with different scales.

Of course if the MMIO space for the PLIC starts with the dimensions of these various tables, then a good driver can adapt to different scales without need for separate devices or messing and complicated config string parsing. The config string can be reduced to just storing the base (and size) for each MMIO device aperture and the hardware is documented like any other device. I guess I have the idea of hardware that is as much as possible self-describing (too much time looking at things like OpenCL which has a topology reflection API). e.g. if you have a core with 4 harts, you have a read-only MMIO apeture on the LLIC that has this constant wired (thinking from a device driver model here). The config string of source is necessary for things that can’t be reflected on. i.e. where the synthesis can embed the parameters of its model into read-only MMIO regions.

I don’t have this exactly yet (I have the spec model for mtimecmp and the spike model for ipi), but I am considering to collapse timecmp and ipi into a LLIC and just place the base address in the config as the MMIO region would be self-describing. rtc and mtimecmp are replaced for LLIC and it would have the IPI software interrupt bit-vector, perhaps even one per privilege level (model specific).

platform { vendor meta; arch anarch128; };
ram { 0 { @ 0x80000000:0xBFFFFFFF; }; };
plic { @ 0x40001000; };
uart { @ 0x40002000; };
core { 0 { 0 { isa rv128imafdc; llic 0x40000000; }; }; };


Interrupt Priority

I also have mapped privilege level (M=3,H=2,S=1,U=0) to interrupt priority with M=3 being the highest interrupt priority, for masking lower priority interrupts when running in higher privilege level modes. i.e. if an interrupt is delegated to S-mode, and the priority for a specific IRQ is set to S=1, then a pending interrupt on that IRQ will not be triggered when the processor is in M mode even if M mode enables interrupts. The spec isn’t specific as to whether priority maps to privilege level however it seems intuitive. This way we don’t have to worry about S-mode priority interrupt sources triggering while we are in an M-mode handler. We can control timer and interrupts in S-mode as they are MMIO. IPI is also MMIO. An IPI MMIO device for 64 harts would be an 8-byte aperture and would trigger S_SOFTWARE in the harts for which bits were set in the MMIO region. I believe SMM is somewhat like this on x86 and it must have a timer interrupt source independent from the Local APIC.


Prototype PLIC

At the moment I have a prototype PLIC with a self describing MMIO config that only needs a base address in the config string. The PLIC dimensions are in read-only registers at the start of the MMIO region so the drivers can adapt to the number or IRQs and harts so a common device driver will support many scales. I have NUMA node_id * hart_id enabled matrix.


A PLIC configured for 64 IRQs, 1 pending, 2 priority bits per IRQ (4 priority levels mapping to M,H,S,U) and an enabled matrix for 4 nodes and 16 harts per node is under 4K. It’s likely easier for kernel code to use self-describing read-only registers at the start of the PLIC MMIO region because then the driver can adapt to many scales easily and it only needs a base address. This is somewhat similar to PCI config space. It also means one device spec can cover many scales.

/* PLIC config registers */

u32 num_irqs;
u32 num_nodes;
u32 num_harts;
u32 num_priorities;
u32 pending_offset;
u32 pending_length;
u32 priority0_offset;
u32 priority0_length;
u32 priority1_offset;
u32 priority1_length;
u32 enabled_offset;
u32 enabled_length;

/* PLIC data registers */

u32 pending[irq_words];
u32 priority0[irq_words];
u32 priority1[irq_words];
u32 enabled[enabled_words];

For IRQS=64, nodes=4, harts=16 we get the following size:

num_irqs         64
num_nodes        4
num_harts        16
pending_offset   48
pending_length   8
priority0_offset 56
priority0_length 8
priority1_offset 64
priority1_length 8
enabled_offset   72
enabled_length   512
total_size       584

For IRQS=64, nodes=1, harts=4 we get the following size:

num_irqs         64
num_nodes        1
num_harts        4
pending_offset   48
pending_length   8
priority0_offset 56
priority0_length 8
priority1_offset 64
priority1_length 8
enabled_offset   72
enabled_length   32
total_size       104


It made sense to split the 4 priorities into two 1-bit arrays as it allows for simplified boolean logic for priority-based dispatch. This is "implementation defined”

 * highest M (11)           (priority0[i] & priority1[i])
 *         H (10,11)        (priority0[i])
 *         S (01,10,11)     (priority0[i] | priority1[i])
 * lowest  U (00,01,10,11)


Although I imagine the PLIC is going to get somewhat more complex now the wording of the spec has changed from level triggered interrupts to including other types such as edge triggered and message signalled interrupts <http://yarchive.net/comp/linux/edge_triggered_interrupts.html>, I am now thinking of a case where the interrupt router is choosing between nodes and harts that are at different privilege levels and whether the current privilege level of the core is asserted on wires to the router. I think we should ignore the NUMA case for the short-to-medium term as it will be a distraction (just keep node_id where applicable). An S priority interrupt choosing between harts where some are in various modes {M, H, S or U}, and that the interrupt router has the constraint of choosing a hart that will accept the interrupt (there is also the wording for preferring a hart that is in WFI, so WFI state would also be asserted to on a wire and visible to the PLIC, as a hart in a high priority state will have more interrupts masked.

Mapping privilege level to interrupt priority made sense to me so I chose 4 priorities. If we we are running in M mode, and we set MIE, we only want to receive highest priority interrupts. An interrupt shouldn’t be able to lower the privilege level. Of course most often we will be receiving S mode interrupts while in U mode, where the interrupt raises the privilege level, but we can’t lower the privilege level. i.e. deliever a U-mode interrupt to a core in M-mode. It will be masked until it returns to (a specific) U-mode context.


Console IO

In any case I am getting relatively close to having something that will be able to boot riscv-linux or riscv-freebsd but I need to complete IO. I was actually drafting another email with questions about riscv-linux branches which I will ask later. I intend to implement a 16550 UART emulation. This would be similar to the Xilinx AXI 16550 (which is similar to a NS16550) however it appears that the Xilinx uses 32-bit words for each of the 8 x 8-bit UART registers so it has an unusual 32-byte aperture versus the 8-byte aperture for a typical 16550. I may go with the normal 16550 layout (perhaps I can parameterise the width for compatibility. The upper 24-bits are ignored by the Xilinx 16550). The timer in the emulator is as per riscv spec, however the PLIC interface is somewhat custom as the spec is not very specific about the PLIC register layout… yet… I don’t know if 16550 is proprietary however there seems to be many generic implementations of 16550-like uarts.

In any case I can document the layout of the prototype I am working on. I believe if we have well defined interfaces, then there is much less need for an SBI interface. Implementors may standardise on the defined MMIO apertures for the documented platform device registers, including IPI (whose mask register could in fact move into an LLIC as I mention above). I also like the idea of base address registers and config registers with dimensions for devices like the PLIC. This is somewhat like using PCI config space for device variants (different buffer sizes, etc). It may make the drivers easier to write and the config string will be smaller and easier to parse. As long as synthesis can embed a constant into a read-only MMIO apeture (ROM) then this should not be too big of a problem?


Status Update

I spent a bit of time documenting what I’m doing in relation to RISC-V:


I have previously this idea of a riscv binary translator. QEMU is a very well established multi-platform translator and it brings a wealth of emulated hardware support to ISA targets such as RISC-V, however I am specifically interested in a monomorphic transform from RISC-V to i786. This goes against the QEMU model which is n to n. Not 1 to 1 unidirectional, so I expect the results will be quite different to QEMU TCG. TCG micro-ops map more closely to RISC ops. The focus on the interpreter is because we obviously need return to interpreter from translated trace or region. There is nearly enough infrastructure in place to work on bintrans proper… or maybe I am procrastinating.



Employment

BTW. In a strange twist of fate, I am no longer in paid unemployed as of last week, which is kind of weird, and is also a bit of a problem as it may put an end to this hobby (enthusiasm?). Spending time on RISC-V which enjoyable is not going to feed me. Anyone need a hand with anything. It doesn’t need to be RISC-V specific? Likely C, C++, OpenCL, OpenGL. Not CSS and HTML for restaurant websites which is the kind of work I had for my day job. New Zealand is a primary producer so most IT here revolves around the meat and dairy industry. It is kind of depressing if one is interested in high tech. There is no longer any high tech in New Zealand since we removed all trade barriers in the 70’s and 80’s. It moved off-shore. I grew up in New Zealand’s Silicon Valley, Waihi (well more of a Tube Valley), but this industry has now disappeared from out country. Good if one is a farmer or likes nice wine and cheese. http://www.waihi.org.nz/about-us/history-and-heritage/the-pye-story/


So I have timers, interrupts, paging and pretty much the entire privilege spec implemented, but no IO. There is no IO in the privileged spec and HITF is non-standard. PMAs need work too.


$ make test-emulate
...
build/linux_x86_64/bin/riscv-test-emulate -S -m -O -p build/riscv64-unknown-elf/bin/test-m-mmio-timer
seed: 603f23136d73df789d70d3959150d2d129a69b107f562645a592d5c32abfebdae27892607202ee984c8dcdbc8d9650d7e57a2278b2a43b6e9727b39bd29ddb80
mmap-elf :0000000000011000-00000000000110f8 +R+X
soft-mmu :0000000000011000-00000000000110f8 ROM0 (0x7f4797a6e000-0x7f4797a6e0f8) +MAIN+R+X
soft-mmu :0000000080000000-00000000c0000000 RAM0 (0x7f4757a6e000-0x7f4797a6e000) +MAIN+R+W+X
soft-mmu :0000000000001000-0000000000002000 BOOT (0x0000-0x1000) +MAIN+R
soft-mmu :0000000040000000-0000000040000010 TIME (0x0000-0x0010) +IO+R+W
soft-mmu :0000000040001000-0000000040001008 MIPI (0x0000-0x0008) +IO+R+W
soft-mmu :0000000040002000-0000000040002848 PLIC (0x0000-0x0848) +IO+R+W
soft-mmu :0000000040003000-0000000040003008 UART (0x0000-0x0008) +IO+R+W
time_mmio:0x0000 -> 0xc2d374d8ea1b5 ← (reading from mtime)
time_mmio:0x0008 <- 0xc2d3789296bb5 ← (writing to mtimecmp)
TRAP     :breakpoint pc:0x110f4 badaddr:0x110f4
pdid     :0000000000000000 mode     :0000000000000003
mvendorid:0000000000000000 marchid  :0000000000000000 mimpid   :0000000000000000
misa     :800000000000112d mhartid  :0000000000000000 mstatus  :0000000000001880
medeleg  :0000000000000000 mideleg  :0000000000000000 mip      :0000000000000080
mtvec    :00000000000110dc mscratch :00000000bffffff8 mie      :0000000000000080
mepc     :00000000000110d8 mcause   :8000000000000007 mbadaddr :0000000000000000
mbase    :0000000000000000 mibase   :0000000000000000 mdbase   :0000000000000000
mbound   :0000000000000000 mibound  :0000000000000000 mdbound  :0000000000000000
stvec    :0000000000000000 sscratch :0000000000000000 sptbr    :0000000000000000
sepc     :0000000000000000 scause   :0000000000000000 sbadaddr :0000000000000000
cycle    :00000000019bfcc6 instret  :00000000019bfcc6 time     :000c2d378a20ff47
pc       :00000000000110f4 fcsr     :0000000000000000
ra       :6f6cab6f620aa9ac
sp       :00000000bfeffff8 gp       :f6a124338c5e5727 tp       :65f515a915c7eef6
t0       :000000003b9aca00 t1       :d627ea5bf572575b t2       :1f8981ad0d9ca7f8
s0       :c36821c14bd02020 s1       :8b27478c603b7b31 a0       :00000dabadabad00
a1       :000c2d3789296bb5 a2       :30eb2649c82a5234 a3       :facb9af44f34caa3
a4       :92430ec2d839c12a a5       :492b6d9174d5fbea a6       :7075019124fe1fb7
a7       :94d068c8fd1abb3c s2       :c4600b77bd3b567b s3       :ed011e7d8682459c
s4       :cd646897265e220d s5       :7e1830a3aa568aa0 s6       :075ca8ed1b6947d3
s7       :c937783c6a7b39a9 s8       :336f0de92193496b s9       :66797e681d9a14cc
s10      :2436a289861f4d52 s11      :0f07f219951c8624 t3       :d4a30e6182a38722
t4       :83505a50fc5b9264 t5       :25669be20e4653b6 t6       :11a6cc83b7425f5c

Samuel Falvo II

unread,
Nov 21, 2016, 1:18:43 PM11/21/16
to Michael Clark, Stefan O'Rear, RISC-V ISA Dev
On Mon, Nov 21, 2016 at 10:06 AM, Michael Clark <michae...@mac.com> wrote:
> This is interesting. I wasn’t aware that M-mode had to run code to dispatch
> S-mode interrupts.


It doesn't have to, as long as mideleg and medeleg registers are set
accordingly.

For example, let's pretend we want to delegate the timer interrupt to
S-mode through M-mode. As a simplification, let's further pretend
that NO H-mode exists.

If mideleg[7] is set, sideleg[5] is clear, AND a timer interrupt
happens, the processor will take the interrupt directly in S-mode,
without any intervention in M-mode at all. The Delegation bits are,
so far as the wording in the current spec are written, interpreted
like so (used a fixed-width font to see the table):

M/H/S
0 x x Interrupt/trap handled in M-mode.
1 0 x Interrupt/trap handled in H-mode.
1 1 0 Interrupt/trap handled in S-mode.
1 1 1 Interrupt/trap handled in U-mode.

However, please note that the delegation bits are aligned with their
interrupt pending bits; for this reason, the three bits that the CPU
would look at are mideleg[7], hideleg[6], and sideleg[5], when looking
at the timer interrupt.

--
Samuel A. Falvo II

Stefan O'Rear

unread,
Nov 21, 2016, 2:28:28 PM11/21/16
to Samuel Falvo II, Michael Clark, RISC-V ISA Dev
On Mon, Nov 21, 2016 at 10:18 AM, Samuel Falvo II <sam....@gmail.com> wrote:
> On Mon, Nov 21, 2016 at 10:06 AM, Michael Clark <michae...@mac.com> wrote:
>> This is interesting. I wasn’t aware that M-mode had to run code to dispatch
>> S-mode interrupts.
>
>
> It doesn't have to, as long as mideleg and medeleg registers are set
> accordingly.
>
> For example, let's pretend we want to delegate the timer interrupt to
> S-mode through M-mode. As a simplification, let's further pretend
> that NO H-mode exists.
>
> If mideleg[7] is set, sideleg[5] is clear, AND a timer interrupt
> happens, the processor will take the interrupt directly in S-mode,
> without any intervention in M-mode at all. The Delegation bits are,
> so far as the wording in the current spec are written, interpreted
> like so (used a fixed-width font to see the table):

sideleg[5] doesn't matter. What matters is sideleg[7], because you've
just delegated a "machine timer (7)" interrupt, which is a different
and distinguishable thing from handling "machine timer (7)" and
asserting "supervisor timer (5)".

AFAICT.

-s

Samuel Falvo II

unread,
Nov 21, 2016, 2:43:06 PM11/21/16
to Stefan O'Rear, Michael Clark, RISC-V ISA Dev
On Mon, Nov 21, 2016 at 11:28 AM, Stefan O'Rear <sor...@gmail.com> wrote:
> sideleg[5] doesn't matter. What matters is sideleg[7], because you've
> just delegated a "machine timer (7)" interrupt, which is a different
> and distinguishable thing from handling "machine timer (7)" and
> asserting "supervisor timer (5)".

I may be remembering from v1.7, but any real documentation on this is
definitely missing from v1.9.1. ISTR that sideleg is a shadow
register of mideleg, but the M- and H-bits are masked off. You would
have no choice but to use bit 5. This seems to agree with several
supporting pieces of evidence:

- The spec *does* state that mideleg has the exact same bit layout as
the mip register, and,
- The spec *does* state that sip is a shadow of mip but with
higher-privilege bits inaccessible and hardwired 0.

From this, I can inductively reason that sideleg has the same bit
layout as sip, meaning access to M- and H-bits are inaccessible to
S-mode code to control delegation. It also makes more sense this way,
since otherwise S-mode code would be able to distinguish a hardware
direct S-mode trap and an M-mode delegated trap. I believe this
defeats virtualization. You'd want both to look identical to S-mode
code.

Samuel Falvo II

unread,
Nov 21, 2016, 2:49:15 PM11/21/16
to Stefan O'Rear, Michael Clark, RISC-V ISA Dev
On Mon, Nov 21, 2016 at 11:43 AM, Samuel Falvo II <sam....@gmail.com> wrote:
> On Mon, Nov 21, 2016 at 11:28 AM, Stefan O'Rear <sor...@gmail.com> wrote:
>> sideleg[5] doesn't matter. What matters is sideleg[7], because you've
>
> I may be remembering from v1.7, but any real documentation on this is

On a somewhat tangentially related topic, the fact that we're having
this very discussion is THE reason why I object to having M-class
registers hold bits for H/S/U-class fields. Shadow registers are
great when you're extending an architecture to cover new things, but
for what we're using them for here, I think they're a mistake.
They're obnoxiously hard to document, and even when documented, can
lead to questions of semantics like what we're having here.

mstatus, sstatus, ustatus, mip, hip, sip, uip, etc. should be distinct
registers with compatible layouts, sure, but which are backed by
separate registers in their HDL implementation. This not only yeilds
benefits in the standardization process, but also minimizes
documentation effort by allowing re-use of concepts more easily, and
greatly simplifies the HDL used to implement the logic.

But, I digress; I won't go further on this topic on this subject heading.

Michael Clark

unread,
Nov 21, 2016, 3:26:10 PM11/21/16
to Samuel Falvo II, Stefan O'Rear, RISC-V ISA Dev
This is how I interpret it, however it seemed rational to send cause S_TIMER to S-mode so I translate the cause as I walk the delegation registers.

I can test against Spike… I’m getting close to the point where I have a comparative model… once i’ve added IO (a real virtual 16550 UART via MMIO versus HTIF CSR IO).

In fact i’m only doing 16550 for compatibility. It has a tiny and broken FIFO model. I indent to implement circular buffer IO for higher bandwidth devices.

1 interrupt per 8-bits is pretty inefficient.

Allen J. Baum

unread,
Nov 21, 2016, 7:29:58 PM11/21/16
to Samuel Falvo II, Stefan O'Rear, Michael Clark, RISC-V ISA Dev
I disagree from a cost point of view. CSRs are very expensive to implement, even in the case where we would have 2 CSRs with one implemented bit each, vs. one that has 2 implemented bits.

If I can implement one CSR instead of 4, and merely condition the reading and writing based on a fixed mask generated by register address - that's a huge win.
This is an implementation trick whose architectural visibility is that a read or write to a single register is required to set all bits for some functin instead of one per priv level. It does meanthatwe have 4 bits o

Typically, I also like to separate the C vs. S in CSR. That is, readonly status bits in one CSR, control bits in another. That's hard to do sometimes, since some status bits need to be cleared (e.g. int pending) and rarely written (typically for HW debug only). Intel CSRs have different bit types,, e.g. RO, RW, RW1C (read, write 1 to clear), and even RW0C. RW are typically control; the rest are good for status registers; since you can specific exactly which bits you want to clear with a mask containing 1s in the correct positions. This lets you read status, clear it, then read it again to see if bits had been set in the meantime (or set again after clearing), so as not to require atomic ops.

If I remember, RW0C typically covers the entire register, not individual bits, so you would read the status, then clear it by writing zeros.
>--
>You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
>To post to this group, send email to isa...@groups.riscv.org.
>Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
>To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAEz%3DsokhsLs9q_qtfAs%3D480A2BUOfPwDrmq3a%3DrheVPS3NSnyw%40mail.gmail.com.


--
**************************************************
* Allen Baum tel. (908)BIT-BAUM *
* 248-2286 *
**************************************************

Michael Clark

unread,
Nov 21, 2016, 7:41:11 PM11/21/16
to Stefan O'Rear, Samuel Falvo II, RISC-V ISA Dev
Well with one timer then the model is broken if interpreted in this way as then H,M,S_TIMER interrupts have no way to be generated than with software which seems a bit wrong to me.

The way I have modelled it, is that S-mode can use the timer (it just sees the MMIO segment and not the names so mtime and mtimecmp don’t exists), and S-mode gets S_TIMER interrupts. This seems right.

In my case, it’s whether it’s code outside or inside of the machine simulation that does the delegation. I am simulating an S-mode that doesn’t depend on M-mode or SBI.

The spec needs to evolve to allow multiple hardware timers and have priority and privilege somehow linked. I alluded to those alternatives. I think there is a hardware option.

If I have 2 timers I’d wire one to M and one to S.

Michael Chapman

unread,
Nov 22, 2016, 2:28:40 AM11/22/16
to isa...@groups.riscv.org

What we did many years ago for registers with mixed status/control was to use two user visible register bits per real HW bit.
When read you would get the value 01 or 10 for set or clear respectively.

For writing we used:-
    01 -> set
    10 -> clear
    00 -> don't touch
    11 -> toggle

This enabled any combination of bits to be set/cleared/toggled or left alone (important when the bit could potentially be changed by HW) when writing thus avoiding any read/modify/write concurrency issues and it also enabled easy testing for any combination of 0, 1 or don't care when reading with an ANDI and BEQ.

Michael Chapman

kr...@berkeley.edu

unread,
Nov 22, 2016, 11:05:18 AM11/22/16
to Michael Chapman, isa...@groups.riscv.org

Fun, but we should try and avoid needing this, if only to reduce
hardware cost and context-switch overhead (another reason to see
multiple privilege levels in one CSR).
The atomic set/clear bit CSR instructions help reduce the need for
this, while reducing number of CSRs.

Krste
| 7e9fb234-b33d-d59b-d04f-5245e3307d63%40gmail.com.
Reply all
Reply to author
Forward
0 new messages