Debian/riscv64: rdcycle causing Illegal instruction

428 views
Skip to first unread message

Mathieu Malaterre

unread,
Sep 1, 2022, 2:36:57 AM9/1/22
to RISC-V SW Dev
Hi there,

We are trying to understand what happen recently in one of Debian package: highway.

It turns out that the following code (*) trigger Illegal instruction on particular arch/kernel combination:

It works fine on
- QEMU running a 5.18.16-1 kernel
- Hifive Unleashed running a 5.10.28 kernel
- Hifive Unmatched running a 5.16.14-1 kernel
- Polarfire Icicle running kernel 5.18.14-1

However it produces a SIGILL on
- Hifive Unmatched running a 5.18.14-1 kernel
- Hifive Unmatched running a 5.18.16-1 kernel

I'd like to request this community to test the below code on Hifive Unmatched with a  non Debian kernel 5.18.x.

Thanks for your report

(*)
$ cat t.c
#include <stdio.h>
#include <stdint.h>

int main()
{
    uint64_t t;

    asm volatile("rdcycle %0" : "=r"(t));

    printf("cycles: %ld\n", t);
}

Tommy Murphy

unread,
Sep 1, 2022, 2:46:45 AM9/1/22
to Mathieu Malaterre, RISC-V SW Dev
Could this be related to the recent separation of Zicsr (Zicencei) from the base I (integer) extension and the implementation of that change in the relevant tools (GCC/llvm et. al.)?

Previously, compiling with, say, -march=rv64gc and linking with rv64gc libs was sufficient to allow code to use CSR access instructions but at some point, this changed to -march=rv64gc_Zicsr (and usually -march=rv64gc_Zicsr_Zifencei) and linking with rv64gc_Zicsr (or rv64gc_Zicsr_Zifencei) libs.

Mathieu Malaterre

unread,
Sep 1, 2022, 2:55:41 AM9/1/22
to RISC-V SW Dev, tommy_...@hotmail.com, Mathieu Malaterre
Here is how gcc is configured on Debian/riscv64:

$ gcc --verbose t.c
[...]
gcc version 12.2.0 (Debian 12.2.0-1)
COLLECT_GCC_OPTIONS='-v' '-march=rv64imafdc_zicsr_zifencei' '-mabi=lp64d' '-misa-spec=20191213' '-march=rv64imafdc_zicsr_zifencei' '-dumpdir' 'a-'
[...]

So I suspect this is fine.

Tommy Murphy

unread,
Sep 1, 2022, 3:01:44 AM9/1/22
to Mathieu Malaterre, RISC-V SW Dev, Mathieu Malaterre
Yes, that looks correct so most likely discounts my hypothesis.

Is it possible to debug and see the $mepc, $mcause and $mtval at the point at which the fault occurs? That might shed some light on the reason for this issue.

Palmer Dabbelt

unread,
Sep 1, 2022, 5:08:47 PM9/1/22
to tommy_...@hotmail.com, mathieu....@gmail.com, sw-...@groups.riscv.org
On Wed, 31 Aug 2022 23:46:39 PDT (-0700), tommy_...@hotmail.com wrote:
> Could this be related to the recent separation of Zicsr (Zicencei) from the base I (integer) extension and the implementation of that change in the relevant tools (GCC/llvm et. al.)?
>
> Previously, compiling with, say, -march=rv64gc and linking with rv64gc libs was sufficient to allow code to use CSR access instructions but at some point, this changed to -march=rv64gc_Zicsr (and usually -march=rv64gc_Zicsr_Zifencei) and linking with rv64gc_Zicsr (or rv64gc_Zicsr_Zifencei) libs.

That just changes which instructions the assembler will consider valid,
it doesn't change what the hardware does. IIRC the SiFive systems have
always trapped into machine mode on rdtime/rdcycle, even those that
predate the csr/timer/counter split.

My first guess here would be that something in the firmware has changed,
as Linux just assumes those are either supported by the hardware or
emulated in M-mode. These ISA refactorings have broken a handful of
uABI bits like this, we should really have code in the kernel to handle
them independently of the firmware.

atish patra

unread,
Sep 1, 2022, 6:42:55 PM9/1/22
to Palmer Dabbelt, tommy_...@hotmail.com, mathieu....@gmail.com, sw-...@groups.riscv.org
I just replied to the kernel thread[1]. But it has not shown up in the infradead server to share the link.
I will just reiterate my response here

The breakage happened because the new pmu driver only enabled TM bit by default instead of all three.
The change was intentional due to security reasons. One rogue process can have access to cycle &
instructions of the entire kernel always which can lead to some sort of side channel attacks. 
 
However, I agree that we can't break userspace. I was not aware of the fact that there are already users of rdcycle
in the userspace. I will send a patch to restore the original behavior by enabling CY, IR bits in scounteren."
 
--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/mhng-80c535b2-e797-4d75-bba9-a529a96e43e2%40palmer-ri-x1c9.


--
Regards,
Atish

Tommy Murphy

unread,
Sep 1, 2022, 7:09:30 PM9/1/22
to atish patra, Palmer Dabbelt, mathieu....@gmail.com, sw-...@groups.riscv.org

How does that explain the different results with kernel 5.18.x on different platforms? From the original post...

Tommy Murphy

unread,
Sep 1, 2022, 7:12:51 PM9/1/22
to atish patra, Palmer Dabbelt, mathieu....@gmail.com, sw-...@groups.riscv.org
Sorry, just noticed that it's presumably due to different SBI PMU implementations on different platforms.

atish patra

unread,
Sep 1, 2022, 7:14:46 PM9/1/22
to Tommy Murphy, Palmer Dabbelt, mathieu....@gmail.com, sw-...@groups.riscv.org
Can you copy paste the boot log that will indicate which PMU driver is being used?
I suspect PolarFire/Qemu is running an older OpenSBI version which resulted in legacy driver selection which preserved the behavior. 


--
Regards,
Atish

Tommy Murphy

unread,
Sep 1, 2022, 7:26:51 PM9/1/22
to atish patra, Palmer Dabbelt, mathieu....@gmail.com, sw-...@groups.riscv.org
Sorry, I didn't report the issue and don't have the wherewithal to do any testing.

Paul Walmsley

unread,
Sep 1, 2022, 10:36:28 PM9/1/22
to RISC-V SW Dev, tommy_...@hotmail.com, pal...@dabbelt.com, mathieu....@gmail.com, sw-...@groups.riscv.org, atis...@gmail.com
If I recall correctly,  other architectures don't allow direct access to their cycle counters from userspace for security reasons.   Any reason why RISC-V shouldn't follow the same approach?

- Paul

atish patra

unread,
Sep 1, 2022, 10:44:25 PM9/1/22
to Paul Walmsley, RISC-V SW Dev, mathieu....@gmail.com, pal...@dabbelt.com, tommy_...@hotmail.com
On Thu, Sep 1, 2022 at 7:36 PM Paul Walmsley <paul.w...@sifive.com> wrote:
If I recall correctly,  other architectures don't allow direct access to their cycle counters from userspace for security reasons.   Any reason why RISC-V shouldn't follow the same approach?

That was the intention behind disabling the access in the PMU driver. At that time, I didn’t find any users. Obviously I was wrong 😂. But is that a sufficient reason to existing break user space ?



- Paul
On Thursday, September 1, 2022 at 4:26:51 PM UTC-7 tommy_...@hotmail.com wrote:
Sorry, I didn't report the issue and don't have the wherewithal to do any testing.
--
Regards,
Atish

Anup Patel

unread,
Sep 2, 2022, 2:49:24 AM9/2/22
to atish patra, Paul Walmsley, RISC-V SW Dev, mathieu....@gmail.com, pal...@dabbelt.com, tommy_...@hotmail.com
On Fri, Sep 2, 2022 at 8:14 AM atish patra <atis...@gmail.com> wrote:
>
>
>
> On Thu, Sep 1, 2022 at 7:36 PM Paul Walmsley <paul.w...@sifive.com> wrote:
>>
>> If I recall correctly, other architectures don't allow direct access to their cycle counters from userspace for security reasons. Any reason why RISC-V shouldn't follow the same approach?
>
>
> That was the intention behind disabling the access in the PMU driver. At that time, I didn’t find any users. Obviously I was wrong 😂. But is that a sufficient reason to existing break user space ?

As already pointed out, we can't compromise security by having all
apps unrestricted access to the cycle counter. We already have the
Linux perf subsystem for managing counters so apps should always use
Linux perf syscalls.

IMO, the package "highway" directly accessing the cycle counter should
be fixed instead of fixing the Linux SBI PMU driver.

Further investigating the highway project
(https://github.com/google/highway), it seems this project is using
"rdcycle" to track timer ticks which is totally wrong. Instead the use
of "rdcycle" should be replaced with "rdtime" in this project.
(Refer, line 156 of
https://github.com/google/highway/blob/master/hwy/nanobenchmark.cc)

Regards,
Anup



>
>>
>>
>> - Paul
>> On Thursday, September 1, 2022 at 4:26:51 PM UTC-7 tommy_...@hotmail.com wrote:
>>>
>>> Sorry, I didn't report the issue and don't have the wherewithal to do any testing.
>
> --
> Regards,
> Atish
>
> --
> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CAELrHRDugBr2k91SQWKCxnxmk7J8Udzcstw2TH8Wyi%2BYysPbBA%40mail.gmail.com.

Mathieu Malaterre

unread,
Sep 2, 2022, 2:53:33 AM9/2/22
to Anup Patel, atish patra, Paul Walmsley, RISC-V SW Dev, pal...@dabbelt.com, tommy_...@hotmail.com
On Fri, Sep 2, 2022 at 8:49 AM Anup Patel <an...@brainfault.org> wrote:
>
> On Fri, Sep 2, 2022 at 8:14 AM atish patra <atis...@gmail.com> wrote:
> >
> >
> >
> > On Thu, Sep 1, 2022 at 7:36 PM Paul Walmsley <paul.w...@sifive.com> wrote:
> >>
> >> If I recall correctly, other architectures don't allow direct access to their cycle counters from userspace for security reasons. Any reason why RISC-V shouldn't follow the same approach?
> >
> >
> > That was the intention behind disabling the access in the PMU driver. At that time, I didn’t find any users. Obviously I was wrong 😂. But is that a sufficient reason to existing break user space ?
>
> As already pointed out, we can't compromise security by having all
> apps unrestricted access to the cycle counter. We already have the
> Linux perf subsystem for managing counters so apps should always use
> Linux perf syscalls.
>
> IMO, the package "highway" directly accessing the cycle counter should
> be fixed instead of fixing the Linux SBI PMU driver.
>
> Further investigating the highway project
> (https://github.com/google/highway), it seems this project is using
> "rdcycle" to track timer ticks which is totally wrong. Instead the use
> of "rdcycle" should be replaced with "rdtime" in this project.
> (Refer, line 156 of
> https://github.com/google/highway/blob/master/hwy/nanobenchmark.cc)

Pay attention that quite a few projects are using rdcycle in user-space already:

* https://codesearch.debian.net/search?q=%22rdcycle+%250%22

Anyway if you believe rdtime is the right fix, we can fix one project
at a time...

> Regards,
> Anup
>
>
>
> >
> >>
> >>
> >> - Paul
> >> On Thursday, September 1, 2022 at 4:26:51 PM UTC-7 tommy_...@hotmail.com wrote:
> >>>
> >>> Sorry, I didn't report the issue and don't have the wherewithal to do any testing.
> >
> > --
> > Regards,
> > Atish
> >
> > --
> > You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> > To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CAELrHRDugBr2k91SQWKCxnxmk7J8Udzcstw2TH8Wyi%2BYysPbBA%40mail.gmail.com.



--
Mathieu

atish patra

unread,
Sep 2, 2022, 3:00:06 AM9/2/22
to Mathieu Malaterre, Anup Patel, Paul Walmsley, RISC-V SW Dev, pal...@dabbelt.com, tommy_...@hotmail.com
On Thu, Sep 1, 2022 at 11:53 PM Mathieu Malaterre <mathieu....@gmail.com> wrote:
On Fri, Sep 2, 2022 at 8:49 AM Anup Patel <an...@brainfault.org> wrote:
>
> On Fri, Sep 2, 2022 at 8:14 AM atish patra <atis...@gmail.com> wrote:
> >
> >
> >
> > On Thu, Sep 1, 2022 at 7:36 PM Paul Walmsley <paul.w...@sifive.com> wrote:
> >>
> >> If I recall correctly,  other architectures don't allow direct access to their cycle counters from userspace for security reasons.   Any reason why RISC-V shouldn't follow the same approach?
> >
> >
> > That was the intention behind disabling the access in the PMU driver. At that time, I didn’t find any users. Obviously I was wrong 😂. But is that a sufficient reason to existing break user space ?
>
> As already pointed out, we can't compromise security by having all
> apps unrestricted access to the cycle counter. We already have the
> Linux perf subsystem for managing counters so apps should always use
> Linux perf syscalls.
>
> IMO, the package "highway" directly accessing the cycle counter should
> be fixed instead of fixing the Linux SBI PMU driver.
>
> Further investigating the highway project
> (https://github.com/google/highway), it seems this project is using
> "rdcycle" to track timer ticks which is totally wrong. Instead the use
> of "rdcycle" should be replaced with "rdtime" in this project.
> (Refer, line 156 of
> https://github.com/google/highway/blob/master/hwy/nanobenchmark.cc)

Pay attention that quite a few projects are using rdcycle in user-space already:

* https://codesearch.debian.net/search?q=%22rdcycle+%250%22

Anyway if you believe rdtime is the right fix, we can fix one project
at a time...


It would be good to continue this discussion in the kernel mailing list as more folks watch that.
I will copy paste the response until now.
 
> Regards,
> Anup
>
>
>
> >
> >>
> >>
> >> - Paul
> >> On Thursday, September 1, 2022 at 4:26:51 PM UTC-7 tommy_...@hotmail.com wrote:
> >>>
> >>> Sorry, I didn't report the issue and don't have the wherewithal to do any testing.
> >
> > --
> > Regards,
> > Atish
> >
> > --
> > You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> > To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CAELrHRDugBr2k91SQWKCxnxmk7J8Udzcstw2TH8Wyi%2BYysPbBA%40mail.gmail.com.



--
Mathieu


--
Regards,
Atish

Sean Halle

unread,
Sep 3, 2022, 9:36:20 AM9/3/22
to atish patra, Mathieu Malaterre, Anup Patel, Paul Walmsley, RISC-V SW Dev, pal...@dabbelt.com, tommy_...@hotmail.com

I am not in the kernel group, so responding here..

The proto-runtime project requires direct access to rdcycle in userspace for high performance parallel code.  Going through perf or making a system call cripples the performance.  At the minimum, we need a low barrier mechanism for the user to enable rdcycle for their own process.  After enabling, the process has direct execution of the instruction (no overhead).

Thanks,

Sean


Anup Patel

unread,
Sep 3, 2022, 11:55:23 AM9/3/22
to Sean Halle, atish patra, Mathieu Malaterre, Paul Walmsley, RISC-V SW Dev, pal...@dabbelt.com, tommy_...@hotmail.com
On Sat, Sep 3, 2022 at 7:06 PM Sean Halle <sean...@gmail.com> wrote:
>
>
> I am not in the kernel group, so responding here..
>
> The proto-runtime project requires direct access to rdcycle in userspace for high performance parallel code. Going through perf or making a system call cripples the performance. At the minimum, we need a low barrier mechanism for the user to enable rdcycle for their own process. After enabling, the process has direct execution of the instruction (no overhead).

To address this, we will be having a sysctl interface (just like ARM)
to explicitly enable direct user-space access to HPM counters.
The corresponding kernel patch can be found here:
https://www.spinics.net/lists/kernel/msg4490809.html

Please note that configuring/initializing the counter still needs to
be done via Linux perf syscalls.

Regards,
Anup
Reply all
Reply to author
Forward
0 new messages