What is the purpose of the cycle/mcycle CSR?

1,168 views
Skip to first unread message

Rishiyur Nikhil

unread,
Sep 8, 2017, 6:07:19 PM9/8/17
to RISC-V ISA Dev
I'm unclear about the purpose of the 'cycle/mcycle' CSR.
Can it ever be used in a portable way?

In different implementations, cycles will advance at various rates.
Even on a single implementation, cycles may advance at varying rates,
for power management.

Simulators, emulators, and implementations in asynchronous logic do not
even have a meaningful notion of a "cycle"; they have to fake it somehow.

So, what property of cycle/mcycle can software rely upon to do anything useful?

One candidate property may be that it is monotonic and non-decreasing.
But we cannot even rely on it increasing, i.e., it can be stuck for arbitrary durations
(even infinitely) at a constant value, e.g., in asynchronous logic, simulators and emulators.

If so, how can software ever use it in a meaningful and portable way?

Can someone please shed some light on this?

Thanks,

Nikhil

kr...@berkeley.edu

unread,
Sep 8, 2017, 6:40:33 PM9/8/17
to Rishiyur Nikhil, RISC-V ISA Dev

Anyone tuning performance code will be very upset if we remove the
cycle counter. It is the simplest widget that gives the biggest gain
for code tuning.

Its desirable properties are that it is monotonic, proportional to
performance (on almost all systems over short time intervals), and
cheap to access. Functional simulators and asynch systems can
increment it once per instruction. Emulators will have the same clock
cycles as the RTL they're emulating.

Another aspect is that it (and instructions retired) counters are a
subset of more complete performance counters. Code tuning itself is
not portable, but to the extent possible, we'd like to have common
interface to common performance analysis components.

Krste
| --
| You received this message because you are subscribed to the Google Groups
| "RISC-V ISA Dev" group.
| To unsubscribe from this group and stop receiving emails from it, send an email
| to isa-dev+u...@groups.riscv.org.
| To post to this group, send email to isa...@groups.riscv.org.
| Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/
| .
| To view this discussion on the web visit https://groups.google.com/a/
| groups.riscv.org/d/msgid/isa-dev/
| CAAVo%2BP%3DD09DqqaE1JO3CvP4a5KRk8%2B0vUnKFUmPPmyjLOTF2%2BA%40mail.gmail.com.

Rishiyur Nikhil

unread,
Sep 11, 2017, 11:19:42 AM9/11/17
to Krste Asanovic, RISC-V ISA Dev
Ok, I think I understand.

You're saying that, yes, rdcycle is difficult to use in a portable manner,
but it is immensely useful in certain fundamentally non-portable situations,
such as platform-specific performance tuning.

Thanks,

Nikhil


Samuel Falvo II

unread,
Sep 11, 2017, 11:28:43 AM9/11/17
to Rishiyur Nikhil, Krste Asanovic, RISC-V ISA Dev
It can also be used more or less portably as well by treating the
numbers returned from rdcycle in an opaque manner. Just as they say
you can never truly compare CPUs on clock speed alone, you can never
compare two RISC-V implementations on rdcycle alone.

What you can do, however, is measure algorithmic impact. Algorithmic
impact tends to have measurable performance hits or improvements which
can be measured in percentages. So the absolute values returned by
rdcycle may not hold much significance; however, when used to compute
percentages, it can still deliver valuable information.

For this reason, my CPUs implement rdcycle using a 64-bit up-counter
that ticks on every CPU clock tick, even though the CPU can take
numerous cycles to execute a single instruction. If/when I get around
to implementing rdinstret, then you can use this to calculate the
average number of CPU cycles per instruction, at which point you can
then reasonably start comparing RISC-V implementations against each
other.
>> | to isa-dev+u...@groups.riscv.org.
>> | To post to this group, send email to isa...@groups.riscv.org.
>> | Visit this group at
>> https://groups.google.com/a/groups.riscv.org/group/isa-dev/
>> | .
>> | To view this discussion on the web visit https://groups.google.com/a/
>> | groups.riscv.org/d/msgid/isa-dev/
>> |
>> CAAVo%2BP%3DD09DqqaE1JO3CvP4a5KRk8%2B0vUnKFUmPPmyjLOTF2%2BA%40mail.gmail.com.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit
> https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAAVo%2BPnND28LOsFXcq0bQgnc0y5p0t34caPG_ThH47vy1SXD4w%40mail.gmail.com.



--
Samuel A. Falvo II

Rishiyur Nikhil

unread,
Sep 11, 2017, 11:53:19 AM9/11/17
to Samuel Falvo II, Krste Asanovic, RISC-V ISA Dev
True.

But it also seems legal for rdcycle always to return the same constant value (e.g., 0),
(e.g., in an ISS or asynchronous-logic implem)
in which case it would be quite useless on those platforms.
Hence my question of whether there is any assumption/expectation of monotonic increase in finite time.

Rgds,

Nikhil



>> | To post to this group, send email to isa...@groups.riscv.org.
>> | Visit this group at
>> https://groups.google.com/a/groups.riscv.org/group/isa-dev/
>> | .
>> | To view this discussion on the web visit https://groups.google.com/a/
>> | groups.riscv.org/d/msgid/isa-dev/
>> |
>> CAAVo%2BP%3DD09DqqaE1JO3CvP4a5KRk8%2B0vUnKFUmPPmyjLOTF2%2BA%40mail.gmail.com.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit

kr...@berkeley.edu

unread,
Sep 11, 2017, 3:57:25 PM9/11/17
to Rishiyur Nikhil, Krste Asanovic, RISC-V ISA Dev

>>>>> On Mon, 11 Sep 2017 11:19:09 -0400, Rishiyur Nikhil <nik...@bluespec.com> said:
| Ok, I think I understand.
| You're saying that, yes, rdcycle is difficult to use in a portable manner,
| but it is immensely useful in certain fundamentally non-portable situations,
| such as platform-specific performance tuning.

I would have phrased it differently.

rdcycle is easy to use in a portable manner for performance tuning on
almost all real systems in practice. It is difficult to use portably
for a few real systems and some simulations.

Krste
| | to isa-dev+u...@groups.riscv.org.

kr...@berkeley.edu

unread,
Sep 11, 2017, 4:11:25 PM9/11/17
to Rishiyur Nikhil, Samuel Falvo II, Krste Asanovic, RISC-V ISA Dev

>>>>> On Mon, 11 Sep 2017 11:52:47 -0400, Rishiyur Nikhil <nik...@bluespec.com> said:

| True.
| But it also seems legal for rdcycle always to return the same constant value
| (e.g., 0),
| (e.g., in an ISS or asynchronous-logic implem)
| in which case it would be quite useless on those platforms.

And so you would question why the designers of such systems would
choose such an implementation. Even in asych implementations you can
count completion tokens or some other measure of progress.

| Hence my question of whether there is any assumption/expectation of monotonic
| increase in finite time.

To be useful, yes.

Krste
||| | to isa-dev+u...@groups.riscv.org.
||| | To post to this group, send email to isa...@groups.riscv.org.
||| | Visit this group at
||| https://groups.google.com/a/groups.riscv.org/group/isa-dev/
||| | .
||| | To view this discussion on the web visit https://groups.google.com/a/
||| | groups.riscv.org/d/msgid/isa-dev/
||| |
||| CAAVo%2BP%3DD09DqqaE1JO3CvP4a5KRk8%2B0vUnKFUmPPmyjLOTF2%2BA%40mai
| l.gmail.com.
||
||
|| --
|| You received this message because you are subscribed to the Google Groups
|| "RISC-V ISA Dev" group.
|| To unsubscribe from this group and stop receiving emails from it, send an
|| email to isa-dev+u...@groups.riscv.org.

Guy Lemieux

unread,
Sep 11, 2017, 4:23:53 PM9/11/17
to Krste Asanovic, Rishiyur Nikhil, RISC-V ISA Dev
Nikhil, I like your question and I don't think you've been given a
straight answer.

The spec says: "which holds a count of the number of clock cycles
executed by the processor core [...] The underlying 64-bit counter
should never overflow in practice. The rate at which the cycle counter
advances will depend on the implementation and operating environment."

The spec does not say "successive reads to rdcycle will be
monotonically increasing", but then it does not say "successive reads
to rdcycle may return the same value" either. Hence, the spec is, to
some degree, ambiguous.

Since the spec doesn't say which clock to use, it isn't clear if a
very slow clock can be used (nonmonotonic), or if the counter can be
left unimplemented (always returns same value).

The discussion part of the spec clarifies that these counters are
mandated to be available (as 64b, possibly doing the 32 MSB in
software). However, the discussion isn't part of the formal spec....
it shouldn't be adding mandatory requirements in the discussion. The
discussion reads: "We mandate these basic counters be provided in all
implementations as they are essential for [...analysis and
optimization...]. [...] We required the counters be 64 bits wide, even
on RV32, as otherwise it is very difficult for software to determine
if values have overflowed. "

If the counter is expected to be implemented, and always monotonic,
then the spec should be clear, and it should not rely upon the
discussion.

Note that 64b counters are quite costly in area in small, embedded
FPGA implementations, increasing area about +30%, from 550 LUTs to
700LUTs. I gave a scatterplot of the area penalty in my 3rd RISC V
Workshop presentation (attached).

Guy
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/22966.60063.394393.263425%40dhcp-45-115.EECS.Berkeley.EDU.
2016-01-05 VectorBlox ORCA RISC-V DEMO.pdf
Reply all
Reply to author
Forward
0 new messages