Hi!
This is feedback from IAR on the proposed Zfa extension.
Summary:
- Overall, this is a good proposal.
- It is unclear what "quiet" comparison instructions
corresponds to in C.
- The constants selected for FLI.i doesn't seem to match
real-world
uses based on a statistic analysis we have conducted.
- Finally, an new instructions to scale floating-point values are
proposed.
--------------------
* Overall
We gives a "thumbs up" for this extension, as it fixes a number of
problems with the current FPU instructions, and it improves code
size.
In particular, accessing the upper bits of a 64 bit floating point
value on RV32 will simplify library functions.
--------------------
* No assembler syntax
The assembler syntax of the new instructions are omitted.
For most instructions, this is not a problem. However, for FLI.S,
it
is. There is a footnote that states that it should accept "min",
"inf", and "nan" and the rest as decimal constants.
However, it is unclear it it should accept something like "FLI.S
fa0,
16" (i.e. index number 16) or "FLI.S fa0, 1.0". The latter is
easier
to read, but it might become hard to ensure that the tools can
handle
this case properly.
--------------------
* Unclear encodings
Unlike the unpriv RISC-V specification, this specification doesn't
provide tables for the encoding. Instead, encoding is expressed
like
"These instructions are encoded like their FMIN and FMAX
counterparts,
but with instruction bit 13 set to 1."
On one hand, this is provides enough information to
implement hardware and tools. On the other hand, it makes the
manual
specification hard to read and thus makes any implementation
process
more error prone.
--------------------
* When should FLEQ.S be used?
It is not clear when a compiler should select to use the "quiet"
comparison instructions over the original comparison instructions.
I suggest that the specification explains the C language
constructions
that is suited for these instructions.
If the new functions are expected to be utilized using builtin
functions, those functions should be specified either in this
specification or in a companion specification. In the latter case,
that specification should be ratified along the Zfa specification.
--------------------
* The constants of the FLI.i instructions
In my experience, the constants in the table 25.1 doesn't
represent
the most commonly used floating-point constants.
However, just relying on gut feeling isn't a good way to design
processor instructions.
Instead, I instrumented the IAR compiler and build a large body of
code. The table below is the head of that list (it combines both
32
and 64 bit types).
Clearly, I see a different pattern compared to the constants
provided
in the Zfa suggestion. (Of course, it could be possible to refine
this
analysis further by investigating the context of the constants --
e.g.
if the "-1.0" is used for addition, in which case it could be
rewritten as a subtraction.)
On the other hand, constructing a non-fractional floating-point
value
is easy in RISC-V assembly, for example:
addi a0, zero, 123
fcvt.d.w fa0, a0
Whereas it is a lot harder to create a more complex value,
especially
for types larger than 32 bits.
Of the proposed constants, there are some that are have very
little or
no presence in my statistical material:
256 -- 4
2^15 -- 2 (-2^15 occurs 9 times)
2^-8 -- 2
1.25 -- 1 (-1.25 occurs 7 times)
2^16 -- 0
2^-7 -- 0
2^-15 -- 0
2^-16 -- 0
0.3125 -- 0
0.375 -- 0
0.4375 -- 0
0.625 -- 0
0.875 -- 0
1.75 -- 0
0.0625 -- 0
Based on the statistics, I suggest that we add the following
constants:
5.0
10.0
6.0
60.0
120.0
pi/2 (or -pi/2) (which seems to be used a lot more than pi)
Note: There is no need for 0.0, as this can be created using:
fcvt.d.w fa0, zero
Reservation: The material used might not be representative for
real
world applications as it contains a lot of code specifically used
to
test compilers. However, I believe that this provide a better
selection than the GCC standard library, which I understand was
used
as the base for the proposed Zfa constants.
Appendix: Statistical analysis of floating-point constants.
("*" = Part of the proposed Zfa extension.)
1.0 |
1466 *
2.0 | 615 *
3.0 | 530 *
4.0 | 413 *
-1.0 | 391 *
10.0 | 333
5.0 | 296
0.5 | 238 *
6.0 | 208
120.0 | 201
60.0 | 164
70.0 | 147
7.0 | 143
9.0 | 134
150.0 | 132
Infinity | 130 *
-2.0 | 128
16.0 | 115 *
17.0 | 111
130.0 | 110
12.0 | 109
18.0 | 107
1280.0 | 102
3600.0 | 99
11.0 | 91
90.0 | 83
NaN | 83 *
-3.0 | 81
8.0 | 81 *
-90.0 | 79
2700.0 | 77
-4.0 | 75
14.0 | 73
-5.0 | 67
25.0 | 67
30.0 | 64
1000.0 | 64
1.234 | 63
20.0 | 63
26.0 | 62
2.2 | 59
1.100000023841858 | 59
-1.7363228039172371 | 56
10000000000.0 | 56
-60.0 | 56
1.1 | 53
5.0e-324 | 52
1.0e+35 | 49
99.0 | 48
100.0 | 48
1.0000000000000001e-35 | 48
5.4324 | 47
13.0 | 46
-0.5 | 45
-6.0 | 44
110.0 | 44
-Infinity | 41
0.25 | 41 *
28.0 | 41 *
21.0 | 40
162.0 | 38
3.3 | 38
1.5 | 37
15.0 | 37
0.0 | 36
123.0 | 35
22.0 | 35
80.0 | 34
180.0 | 34
23.0 | 33
31.0 | 28
--------------------
* Suggestion: Add a "scaling" instruction (mul/div by
power-of-two)
One operation that I often see in floating-point based code is
when a
value of scaled up or down with a power of two, where the scale
value
is provided as an immediate.
For example, to multiple fa0 by 8.0, we could use:
SCALEUP.D fa0, fa0, 3
From an implementation point of view, this is almost trivial to
implement in hardware, as a scaling corresponds to an adjustment
of
the "exp" field, with some logic associated with overflow:s (when
scaling up) and underflow (when scaling down).
In addition, this is useful when constructing constants.
fli.d fa0, 1.0 # Note: Unsure about the syntax.
scaledown.d fa0, 14 # Creates 2 ^ -14
This could reduce the need for power of two:s in the FLI.i
instruction.
IAR Systems AB
Box 23051, Strandbodgatan 1
SE-750 23 Uppsala, Sweden
www.iar.com
LinkedIn
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/b34b5ed5-cd47-aaf8-edd9-1daf69fb75bb%40iar.com.
- The constants selected for FLI.i doesn't seem to match real-world
uses based on a statistic analysis we have conducted.
The idea was to encode values that would only require no more a few gates to synthesize, rather than requiring a lookup table (e.g. pi/2 requires that). That meant a few bits of exponent and significand come directly from the instruction. Given that constraint, I believe the proposal was based on static statistics gathered from libc and some other sources. Since FLI is primarily for code size rather performance, static statistics were most appropriate.
- Finally, an new instructions to scale floating-point values are
proposed.
Please consider implementing this in a compiler and gathering some statistics. Another place that exponent scaling is useful is in conversion from integer to floating-point (e.g. for switching from integer DSP to FP).
-Earl