Defined behaviour for upper bits in FD register file after downwards conversions

Michael Clark

unread,

Sep 3, 2016, 7:29:37 AM9/3/16

to RISC-V ISA Dev

Hi All,

I have a question about whether the value of the upper bits are defined in the FD register file, for any of the double to single size narrowing or move operations: i.e.

- FCVT.S.D, FCVT.S.W, FCVT.S.WU (on RV{32,64}FD)

- FCVT.S.L and FCVT.S.LU, FMV.S.X (on RV64FD)

What should the state of the upper bits be after any of these operations?

a) sign extend into the upper bits of a 64-bit register file?

b) zero extend into the upper bits of a 64-bit register file?

c) upper bits hold their previous bit values?

d) upper bits values are undefined?

The C pseudo code instructions I am working on use C cast notation for sign and zero extending. i.e. assigning s32() to a 64-bit signed type has the well-defined behaviour of sign extending. The integer register are all maximal width signed integers.

The sign extension versus zero extension recently caught me out on the 32-bit logical shift instructions on RV64 which counter intuitively (to me) sign extend for unsigned types. It seems all narrower integer operations will sign extend even if they are unsigned (logical). I used Alex’s port of the riscv-tests for the Linux ABI to test them <https://github.com/arsv/riscv-qemu-tests> as I haven’t implemented the privileged spec yet.

These all pass the tests (and the rest of the integer ISA):

slliw "Shift Left Logical Immediate Word" "rd = s32(u32(rs1) << imm)"
srliw "Shift Right Logical Immediate Word" "rd = s32(u32(rs1) >> imm)"
sraiw "Shift Right Arithmetic Immediate Word" "rd = s32(rs1) >> imm"
sllw "Shift Left Logical Word" "rd = s32(u32(rs1) << (rs2 & 0b111111))"
srlw "Shift Right Logical Word" "rd = s32(u32(rs1) >> (rs2 & 0b111111))"
sraw "Shift Right Arithmetic Word" "rd = s32(rs1) >> (rs2 & 0b111111)”

The integer pseudo code now sign extends to wider type in the integer register file. The notation is able to get away with simple assignment to rd because it is the largest signed type, so sign extension happens naturally. I could potentially make type sugar aliases for sign_extend<32>(val) and to make it more human readable however the cast behaviour of C is “well defined".

However narrowing conversions in the float register are another story and may leave garbage bits in the upper 32-bits for narrowing conversion which could be read back out with fmv.x.d on RV64FD and result in undefined values on RV32FD if performing and fcvt.s.d (convert double to single) followed by fmv.x.d (shift double to int register).

fcvt.s.w   "FP Convert Word to Float (SP)"   "f32(frd) = f32(s32(rs1))"
fcvt.s.wu  "FP Convert Word Unsigned to Float (SP)"     "f32(frd) = f32(u32(rs1))"
fmv.s.x   "FP Move from Integer Register (SP)"     "u32(frd) = u32(rs1)"
fcvt.s.l   "FP Convert Double Word to Float (SP)"     "f32(frd) = f32(s64(rs1))"
fcvt.s.lu  "FP Convert Double Word Unsigned to Float (SP)"  "f32(frd) = f32(u64(rs1))"
fcvt.s.d   "FP Convert DP to SP"   "f32(frd) = f32(f64(frs1))"
fmv.x.d   "FP Move to Integer Register (DP)"     "rd = s64(frs1)”

Example, fcvt.s.d currently assigns to a float in a union of a float, double, integer words with padding in the correct endian order for the platform (so I can implement fmv). I don’t know an easy way to zero extend a float once it is a float so the upper bits are currently defined as "c) the upper bits hold their previous bit values". At least any other behaviour is not easily expressible in simple C. Some hardware may in fact have a different policy for the upper bits in the register file upon narrowing conversions.

fcvt.s.d "FP Convert DP to SP" "f32(frd) = f32(f64(frs1))”

and fmv.s.x could be made to sign extend very easily as it access an integer alias in the float/double register file union

fmv.s.x "FP Move from Integer Register (SP)” "sx(frd) = u32(rs1)”

and fmv.x.d (ont RV32) will currently integer overflow the binary representation of the double truncating it to 32 bits of the mantissa.

fmv.x.d "FP Move to Integer Register (DP)" "rd = s64(frs1)”

I think Krste just answered FMV.X.D. It should narrow to float on RV32IFD?

In any case, I have so far been able to express a large portion of the ISA as simple C pseudo code however there are a few edge cases with FP and I still need to implement fcsr and frm and I have to suppress OS Divide By Zero exceptions and what not...

http://www.cplusplus.com/reference/cfenv/

Michael

Andrew Waterman

unread,

Sep 3, 2016, 6:35:32 PM9/3/16

to Michael Clark, RISC-V ISA Dev

On Sat, Sep 3, 2016 at 4:29 AM, Michael Clark <michae...@mac.com> wrote:
> Hi All,
>
> I have a question about whether the value of the upper bits are defined in
> the FD register file, for any of the double to single size narrowing or move
> operations: i.e.
>
> - FCVT.S.D, FCVT.S.W, FCVT.S.WU (on RV{32,64}FD)
> - FCVT.S.L and FCVT.S.LU, FMV.S.X (on RV64FD)
>
> What should the state of the upper bits be after any of these operations?
>
> a) sign extend into the upper bits of a 64-bit register file?
> b) zero extend into the upper bits of a 64-bit register file?
> c) upper bits hold their previous bit values?
> d) upper bits values are undefined?

Interpreting a float32 as a float64 has implementation-defined
behavior (and not just with respect to the upper 32 bits).

The only guarantee is that spilling a float32 as a float64 (FSD or
FMV.X.D), then reloading it as a float64 (FLD or FMV.D.X), preserves
the original float32 value.

>
> The C pseudo code instructions I am working on use C cast notation for sign
> and zero extending. i.e. assigning s32() to a 64-bit signed type has the
> well-defined behaviour of sign extending. The integer register are all
> maximal width signed integers.
>
> The sign extension versus zero extension recently caught me out on the
> 32-bit logical shift instructions on RV64 which counter intuitively (to me)
> sign extend for unsigned types. It seems all narrower integer operations
> will sign extend even if they are unsigned (logical). I used Alex’s port of

The reason that all *w instructions sign-extend the result is to help
maintain the ABI variant that all 32-bit integer types are
sign-extended in a wider register. This makes conversion from int32
to uint32, or int32 to [u]int64, a no-op.

FMV.X.D and FMV.D.X don't exist on RV32.

>
> In any case, I have so far been able to express a large portion of the ISA
> as simple C pseudo code however there are a few edge cases with FP and I
> still need to implement fcsr and frm and I have to suppress OS Divide By
> Zero exceptions and what not...
>
> http://www.cplusplus.com/reference/cfenv/
>
> Michael
>

> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit
> https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/15C2A153-9535-40E1-8806-6B4B343FC244%40mac.com.

Michael Clark

unread,

Sep 3, 2016, 7:39:11 PM9/3/16

to Andrew Waterman, RISC-V ISA Dev

Hi Andrew,

On 4 Sep 2016, at 10:35 AM, Andrew Waterman <wate...@eecs.berkeley.edu> wrote:

On Sat, Sep 3, 2016 at 4:29 AM, Michael Clark <michae...@mac.com> wrote:
Hi All,

I have a question about whether the value of the upper bits are defined in
the FD register file, for any of the double to single size narrowing or move
operations: i.e.

- FCVT.S.D, FCVT.S.W, FCVT.S.WU (on RV{32,64}FD)
- FCVT.S.L and FCVT.S.LU, FMV.S.X (on RV64FD)

What should the state of the upper bits be after any of these operations?

a) sign extend into the upper bits of a 64-bit register file?
b) zero extend into the upper bits of a 64-bit register file?
c) upper bits hold their previous bit values?
d) upper bits values are undefined?

Interpreting a float32 as a float64 has implementation-defined
behavior (and not just with respect to the upper 32 bits).

The only guarantee is that spilling a float32 as a float64 (FSD or
FMV.X.D), then reloading it as a float64 (FLD or FMV.D.X), preserves
the original float32 value.

“Implementation defined” is good way to explain this, then a reference manual for an implementation can elaborate.

Thanks.

The C pseudo code instructions I am working on use C cast notation for sign
and zero extending. i.e. assigning s32() to a 64-bit signed type has the
well-defined behaviour of sign extending. The integer register are all
maximal width signed integers.

The sign extension versus zero extension recently caught me out on the
32-bit logical shift instructions on RV64 which counter intuitively (to me)
sign extend for unsigned types. It seems all narrower integer operations
will sign extend even if they are unsigned (logical). I used Alex’s port of

The reason that all *w instructions sign-extend the result is to help
maintain the ABI variant that all 32-bit integer types are
sign-extended in a wider register. This makes conversion from int32
to uint32, or int32 to [u]int64, a no-op.

It makes sense however I wrote the pseudo code for those instructions from the spec and wasn’t reading closely enough.

I went back to the spec and found a mention of “signed”. It was an innocent mistake. It’s very hard to fault the Base ISA spec. It’s very detailed.

"

SLLIW, SRLIW, and SRAIW are RV64I-only instructions that are analogously dened but operate
on 32-bit values and produce signed 32-bit results. SLLIW, SRLIW, and SRAIW generate an illegal
instruction exception if imm[5] 6= 0.

"

I did actually look at Spike’s code for sign injection. It has been an interesting process, i.e. testing the spec from the wording versus the reference implementation.

The observation is that a normative notation which is explicit about sign and zero extension will be interesting.

The bulk of the ALU instructions are XLEN width instructions so this is not relevant.

Oh right. I was getting confused between instruction selection in the other thread and the behaviour that was on my mind at the time.

In summary, one can conform with the spec if the “implementation defines” the behaviour of the upper bits, which is different from “undefined”.

Thanks again,

Michael

Jacob Bachmeyer

unread,

Sep 3, 2016, 8:09:17 PM9/3/16

to Michael Clark, Andrew Waterman, RISC-V ISA Dev

Michael Clark wrote:
> In summary, one can conform with the spec if the “implementation
> defines” the behaviour of the upper bits, which is different from
> “undefined”.

The only architectural requirement is that FSW of a narrower float as an
FLEN-width float followed by an FLD as an FLEN-width float of the same
value must produce the value that was originally in the register. This
is necessary to support spilling callee-saved FP registers and context
switches without extra state to encode what is actually *in* the FP
registers.

As far as portable code is concerned, "implementation defined" *is*
"undefined behavior", since relying on it would tie the code to a
particular implementation. As far as an implementation is concerned,
"implementation defined" means "do whatever is convenient".

One option is to essentially only support FLEN-width floats by
"unpacking" narrower floats into the correct fields in the FPU register
file during FLD and "packing" them again for FSW. Arithmetic would mask
the "excess" bits. Conversion from double-precision to single-precision
would simply adjust the exponent and mask off some less-significant bits
from the mantissa, replacing the entire value if needed to express an
overflow/underflow condition. Most values could be quickly converted by
hardware, while overflow/underflow/denormal results could cause an
implementation-specific trap to M-mode to produce the correct result value.

Andrew Waterman

unread,

Sep 3, 2016, 8:39:17 PM9/3/16

to jcb6...@gmail.com, Michael Clark, RISC-V ISA Dev

That is a reasonable implementation strategy and is what Alpha and
POWER6 do. It's worse than just masking off the extra precision for
single-precision arithmetic, though, as single-precision rounding may
increment the upper bits of the significand. The effect for POWER6 is
that single-precision arithmetic is higher latency than double.

Our original intent in leaving this implementation-defined was to
permit an internal recoding that uses an extra exponent bit to keep
subnormal numbers in normalized form. This simplifies the handling of
subnormals in hardware, reducing the temptation to trap on the
uncommon cases.

Stefan O'Rear

unread,

Sep 3, 2016, 8:44:02 PM9/3/16

to Andrew Waterman, jcb6...@gmail.com, Michael Clark, RISC-V ISA Dev

On Sat, Sep 3, 2016 at 5:38 PM, Andrew Waterman
<wate...@eecs.berkeley.edu> wrote:
> Our original intent in leaving this implementation-defined was to
> permit an internal recoding that uses an extra exponent bit to keep
> subnormal numbers in normalized form. This simplifies the handling of
> subnormals in hardware, reducing the temptation to trap on the
> uncommon cases.

It's worth noting that using traps to handle subnormals is a [known
security footgun](https://cseweb.ucsd.edu/~dkohlbre/papers/subnormal.pdf)
and to the extent that we can discourage implementations from doing
that and provide viable alternatives, we probably should.

-s

Reply all

Reply to author

Forward