Instruction for total ordering of floating point values

51 views
Skip to first unread message

Simon Ochsenreither

unread,
Sep 4, 2021, 5:51:05 PM9/4/21
to RISC-V ISA Dev
Hi everyone,

the RISC-V instruction set contains 6 instructions (FEQ, FLT, FLE in both F and D standard extensions) that implement the comparison predicate (§5.11 of the IEEE754 spec) for floating point numbers.

The IEEE754 spec also defines a total order predicate in §5.10, but RISC-V does not provide instructions for it.

While RISC-V can't and shouldn't provide instructions for everything, I think this is similar to the floating-point sign-injection operations: these operations could be implemented by moving values to x registers, a few bit operation instructions there and a move back to f registers, but these operations were deemed important enough to exist on their own.

Would a proposal to add 3 instructions to the F/D opcode family – tentatively named "FEQO", "FLTO" and "FLEO" – be welcomed?

Thank you!

Simon

Jim Wilson

unread,
Sep 7, 2021, 6:45:28 PM9/7/21
to Simon Ochsenreither, RISC-V ISA Dev
On Sat, Sep 4, 2021 at 2:51 PM Simon Ochsenreither <simon.och...@gmail.com> wrote:
Would a proposal to add 3 instructions to the F/D opcode family – tentatively named "FEQO", "FLTO" and "FLEO" – be welcomed?

This subject has come up before, multiple times.  Last one I remember is when someone asked why the compiler was emitting multiple instructions for a simple FP comparison.  And that is because we don't have a full set of FP comparison instructions.  I would certainly like to see a full set of FP comparison instructions.  I don't know if the additional 3 you are suggesting are sufficient.  Someone should do some compiler experimenting to see what missing FP compare instructions we would get the most benefit from.  On the other side are people who don't think that NaNs are important enough to waste opcode space on.

I found this which is the last one I remember

Jim

Simon Ochsenreither

unread,
Sep 8, 2021, 3:58:59 PM9/8/21
to RISC-V ISA Dev, Jim Wilson, RISC-V ISA Dev
Hi Jim,

thank you for the reference! I agree that the signaling/quiet distinction is probably not worth the complexity, and many of the mentioned comparison operations can probably be emulated rather trivially.

What I can't seem to find though is any discussion on ordering instructions, which cannot be emulated by comparison instructions (I searched for "total", "order" and "5.10", but came up empty on the mailing list you linked to).

Though if interest is low or the discussion is expected to be lengthy, I'd be respectful of everyone's time and work on other things instead.

Thanks,

Simon

Jim Wilson

unread,
Sep 8, 2021, 5:12:57 PM9/8/21
to Simon Ochsenreither, RISC-V ISA Dev
On Wed, Sep 8, 2021 at 12:59 PM Simon Ochsenreither <simon.och...@gmail.com> wrote:
thank you for the reference! I agree that the signaling/quiet distinction is probably not worth the complexity, and many of the mentioned comparison operations can probably be emulated rather trivially.

What I can't seem to find though is any discussion on ordering instructions, which cannot be emulated by comparison instructions (I searched for "total", "order" and "5.10", but came up empty on the mailing list you linked to).

I had to check the IEEE FP standard to see what totalorder is.  Looks like the ISO C standard hasn't caught up yet and doesn't support it yet.  I would be surprised if any target has an instruction for that.

The flto/etc instructions you mentioned are for unordered/ordered compares, and we can easily emulate those with a few extra instructions.  If that is all you need, then you can already compute totalorder using the existing instructions.  It will just require more instructions than necessary if we had a full set of FP compare operations.  Also see the fclass instruction which solves the other half of the problem.

RISC-V does support all of the comparison functions required by the IEEE FP standard.  It just requires more than one instruction to compute some of them.

Jim

Simon Ochsenreither

unread,
Sep 9, 2021, 4:06:31 PM9/9/21
to RISC-V ISA Dev, Jim Wilson, RISC-V ISA Dev
Hi Jim,

> The flto/etc instructions you mentioned are for unordered/ordered compares
> and we can easily emulate those with a few extra instructions.

I think "total order" is a misleading name as it sounds as if it was related to unordered/ordered comparisons, but it really isn't.

> If that is all you need, then you can already compute totalorder using the existing instructions.
> It will just require more instructions than necessary if we had a full set of FP compare operations.
> Also see the fclass instruction which solves the other half of the problem.

I'm not sure this is possible, as unordered/ordered comparisons indiscriminately put NaNs first/last,
and I'm also pretty sure they treat 0.0 and -0.0 as equal (The total order is specified as the following: -qNaN -sNaN -Inf -1024.0 -1.0 -0.0 0.0 1.0 1024.0 Inf sNaN qNaN.)

The only implementations I have ever seen do not use floating point comparisons, but move the float to gp registers and do bit shifts and integer comparisons there.

I would really like to see an example of a set of instructions that operates only the f register.
If it was possible, I would imagine it would take a lot of branches to make float comparisons do float ordering that it would be so exceedingly expensive that moving floats to gp registers would be cheaper.
(The code shown is branch-free up to the point of the integer comparison at the end.)

> RISC-V does support all of the comparison functions required by the IEEE FP standard.
> It just requires more than one instruction to compute some of them.

This is kind of my motivation, as moves between x and f registers probably have a high variability between core designs, compared to an instruction that works on f registers directly.

Thanks,

Simon

Jim Wilson

unread,
Sep 9, 2021, 4:51:23 PM9/9/21
to Simon Ochsenreither, RISC-V ISA Dev
On Thu, Sep 9, 2021 at 1:06 PM Simon Ochsenreither <simon.och...@gmail.com> wrote:
I'm not sure this is possible, as unordered/ordered comparisons indiscriminately put NaNs first/last,
and I'm also pretty sure they treat 0.0 and -0.0 as equal (The total order is specified as the following: -qNaN -sNaN -Inf -1024.0 -1.0 -0.0 0.0 1.0 1024.0 Inf sNaN qNaN.)

which is why I mentioned that the fclass instruction solves the other half of the problem.  But the glibc solution using integer operations may still be the faster solution.

Jim

Simon Ochsenreither

unread,
Sep 9, 2021, 6:36:02 PM9/9/21
to RISC-V ISA Dev, Jim Wilson, RISC-V ISA Dev
Hi Jim,

> which is why I mentioned that the fclass instruction solves the other half of the problem.

I think this would only deal with the 0.0 vs -0.0 case, I guess some additional FSGNJ magic would be required on top of that for splitting the NaNs into negative and positive groups.

> But the glibc solution using integer operations may still be the faster solution.

Yes, that was the starting assumption I wanted to improve upon.

Thanks,

Simon

MitchAlsup

unread,
Sep 26, 2021, 9:37:53 PM9/26/21
to RISC-V ISA Dev, Simon Ochsenreither
On Saturday, September 4, 2021 at 4:51:05 PM UTC-5 Simon Ochsenreither wrote:
Hi everyone,

the RISC-V instruction set contains 6 instructions (FEQ, FLT, FLE in both F and D standard extensions) that implement the comparison predicate (§5.11 of the IEEE754 spec) for floating point numbers.

Be careful here::

     if( x > y )
          then-clause
     else
          else-clause

has NaNs go to the else clause, while if you invert the comparison:

     if( x <= y )
          else-clause
     else
          then-clause

has the NaNs go to the then-clause--this is not the desired flow of control.

The IEEE754 spec also defines a total order predicate in §5.10, but RISC-V does not provide instructions for it.

This is expected to be a "seldom used" component of the standard. Not that it should be inefficient in any way,
it is just not expected to be used very often.

Mitch

Simon Ochsenreither

unread,
Oct 9, 2021, 4:38:55 PM10/9/21
to RISC-V ISA Dev, MitchAlsup, Simon Ochsenreither
This is expected to be a "seldom used" component of the standard. Not that it should be inefficient in any way,
it is just not expected to be used very often.

Yes, that was my question.
Reply all
Reply to author
Forward
0 new messages