--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CA%2BwH294PKh%2BWaeY_Oj%3Dwb-7_wmKcJ0Ou%3DSvAdF8CJhw0%3DbmNow%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/22739.48992.661784.150023%40KAMacBookAir2012.local.
On 22/03/2017, Allen J. Baum <allen...@esperantotech.com> wrote:
> The ideal format from a HW perspective is to right-justify the exponent (and
> converting from excess 127 to/from excess 1023, which is a 3 bit decrement)
> and to left-justify the mantissa (with trailing zeroes) to the wider format
> - which is effectively converting it to the wider format. That's cheap,
> easy, and allows the FPU to handle either single or double with little extra
> logic.
On 23/03/2017, Roger Espasa <roger....@esperantotech.com> wrote:
> Encoding SP in more than 32b might cause troubles to folks wishing to build
> cheap SIMD on top of the existing Fregs. And the issue described by Alex
> will get worse when those SIMD extensions also include integer data types.
> And this will pop up again in the definition of the vector extension. So
> closing the hole in a way that an SP value stored with FSD ends up in the
> low 32b of the memory location in exactly IEEE format seems a good choice
> to me.
Well, it seems we have a dilemma. But i have an idea:
If we unpack the values, the exponent in the more-precise format is
never all 0 or all 1.
If we choose zero-padding or NaN-encoding scheme, the exponent in the
more-precise format is exactly all 0 or all 1.
IEEE-compliant FPUs must make a distinction when storing single precision values with a store double.
On Thu, Mar 23, 2017 at 7:31 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
> Andrew Waterman wrote:
>>
>> On Wed, Mar 22, 2017 at 11:43 PM, Allen J. Baum
>> <allen...@esperantotech.com> wrote:
> I agree that widening narrower values may be the best way to store them in
> the FP register file, but the issue here is standardizing what to do when
> FSQ is executed on a register holding a single-precision value if the
> processor distinguishes that case. I seem to recall previous discussions on
Yeah. If the values are stored in the regfile in the widest supported
format, then it is clear what happens when a float32 or float64 is
FSQ'd: it's stored to memory as a float128 that represents the same
value. Easy to specify, and cheap to implement if employing recoding.
If you're using a single FPU to implement double, single, and packed single, then unpacking has 3 formats instead of two.I don't think I understand the requirement thatIEEE-compliant FPUs must make a distinction when storing single precision values with a store double.What does that mean? That someone examining the bits in memory can determine whether the value stores was originally a single rather than a double? Strictly speaking, the only way to do that is by reserving NaN values, and I'd have to read the spec carefully to see if that was legal.I also don't know what the spec says about loading a single and adding it to a double or vice Versace- a similar issue.At first glance the HW cost doesn't seem to be much worse than zero filling, though the recursive encoding makes me nervous.
On Friday, 24 March 2017 00:58:21 UTC-4, andrew wrote:On Thu, Mar 23, 2017 at 7:31 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
> Andrew Waterman wrote:
>>
>> On Wed, Mar 22, 2017 at 11:43 PM, Allen J. Baum
>> <allen...@esperantotech.com> wrote:
> I agree that widening narrower values may be the best way to store them in
> the FP register file, but the issue here is standardizing what to do when
> FSQ is executed on a register holding a single-precision value if the
> processor distinguishes that case. I seem to recall previous discussions on
Yeah. If the values are stored in the regfile in the widest supported
format, then it is clear what happens when a float32 or float64 is
FSQ'd: it's stored to memory as a float128 that represents the same
value. Easy to specify, and cheap to implement if employing recoding.
Some hardware implementations that employ this approach internally (of max size fits all)
provide a post operation of readjusting to the lower precision and therefore execute lower precision
formats _slower_ than the full higher precision and at a higher cost in active circuits (e.g. adding across 52 bits vs 23.)
RISC-V has distinct instructions for each precision, so an implementation can optimize for lower precision.
Whereas, an implementation that is lower precision aware can have lower precision values already positioned for optimizing the expected precision operation.
The NaN encoding provides the information of the precision at load time.
Whereas, encoding as lower_precision_value in higher_precision_format does not know the intended precision until a subsequent float instruction is decoded.
> this list pointing out that IEEE-compliant FPUs must make that distinction,
> so we are left with standardizing a way to encode a width tag and a narrower
> value into a wider value.
>
> I favor Alex Bradbury's proposal to encode the narrower value inside a wider
> NaN and offer a suggestion to extend that recursively if the widths are not
> adjacent. A half-precision value can be encoded into a single-precision NaN
> encoded into a double-precision NaN encoded into a quad-precision NaN and
> the whole bundle unpacked into the half-precision value, converted to
> internal quad-precision with a tag indicating half-precision format, upon
> FLQ if the implementation so chooses.
>
> -- Jacob
>
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/40ced8c4-ba2e-47d0-9848-3db6bdd37f26%40groups.riscv.org.
The NaN encoding provides the information of the precision at load time.
Whereas, encoding as lower_precision_value in higher_precision_format does not know the intended precision until a subsequent float instruction is decoded.
* Come up with a new NaN-based encoding.
* A double-precision NaN is represented as a value where the exponent is all
1s and the 52-bit significand is non-zero. This is a huge encoding space,
and a standard encoding could easily be chosen
* There is an advantage in that debug tools could determine with a high
degree of certainty whether the dumped state from a floating point register
is holding a value that is meant to be interpreted as a single-precision
float
* Similar encodings could be used to represent a half precision value in a
float register, or a double in a quad register
* There's perhaps more flexibility for eagerly recoding what seems to be a
single-precision float to a different internal representation upon an fld
(rather than on-demand when a single precision operation is performed).
However, for IEEE compliance any such value would still need to act as a NaN
when used in a double precision operation.
I argue that what matters above all else, is that one of these options is
chosen and used consistently. It's worth nothing that an implementation is
still free to use a different internal recoding, it would just need to support
serialising and deserialising to the standard encoding that is chosen.
"FSD and FMV.X.D should be defined to create the same implementation-defined values as each other, and FLD and FMV.D.X should restore them equivalently. In particular, FSD followed by LD and FMV.D.X should properly recreate the single-precision value, as should FMV.X.D followed by SD and FLD.”
## Backwards compatibility impact
I believe this change can be made in a backwards compatible way (i.e. all
standards-compliant RISC-V software would continue to work on a newer
revision). It also seems likely there is still time to specify this change
and have it adopted before any RISC-V FPU implementations are available in
shipping systems.
## Other related issues
* Substantially more minor, but at least a recommendation for encoding quiet vs
signalling NaNs would be useful. A similar issue does exist here, in that a
signalling NaN might be interpreted as non-signalling after a context switch
to a different RISC-V implementation. I expect the potential impact of this
issue is far, far lower than what is described above
* An RV64IFD system hoping to support Q would, as it stands, have to break ABI
compatibility when Q is in use. Is the cost of adding yet another ABI worth
it, or can it be avoided?
## Conclusion/summary
* Leaving the encoding of lower precision values in higher precision fp
registers appears to give more microarchitectural freedom, but in reality this
is an architecturally visible property that 'leaks' and causes potential
issues in use cases that RISC-V community should care about (e.g.
migration on a heterogeneous cluster)
* The RISC-V community would benefit from standardising on a single externally
visible encoding ('serialisation'), and doing so quickly
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CA%2BwH294PKh%2BWaeY_Oj%3Dwb-7_wmKcJ0Ou%3DSvAdF8CJhw0%3DbmNow%40mail.gmail.com.
a). expand into exponent and mantissa of larger typeb). right justify in mantissa of larger type with unspecified encoding for the remaining bitsc). right justify in mantissa of larger type with specified recursive type encoding for the larger type MSB mantissa NaN (all 1’s)
On Fri, Mar 24, 2017 at 5:16 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
> Allen J. Baum wrote:
>>
>> - ensure IEEE Standards compliance?
>> * I'm a little unsure of which rule we may currently be violating,
>> beside perhaps not having a standard.
>> There was a suggestion last night that ( if I interpreted it
>> correctly, far from a sure thing) it was a requirement for someone who
>> examined the bits stored by a wider format store to be able to determine
>> that the value stored was actually a narrow format.
>> If that is indeed the case, then right justify, fill with NaN is the
>> only legal option, as far as I can tell.
>
>
> That was me misremembering a response to a suggestion I had made that an
> implementation could implement only FLEN-width floats, "unpack" narrower
> floats to FLEN upon LOAD-FP, and "pack" them again for STORE-FP. Masking
> the "excess" bits in arithmetic is then required, but can make
> single-precision latency greater than double-precision latency due to an
> implicit FCVT.S.D after every operation. See message-id
> <CA++6G0AvAWSOcuCOenHecUMRKqAozrWPmqvGrt8ex...@mail.gmail.com> for the
> response I misremembered.
On Fri, Mar 24, 2017 at 6:44 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
> Allen Baum wrote:
I see only two viable possibilities:
* Embed float32 in a float64 of the same numeric value
(but we still
need to decide what happens to NaN payloads; left-aligning them with
zero bits on the right probably requires the fewest additional wires)
* Embed float32 in float64 by adding 1 bits on the left.
-s
I'm not convinced this "internally tracked format information" is a meaningful thing to talk about this way, and you seem to be misapprehending the meaning of "undefined behavior."
Undefined behavior is there _to allow for implementation freedom in handling it_, when a valid program would never encounter it. As a result:
- If it's undefined behavior, that "internally tracked format information" is actually entirely unnecessary
- If you define "mismatched operations" as "raises Illegal Operation" it is no longer undefined behavior, and actually _significantly constrains implementations_
For example, if mismatched operations (aside from FLD and FMV) are UB, then using the wrong-precision "fadd" may result in incorrect rounding - which is fine, so long as using the _correct-precision_ "fadd" results in the correct answer.
Such an implementation would _never_ need "internally tracked format information" - it might use a rounding step after performing a double computation, or it might use a separate single-precision unit, or any other implementation under the sun. It'd simply select behavior based on the format given _in the instruction_.
By comparison, if mismatched operations are _erroneous_, then it _must_ store that information - forcing complexity on informations - in order to _compare_ it against the information in the instruction.
I personally prefer the "store float32 as a float64 of the same value" representation. It's friendlier to the outside world, and doesn't invent a quirky new format. In an implementation with the area to expend on separate single/double units it has no performance downsides (and area is sufficiently available that it's being used on GPUs). In area-constrained implementations, it's going to have the same performance impact as _any_ approach to supporting both float and double with the same unit.
> > (but we still
> > need to decide what happens to NaN payloads; left-aligning them with
> > zero bits on the right probably requires the fewest additional wires)
> >
> > * Embed float32 in float64 by adding 1 bits on the left.
>
> As mentioned elsewhere, this mandates tracking size in the implementation.
> Ideally we would not wish to impose this is on an implementation that
> otherwise would not care to do so.
As above, no it does not.
> The question is if we mandate a relatively minor obligation on F32-in-F64
> to retain the internal state of other implementations.
>
> By the way, I believe Stefan O'Rear's suggestion of all 1 -NaN is not only
> cute, but elegant, I cannot think of a better value for a specific NaN.
Er, the "all 1 -NaN" _is_ "embed float32 in float64 by adding 1 bits on the left".
It just so happens that "the left" includes the sign bits, mantissa, etc, and thus results in a negative NaN.
>
> -s
>
>
> Some additional thoughts:
> (The float load and store and fmv (pseudo op) of course will not raise the
> mismatch flag because they explicitly handle this use case.
> As I read the fmv it propagates the size information as well as the
> existing format's data).
>
> I believe the specs should be explicit when operations and data-size
> mismatch, thought I saw it stated at one time but it is certainly implied.
Clearly marking it as undefined behavior would be good. Making it _defined erroneous_ behavior is a very different beast.
On Saturday, 25 March 2017 10:54:09 PDT Alex Elsayed wrote:
> On Friday, 24 March 2017 20:52:56 PDT David Horner wrote:
<snip>
> > By the way, I believe Stefan O'Rear's suggestion of all 1 -NaN is not only
> > cute, but elegant, I cannot think of a better value for a specific NaN.
>
> Er, the "all 1 -NaN" _is_ "embed float32 in float64 by adding 1 bits on the
> left".
>
> It just so happens that "the left" includes the sign bits, mantissa, etc,
> and thus results in a negative NaN.
Gah, s/bits/bit/ and s/mantissa/exponent/. That'll teach me to send email right after waking up :/
<clip>
>> operation) would be treated as a NaN. Perhaps narrower values could be seen
>> as quiet NaNs by wider operations, while wider values would be seen as
>> signaling NaNs by narrower operations?
>>
>
> I do not see the purpose of complicating it in that way.
>
I see it as a means for software to use the FCLASS.x instructions in an
RVFDQ implementation to quickly determine what width of value is in an
FP register using at most two executions of FCLASS.x.
-- Jacob
- If you define "mismatched operations" as "raises Illegal Operation" it is no longer undefined behavior, and actually _significantly constrains implementations_
-Nan(0xFFFFFFFFFxxxx) // HP in DP NaN-Nan(0xFFFFFxxxxxxxx) // SP in DP NaN
Note, if the RISC-V specification did not impose the guarantee on FSD for "a floating-point register holds a single-precision value" that the single-precision value would be restored, then the onus would be on the user program to ensure the operating system and libraries were aware of each registers current "precision"/size.
However, RISC-V has made the stipulation and there appears to be only the two approaches to encode.
On Sat, Mar 25, 2017 at 3:06 PM, David Horner <ds2h...@gmail.com> wrote:
>
>
>
> However, RISC-V has made the stipulation and there appears to be only the
> two approaches to encode.
>
Arithmetic instructions in RISC-V do not cause traps based upon
argument values. Adding value-dependent traps for this one case is an
undue burden. And I'd argue that making the trapping behavior
optional defeats the purpose of closing this specification hole.
So, I conclude that trapping on mismatched values should not factor
into this discussion.
On 26 Mar 2017, at 11:30 PM, kr...@berkeley.edu wrote:* Why not NaN encoding?
The alternative proposal was to encode n values within m NaN encoding
space. This is undesirable as it encroaches upon NaN encoding space,
which could affect/interact with other uses of NaN encoding space (NaN
payload propagation, JIT boxing, etc.). In addition, if the
conversion to wider type is mandated, it can be used by software
during conversion between types, whereas NaN encoding requires
separate explicit conversion instruction. Also, for many optimized
FPU implementations the wider encoding is natural.
At 6:35 AM -0700 4/3/17, kr...@berkeley.edu wrote:
>
>Operations on wider types:
>
>If a register containing a wider m-bit operand is used as input to a
>narrower n-bit operation, the result should be as if the lower n-bits
>of the external m-bit representation were used. This operation should
>never occur in correct code, so can be handled by taking a slow M-mode
>trap to fix up the results.
(assuming a 32-bit floating op on registers with 64-bit floats in them)
So, does this imply that 32-bit operations should enforce that the upper 32 bits of input for all operands are all1s, and trap if they are not, as well as setting the upper 32-bits to all1s in the result?
If so, I don't see where we need to do any tagging at all (though it might be simpler than checking that the upper-32bits are all zeroes)
--
**************************************************
* Allen Baum tel. (908)BIT-BAUM *
* 248-2286 *
**************************************************
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/p062408b4d5082361d673%40%5B192.168.1.50%5D.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/22764.8954.864473.167760%40KAMacBookPro2016.local.
On Apr 11, 2017, at 6:17 AM, Bruce Hoult <br...@hoult.org> wrote:"The cost to a recoded implementation is primarily in checking if the
upper bits of a narrower operation represent a legal NaN-boxed value."NON-recoded, Shirley?
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/22764.8954.864473.167760%40KAMacBookPro2016.local.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/C9051EFD-14D5-4175-95C3-E440432F6E9A%40berkeley.edu.
Maybe worth explicitly stating:ANY use of an n bit value by a non-n bit operation (whether wider or narrower) will result in that operand being treated as a qNaN.?
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAMU%2BEkxJP7rC8bAja_DAYdfnyYDVd%3DYf2TTw8wOtOWD9ZqSJzA%40mail.gmail.com.
On Apr 11, 2017, at 6:36 AM, Bruce Hoult <br...@hoult.org> wrote:Maybe worth explicitly stating:ANY use of an n bit value by a non-n bit operation (whether wider or narrower) will result in that operand being treated as a qNaN.?As written, anything <=n bits is treated as expected - I think it actually confuses the issue to call out narrower boxed values as being treated as qNaN (their representation will ensure that).
On 12 Apr 2017, at 11:17 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:On another note, this change seems to define the result of FSGNJ[NX].D on single-precision inputs--the sign bit read from a single-precision input is always 1. The sign bit of a NaN value is officially meaningless, and FSGNJ*.D can easily produce results that have positive sign bits but are otherwise boxed narrower values. Should these also be valid boxed values or should they be considered NaN in any width?