Fixed-width floating-point types

1,020 views
Skip to first unread message

Ray Hamel

unread,
Feb 23, 2017, 1:55:54 PM2/23/17
to ISO C++ Standard - Future Proposals
C99, C++11 and later introduced standardized fixed-width integer types such as int32_t provided in the headers stdint.h/cstdint and inttypes.h/cinttypes. However, although the float  and double data types conventionally are implemented with, respectively, IEC 559 binary32 and binary64 format and semantics, and long double as a typedef of double, this is not specified by the standard.

There is therefore a need for floating-point types with well-defined format and semantics, which could be provided in existing headers such as cfloat or cmath, or a new header perhaps called float, stdfloat or floattypes.

32-bit and 64-bit base-2 floating-point numbers with fixed semantics, such that arithmetic expressions resulting in these types may not be transformed in any way that alters the result of the expression, even if no precision is lost. The set of values of type f64_t must be a superset of the set of values of type f32_t.

f32_t
f64_t

IEC 559 binary32/64 format and semantics.

f32_iec559_t
f64_iec559_t

32-bit and 64-bit base-2 floating-point numbers whose sets of values are supersets of the non-subnormal subsets of IEC 559 binary32/64, respectively. Arithmetic expressions resulting in these types may result in any non-subnormal value at least as precise as provided IEC 559 binary floating point arithmetic. Support for subnormal values of these types is implementation-defined.

f32_fast_t
f64_fast_t

Equivalent to f32_iec559_t and f64_iec559_t, but trap on arithmetic errors (inexact, underflow, overflow, divide-by-zero, invalid), or when they hold the values +/-NAN or +/-INFINITY.

f32_trap_t
f64_trap_t

Equivalent to f32_trap_t and f64_trap_t, but throw a C++ exception instead of trapping.

f32_throw_t
f64_throw_t

In addition, implementations may optionally provide any or all of the following types for fixed-width binary floating-point numbers with standardized semantics, …

f8_t
f16_t
f80_t
f128_t
f256_t
f16_iec559_t
f128_iec559_t
f256_iec559_t
f8_fast_t
f16_fast_t
f80_fast_t
f128_fast_t
f256_fast_t
f8_trap_t
f16_trap_t
f80_trap_t
f128_trap_t
f256_trap_t
f8_throw_t
f16_throw_t
f80_throw_t
f128_throw_t
f256_throw_t

… and may also optionally provide any or all of the following types for fixed-width decimal floating-point numbers with standardized semantics.

dec32_t
dec64_t
dec128_t
dec32_iec559_t
dec64_iec559_t
dec128_iec559_t
dec32_fast_t
dec64_fast_t
dec128_fast_t
dec32_trap_t
dec64_trap_t
dec128_trap_t
dec32_throw_t
dec64_throw_t
dec128_throw_t

Matthew Woehlke

unread,
Feb 23, 2017, 2:50:59 PM2/23/17
to std-pr...@isocpp.org
On 2017-02-23 13:55, Ray Hamel wrote:
> C99, C++11 and later introduced standardized fixed-width integer types such
> as int32_t provided in the headers stdint.h/cstdint and inttypes.h/cinttypes.
> However, although the float and double data types conventionally are
> implemented with, respectively, IEC 559 binary32 and binary64 format and
> semantics, and long double as a typedef of double, this is not specified by
> the standard.
>
> There is therefore a need for floating-point types with well-defined format
> and semantics, which could be provided in existing headers such as cfloat or
> cmath, or a new header perhaps called float, stdfloat or floattypes.

This seems reasonable to me at least. Please however name them e.g.
float32_t, for consistency:

int, short, long -> intN_t
float, double -> floatN_t

Given the int types are in cstdint, I guess the float types should be in
cstdfloat. This suggests however that you should talk to WG14 first...

Certainly the _fast variations make sense. I'm not sure how practical
the others are. Possibly _ieee754 makes sense to the extent that these
can either exist if floating point uses IEEE 754, or not exist
otherwise. Do non-IEEE compilers have a way of declaring an IEEE float
even if plain `float` is something else? For similar reasons, I'm not
sure if _trap, etc. variants are possible.

If you're going to have _fast, you should probably consider having
_least, also, especially if you have float80 variants, since a platform
might have float_least80_t -> float128_t but not have float80_t.

--
Matthew

Ray Hamel

unread,
Feb 23, 2017, 4:07:15 PM2/23/17
to ISO C++ Standard - Future Proposals, mwoehlk...@gmail.com
On Thursday, February 23, 2017 at 2:50:59 PM UTC-5, Matthew Woehlke wrote:
This seems reasonable to me at least. Please however name them e.g.
float32_t, for consistency:

  int, short, long -> intN_t
  float, double -> floatN_t
 
I was unsure of which way to go on that; I thought if I used the whole word "float" the longer types would start to get a bit ungainly. I also think that floatN_t might be somewhat harder to visually distinguish from plain float than intN_t is from int. What do you think?

static inline constexpr const volatile int _t_n;
static inline constexpr const volatile int8_t n;
static inline constexpr const volatile float _t_f;
static inline constexpr const volatile float8_t f;

Agreed that floatN_t would be more consistent — s/f(\d)/float\1/g on the OP unless others think differently.

Given the int types are in cstdint, I guess the float types should be in
cstdfloat. This suggests however that you should talk to WG14 first...
 
Do they have a public mailing list? I wasn't able to find one. In any case, since this is far from a formal proposal, any feedback I could get here or there is useful.

Certainly the _fast variations make sense. I'm not sure how practical
the others are. Possibly _ieee754 makes sense to the extent that these
can either exist if floating point uses IEEE 754, or not exist
otherwise. Do non-IEEE compilers have a way of declaring an IEEE float
even if plain `float` is something else?

I'm not sure. I know IEEE 754 is possible to implement in software and there are compilers that do so for platforms that don't support it in hardware. I refer to it as IEC 559 here because that's how it's referred to in the standard.
 
For similar reasons, I'm not sure if _trap, etc. variants are possible.

IEEE 754 specifies that those five floating-point exceptions be recorded in the floating-point status word, so those variants should be trivial to implement (possible at the library level with C++, would require compiler support with C). But if not they could also be made optional.
 
If you're going to have _fast, you should probably consider having
_least, also, especially if you have float80 variants, since a platform
might have float_least80_t -> float128_t but not have float80_t.

Good idea.

- Ray 

Nicol Bolas

unread,
Feb 23, 2017, 6:35:09 PM2/23/17
to ISO C++ Standard - Future Proposals
One of the problems you're going to run into here is that the specialized integer types were inherited from C99. They were based on existing practice among a number of compilers. GCC, MSVC, and others had built-in sized integer types. And these new integers fit into a standardization paradigm that included "extended integers": implementation-defined integer types that were not the standard integer types. The standard explicitly allowed implementations to add them.

What you're doing here is creating something from whole cloth, with no existing practice among a number of compilers. Nobody has `float32_t` typedefs out there or something similar. There is no standard framework for implementation-defined floating point types.

Also, the only standard-defined integer type whose implementation is fixed to a specific set of rules is the `(u)intXX_t` series of integers; they are required to be two's complement. And they're also optional. If a platform doesn't support two's complement, they just don't implement them.

What you seem to want is for compilers to implement a bunch of floating point types with required behavior. To force compilers to implement a bunch of different potential floating-point types.

Jonathan Müller

unread,
Feb 24, 2017, 3:43:22 AM2/24/17
to std-pr...@isocpp.org
On 23.02.2017 20:50, Matthew Woehlke wrote:
>
> Certainly the _fast variations make sense.

Note that they are already available under a different name:
std::float_t is an alias to a floating point at least as big as float,
std::double_t is an alias to a floating point at least as big as double.
It is designed to be the more efficient type.

Marc

unread,
Feb 25, 2017, 12:44:49 PM2/25/17
to ISO C++ Standard - Future Proposals
 
GCC7 (in C mode) has support for _Float32, etc based on ISO/IEC TS 18661-3:2015. You may want to discuss the relation with that TS in your proposal.

https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01290.html
Reply all
Reply to author
Forward
0 new messages