Hi Everybody,
I'm experimenting with a rv32iamf core and I'm not sure if this is by design or not. In essence, I'm fine with the low precision of single precision and want to avoid double precision software implementation.
The whole codebase is using floats, but then when I call functions such as powf, or sqrtf they will still convert the float internally to double and then use the double for few moment before retunring to float again, which I don't want.
and then i uses software implementation to work on double.
I think it's ussing the correct multilib (with correct abi):
Forgot to add that the powf is calling the __extendsfdf2 often which is the float to double convertor which is used as if you would do msoft-float approach (which I try to avoid)
https://gcc.gnu.org/onlinedocs/gccint/Soft-float-library-routines.html
Of course then there are other soft float add/sub/div implementations added as well.
Regards,
Anton
From: Anton Krug [mailto:anton...@microsemi.com]
Sent: Friday, December 8, 2017 3:08 PM
To: RISC-V SW Dev <sw-...@groups.riscv.org>
Subject: [sw-dev] Experimenting with a floating point
EXTERNAL EMAIL
--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
sw-dev+un...@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at
https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit
https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/00396c6ea14a4be39d39d5d43306c074%40microsemi.com.
Thank your Jim for the deep explanation. Yes in case of an exception the performance is the least of the worry. Yes it's the newlib.
I'm worried if there is something else happening as well, I have seen that the software add, subtraction and division were included:
__divdf3
__adddf3
__subdf3
Are this required for the exceptions, or there is something else happening?
I tried the following:
_LIB_VERSION_TYPE _LIB_VERSION = _POSIX_; // and tried _IEEE_
But still the divdf3 are still present, If I understand it correctly the divdf3 is added there on compile time, while the global variable affects functionality at runtime only.
Anton
On Fri, Dec 8, 2017 at 8:32 PM, Anton Krug <anton...@microsemi.com> wrote:
> I'm worried if there is something else happening as well, I have seen that
> the software add, subtraction and division were included:
>
> __divdf3
For pow (x, y), if you have a negative x, a non-integral y, and the
result is a NaN, then you get a domain error. We need a double NaN
for the struct exception retval field, which is generated by doing a
double 0.0/0.0 operation. There may be a better way to do this for a
single-float-only target, but this is exception code so probably not
critical.
> __adddf3
> __subdf3
These are both called from rint. For pow (x, y), if you have a
negative x, and the result is an infinity, then we need a double
HUGE_VAL for the struct exception retval field, which is provided by a
macro that calls a compiler builtin, except that this needs to be
positive if y*0.5 is an integer, and negative if y*0.5 is a
non-integer, so rint is called to check to see if y*0.5 is an integral
value or not. But we do have an rintf function, so this one is
fixable by calling rintf instead of rint. It appears that powf is the
only place where this mistake is made with rint/rintf. The other
float functions that use rint appear to be correctly calling rintf.
Fixing this gets rid of the adddf3 and subdf3 calls. This is
exception handling code though, so probably not performance critical.
And that reminds me that we could replace the double 0.0/0.0 with
another compiler builtin to generate a NaN. Fixing this gets rid of
the divdf3 calls.
However, we will still be left with the extendsfdf2 calls, because of
the struct exception fields. That is much more work to get rid of.
> I tried the following:
> _LIB_VERSION_TYPE _LIB_VERSION = _POSIX_; // and tried _IEEE_
>
> But still the divdf3 are still present, If I understand it correctly the
> divdf3 is added there on compile time, while the global variable affects
> functionality at runtime only.
This is run-time only, so yes, the divdf3 and friends will still be
there in the binary.
Since I was looking at disassembled code, I noticed that matherr is
just two instructions, one to load 0 into the return value register,
and one to return. So the default version does nothing. This is just
a hook provided so that programmers can override it if they want to do
something more interesting on error, but it is unlikely that many
newlib users are defining this hook. It looks like we are doing a lot
of work for very little benefit. If someone cared enough about this,
they could add a configure option to newlib to disable the matherr
support. This would get rid of the extendsfdf2 calls, at the expense
of losing some SVID/XOPEN compatibility. This would be a moderate
size project though, as there are an awful lot of matherr calls in
newlib, and every single one would have to be tested if someone
changed this. It might be reasonable to try adding ISO C 99 fenv
support as a replacement, but that makes it an even bigger project.
Jim
--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CAFyWVabq5iz%2B5HdFPRkR9U5Ub5ASXBC19Ycu%3DFMH9AUkKjAxSA%40mail.gmail.com.
It's a problem in a resource (e.g. memory) constrained embedded environment because of the need to carry double overhead when only doing float arithmetic thus bloating the program size arguably unnecessarily.
Thank you very much, that was pretty fast.
I will try to do a build today.
I was applying the patch by hand:
Thank you very much, that was pretty fast.
I will try to do a build today.