[LLVMdev] Optimize away sqrt in simple cases?

Erkki Lindpere

unread,

Apr 23, 2013, 4:12:58 PM4/23/13

to llv...@cs.uiuc.edu

hi!

I'm using LLVM 3.2. Does it optimize away llvm.pow(llvm.sqrt(x), 2) to `x` with any settings? I tried with llc -O3, but that didn't do it.

Would be nice to write |v|² in my language ('v' being a 2D vector say and |...| and ...² being two separate infix "operators") -- when I could compare squares of lengths as well as lengths, and know that the sqrt is optimized away.

Erkki

Owen Anderson

unread,

Apr 23, 2013, 4:26:19 PM4/23/13

to Erkki Lindpere, llv...@cs.uiuc.edu

That's a pretty seriously unsafe floating point optimization. It could be done in fast-math mode, but I doubt we currently do it.

--Owen

> _______________________________________________
> LLVM Developers mailing list
> LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Duncan Sands

unread,

Apr 23, 2013, 4:29:22 PM4/23/13

to llv...@cs.uiuc.edu

Hi Erkki,

On 23/04/13 22:12, Erkki Lindpere wrote:
> hi!
>
> I'm using LLVM 3.2. Does it optimize away llvm.pow(llvm.sqrt(x), 2) to `x` with
> any settings? I tried with llc -O3, but that didn't do it.

the most powerful optimizations are done by the "opt" tool, not llc. That said,
I don't think any pass does this. It would at least require "fast math".

Ciao, Duncan.

>
> Would be nice to write |v|² in my language ('v' being a 2D vector say and |...|
> and ...² being two separate infix "operators") -- when I could compare squares
> of lengths as well as lengths, and know that the sqrt is optimized away.
>
> Erkki
>
>

Christoph Grenz

unread,

Apr 23, 2013, 6:33:46 PM4/23/13

to llv...@cs.uiuc.edu

Hello,

Am Dienstag, 23. April 2013, 13:26:19 schrieb Owen Anderson:
> That's a pretty seriously unsafe floating point optimization. It could be
> done in fast-math mode, but I doubt we currently do it.

I just saw this thread and wonder why it's "seriously" unsafe? I see only one
problematic corner case.

For x >= 0.0 the result cannot lose precision.
For x = NaN the result is the same. sqrt(NaN) = NaN; pow(NaN,2) = NaN
For x = +inf the result is the same. sqrt(inf) = inf; pow(inf,2) = inf
For negative numbers llvm.sqrt(x) by definition has undefined behavior and so
returning x from llvm.pow(llvm.sqrt(x),2) is acceptable.
The same should be true for -inf.

Only the result for -0.0 changes from 0.0 to -0.0.

Is there something I overlooked?

> --Owen

Christoph

Owen Anderson

unread,

Apr 23, 2013, 7:01:07 PM4/23/13

to Christoph Grenz, llv...@cs.uiuc.edu

On Apr 23, 2013, at 3:33 PM, Christoph Grenz <christo...@grenz-bonn.de> wrote:

> Hello,
>
> Am Dienstag, 23. April 2013, 13:26:19 schrieb Owen Anderson:
>> That's a pretty seriously unsafe floating point optimization. It could be
>> done in fast-math mode, but I doubt we currently do it.
>
> I just saw this thread and wonder why it's "seriously" unsafe? I see only one
> problematic corner case.
>
> For x >= 0.0 the result cannot lose precision.

This is not true. The mathematically correct result for sqrt might not be a representable value in floating point, so rounding may occur between the two steps. In that case, pow2(sqrt(x)) != x.

--Owen

Stephen Lin

unread,

Apr 23, 2013, 10:15:45 PM4/23/13

to Owen Anderson, llvmdev

> This is not true. The mathematically correct result for sqrt might not be a representable value in floating point, so rounding may occur between the two steps. In that case, pow2(sqrt(x)) != x.
>
> --Owen

I think what Christoph is saying is that x will always be at least as
accurate as pow2(sqrt(x)), so it's only unsafe in so far as one's code
is actually depending on an imprecise result.

It's not an "as-if" optimization though, so it definitely would
require "fast math" or something similar; however, it seems like it
would be on the more benign side among optimizations that apply
mathematical identities that are not necessarily identities on floats,
as far as I can tell...

-Stephen

Owen Anderson

unread,

Apr 24, 2013, 1:50:51 AM4/24/13

to Stephen Lin, llvmdev

On Apr 23, 2013, at 7:15 PM, Stephen Lin <sw...@post.harvard.edu> wrote:

>> This is not true. The mathematically correct result for sqrt might not be a representable value in floating point, so rounding may occur between the two steps. In that case, pow2(sqrt(x)) != x.
>

> I think what Christoph is saying is that x will always be at least as
> accurate as pow2(sqrt(x)), so it's only unsafe in so far as one's code
> is actually depending on an imprecise result.

Giving more-than-expected precision can be just as bad for the user as less. It tends to come up in situations where the optimization would break some symmetry, the same way that aggressively forming FMAs can break user code. Consider this example:

float foo(float a, float b) {
return pow2(a) - pow2(sqrt(b));
}

float bar(float c) {
return foo(sqrt(b), b);
}

The author *should* be able to assume that for any positive c, the only possible output values are Inf and zero. However, if we apply the pow2/sqrt peephole, suddenly non-zero finite outputs are possible. The equivalent example for FMA formation is x*x - x*x. If we convert that to "fma x, x, (-x*x)", you can get a non-zero finite result.

It boils down to the fact that giving excess precision in some-places-but-not-others can lead to bad behavior.

--Owen

Stephen Lin

unread,

Apr 24, 2013, 2:53:40 AM4/24/13

to Owen Anderson, llvmdev

On Wed, Apr 24, 2013 at 1:50 AM, Owen Anderson <resi...@mac.com> wrote:
>
> On Apr 23, 2013, at 7:15 PM, Stephen Lin <sw...@post.harvard.edu> wrote:
>
>>> This is not true. The mathematically correct result for sqrt might not be a representable value in floating point, so rounding may occur between the two steps. In that case, pow2(sqrt(x)) != x.
>>
>> I think what Christoph is saying is that x will always be at least as
>> accurate as pow2(sqrt(x)), so it's only unsafe in so far as one's code
>> is actually depending on an imprecise result.
>
> Giving more-than-expected precision can be just as bad for the user as less. It tends to come up in situations where the optimization would break some symmetry, the same way that aggressively forming FMAs can break user code. Consider this example:
>
> float foo(float a, float b) {
> return pow2(a) - pow2(sqrt(b));
> }
>
> float bar(float c) {
> return foo(sqrt(b), b);
> }
>
> The author *should* be able to assume that for any positive c, the only possible output values are Inf and zero. However, if we apply the pow2/sqrt peephole, suddenly non-zero finite outputs are possible. The equivalent example for FMA formation is x*x - x*x. If we convert that to "fma x, x, (-x*x)", you can get a non-zero finite result.
>
> It boils down to the fact that giving excess precision in some-places-but-not-others can lead to bad behavior.
>
> --Owen

OK, that makes sense--just clarifying what I thought Christoph meant.
In any case, maybe it ought to be at least an option for "fast math"
or in some other setting where mathematical identities that may gain
or lose precision are taken advantage of. (I know GCC's version of
"fast math" allows lots of things that are potentially more unsafe
than this; I'm not familiar of the state of clang's equivalent).

Stephen

Owen Anderson

unread,

Apr 24, 2013, 3:41:17 AM4/24/13

to Stephen Lin, llvmdev List

On Apr 24, 2013, at 12:36 AM, Stephen Lin <sw...@post.harvard.edu> wrote:

By the way, I definitely believe you that this isn't a 100% safe
optimization, but I'm curious, is this really guaranteed here that c
will not be a non-zero finite value? I was under the impression that
FPU state could lead to slightly different results in these kinds of
cases, so one should almost never rely on the result of a computation
being exactly zero (outside of some simple primitive cases).

That's a common misconception. Floating point computations are deterministic and a function only of their inputs (and a couple of modal switches like rounding mode).

--Owen

Reed Kotler

unread,

Apr 24, 2013, 4:00:51 AM4/24/13

to LLVM Developers Mailing List

On 04/23/2013 10:50 PM, Owen Anderson wrote:
>
> On Apr 23, 2013, at 7:15 PM, Stephen Lin <sw...@post.harvard.edu> wrote:
>
>>> This is not true. The mathematically correct result for sqrt might not be a representable value in floating point, so rounding may occur between the two steps. In that case, pow2(sqrt(x)) != x.
>>
>> I think what Christoph is saying is that x will always be at least as
>> accurate as pow2(sqrt(x)), so it's only unsafe in so far as one's code
>> is actually depending on an imprecise result.
>
> Giving more-than-expected precision can be just as bad for the user as less. It tends to come up in situations where the optimization would break some symmetry, the same way that aggressively forming FMAs can break user code. Consider this example:
>
> float foo(float a, float b) {
> return pow2(a) - pow2(sqrt(b));
> }
>
> float bar(float c) {
> return foo(sqrt(b), b);
> }
>
> The author *should* be able to assume that for any positive c, the only possible output values are Inf and zero. However, if we apply the pow2/sqrt peephole, suddenly non-zero finite outputs are possible. The equivalent example for FMA formation is x*x - x*x. If we convert that to "fma x, x, (-x*x)", you can get a non-zero finite result.
>
> It boils down to the fact that giving excess precision in some-places-but-not-others can lead to bad behavior.
>
> --Owen
>

Not sure exactly what the C/C++ standard says here but usually it's
allowed to give more precision.

It's also allowed for math to work as real math and not as limited
precision math.

If your code is requiring otherwise, then therein is the problem.

Can't arcsin(sqrt(2)/2) == pi/4 ?

I think so.

When you are debugging a calculator that's what you want.
People spend a lot of time trying to get the math library to give the
expected results when known equations combining various operations are
used. That is part of the sanity check of making sure that you have a
good math library implementation.

Scientists using the math library want the math to be as close to real
math as possible.

That was part of the point of the IEEE floating point standard.

sin (n * pi) == 0 etc.

Back in the day I knew W. Kahan (IEEE FP inventor) and I know that this
was his belief.

Renato Golin

unread,

Apr 24, 2013, 6:09:27 AM4/24/13

to Reed Kotler, LLVM Developers Mailing List

On 24 April 2013 09:00, Reed Kotler <rko...@mips.com> wrote:

When you are debugging a calculator that's what you want.

I thought that most calculators used fixed-point or arbitrary precision math...

Scientists using the math library want the math to be as close to real math as possible.

Scientists that don't understand FP well write bad scientific code. Computers can't do math, sorry.

Relying on a supposedly identical sequence of FP operations to generate the same output is one thing, relying on computers doing abstract math is a bit of a jump...

I know of a few examples where well written crude models easily outperform (in speed and quality) in-depth badly written models.

cheers,

--renato

Christoph Grenz

unread,

Apr 25, 2013, 6:12:54 PM4/25/13

to llv...@cs.uiuc.edu

Am Dienstag, 23. April 2013, 22:50:51 schrieben Sie:
> [...]

> Giving more-than-expected precision can be just as bad for the user as less.
> It tends to come up in situations where the optimization would break some
> symmetry, the same way that aggressively forming FMAs can break user code.

> [...]

>
> It boils down to the fact that giving excess precision in
> some-places-but-not-others can lead to bad behavior.

Ok, I didn't think about excess precision resulting in problems.
Now it's clear to me that fast-math is neccessary for this optimization.

BTW: Is there a way to only get fast-math optimizations that don't change
behaviour on NaNs and Infs? I think it would be an interesting possibility to
allow arithmetic tranformations but only those preserving NaN and infinity
bahavior to catch errors/corner cases.

At least the pow(sqrt(x),2)=>x optimization falls into this category as I
wrote before.

> --Owen

Christoph

Warren Hunt

unread,

Apr 25, 2013, 6:22:52 PM4/25/13

to Christoph Grenz, llv...@cs.uiuc.edu

I'd like to second that it would be nice to have an option for fast-math optimizations that don't change NaN and inf. When I was working on rendering systems, optimizations that subtly changed precision were fine but we relied on inf behavior.

-Warren

Reply all

Reply to author

Forward