performance of math operations

Gerhard

unread,

Feb 9, 2000, 3:00:00 AM2/9/00

to

I think I could find a way to normalize a vector v1 without the sqrt
calculation.
It might be possible to use an arbitrary vector v2 of length one (e.g.
1,0,0) , compute the angle between v1 and v2 (involves an arcus cosine), and
rotate the v2 by this angle (involves sin and cos). Rotation vector for this
operation would have to be the normal of v1 and v2 (Cross-Product), I think.
I hope this normal doesn't have to be normalized, otherwise I'd have a
problem.
Ok, so we replaced the sqrt by an acos, a cos, a sin and some additional
mults and adds.

Does this make sense? How slow is the sqrt really?

Another question: I could in one of my calculations replace a divide by
three mults. Same thing, does it make sense, how slow is divide compared to
mult?

Thank you

Gerhard

Hans-Bernhard Broeker

unread,

Feb 9, 2000, 3:00:00 AM2/9/00

to

Gerhard <ger...@postkasten.de> wrote:
> I think I could find a way to normalize a vector v1 without the sqrt
> calculation.

[...]

> Ok, so we replaced the sqrt by an acos, a cos, a sin and some additional
> mults and adds.

> Does this make sense?

No. Each of cos, sin and acos is usually even slower than sqrt.
sqrt() may not be terribly fast, sure, but there's not much you can do
about that, either.

> How slow is the sqrt really?

> Another question: I could in one of my calculations replace a divide by
> three mults. Same thing, does it make sense, how slow is divide compared to
> mult?

Depends on the particular hardware. The only way to find out is to
time it, on your platform.

--
Hans-Bernhard Broeker (bro...@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Jason Dorie

unread,

Feb 23, 2000, 3:00:00 AM2/23/00

to

Gerhard <ger...@postkasten.de> wrote in message
news:38a16cfa$1...@news.ginko.net...

> I think I could find a way to normalize a vector v1 without the sqrt
> calculation.

There are a couple ways you can speed this up. First, most people,
normalizing a vector for the first time do this:

VectLen = sqrt(v.x*v.x + v.y*v.y + v.z*v.z);
v.x = v.x / VectLen;
v.y = v.y / VectLen;
v.z = v.z / VectLen;

Every processor I've ever seen is faster at doing multiplies than divides.
So instead of the above code which uses 3 divides, try this:

InverseVectLen = 1.0f / sqrt(v.x*v.x + v.y*v.y + v.z*v.z);
v.x = v.x * InverseVectLen;
v.y = v.y * InverseVectLen;
v.z = v.z * InverseVectLen;

This is a lot faster, but it can still be better. The 1.0f / sqrt()
caclulation is the nasty one. Fortunately there are some really good
approximations out there. Assuming you're running on a machine which
supports the IEEE floating point standard (most common processors do -
Intel, Motorola), and you're using single-precision arithmetic (32 bit,
instead of 64 bit) try this:

float InvSqrt(float v)
{
float x2 = v * (float)0.5F;
float y = v;
long i = *(long *) &y;
i = 0x5f3759df - (i>>1);
y = *(float *)&i;

y = y * (1.5f - (x2 * y * y));
y = y * (1.5f - (x2 * y * y));
return y;
}

Now, the normalize code looks like this:

InverseVectLen = InvSqrt(v.x*v.x + v.y*v.y + v.z*v.z);
v.x = v.x * InverseVectLen;
v.y = v.y * InverseVectLen;
v.z = v.z * InverseVectLen;

It runs really fast. ;-) The method is iterative, so if it's not fast
enough, you can remove one or both of the y = y*(1.5f...) lines, but it will
be less accurate. If you want more accuracy, add more of them, but it will
be slower. Enjoy!

Jason Dorie
djmid...@email.com