Math and Math Optimization

7 views
Skip to first unread message

GGulati

unread,
Dec 6, 2011, 10:08:35 PM12/6/11
to MonoXNA
The Matrix class should be heavily micro-optimized, including such
procedures as manually inlining the static method calls into the
instance method calls (i.e., myMatrix.Multiply should NOT call
Matrix.Multiply, and the op_Multiply should call neither of the two).
This also means that Matrix.Divide should be manually inlined because
the generated IL for the Matrix.Multiply method is over 32 bytes -
that is, over the maximum size of IL code that is inlined by the
JITter. Essentially, don't optimize for the code cache, optimize for
runtime speed (since there should be nothing in the Matrix class that
has branching).
Obviously, this should not break anything and should instead make
everything a few percentage points faster. Profiling on several
computers would demonstrate whether or not the intended optimization
worked or did not.

Changes:
- Optimized basic Matrix math (add, subtract, scalar multiply, matrix
multiply, scalar divide, inequality comparison)
- Optimized basic Vector2/3/4 math (cross product, reflect, distance
and length calculation, inequality comparison)
- Fails the Vector3 Clamp unit test at test #6 since #6 is a "wicked"
clamp - the min and max arguments are passed backwards. Compensating
for this case involves an extra "if" check in the MathHelper.Clamp
function and would significantly slow down the function.
- Optimized MathHelper functions
- Eliminated implicit casts to double in CatmullRom interpolation
(basically, it said 0.5 instead of 0.5f and so on)


Recommended Changes:
- MathHelper.ToRadians and MathHelper.ToDegrees: Lose the double
precision. Marginal benefit.
- MathHelper.WrapAngle: I suppose that this is supposed to, well, give
you an angle between 0 and TwoPi radians. Erm... why not use a modulo
and then an if check? You lose the expensive casting and a branch.
Mind you, modulo is not a terribly fast instruction, but it's faster
than calling a method with two casts.
angle = angle % TwoPi;
return angle < 0.0f ? angle + TwoPi : angle;

Suggestions for the Future:
- Implement your own Math.Sqrt function that uses floats instead of
doubles. Save on the expensive casts. Add it to MathHelper or
whatever.
- You'd have to profile this change and see if it made a difference,
since System.Math.Sqrt is actually just an extern.
- For Vector2/3/4.Transform(VectorN[] sourceArray, ref Matrix matrix,
Vector2[] destinationArray): Erm... am I missing something here? It
seems obvious to just loop and transform each Vector.
- Also, inline the Transform overload calls as soon as you finalize
the implementations
- For Vector2/3/4.Transform(VectorN value, Quaternion rotation): Why
not use the Matrix.CreateFromQuaternion method to create a Matrix and
then called VectorN.Transform(Vector2, Matrix)? It's slow, but at
least it runs.
- Various other functions can be altered so they don't create new
VectorN or Matrix or whatever and instead recycle the arguments (which
are copied on the stack and so are safe to reuse)

Patch file:
http://monoxna.googlecode.com/issues/attachment?aid=270020000&name=mathOptimizationPass1.patch&token=yLU55LG_GTMsVAGiZ-n3js3V6FA%3A1323226232601

Reply all
Reply to author
Forward
0 new messages