Hey all, i'm working on a physics engine for an android game right now, obviously i want the highest possible fps on my physics engine, so i switched from openGL's matrix multiplication function to EJML, in order to transform vertices on my object's collision frames (these are not the same as opengl shader vertices, so i'm not computing them on the GPU, they're basically points along the outline of every object in my game, that are used for collision detection, and they have to be updated by matrix multiplication every frame).
anways, i was running some benchmarks to see how much of a speedup i could get using ejml, and i noticed that on the first few matrix multiply calls (all i'm using is MatrixMatrixMult.mult_small right now) its quite slow (still 5x faster than my multiplyMM opengl function however) but on later calls to the same multiplication function, i get an almost 10x speedup, here is an example of running the same multiplication 10 times in a loop: (time is in ms, for a 4x4 * 100000x4 matrix).
time = 23
time = 4
time = 4
time = 3
time = 3
time = 3
time = 3
time = 3
time = 3
time = 3
the problem is, when i put this MatrixMatrixMult.mult_small into my game loop, i only achieve the performance of the first iteration (time=23ms), presumably due to cpu caching...
so i was wondering if its possible to somehow achieve the faster multiplications on the first iteration? or is this simply not possible, and i should try to break up my multiplications in some other way to take advantage of caching? it would be nice to have the faster times (3ms) because obviously that would allow me to run almost 10x as many colliding objects in my engine...
here is the code i'm using to benchmark:
import org.ejml.data.DenseMatrix64F;
import org.ejml.alg.dense.mult.* ;
public class MatrixTest {
public static void main(String[] args){
DenseMatrix64F A = new DenseMatrix64F(4,4,true,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1) ;
int length = 100000 ;
DenseMatrix64F B = new DenseMatrix64F(length,4) ;
int dataCount = 0 ;
double[] data = new double[length*4] ;
for(int i=0;i<length;i++)
for(int j=0;j<4;j++){
data[dataCount] = Math.random() ;
dataCount++ ;
}
B.setData(data) ;
for(int i=0;i<10;i++){
DenseMatrix64F C = new DenseMatrix64F(length,4) ;
long startTime = System.nanoTime();
MatrixMatrixMult.mult_small(C,A,B) ;
long endTime = System.nanoTime();
long duration = (endTime - startTime);
System.out.println("time = " + duration/1000000) ;
//double[] d = new double[10000000] ;
//for(int j=0;j<d.length;j++)
// d[j] +=i+j ;
//System.out.println(d[100000]) ;
}
/*
A.print();
System.out.println();
A.print("%e");
System.out.println();
A.print("%10.2f");
*/
}
}