I'm so lost here. Mapsum is 10 times slower than a naive FOR loop. What am I doing wrong?
## TOY SETUP: (I tried variations on this, with similar results)
## x is a vector. Energy is a scalar function of x. We wish to mapsum energy over several x's, collected in a matrix called X.
##
## We explore several ways:
## (a) Lucky break: Exploiting that X can be direcly multiplied as a matrix. (1.6s)
## (b) A FOR loop over every x within X. Takes 10 times longer. (6s)
## (c) mapsum: Takes ANOTHER 20 times longer(!!????) (130s!)
## I would have thought that (c) would save significant time over (b). Instead it takes 10 times longer than b. Do you see what I am doing wrong below?
## (In each case, we get the same answer for energy).
tic;
m=601;
n = 599;
ssx=MX.sym('x',m,1); ## symbolic x.
ssX = MX.sym('X', m,n);
ssw = MX.sym('w', m,m);
## Define a somewhat complicated y. Define a Y for the entire X matrix as well.
ccy = 0+ssx;
ccY = 0+ssX;
for ii=1:5; ## where E
ccy = ssw * ccy;
ccy = tanh(ccy);
ccY = ssw * ccY;
ccY = tanh(ccY);
endfor
## actual values:
vvX = randn(m, n);
vvw = randn(m, m);
energy = sumsqr(ccy);
Energy = sumsqr(ccY);
fenergy = Function('fenergy', {ssx, ssw}, {energy});
fEnergy = Function('fEnergy', {ssX, ssw}, {Energy});
## Now, let's measure energy in several ways.
#################################
## (a) Direct matrix way, definiton.
tic;
vEnergy1 = full(fEnergy(vvX, vvw));
tocmy(" done Using matrix.", vEnergy1); ## 1.6s
#################################
## (b) A sum over individual energies:
tic;
vEnergy2=0;
for ii=1:n;
venergy = full(fenergy(vvX(:,ii), vvw));
vEnergy2 += venergy;
endfor
tocmy("done Using individual sums.", vEnergy2); ##
#################################
## (c1) Using mapsum, but directly on values!
tic;
symbolicenergy3 = fenergy.mapsum({vvX, vvw}){1};
## This is an expression over numbers. convert it to an actual number.
dummy = MX.sym('dummy');
fenergy3 = Function('dummy',{dummy},{symbolicenergy3});
venergy3a = full(fenergy3(0));
tocmy(" done Using mapsum on values.", venergy3a); ## 150s
## (c2) Using mapsum, but on ssX
tic;
symbolicenergy3 = fenergy.mapsum({ssX, ssw}){1};
fenergy3 = Function('f',{ssX,ssw},{symbolicenergy3});
venergy3 = full(fenergy3(vvX, vvw));
tocmy(" done Using mapsum on symbols.", venergy3); ## 120s
#################################
(using casadi 3.4.0 on 4.2.2)