Mapsum is 10 times slower than a naive FOR loop. Any ideas what I'm doing wrong?

121 views
Skip to first unread message

Dave Goel

unread,
Apr 1, 2018, 5:46:54 PM4/1/18
to CasADi
I'm so lost here. Mapsum is 10 times slower than a naive FOR loop. What am I doing wrong?


## TOY SETUP: (I tried variations on this, with similar results)

## x is a vector. Energy is a scalar function of x. We wish to mapsum energy over several x's, collected in a matrix called X.
## 
## We explore several ways:
## (a) Lucky break: Exploiting that X can be direcly multiplied as a matrix.  (1.6s)
## (b) A FOR loop over every x within X. Takes 10 times longer.  (6s)
## (c) mapsum: Takes ANOTHER 20 times longer(!!????)  (130s!)


## I would have thought that (c) would save significant time over (b). Instead it takes 10 times longer than b. Do you see what I am doing wrong below? 

## (In each case, we get the same answer for energy).

tic;


m=601;
n = 599;
ssx=MX.sym('x',m,1); ## symbolic x.
ssX = MX.sym('X', m,n);
ssw = MX.sym('w', m,m);

## Define a somewhat complicated y. Define a Y for the entire X matrix as well. 
ccy = 0+ssx;
ccY = 0+ssX;
for ii=1:5;  ## where E
  ccy = ssw * ccy;
  ccy = tanh(ccy);
  ccY = ssw * ccY;
  ccY = tanh(ccY);
endfor

## actual values:
vvX = randn(m, n);
vvw = randn(m, m);



energy = sumsqr(ccy);
Energy = sumsqr(ccY);


fenergy = Function('fenergy', {ssx, ssw}, {energy});
fEnergy = Function('fEnergy', {ssX, ssw}, {Energy});


## Now, let's measure energy in several ways. 


#################################

## (a) Direct matrix way, definiton. 
tic;
vEnergy1 = full(fEnergy(vvX, vvw));
tocmy(" done Using matrix.", vEnergy1); ## 1.6s




#################################
## (b) A sum over individual energies:
tic;
vEnergy2=0;
for ii=1:n;
  venergy = full(fenergy(vvX(:,ii), vvw));
  vEnergy2 += venergy;
endfor
tocmy("done Using individual sums.", vEnergy2); ## 



#################################
## (c1) Using mapsum, but directly on values!
tic;
symbolicenergy3 = fenergy.mapsum({vvX, vvw}){1}; 
## This is an expression over numbers. convert it to an actual  number. 
dummy = MX.sym('dummy');
fenergy3 = Function('dummy',{dummy},{symbolicenergy3});
venergy3a = full(fenergy3(0));
tocmy(" done Using mapsum on values.", venergy3a); ## 150s


## (c2) Using mapsum, but on ssX
tic;
symbolicenergy3 = fenergy.mapsum({ssX, ssw}){1};
fenergy3 = Function('f',{ssX,ssw},{symbolicenergy3});
venergy3 = full(fenergy3(vvX, vvw));
tocmy(" done Using mapsum on symbols.", venergy3); ## 120s


#################################



(using casadi 3.4.0 on 4.2.2)

Dave Goel

unread,
Apr 2, 2018, 2:01:12 AM4/2/18
to CasADi
Resolution: mapsum works better when applied to only one argument:

## UPDATE: Added sections (c3), (c4), (d1) and (d2)

## So, after pulling my hair out over this, I finally had some luck! See the new sections (c3), (c4) and (d)s below.

## (c3): Instead of applying mapsum to f(two arguments), I first "hardcode" the numerical value of the second argument, make a new function with only one argument. Apply mapsum there, and it's now magically much faster!

## (d) repeats the same but uses map. And, then sums them manually. And, just like map, it's exponentially worse if you apply it to two arguments, namely. ([matrix of various x's], a single w). Instead, best to first create a new function that "hardcodes w" so that it takes only x, and then map over it.  Note that map's exponential case is still twice as fast as that of mapsum's exponential case. 


## Here are the full results:

## TOY SETUP: (I tried variations on this, with similar results)

## x is a vector. Energy is a scalar function of x. We wish to mapsum energy over several x's, collected in a matrix called X.
## 
## We explore several ways:
## (a) Lucky break: Exploiting that X can be direcly multiplied as a matrix.  (1.6s)
## (b) A FOR loop over every x within X. Takes 4 times longer.  (6s)
## (c) mapsum: Takes ANOTHER 20 times longer(!!????)  (130s!)
## (c3/4) MAPSUM WITH ONE ARGUMNENT: 2s (ALMOST AS GOOD AS (A))!!
## (d2) map on one argument, plus sum: 2s. 
## (d1) map on two arguments: 

## I would have thought that (c) would save significant time over (b). Instead it takes 10 times longer than b. Do you see what I am doing wrong below? 

## (In each case, we get the same answer for energy).

## Here are the printouts from running the code below (the number on the right is the same energy they each discover)

## Oc:01:50:19_11710_6> __casadimapsumExample006
## [01:50:49_11710 __casadimapsumExample006(70)]   1.588s:  TOC:  (a) done Using matrix. 348014.002
## [01:50:55_11710 __casadimapsumExample006(84)]  5.7632s:  TOC:  (b) done Using for-loop summation outside casadi. 348014.002
## [01:50:57_11710 __casadimapsumExample006(100)]  2.0934s:  TOC:  (d2) done Using map of 1 arg, then sum. 348014.002
## [01:50:59_11710 __casadimapsumExample006(112)]   2.076s:  TOC:  (c3) done Using mapsum on ONE argument. 348014.002
## [01:51:01_11710 __casadimapsumExample006(123)]  2.0917s:  TOC:  (c4) done Using mapsum on ONE argument, applied directly to numerical 348014.002
## [01:52:12_11710 __casadimapsumExample006(130)]  71.374s:  TOC:  (d1) done Using map of 2 args, then sum. 348014.002
## [01:54:20_11710 __casadimapsumExample006(141)]  127.98s:  TOC:  (c1) done Using mapsum on values. 348014.002
## [01:56:27_11710 __casadimapsumExample006(149)]  126.67s:  TOC:  (c2) done Using mapsum on symbols. 348014.002



casadimainenable;


tic;
m=601;
n = 599;

## m=201;
## n =199;


ssx=MX.sym('x',m,1); ## symbolic x.
ssX = MX.sym('X', m,n);
ssw = MX.sym('w', m,m);

## Define a somewhat complicated y. Define a Y for the entire X matrix as well. 
ccy = 0+ssx;
ccY = 0+ssX;
for ii=1:5;  ## 
  ccy = ssw * ccy;
  ccy = tanh(ccy);
  ccY = ssw * ccY;
  ccY = tanh(ccY);
endfor

## actual values:
vvX = randn(m, n);
vvw = randn(m, m);



energy = sumsqr(ccy);
Energy = sumsqr(ccY);


fenergy = Function('fenergy', {ssx, ssw}, {energy});
fEnergy = Function('fEnergy', {ssX, ssw}, {Energy});


## Now, let's measure energy in several ways. 


#################################

## (a) Direct matrix way, definiton. 
tic;
vEnergy1 = full(fEnergy(vvX, vvw));
tocmy("(a) done Using matrix.", vEnergy1); ## 1.6s





#################################
## (b) A sum over individual energies:
tic;
vEnergy2=0;
for ii=1:n;
  venergy = full(fenergy(vvX(:,ii), vvw));
  vEnergy2 += venergy;
endfor
tocmy("(b) done Using for-loop summation outside casadi.", vEnergy2); ## 



#################################




## (d2) Using map and then manual sum.  But, first hardcode vvw. 
tic;
cenergy4 = fenergy(ssx, vvw);
fenergy4 = Function('fenergy4', {ssx}, {cenergy4});
fEnergyMap = fenergy4.map(n);
vEnergy4map =cdfevalfull(fEnergyMap, vvX);
vEnergy4 = sum(vEnergy4map,2);
tocmy("(d2) done Using map of 1 arg, then sum.", vEnergy4); ## 120s

#################################


## (c3) Using mapsum, but hardcode ssw. Thus, using ONLY one argument. 
tic;
ccenergy = fenergy(ssx, vvw);  ## Now, it is a function of ONLY one argument 
fcenergy = Function('energy',{ssx}, {ccenergy});
Energyc = fcenergy.mapsum({ssX}){1};
FcEnergy = Function('f',{ssX},{Energyc});
venergy3 = full(FcEnergy(vvX));
tocmy("(c3) done Using mapsum on ONE argument.", venergy3); ## 120s


## (c4) Using mapsum, but hardcode ssw. Furthermore, apply mapsum on vvX, not ssX.
tic;
ccenergy = fenergy(ssx, vvw);  ## Now, it is a function of ONLY one argument 
fcenergy = Function('energy',{ssx}, {ccenergy});
Energyc = fcenergy.mapsum({vvX}){1};
dummy = MX.sym('dummy');
FcEnergy = Function('f',{dummy},{Energyc});
venergy3 = full(FcEnergy(0));
tocmy("(c4) done Using mapsum on ONE argument, applied directly to numerical", venergy3); ## 120s

## (d1) Using map and then manual sum.  Does NOT work unles you repmat vvw too!
tic;
fEnergyMap = fenergy.map(n);
vEnergy4map =cdfevalfull(fEnergyMap, vvX, repmat(vvw,1,n));
vEnergy4 = sum(vEnergy4map,2);
tocmy("(d1) done Using map of 2 args, then sum.", vEnergy4); ## 120s



## (c1) Using mapsum, but directly on values!
tic;
symbolicenergy3 = fenergy.mapsum({vvX, vvw}){1}; 
## This is an expression over numbers. convert it to an actual  number. 
dummy = MX.sym('dummy');
fenergy3 = Function('dummy',{dummy},{symbolicenergy3});
venergy3a = full(fenergy3(0));
tocmy("(c1) done Using mapsum on values.", venergy3a); ## 150s


## (c2) Using mapsum, but on ssX
tic;
symbolicenergy3 = fenergy.mapsum({ssX, ssw}){1};
fenergy3 = Function('f',{ssX,ssw},{symbolicenergy3});
venergy3 = full(fenergy3(vvX, vvw));
tocmy("(c2) done Using mapsum on symbols.", venergy3); ## 120s

## (using casadi 3.4.0 on 4.2.2)

Reply all
Reply to author
Forward
0 new messages