OPTI nonlinear solvers execution time

166 views
Skip to first unread message

Abdo Mohamed

unread,
Dec 25, 2017, 7:14:48 AM12/25/17
to OPTI Toolbox Forum
Greetings,

I am using OPTI in order to study different algorithms being used in the problem of solving nonlinear equations [specifically in circuits and semi-conductor simulators]. I am running many test cases in order to compare some of the algorithms included in OPTI and those included in other circuit and device simulators [mainly open sourced E.g. spice based simulators]. The main point of comparison is the simulation time. I do have few questions regarding this point

  • Can i rely on the execution time reported by any of the algorithms here, or the Matlab wrapper introduces overhead 
I am mainly interested in comparing the simulation time [which includes solving a huge set of sparse non-linear equations] both on the spice simulators and the Matlab interface for OPTI, 
  • would such comparison be fair or i should build nativly any of the listed algorithms and run them outside the Matlab environment ?

Thanks

Jonathan Currie

unread,
Dec 26, 2017, 1:47:57 AM12/26/17
to OPTI Toolbox Forum
Hi,

The execution time reported by OPTI (i.e. within the 'Time' field of the output statistics) is recorded as follows (for e.g. NL2SOL):

t = tic;
% Run NL2SOL
[x, fval, exitflag, iter, feval] = nl2sol(fun,grad,x0,ydata,lb,ub,opts);

%Collect Results
info.Iterations = iter;
info.FuncEvals = feval;
info.Time = toc(t);

Noting it uses MATLAB's tic and toc functions (https://au.mathworks.com/help/matlab/ref/tic.html) around the call to the actual MEX file. Therefore the timing can be basically assumed to be just the MEX interface, and not any MATLAB setup/wrapper code. However, because the MEX interface will call MATLAB functions (e.g. fun and grad), then this does execute MATLAB code. Whether these calls are faster or slower than compiled code (e.g. written in C/C++) really depends on your application. You would have to profile these separately to get a feel for this difference.

Also remember most SNLE algorithms supplied with OPTI are for dense problems only. OPTI does provide a means to supply sparse derivatives (http://www.inverseproblem.co.nz/OPTI/index.php/Probs/SNLE#SparseDerivs), but then the problem is solved as an NLP (generally using IPOPT), which may not be the intended comparison.

Jonathan



--
You received this message because you are subscribed to the Google Groups "OPTI Toolbox Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opti-toolbox-forum+unsub...@googlegroups.com.
To post to this group, send email to opti-toolbox-forum@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elbehery

unread,
Dec 27, 2017, 2:09:39 PM12/27/17
to opti-tool...@googlegroups.com


Thanks so much for your time.

In case someone is interested, i have actually ran few tests to compare the four following scenarios

  1. build MKLTRNLS nativly and measure the running time
  2. run the same function [with the Matlab implementation] over OPTI  and measure the time
  3. run the same function using Matlab fsolve without any sparse Jacobin optimization
  4. run the same function using Matlab fsolve and explicitly provide the Jacobin [not the pattern but the functions that evaluate it] and use the parallel Jacobin evaluator used in the Matlab optimization toolbox

I am using the Extended Powell singular function taken from [1] with 4000 unknowns and here are the results


For 1


I have manipulated the code at the examples provided and increased n = 4000; m=4000; it took almost 14 seconds for the code to obtain the roots




For 2

It took almost 70 seconds to execute the Matlab implementation of the function

function [F, J]  = ex_pow(x)
n=length(x);
F=zeros(1,length(x));
i=1:1:(length(x)/4);
F(4 * i - 3) = x(4 * i - 3) + 10.0 * x(4 * i - 2);
F(4 * i - 2) = 2.2360679774998 * (x(4 * i - 1) - x(4 * i));
F(4 * i - 1) = (x(4 * i - 2) - 2.0 * x(4 * i - 1)) .* (x(4 * i - 2) - 2.0 * x(4 * i - 1));
F(4 * i) = 3.1622776601684 * (x(4 * i - 3) - x(4 * i)) .* (x(4 * i - 3) - x(4 * i));
%The Jacobian part
if nargout > 1
    c=1;C=sparse(4*i-3, 4*i-3,c,n,n);
    c=10;C1=sparse(4*i-3, 4*i-2,c,n,n);
    d=2.2360679774998;D=sparse(4*i-2, 4*i-1,d,n,n);
    d=-2.2360679774998;D1=sparse(4*i-2, 4*i,d,n,n);
    d=2*(x(4*i-2)-2*x(4*i-1));D2=sparse(4*i-1, 4*i-2,d,n,n);
    d=2*(x(4*i-2)-2*x(4*i-1))*(-2);D3=sparse(4*i-1, 4*i-1,d,n,n);

    d=3.1622776601684*2*(x(4*i-3)-2*x(4*i));D4=sparse(4*i, 4*i-3,d,n,n);
    d=-3.1622776601684*2*(x(4*i-3)-2*x(4*i));D5=sparse(4*i, 4*i,d,n,n);
    J=C+C1+D+D1+D2+D3+D4+D5;
end



For 3

It took Matlab fsolve about 19 seconds to solve without any kind of optimization


For 4

Using the following options


It took Matlab 0.1 seconds to solve about 140x speedup !




Seems like Intel MKL does not support this kind of optimization for sparse problems [I have concluded this after reading on their website Does it?] They also don't support any kind of parallel optimization unlike Matlab


Again, thanks so much for your time and this beautiful project.

Jonathan Currie

unread,
Dec 27, 2017, 4:22:05 PM12/27/17
to Elbehery, OPTI Toolbox Forum
Hi,

Many thanks for posting your results, very interesting! A couple of points:

1) I don't think your Jacobian is implemented correctly. I used OPTI's Derivative Checker (https://inverseproblem.co.nz/OPTI/index.php/Advanced/Opts#derivCheck) and it reported a few errors. I've attached my implementation below.

2) OPTI won't recognise gradients supplied via the second output argument of an objective callback. You must supply them as a separate function via the 'grad' argument of OPTI.

3) As I stated above, sparse SNLEs are not supported by any of the supplied OPTI SNLE solvers. For your particular example problem, the size of the Jacobian is 4000x4000 (i.e 16 million dense elements), yet it only has 8000 non-zero elements. This is why a sparse solver is (sparse fsolve) is so much faster. This is a *very* important consideration in your comparison, especially if your problems are sparse (generally < 5 to 10% non-zero elements, your example problem is 0.05%, thus very sparse).

4) You can use IPOPT as a sparse SNLE solver, example posted below. On my (rather performance limited laptop), this solves the 4000 element problem in 0.6 seconds.


Generally the main computation time is taken in the factorization of linear system solved at each iteration for these sorts of problems (i.e. very simply objective + Jacobian). You can prove this by removing the sparsity structure (nlJacStr) from the example below and then trying to solve it, it will take significantly longer! This shows the linear system solver is having to do a lot more work with many more elements (as we'd expect). 


Jonathan


%% Extended Powell Example - Sparse
clc
n = 4000;
% Equations to solve
nleq = @(x) ex_pow(x);
% Jacobian (Sparse)
nlJac = @(x) ex_pow_jac_sp(x);

% Starting guess
x0 = repmat([3,-1,0,1]',n/4,1);

% Jacobian Sparsity Structure (*very* important) - note careful choice of x0 so all possible elements are non-zero
nlJacStr = @() spones(nlJac(x0));

% Build and Solve
opts = optiset('display','off','derivCheck','on','solver','ipopt');
optObj = opti('nleq',nleq,'nlJac',nlJac,'nlJacstr',nlJacStr,'x0',x0,'opts',opts)

[~,f,~,i] = solve(optObj)



function F  = ex_pow(x)
n=length(x);
% F=zeros(1,length(x));
i=1:1:n/4;
F(4 * i - 3) = x(4 * i - 3) + 10.0 * x(4 * i - 2);
F(4 * i - 2) = 2.2360679774998 * (x(4 * i - 1) - x(4 * i));
F(4 * i - 1) = (x(4 * i - 2) - 2.0 * x(4 * i - 1)) .* (x(4 * i - 2) - 2.0 * x(4 * i - 1));
F(4 * i) = 3.1622776601684 * (x(4 * i - 3) - x(4 * i)) .* (x(4 * i - 3) - x(4 * i));



function J  = ex_pow_jac_sp(x)
n=length(x);
i = 1:4:n;
J = sparse(i,i,1,n,n) + ...
    sparse(i,i+1,10,n,n) + ...
    sparse(i+1,i+2,sqrt(5),n,n) + ...
    sparse(i+1,i+3,-sqrt(5),n,n) + ...
    sparse(i+2,i+1,2*x(i+1) - 4*x(i+2),n,n) + ...
    sparse(i+2,i+2,8*x(i+2) - 4*x(i+1),n,n) + ...
    sparse(i+3,i,sqrt(10)*(2*x(i) - 2*x(i+3)),n,n) + ...
    sparse(i+3,i+3,-sqrt(10)*(2*x(i) - 2*x(i+3)),n,n);




On Thu, Dec 28, 2017 at 8:09 AM, Elbehery <abdo.m...@gmail.com> wrote:

Thanks so much for your time.

In case someone is interested, i have actually ran few tests to compare the four following scenarios

  1. build MKLTRNLS nativly and measure the running time
  2. run the same function [with the Matlab implementation] over OPTI  and measure the time
  3. run the same function using Matlab fsolve without any sparse Jacobin optimization
  4. run the same function using Matlab fsolve and explicitly provide the Jacobin [not the pattern but the functions that evaluate it] and use the parallel Jacobin evaluator used in the Matlab optimization toolbox

I am using the Extended Powell singular function taken from [1] with 4000 unknowns and here are the results


For 1


I have manipulated the code at the examples provided and increased n = 4000; m=4000; it took almost 14 seconds for the code to obtain the roots




For 2

It took almost 180 seconds to execute the Matlab implementation of the function

Seems like Intel MKL does not support this kind of optimization for sparse problem [I have concluded this after reading on their website Does it?] They also don't support any kind of parallel optimization unlike Matlab

To unsubscribe from this group and stop receiving emails from it, send an email to opti-toolbox-forum+unsubscribe@googlegroups.com.
To post to this group, send email to opti-tool...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Elbehery

unread,
Dec 27, 2017, 5:20:55 PM12/27/17
to opti-tool...@googlegroups.com
Thanks so much for your time and efforts really appreciated. 
Reply all
Reply to author
Forward
0 new messages