Are there anyway or function that can perform an interpolation or smoothing of data such that the output will be monotonically increasing, even some portion of data don't support so.
x=[0.1;0.2;0.3;0.4;0.5;0.6;0.7;0.8;0.9;1;];
y=[183.6992;347.6811;184.4308;199.1481;308.3963;453.1903;490.1048;558.374;613.4362;623.4727;];
I prefer the dipping part to be simply ignore during interpolation if possible.
Thanks for any suggestions and hints
I believe this kind of problem can be rewritten as least-square minimization (fit) with linear constraints.
Bruno
Yes, it can, almost. The problem is the least
squares may cause some of the data points to
not get fit exactly, even when the curve and
the data truly is monotone in that vicinity.
Or, if you use one knot per data point, then
a least squares fit can be trivially written
using lsqlin, but it may be slow if you have a
large number of points, since there will be
one constraint for each pair of consecutive
data points. For example, on the given curve,
try this...
x=[0.1;0.2;0.3;0.4;0.5;0.6;0.7;0.8;0.9;1;];
y=[183.6992;347.6811;184.4308;199.1481;308.3963;453.1903;490.1048;558.374;613.4362;623.4727;];
n = length(x);
C = eye(n);
D = y;
A = diag(ones(n,1),0) - diag(ones(n-1,1),1);
A(end,:) = [];
b = zeros(n-1,1);
opts = optimset('lsqlin');
opts.LargeScale = 'off';
opts.Display = 'none';
yhat = lsqlin(C,D,A,b,[],[],[],[],[],opts);
plot(x,y,'ro',x,yhat,'b-*')
If you like, now you could use interp1 (pchip)
to do a montone interpolation for the fitted
results.
How did it do?
yhat
yhat =
183.7
243.75
243.75
243.75
308.4
453.19
490.1
558.37
613.44
623.47
See that it has missed a total of 3 points. All
of the others will have been fit exactly, but
those three were ambiguous. How does lsqlin
know which point was in error? So it has
adjusted 3 consecutive points by the smallest
aggregate amount to yield a monotone curve.
Yes, usually the human eye is more able to see
that it was actually point #2 here that was in
error, but that will not always be the case.
HTH,
John
No, it is not doable, but it depends upon exactly
what you mean by "doable". How will you choose
to interpolate non-monotonic data so that the
result is monotone?
Interpolation means that you will reproduce the
data points exactly. So if your data is not monotone
then an interpolant cannot possibly be monotonic.
The best you can do is something like what pchip
does. Pchip will not introduce new local extrema
into the curve where the curve was monotone.
It will be as monotone as is the data.
Or, you can do as I did in my last post, where I
showed how to use a constrained least squares
to find the minimal change to your data points
such that the result is monotone.
These are the only two alternatives I'd suggest
There are of course others, subtle variations on
the above themes, but not much different.
John
Bruno
Here is the monotonic fit results for three strategies (respectively raw data, L1, Linfty and L2).
[ydata yl1 yinf yl2]
ans =
183.6992 183.6992 184.0256 183.7000
347.6811 197.5389 273.2680 243.7500
184.4308 197.5389 273.2680 243.7500
199.1481 199.1481 275.0011 243.7500
308.3963 308.3963 323.7812 308.4000
453.1903 453.1903 449.2267 453.1900
490.1048 490.1048 499.5431 490.1000
558.3740 558.3740 561.7567 558.3700
613.4362 613.4362 612.5059 613.4400
623.4727 623.4727 647.3502 623.4700
Bruno (who likes the l1)
Bruno
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% testmonottonic.m
% Solve the problem submitted in the following thread
% http://www.mathworks.com/matlabcentral/newsreader/view_thread/238438
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Data
% Pete Shere's (OP) test data
x=[0.1;0.2;0.3;0.4;0.5;0.6;0.7;0.8;0.9;1];
y=[183.6992;347.6811;184.4308;...
199.1481;308.3963;453.1903;...
490.1048;558.374;613.4362;...
623.4727];
% L2 solution, using lsqlin, solved by John D'Errico
yl2 = [183.7
243.75
243.75
243.75
308.4
453.19
490.1
558.37
613.44
623.47 ];
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Engine starts here, basic reshape
x = x(:); % Actually x does not play any role in the problem, it is simply
% a correspondant abscissa to y data
y = y(:);
n = length(y);
if length(y)~=n
error('x and y must have the same length');
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% L1 fitting, BL's preference
M = eye(n);
% L1 system with slack variables u and v
% y(i) = yfit(i) + u(i) - v(i) for all i
% u>=0
% v>=0
% min sum(u+v) = sum|y-yfit|
Aeq = [M eye(n,n) -eye(n,n)];
beq = y(:);
c = [zeros(1,size(M,2)) ones(1,2*n)].';
%
LB = [-inf(1,size(M,2)) zeros(1,2*n)].';
% no upper bounds at all.
UB = inf(size(LB)); % [];
% Monotonic requirement
A = diag(ones(1,n),0)-diag(ones(1,n-1),1);
A(end,:) = [];
A(end,3*n) = 0;
b = zeros(n-1,1);
% Solve LP
% Minimizing (for x in R^n): f(x) = c'*x, subject to
% A*x <= b (LE)
% Aeq*x = beq (EQ)
% LB <= x <= UB (BD).
if isempty(which('linprog'))
% Here is BL Linprog, call Matlab linprog instead to get "SOL",
% the solution of the above LP problem
mpsfile = 'L1_1DReg.mps';
Contain = BuildMPS(A,b,Aeq,beq,c,LB,UB);
OK = SaveMPS(mpsfile,Contain);
if ~OK
disp('Cannot write mps file');
return
end
[OK outfile] = invokePCx(mpsfile);
if ~OK
disp('Cannot invoke PCx');
return
end
[OK sol] = readPCxoutput(outfile);
if ~OK
disp('Cannot read PCx outfile');
return
end
else % Matlab linprog
% Adjust optional LP options at your preference, see
% http://www.mathworks.com/access/helpdesk/help/toolbox/optim/ug/linprog.html
% http://www.mathworks.com/access/helpdesk/help/toolbox/optim/ug/optimset.html
% BL has not checked because he does not have the right toolbox
LPoption = {};
% LPoption = {optimset(...)}
sol = linprog(c,A,b,Aeq,beq,LB,UB,[],LPoption{:});
end
%%% End of linprog solution
% Discard slack variables
sol = sol(1:size(M,2));
yl1 = M*sol;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Linfinity fitting
M = eye(n);
% Linfinity system with slack variables u and v
% L1 system with slack variables u and v
% y(i)+u = yfit(i) OR
% y(i)-v = yfit(i)
% u>=0 and v>=0
% min u+v = max|y-yfit|
LB = [-inf*ones(1,n) 0 0].';
UB = inf(size(LB)); % [];
A_LP=[+M -ones(n,1) zeros(n,1); ...
-M zeros(n,1) -ones(n,1)];
b_LP=[+y;...
-y];
f = [zeros(1,n) 1 1].';
% Monotonic requirement
A = diag(ones(1,n),0)-diag(ones(1,n-1),1);
A(end,:) = [];
A(end,n+2)=0;
b = zeros(n-1,1);
AA = [A_LP; ...
A];
bb = [b_LP; ...
b];
% Solve LP
% Minimizing (for x in R^n): f(x) = f'*x, subject to
% AA*x <= bb (LE)
% LB <= x <= UB (BD).
if isempty(which('linprog'))
% Here is BL Linprog, call Matlab linprog instead to get "SOL",
% the solution of the above LP problem
mpsfile='Linf_1DReg.mps';
Contain=BuildMPS(AA,bb,[],[],f,LB,UB,'Linf_1DReg');
OK=SaveMPS(mpsfile,Contain);
if ~OK
disp('Cannot write mps file');
return
end
[OK outfile]=invokePCx(mpsfile);
if ~OK
disp('Cannot invoke PCx');
return
end
[OK solinf]=readPCxoutput(outfile);
if ~OK
disp('Cannot read PCx outfile');
return
end
else % Matlab linprog
% Adjust optional LP options at your preference, see
% http://www.mathworks.com/access/helpdesk/help/toolbox/optim/ug/linprog.html
% http://www.mathworks.com/access/helpdesk/help/toolbox/optim/ug/optimset.html
% BL has not checked because he does not have the right toolbox
LPoption = {};
% LPoption = {optimset(...)}
solinf = linprog(f,AA,bb,[],[],LB,UB,[],LPoption{:});
end
yinf=solinf(1:n);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Graphic output
fig = figure(1); clf(fig);
ax = axes('Parent',fig);
plot(ax,x,y,'or',...
x,yl1,'b',...
x,yinf,'g', ...
x,yl2, 'c');
legend(ax,'data','l_1','l_{inf}','l_2','Location','NorthWest');
[y yl1 yinf yl2]
%% End of the script
There is an IMPORTANT caveat there.
Bruno has written a piecewise linear approximant. This
is NOT interpolation, because interpolation reproduces
the data exactly. When your data is not monotone, then
a monotone function cannot interpolate the data.
You will see different results from a monotone cubic
approximant too. See my SLM tools for both a linear
and cubic approximant. (It looks like Bruno has used
a solution that I posted some time ago, before I had
chosen to post SLM.
http://www.mathworks.com/matlabcentral/fileexchange/24443
Note that the constraints for a monotone cubic are
a bit nastier than just those one would employ for a
monotone linear approximant. In fact, there are several
ways one can implement a class of constraints for a
monotone cubic. (Outside the scope of this discussion.)
While in the past, I have offered an L1 or Linfinity solver
when I've written a tool like SLM, I've rarely found it to
be in demand. It is not difficult to add to SLM though
if users wanted it.
Now, why did Bruno see different results from the
various choices of approximant used?
An infinity norm solver will minimize the MAXIMUM
residual. This is a terrible choice for problems where
you have an outlier or outliers in the data, since all
that matters is a reduction of the MAXIMUM residual.
Worse, if you use an infinity norm solver to solve this
problem, then it need not reproduce the data points
even when the data is truly monotone! See that the
L1 and L2 solvers will at least reproduce the data
points in those parts of the curve where the data was
monotone.
At the other extreme is an L1 solver. Here the sum of
absolute residuals is minimized. So a single outlier may
often be ignored when it is inconsistent with the
remainder of the data. There is no assurance that this
will always yield a better result however. In fact, I can
give an example where the L1 solver need not generate
as good a solution as the L2 (lsqlin) solver.
I'll argue that the best solution would use my SLM tools
to yield a result that is both smooth and has all the
properties that you know must exist in the underlying
relationship.
Again, I'll consider adding an L1 solver to SLM if
there are requests. This is easy enough to do.
John
Yes you are right, I should call it "monotonic regression", "interpolation when-you-can", or whatever the process that corrects the data "as little as possible" to make them monotonic.
Bruno
Bruno
To some extent this is true, at least if you will set the
knots of the spline as the same as the data points.
However it is arguably better to use a monotone cubic
model in the first place. Even if the knots are still
coincident with all of the data points, this will still
result in a C2 cubic. Thus you will get a monotone
approximant that both fits the data as well as
possible and is C2. Since pchip is only C1, this may
be an issue. And application of pchip to a general
set of monotone points will not result in a C2 curve.
It will still be only C1.
I once did a minor survey, and I found that people
could indeed see the difference between a C1 and a
C2 cubic fit. Sometimes that difference is subtle, but
it is very much visible.
And it is not adequate to make the data points
monotone and then to run a standard interpolating
spline through them, since there is no way to
assure that the spline will still be truly monotone.
Finally, use of a tool like SLM allows you to build
many other constraints into the curve fit. Must it
pass through a specific point? Or perhaps it must
have a zero slope at one end of the curve? You
can also control the smoothness of the curve by a
judicious choice of the knots with SLM.
John
Would it be possible if you can write out the functional form of the linear and cubic hermite polynomial. I want to see how the slm.coef is used to evaluate the f(x).
Thanks a lot in advanced,
Pete