You CANNOT do this with multiple exponential components.
The log of a sum is NOT transformable to the sum of the logs.
John
fminspleas is on the FEX.
Note that sums of exponentials are notoriously ill-conditioned
for estimation. Expect to have a nasty time of it if you have
any noise at all in your data.
John
A nonoptimal quick answer approach
1. log
2. polyfit
3. exp
4, residual
5. repeat on residual
Hope this helps.
Greg
No. This is not an adequate method for estimation of
a multi-component exponential model. Fitting the
residuals will not yield a reasonable estimate of the
second component, since exponential components are
highly correlated with each other. You will just get
crapola for all of the parameters.
Use a nonlinear regression tool to fit all components
at once. Best is to use a partitioned solver, since this
greatly reduces the dimensionality of the space to be
investigated.
John
I agree completely provided the tool is available. My
impression from the OP's description was that none is available.
Recall that I did warn that such a quick "solution" was suboptimal.
How much depends on the signal and the definition of "good enough".
Obviously the ratio of the succesive term variances would be a useful
indicator.
QUESTION:
What is the fundamental difference between this "iteratively fit the
residual" strategy and the strategy used in STEPWISEFIT?
>Best is to use a partitioned solver, since this
> greatly reduces the dimensionality of the space to be
> investigated.
What is a partitioned solver?
Greg
Basically, it's one that eliminates variables. If you want to minimize a function
f(x,y) and you are able to efficiently evaluate
F(x)=argmin_y(f(x,y))
then you can reduce the original problem to the minimization of
g(x)=f(x,F(x))
which is better because now you are only searching over the lower dimensional space of x.
In the case of exponential fitting, for example,
f(x,y)= norm(exp(t(:)*x(:).')*y(:) - data)
y=F(x) is obtainable by doing an LSQ minimization.
I meant to say, "by doing a *linear* LSQ minimization"
Let me apologize again for a bum steer. I took a look at
x = A1*exp(-1*t)+A2*exp(-2*t)+A3*exp(-4*t)
for
A = [ 1, 1/sqrt(10), 1/10 ]
and
A = [ 1/10, 1/sqrt(10), 1 ]
For the first step of fitting to a single exponential,
too much of the function was modeled by the
first term in both cases. So, as John had predicted,
the 2nd and 3rd terms are very, very, very poorly
estimated.
> QUESTION:
>
> What is the fundamental difference between this "iteratively fit the
> residual" strategy and the strategy used in STEPWISEFIT?
The shapes of the basis functions are fixed and known,
not determined by the fitting process.
I don't understand how this would be accomplished.
Are you able to give a simple example?
> then you can reduce the original problem to the minimization of
>
> g(x)=f(x,F(x))
>
> which is better because now you are only searching over the lower dimensional space of x.
>
> In the case of exponential fitting, for example,
>
> f(x,y)= norm(exp(t(:)*x(:).')*y(:) - data)
> y=F(x) is obtainable by doing an LSQ minimization.
I don't see it.
Greg
The problem is that exponentials will be VERY highly
correlated. Extremely so.
So when you estimate a single term, it has parts of the
other terms mixed in. That first term estimate will be
poor, therefore the remainder of the terms will also be
poorly estimated.
Such a scheme will work only when the terms are
orthogonal to each other. To be most accurate, these
terms must be orthogonal over the actual set of data
points, a different thing (slightly) than being orthogonal
as functions.
Stepwisefit will choose which terms to insert based on an
iterative scheme, but will still build the model from the
chosen terms in ONE regression. This is very different
from a scheme wherein the terms are estimated separately.
>
> >Best is to use a partitioned solver, since this
> > greatly reduces the dimensionality of the space to be
> > investigated.
>
> What is a partitioned solver?
Imagine that we have a problem where the model is
Y = a1*F1(x,b1,b2,...) + a2*F2(X,b1,b2,...)
There could be additional terms of course, I'm just being
lazy here, economical of keystrokes.
Here a1, a2, and the variables b1,b2... are unknown
parameters. X and Y are vectors of data points. F1, F2, ...
are knowns functional forms, where if wee know the values
of the b variables and X, we can compute their values. You
COULD use a simple nonlinear regression, where all of
these parameters are treated equally as unknowns. And
it might converge. Of course, you need to supply
initial estimates for all of the unknowns.
Alternatively, suppose you had initial guesses only for
the unknowns b1,b2,... ? Call these parameters the
intrinsically nonlinear parameters.
So if we knew b1,b2,... then we could now estimate a1,a2
using a simple LINEAR regression. Those parameters can
be determined using a non-iterative scheme, and conditional
on the values of the b variables, the values of the variables
a1,a2 are exactly known. I think you should see that if
we now iterate purely on the unknowns b1,b2,... estimating
the values of a1,a2 using the linear regression, then when
those nonlinear parameters have converged, the a variables
will also be the correct, optimal values.
See that we never need to explicitly iterate on a1,a2. Those
variables are conditionally linear variables. In effect, we
have partitioned the problems into two sets of parameters.
This is known in some quarters as a partitioned solver, by
others as a separable nonlinear least squares.
By any name, this scheme will be much more robust than
a full iterative solver on all parameters. It does not require
starting values for the linear parameters. It will converge
more quickly, and be more stable than will the fully
iterative scheme.
This scheme is the basis for the fminspleas and pleas codes
I have posted on the FEX for several years now. Pleas can
be found as part of the optimization tips and tricks.
fminspleas uses fminsearch as the underlying optimizer,
pleas uses lsqnonlin if I recall properly.
The basic scheme goes back at least 40 years.
John
> For the first step of fitting to a single exponential,
> too much of the function was modeled by the
> first term in both cases. So, as John had predicted,
> the 2nd and 3rd terms are very, very, very poorly
> estimated.
Long long ago, in a galaxy far far away, I once did
a test like this:
X = linspace(0,1,101)';
% model: Y = a1*exp(b1*X)
% Assume that a1 = b1 = 1, in this example
Jac1 = [exp(1*X),1*X.*exp(1*X)];
% How well is this model behaved? Look at the singular values of Jac1.
svd(Jac1)
ans =
21.715
3.9466
%% Now consider a model with two such terms.
Jac2 = [exp(1*X),1*X.*exp(1*X),exp(2*X),2*X.*exp(2*X)];
% How well is this model behaved?
svd(Jac2)
ans =
96.356
13.546
0.98869
0.02682
%% Three terms?
Jac3 = [exp(1*X),1*X.*exp(1*X),exp(2*X), ...
2*X.*exp(2*X),exp(3*X),3*X.*exp(3*X)];
% How well is this model behaved?
svd(Jac3)
ans =
239.42
21.841
3.4019
0.21896
0.0055045
0.00019412
As you can see, adding new terms rapidly drives the
condition number of the Jacobian matrix to a bad place.
The partitioned solver, since it works on only half as
many parameters, is better behaved. And smaller
problems with smaller search spaces are simpler
problems to solve.
John
In the above, if x is fixed at some known value x0 then f(x,y) becomes
f(x0,y)=norm(A*y(:)-data)
where the matrix A is exp(t(:)*x0(:).'). The minimization of this with respect to y is
simply y=A\data, or in other words
F(x0)=exp(t(:)*x0(:).')\data