Simon:
> If J has full rank, this means that none of the 8 columns is linearly
> dependent
> on another column or, equivalently, no parameter is linearly dependent
> on another parameter.
Yes.
> " You would generally choose the (normalizes) eigenvectors of the J^T J
> you have as the unit directions and the new set of parameters are then
> multipliers for these directions. "
>
> Is it reasonable to choose \sum_{i=1}^8 ev_i(ev_i are the normalized
> eigenvectors)
> as the new set of parameters?
The sum you show is only one vector. You need 8 basis vectors.
Think of it this way: You have 8 parameters, which I'm going to call
a,b,c,d,e,f,g,h. The parameter space is 8-dimensional, and the solution
of the parameter estimation problem is a point in this parameter space
which I'm going to call
x^*
and you choose to represent this point in 8-space via
x^* = a^* e_a + b^* e_b + ...
where
e_a=(1,0,0,0,0,0,0,0)
e_b=(0,1,0,0,0,0,0,0)
etc.
But you could have chosen to represent x^* as well via
x^* = A^* E_1 + B^* E_2 + ...
where
E_1 = \sqrt{lambda_1} ev_1
E_2 = \sqrt{lambda_2} ev_2
etc
(or some such) and then you are seeking the coefficient A,B,C, ...
> " Of course, all of this requires computing the eigenvalues and
> eigenvectors of the current J^T J once."
>
> If I get the gist correctly, I first run my dealii program to compute J
> like I do it right now,
> thence compute the eigenvectors, and based on them the new set of
> parameters.
> Finally, I run my dealii program again to compute the new J with the new
> parameters.
> Correct?
Yes, something like this.
> I did not read about such a scaling technique in the context of
> parameter estimation so far.
> Do you have any reference where the procedure is described?
Sorry, I don't. But I have a test problem for you: Try to minimize the
function
f(x,y) = 10000*(x-y)^2 + (x+y)^2
It has a unique minimum, but the x,y-values along the valley of this
function are highly correlated and so a steepest descent method will be
very ill-conditioned. The eigenvectors of the Hessian of this function
give you the uncorrelated coordinate directions:
ev_1 = 1/sqrt(2) [1 1]
ev_2 = 1/sqrt(2) [1 -1]
If you introduce variables
A = (x-y)*sqrt(10000)
B = (x+y)*sqrt(1)
your objective function will become
f(A,B) = A^2 + B^2
which is much easier to minimize. Once you have the solution A^*, B^*,
you can back out to x^*, y^*.