Finding the nearest matrix with real eigenvalues

Gary B

unread,

Aug 31, 2010, 11:14:07 AM8/31/10

to

Hi,

I've been struggling with this problem for a while and I can't seem to find out much about it. What I have are lots of 3*3 matrices which represent linear operators. I'm interested in the properties of these linear operators and have thus decomposed them into eigenvectors and eigenvalues, most of these matrices are decomposable in the real field (have real eigenvalues) and these are useful to me, but a small minority have complex eigenvalues which are of no use to me. So my question is: given a linear operator with complex eigenvalues how does one find the nearest linear operator with real eigenvalues? Also some measure of nearness is important as well.

I reckon this would require some use of the matlab function 'norm' but I can't seem to make any progress on this problem. Any suggestions, pointers to relevant texts (as this may be an obscure maths problem), or pointers to relevant matlab functions would be greatly appreciated.

Cheers,

Gary

John D'Errico

unread,

Aug 31, 2010, 12:00:11 PM8/31/10

to

"Gary B" <barg...@tcd.ie> wrote in message <i5j67v$4s4$1...@fred.mathworks.com>...

> Hi,
>
> I've been struggling with this problem for a while and I can't seem to find out much about it. What I have are lots of 3*3 matrices which represent linear operators. I'm interested in the properties of these linear operators and have thus decomposed them into eigenvectors and eigenvalues, most of these matrices are decomposable in the real field (have real eigenvalues) and these are useful to me, but a small minority have complex eigenvalues which are of no use to me. So my question is: given a linear operator with complex eigenvalues how does one find the nearest linear operator with real eigenvalues? Also some measure of nearness is important as well.
>
> I reckon this would require some use of the matlab function 'norm' but I can't seem to make any progress on this problem. Any suggestions, pointers to relevant texts (as this may be an obscure maths problem), or pointers to relevant matlab functions would be greatly appreciated.
>

The simple answer is to find the nearest symmetric matrix.
However, there are some non-symmetric matrices with real
eigenvalues.

>> B = rand(3)
B =
0.39223 0.70605 0.046171
0.65548 0.031833 0.097132
0.17119 0.27692 0.82346

>> eig(B)
ans =
-0.49599
1.0481
0.69536

My immediate thought would be to solve it as an
fmincon optimization problem.

>> A = randn(3)
A =
0.55253 0.085931 -1.0616
1.1006 -1.4916 2.3505
1.5442 -0.7423 -0.6156

>> eig(A)
ans =
0.082915
-0.81879 + 1.5839i
-0.81879 - 1.5839i

Thus, find the minimal perturbation to the 3x3 matrix A,
such that eig(A + delta) is entirely real.

delta will be a 3x3 matrix, with easy to choose starting
values.

>> delta0 = (A + A.')/2 - A;
>> obj = @(delta) sum(delta(:).^2);

I might also have chosen to use norm as a measure here.
Regardless, our initial value for the additive perturbation is
the matrix that sends A to a symmetric matrix.

Supply a set of nonlinear equality constraints as an m-file.

% =============================
function [C,Ceq] = mynonlcon(delta,A)
% nonlinear constraints for fmincon eigenvalue problem
C = [];
Ceq = imag(eig(A = delta));
% =============================

Set some options.

>> opts = optimset('fmincon');
>> opts.Algorithm = 'interior-point';

And solve the problem.

>> delta = fmincon(obj,delta0,[],[],[],[],[],[],@(delta) mynonlcon(delta,A),opts)
delta
delta =
-0.041618 0.3092 -0.35434
0.71895 -0.31954 0.15115
-0.70376 0.48807 0.36116

How does this compare to delta0?

>> delta0
delta0 =
0 0.50734 1.3029
-0.50734 0 -1.5464
-1.3029 1.5464 0

Clearly delta0 is bigger.

>> norm(delta)
ans =
1.1796

>> norm(delta0)
ans =
2.0848

>> eig(A + delta0)
ans =
0.8651
-0.38934
-2.0304

>> eig(A)
ans =
0.082915
-0.81879 + 1.5839i
-0.81879 - 1.5839i

Just a quick thought. There may be better solutions, but this
seems to work.

John

Roger Stafford

unread,

Aug 31, 2010, 2:41:19 PM8/31/10

to

"Gary B" <barg...@tcd.ie> wrote in message <i5j67v$4s4$1...@fred.mathworks.com>...

- - - - - - - - -
If we can assume that the elements of your 3 x 3 matrices are real, and if their eigenvalues are not all real, there will be two complex conjugate roots. Try forcing the two imaginary parts to zero and then reconstruct the matrix as follows. Let A be the original matrix.

[V,D] = eig(A);
A1 = real(V*real(D)/V);

The eigenvalues of A1 should be the same as the real parts of A's eigenvalues and if the imaginary parts were small, then the norm of A-A1 should be small. If the eigenvalues of A were real, then A and A1 should be equal (except for roundoff error.)

Note: The outer 'real' operator is only there to remove imaginary parts that are due to roundoff error.) The inner 'real' operator produces the essential change here.

Roger Stafford

Bruno Luong

unread,

Aug 31, 2010, 4:04:04 PM8/31/10

to

"Roger Stafford" <ellieandr...@mindspring.com.invalid> wrote in message <i5jicf$29u$1...@fred.mathworks.com>...

> If we can assume that the elements of your 3 x 3 matrices are real, and if their eigenvalues are not all real, there will be two complex conjugate roots. Try forcing the two imaginary parts to zero and then reconstruct the matrix as follows. Let A be the original matrix.
>
> [V,D] = eig(A);
> A1 = real(V*real(D)/V);
>

I conjecture Roger's solution is actually optimal in the sense that minimize the spectral norm(A-A1,2), however I can't see an easy proof (I haven't not tried hard either). Roger, what is you opinion?

Bruno

Roger Stafford

unread,

Aug 31, 2010, 6:22:05 PM8/31/10

to

"Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <i5jn7k$pdd$1...@fred.mathworks.com>...

- - - - - - - -

I was afraid someone would ask me that! The only thing I can prove offhand is the claim I made that with small imaginary parts in the original eigenvalues, you get A1 close to A in the sense that the determinant of their difference is small.

abs(det(A-A1)) = abs(det(V*D*inv(V)-V*real(D)*inv(V)))
= abs(det(V*i*imag(D)*inv(V)))
= abs(det(imag(D))) = prod(diag(abs(imag(D))))

V does have an inverse because we are talking about a matrix with three distinct roots, one real and two complex conjugates.

That this is optimal in some sense remains purely conjectural on my part. It just "feels" good.

Roger Stafford

unread,

Sep 1, 2010, 10:12:20 AM9/1/10

to

"Roger Stafford" <ellieandr...@mindspring.com.invalid> wrote in message <i5jvad$c2b$1...@fred.mathworks.com>...

> > > [V,D] = eig(A);
> > > A1 = real(V*real(D)/V);

> > > .......
> .... The only thing I can prove offhand is the claim I made that with small imaginary parts in the original eigenvalues, you get A1 close to A in the sense that the determinant of their difference is small. .....
- - - - - - - - - -
The argument I made yesterday was faulty. Having a matrix with a small determinant does not mean its elements are small. However, in the experiments I ran generating random 3 x 3 matrices with small imaginary components in their eigenvalues, the changes made when these were removed were always correspondingly small. So at this point I have only an empirical justification for the procedure I gave.

Roger Stafford

John D'Errico

unread,

Sep 1, 2010, 10:26:22 AM9/1/10

to

"Roger Stafford" <ellieandr...@mindspring.com.invalid> wrote in message <i5ln03$lje$1...@fred.mathworks.com>...

I guess the question is, what metric will you use to define
small?

John

Bruno Luong

unread,

Sep 1, 2010, 11:36:21 AM9/1/10

to

John and Roger,

John, I just wonder in your code will be you able to change the Frobenious's norm to spectral norm and see if the residual matrix you get is smaller than Roger's.

Thanks,

Bruno

Gary B

unread,

Sep 2, 2010, 3:09:18 PM9/2/10

to

"Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <i5lrtl$ip8$1...@fred.mathworks.com>...

Thanks very much guys you have been most helpful. I used Roger's solution as it was a bit simpler and more intuitive. I could in any case measure empirical how well the matrix performed relative to the one with complex eigenvalues, as each matrix was a linear transformation between two sets of measured variables (representing a dynamic I was interested in), so I could see how well the new matrix predicted the measured values relative to the old matrix. In each case the difference was negligible, so leaving proves and such aside, for my particularly problem it worked a treat. Thanks again.

Cheers,

Gary

Matt J

unread,

Sep 2, 2010, 4:47:20 PM9/2/10

to

> % =============================
> function [C,Ceq] = mynonlcon(delta,A)
> % nonlinear constraints for fmincon eigenvalue problem
> C = [];
> Ceq = imag(eig(A = delta));
> % =============================

Just thought I'd point out that this is a non-differentiable constraint, so there are certain risks using fmincon. Perhaps only on a set of measure zero, though...

Also, even though Gary has a solution he's happy with, I'll just throw out one more idea. If A is the given matrix with Jordan decomposition

A=inv(S)*J*S

then one could use the solution

Anew=inv(S)*real(J)*S

One can easily show that this minimizes the distance metric

Dist=@(B) norm(S*(A-B)*inv(S),'fro')

subject to the real eigenvalues constraint. Don't know why this metric would be worse than any other. Unfortunately, I haven't come across a way to automate Jordan decomposition in MATLAB...

Matt J

unread,

Sep 2, 2010, 5:37:20 PM9/2/10

to

"Matt J " <mattja...@THISieee.spam> wrote in message <i5p2gn$o5$1...@fred.mate hworks.com>...

> Unfortunately, I haven't come across a way to automate Jordan decomposition in MATLAB...

============

And that's possibly because it's numerically unstable. But anyway, any similarity transform which exposes the eigenvalues as entries of the matrix could be the basis for this approach.

Matt J

unread,

Sep 2, 2010, 6:02:20 PM9/2/10

to

"Roger Stafford" <ellieandr...@mindspring.com.invalid> wrote in message <i5jvad$c2b$1...@fred.mathworks.com>...

> V does have an inverse because we are talking about a matrix with three distinct roots, one real and two complex conjugates.
>
> That this is optimal in some sense remains purely conjectural on my part. It just "feels" good.

=======

In that case, it minimizes Dist(A,B) = norm(inv(V)*(A-B)*V,'fro')

Bruno Luong

unread,

Sep 2, 2010, 6:36:07 PM9/2/10

to

"Matt J " <mattja...@THISieee.spam> wrote in message <i5p6tb$f0b$1...@fred.mathworks.com>...

> "Roger Stafford" <ellieandr...@mindspring.com.invalid> wrote in message
>

> In that case, it minimizes Dist(A,B) = norm(inv(V)*(A-B)*V,'fro')

The usual way people do optimization is that: one define first the norm then minimize it. This logic is not for turning up-side-down. If one peak a specific solution in the admissible set, then arrange the definition of the norm (definition that depends on the input) so as the solution is the projection, then it's not rigorous method because one does really know what is minimized and that does not allow various bound estimation. Furthermore the eigen-vectors can be arbitrary chosen up to scaling, and that can also change the norm. Almost everything is floating in this norm. That's not good.

Bruno

Roger Stafford

unread,

Sep 2, 2010, 7:45:23 PM9/2/10

to

"Matt J " <mattja...@THISieee.spam> wrote in message <i5p2gn$o5t$1...@fred.mathworks.com>...

> Also, even though Gary has a solution he's happy with, I'll just throw out one more idea. If A is the given matrix with Jordan decomposition
>
> A=inv(S)*J*S
>
> then one could use the solution
>
> Anew=inv(S)*real(J)*S
>
> One can easily show that this minimizes the distance metric
>
> Dist=@(B) norm(S*(A-B)*inv(S),'fro')
>
> subject to the real eigenvalues constraint. Don't know why this metric would be worse than any other. Unfortunately, I haven't come across a way to automate Jordan decomposition in MATLAB...

- - - - - - - - -

Neither of those seem like very satisfying metrics to me, Matt. They depend on the particular similarity transformation that diagonalizes (or nearly diagonalizes) the original matrix, A, with its complex-valued eigenvalues. The matrix that diagonalizes B to its real-valued eigenvalues will be different, and since for B there will now be two equal eigenvalues, it isn't even unique. One would hope for a metric between the two matrices that has commutative symmetry between the matrices. Commutativity is one of the axioms in the usual definition of a metric space. Also if S is nearly singular, norm(inv(S)*M*S,'fro') can be very much larger than norm(M,'fro') and therefore perhaps of no particular interest in minimizing.

I would like to have proved that the Frobenius norm, or else some other standard norm, of just the difference A-B was minimized in a way that is independent of A and B, but depending only on their difference.

Roger Stafford

Bruno Luong

unread,

Sep 3, 2010, 2:35:26 AM9/3/10

to

"Roger Stafford" <ellieandr...@mindspring.com.invalid> wrote in message <i5pcuj$9en$1...@fred.mathworks.com>...

>
> I would like to have proved that the Frobenius norm, or else some other standard norm, of just the difference A-B was minimized in a way that is independent of A and B, but depending only on their difference.

I would think your solution minimize the 2-norm rather the Frobenius, I do not have the exact proof but below is the reason why I think it's true:

If we take the Rayleigh–Ritz ratio of the normalized eigen vector of the residual matrix A-B (where B is your solution)

A*x = lambda*x, lambda complex eigen value
|x|=1

then
x'*(A-B)*x / (x'*x) = imag(l)

This *seems* to be the best one can do if eigen value of B is real. Thus my conjecture is that the 2-norm is minimizes (since 2 norm is defined as max of Rayleigh–Ritz ratio).

Bruno

Matt J

unread,

Sep 3, 2010, 6:01:22 AM9/3/10

to

"Roger Stafford" <ellieandr...@mindspring.com.invalid> wrote in message <i5pcuj$9en$1...@fred.mathworks.com>...

Also if S is nearly singular, norm(inv(S)*M*S,'fro') can be very much larger than norm(M,'fro') and therefore perhaps of no particular interest in minimizing.
========

I don't see that, Roger. For instance, for any M of the form c*eye(3), the two norms coincide exactly. Regardless, if S is nearly singular, perhaps it means that you need a more demanding criterion than norm(M,'fro') to get a meaningful approximation of A anyway.

> I would like to have proved that the Frobenius norm, or else some other standard norm, of just the difference A-B was minimized in a way that is independent of A and B, but depending only on their difference.

==========

No, I don't think that's what we want to do. Nor do I agree that commutativity is necessarily of value here. Again, it's reasonable to suppose that different A might require different degrees of approximation for the task at hand and therefore that you have to make the norm adaptive to A.

Suppose, for example, that the task was instead to approximate the function
1/x with rational numbers y substituted for various points x. Suppose you had some criterion

abs(1/x -1/y)<=const.

Clearly, the distance measure you would be trying to control is
abs(x-y)/x^2 to account for the greater sensitivity of 1/x near x=0.

Matt J

unread,

Sep 3, 2010, 6:18:08 AM9/3/10

to

"Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <i5p8sn$mid$1...@fred.mathworks.com>...

=========

Bruno, although there are "floating" things in this norm, i.e., the scaling of the eigenvalues, the solution is invariant to them. Perhaps it's worth pointing out that the above norm minimization can be rewritten as the minimization of the following cost function, for any choice of the eigenvalues V

Dist(B) = norm( inv(V)*B*V - D ,'fro')

Moreover, with the change of variables Z=inv(V)*B*V

Dist(Z) = norm( Z - D, 'fro')

you always obtain the solution Z=D. Because Z is diagonal the corresponding B is invariant to how V was scaled columnwise.

Bruno Luong

unread,

Sep 3, 2010, 7:12:07 AM9/3/10

to

"Matt J " <mattja...@THISieee.spam> wrote in message <i5qi10$6oo$1...@fred.mathworks.com>...

> "Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <i5p8sn$mid$1...@fred.mathworks.com>...

>

> Bruno, although there are "floating" things in this norm, i.e., the scaling of the eigenvalues, the solution is invariant to them.

But that's even more disturbing. There is a family of "norms" that depends on scaling and the input, whereas the solution does not depend on the scaling. This solution cannot be *characterized* as the minimum of a "clean" cost function. The minimum of the norm is a property.

I'll caricature here, but I could also tell:

B = V'*real(D)*V, solution proposed by Roger, minimizes the cost function

norm(Ap-B, 'fro') where Ap:=V'*real(D)*V, and [V D]=eig(A).

That claims of course does not have any intrinsic value and it's certainly true. In a way you are doing more or less like this, you define a norm that is floating.

Bruno

Matt J

unread,

Sep 3, 2010, 10:29:09 AM9/3/10

to

"Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <i5ql67$sp0$1...@fred.mathworks.com>...

> "Matt J " <mattja...@THISieee.spam> wrote in message <i5qi10$6oo$1...@fred.mathworks.com>...
> > "Bruno Luong" <b.l...@fogale.findmycountry> wrote in message <i5p8sn$mid$1...@fred.mathworks.com>...
>
> >
> > Bruno, although there are "floating" things in this norm, i.e., the scaling of the eigenvalues, the solution is invariant to them.
>
> But that's even more disturbing. There is a family of "norms" that depends on scaling and the input, whereas the solution does not depend on the scaling. This solution cannot be *characterized* as the minimum of a "clean" cost function. The minimum of the norm is a property.

=======

Don't follow that. So what if a solution happens to be the minimum of multiple norms? We know that given an invertible linear equation y=A*x the solution x=inv(A)*y can be obtained by minimizing ||A*x-y|| for any norm whatsoever. Does that invalidate any of these norms?

> I'll caricature here, but I could also tell:
>
> B = V'*real(D)*V, solution proposed by Roger, minimizes the cost function
>
> norm(Ap-B, 'fro') where Ap:=V'*real(D)*V, and [V D]=eig(A).
>
> That claims of course does not have any intrinsic value and it's certainly true. In a way you are doing more or less like this, you define a norm that is floating.

=========

That's a pretty strong caricature. My cost function is an actual norm of A-B. Your example on the other hand measures the distance between B and something else entirely...

Matt J

unread,

Sep 3, 2010, 10:50:24 AM9/3/10

to

In any case, I can now confirm that Roger's solution minimizes neither the Frobenius norm nor the spectral norm. The following gives a counter-example

A = %given matrix

1 1 0
-1 0 0
0 0 0

B1 = %Roger's solution

0.5000 0.0000 0
0.0000 0.5000 0
0 0 0

B = %alternative to Roger's solution

0.9000 0 0
-2.0000 -0.1000 0
0 0 0

Clearly both B and B1 have real eigenvalues, but one can directly verify that B achieves lower norms both Frobenius and spectral,

>> NormsB1 =[norm(A-B1,'fro'), norm(A-B1)]

NormsB1 =

1.5811 1.5000

>> NormsB = [norm(A-B,'fro'), norm(A-B)],

NormsB =

1.4213 1.1000

Bruno Luong

unread,

Sep 3, 2010, 1:31:21 PM9/3/10

to

"Matt J " <mattja...@THISieee.spam> wrote in message <i5r1vg$mg5$1...@fred.mathworks.com>...

> In any case, I can now confirm that Roger's solution minimizes neither the Frobenius norm nor the spectral norm. The following gives a counter-example

Ah!!! good finding.

Bruno

Roger Stafford

unread,

Sep 3, 2010, 6:01:09 PM9/3/10

to

"Matt J " <mattja...@THISieee.spam> wrote in message <i5r1vg$mg5$1...@fred.mathworks.com>...
> In any case, I can now confirm that Roger's solution minimizes neither the Frobenius norm nor the spectral norm. .......
- - - - - - - - -
Yes, Matt, I discovered that this morning while working with 2 x 2 real matrices. In this case it is possible to derive an explicit expression for the matrix with real eigenvalues for which the difference from the original matrix is minimum under the Frobenius norm. The results do not look at all like my earlier method which latter always gives a poorer answer. As your results show, this is also the case for 3 x 3 matrices. I haven't yet been able to discover a generalization of this 2 x 2 solution to larger matrices.

Roger Stafford

Andrea Fusiello

unread,

Jan 26, 2016, 10:44:10 AM1/26/16

to

Hi everybody. Quite a few time has passed, but on this topic I could not find anything else on the net. I have the same problem, and experimenting a little bit I found a solution based on the real Schur form that _seems_ to be optimal in Frobenius norm:

[U,T] = schur(A)
A2 = U*triu(T)*U';

A2 has real eigenvalues and by numerical simulation with random matrices it always beats the
solution based on taking the real part of eigenvalues.

Il also beats the counterexample, because with

A =

1 1 0
-1 0 0
0 0 0

it produces

A2 =
1.2500 0.7500 0
-0.7500 -0.2500 0
0 0 0

which is closer than B to A in Frobenius norm.

A tentative proof: let A2 be a matrix with the same eigenvectors as A and real eigenvalues
||A - A2|| =|| U (T-T2)U'|| = ||(T-T2|| and given that T2 must be triangular the error is minimized by zeroing the lower triangle of T: T2=triu(T).

Is this consistent with the closed form solution in the 2x2 case?

"Roger Stafford" wrote in message <i5rr75$ql4$1...@fred.mathworks.com>...