sqrtm() of a matrix.

Abed M. Hammoud, D.sc

unread,

Mar 10, 1997, 3:00:00 AM3/10/97

to

Hello,

I am trying to decide on the best algorithm to use to calculate the square root
of a matrix. My firt shot at the problem I tried the following:

for a positive semidefinte matrix (A):

1- [E D] = eig(A); sqrtm(A) = E * sqrt(D) * E' where D is a diagonal matrix and
sqrt(D) is formed by taking the square root of the diagonal entries in D.

Another equivilant way is to use:

2- [U S V] = svd(A); sqrtm(A) = U * sqrt(S) * V'

3- Then I looked at the book by Golub and Van Loan, they suggest under the matrix
square root the following method:

use cholesky decomposition to write A as A = G G'
then use SVD on G, i.e, [u s v] = svd(G), then the square root of
A is equal to sqrtm(A) = u * s * u'

So my question is which is the most accurate and best way of doing
the sqrtm operation. Note that matlab uses a different method than
the above (I think). Also note that the above methods do not in
general yeild the same result.

Thanks for any pointers and help,

- Abed

--
Abed M. Hammoud, D.Sc. ab...@oregano.wustl.edu
IntellX L.L.C. Office: (314) 935-7269
Washington University in St. Louis Fax: (314) 935-7500
Institute of Biomedical Engineering http://oregano.wustl.edu/~abed

Pascal Gahinet

unread,

Mar 10, 1997, 3:00:00 AM3/10/97

to

Option 3 does not seem to have any advantage over 1-2 and
it is more expensive since you need the extra Cholesky
factorization.

Now, 2 may be dangerous if one eigenvalue becomes negative
due to rounding errors (then U and V won't match).

May be the best solution is

[u,t] = schur(A);
e = diag(t); % eigenvalues of A
if any(e<0), error('A is not positive definite'), end
sqrtA = u * diag(sqrt(e)) * u';

- pascal

Abed M. Hammoud, D.sc wrote:
>
> Hello,
>
> I am trying to decide on the best algorithm to use to calculate the square root
> of a matrix. My firt shot at the problem I tried the following:
>
> for a positive semidefinte matrix (A):
>
> 1- [E D] = eig(A); sqrtm(A) = E * sqrt(D) * E' where D is a diagonal matrix and
> sqrt(D) is formed by taking the square root of the diagonal entries in D.
>
> Another equivilant way is to use:
>
> 2- [U S V] = svd(A); sqrtm(A) = U * sqrt(S) * V'
>
> 3- Then I looked at the book by Golub and Van Loan, they suggest under the matrix
> square root the following method:
>
> use cholesky decomposition to write A as A = G G'
> then use SVD on G, i.e, [u s v] = svd(G), then the square root of
> A is equal to sqrtm(A) = u * s * u'
>

Cleve Moler

unread,

Mar 10, 1997, 3:00:00 AM3/10/97

to

Abed M. Hammoud, D.sc <ab...@oregano.wustl.edu> wrote:
>
> I am trying to decide on the best algorithm to use to calculate the square root
> of a matrix. My firt shot at the problem I tried the following:
>
> for a positive semidefinte matrix (A):
>
> 1- [E D] = eig(A); sqrtm(A) = E * sqrt(D) * E' where D is a diagonal matrix and
> sqrt(D) is formed by taking the square root of the diagonal entries in D.
>
> Another equivilant way is to use:
>
> 2- [U S V] = svd(A); sqrtm(A) = U * sqrt(S) * V'
>
> 3- Then I looked at the book by Golub and Van Loan, they suggest under the matrix
> square root the following method:
>
> use cholesky decomposition to write A as A = G G'
> then use SVD on G, i.e, [u s v] = svd(G), then the square root of
> A is equal to sqrtm(A) = u * s * u'
>

> So my question is which is the most accurate and best way of doing
> the sqrtm operation. Note that matlab uses a different method than
> the above (I think). Also note that the above methods do not in
> general yeild the same result.
>
> Thanks for any pointers and help,
>
> - Abed

Hi.

If your matrix is symmetric and positive definite, then your method 1,
based on eigenvalues and eigenvectors, is the best choice. Your method 2
is the same as method 1 for symmetric, positive definite A because V = U
in this case. But use of "svd", instead of "eig", does not take advantage
of symmetry and is more work. I've never seen your method 3, can't
find it in Golub and Van Loan, and don't believe it will work.

For matrices that are not symmetric, or not positive definite, the
computation of matrix square roots is a much more difficult, and
fascinating, topic. Even the 1-by-1 case is a challenge. The square
root might not exist, and it almost certainly isn't unique.
My brief advice is: don't try it at home without MATLAB.

-- Cleve Moler
mo...@mathworks.com

Robert H. Berman

unread,

Mar 11, 1997, 3:00:00 AM3/11/97

to

You might need to worry about the uniqueness of any "matrix square root."

Do you mean the square root of A = G^`.G or = G.G ?
Many folks find the first preferable.

Even then, the 2x2 identity matrix [1,0; 0,1] has at least 4 square roots

[1,0; 0,1] [-1,0;0,1] [1,0;0,-1] and [-1,0; 0,-1]

In fact, for any orthogonal matrix U, U^`.U = I which makes uniqueness a
problem for square roots of I, (i.e., there are more than 4 roots.)

Also, more generally, if G is a square root of A, then U.G is also a
square root of A since (U.G)^`.(U.G) =G^`.G =A

Hope this helps.

--
-------------------------------------------------------------
Robert H. Berman Tel: 617-646-4550
Macsyma Inc. Fax: 617-646-3161
20 Academy Street
Arlington, MA 02174-6463 Email: ber...@macsyma.com
USA URL: http://www.macsyma.com
-------------------------------------------------------------

Abed M. Hammoud, D.sc <ab...@oregano.wustl.edu> wrote in article
<33242B...@oregano.wustl.edu>...
> Hello,

>
> I am trying to decide on the best algorithm to use to calculate the
square root
> of a matrix. My firt shot at the problem I tried the following:
>
> for a positive semidefinte matrix (A):
>
> 1- [E D] = eig(A); sqrtm(A) = E * sqrt(D) * E' where D is a diagonal
matrix and
> sqrt(D) is formed by taking the square root of the diagonal entries in D.
>
> Another equivilant way is to use:
>
> 2- [U S V] = svd(A); sqrtm(A) = U * sqrt(S) * V'
>
> 3- Then I looked at the book by Golub and Van Loan, they suggest under
the matrix
> square root the following method:
>
> use cholesky decomposition to write A as A = G G'
> then use SVD on G, i.e, [u s v] = svd(G), then the square root of
> A is equal to sqrtm(A) = u * s * u'
>
> So my question is which is the most accurate and best way of doing
> the sqrtm operation. Note that matlab uses a different method than
> the above (I think). Also note that the above methods do not in
> general yeild the same result.
>
> Thanks for any pointers and help,
>
> - Abed
>

George Mazurkevitch

unread,

Mar 11, 1997, 3:00:00 AM3/11/97

to

Hi! There are some efficient iterative methods. Look ex. at:

Higham N.J.: Newton's method for matrix square root. Math. Comput.
vol. 46, 1986, 537ff.

Higham N.J.: Computing real square roots of a real matrix.
Linear Algebra Appl. Vol. 88/89, 1987, 405.

Lakic S.: An iterative method for the computation of a matrix
inverse square root. ZAMM(Z. angew. Math. Mech.) vol. 75, 1995, 11,
867-873.

George.

Jesse Bennett

unread,

Mar 11, 1997, 3:00:00 AM3/11/97

to

In article <5g2hqd$m...@acoma.mathworks.com>,
mo...@mathworks.com (Cleve Moler) writes:

> I've never seen your method 3, can't
> find it in Golub and Van Loan, and don't believe it will work.

Third edition, sec. 4.2.10, pp. 149. Summarizing from the text:

A = G'G
G = USV' is the SVD of G
X = USU'

A = GG' = (USV')(USV')' = U(S^2)U' = (USU')(USU') = X^2

=> X is a square root of A

Looks good to me.

--Jesse

G. W. Stewart

unread,

Mar 12, 1997, 3:00:00 AM3/12/97

to

In article <5g2hqd$m...@acoma.mathworks.com>,
Cleve Moler <mo...@mathworks.com> wrote:
#Abed M. Hammoud, D.sc <ab...@oregano.wustl.edu> wrote:
#>
#> I am trying to decide on the best algorithm to use to calculate the square root
#> of a matrix. My firt shot at the problem I tried the following:
#>
#> for a positive semidefinte matrix (A):
#>
#> 1- [E D] = eig(A); sqrtm(A) = E * sqrt(D) * E' where D is a diagonal matrix and
#> sqrt(D) is formed by taking the square root of the diagonal entries in D.
#>
#> Another equivilant way is to use:
#>
#> 2- [U S V] = svd(A); sqrtm(A) = U * sqrt(S) * V'
#>
#> 3- Then I looked at the book by Golub and Van Loan, they suggest under the matrix
#> square root the following method:
#>
#> use cholesky decomposition to write A as A = G G'
#> then use SVD on G, i.e, [u s v] = svd(G), then the square root of
#> A is equal to sqrtm(A) = u * s * u'
#>
#> So my question is which is the most accurate and best way of doing
#> the sqrtm operation. Note that matlab uses a different method than
#> the above (I think). Also note that the above methods do not in
#> general yeild the same result.
#>
#> Thanks for any pointers and help,
#>
#> - Abed
#
#Hi.
#
#If your matrix is symmetric and positive definite, then your method 1,
#based on eigenvalues and eigenvectors, is the best choice. Your method 2
#is the same as method 1 for symmetric, positive definite A because V = U
#in this case. But use of "svd", instead of "eig", does not take advantage
#of symmetry and is more work. I've never seen your method 3, can't
#find it in Golub and Van Loan, and don't believe it will work.
#
#For matrices that are not symmetric, or not positive definite, the
#computation of matrix square roots is a much more difficult, and
#fascinating, topic. Even the 1-by-1 case is a challenge. The square
#root might not exist, and it almost certainly isn't unique.
#My brief advice is: don't try it at home without MATLAB.
#
# -- Cleve Moler
# mo...@mathworks.com

For positive semidefinite matrices, all three methods work in
principal, the first being the most efficient (howevr, you have
to set to zero any negative eigenvalues that rounding error might
produce).

It is worth noting many applications can be recast to use
a Cholesky decomposition instead of the square root.

Cleve was refering to indefinite and nonsymmetric matrices.
Even matlab (at home, at the office, and even on the beach) can
have problems. But I did find something funny. When I run
the script

small = 1.e-5;
for n=2:5
a = small*diag(randn(n,1)) + diag(ones(n-1,1),1);
b = sqrtm(a);
c = a^.5;
disp([norm(b*b-a), norm(c*c-a)])
end

I get the following output (error messages deleted)

2.2204e-16 5.5511e-17
1.5172e-02 2.2555e-10
1.4740e+04 4.3826e-06
1.2471e+06 8.3380e+01

The exponential function, which I believe is based on the
eigendecompostion, gives the accuracy one deserves. But sqrtm blows
it.

Pete Stewart

John Hench

unread,

Mar 12, 1997, 3:00:00 AM3/12/97

to

Don't forget:

@article{Bjorck83,
author= {A. Bj\"{o}rck and S.J. Hammarling},
title= {A Schur Method for the Square Root of a Matrix},
journal= {Lin. Alg. Appl.},
volume= {52/53},
year= {1983},
pages= {127-140}
}

and the related paper:

@TechReport{Kenney95,
author = "C.S. Kenney and A.J. Laub",
title = "A {Schur-Fr\'{e}chet} Algorithm for Computing the
Logarithm of a Matrix",
institution = "University of California, Santa Barbara",
year = "1995",
number = "CCEC-95-0714",
address = "Department of Electrical and Computer Engineering,
University of California, Santa Barbara CA 93106",
month = "July"
}
--
-------------------------------------------------
Dr. J.J. Hench
Dept. of Mathematics, Univ. of Reading, England
Institute of Informatics and Automation, Prague
-------------------------------------------------

Bernard Danloy

unread,

Mar 13, 1997, 3:00:00 AM3/13/97

to

In article <5g2hqd$m...@acoma.mathworks.com>, mo...@mathworks.com (Cleve

Moler) wrote :

> Abed M. Hammoud, D.sc <ab...@oregano.wustl.edu> wrote:
> >
> > I am trying to decide on the best algorithm to use to calculate the
square root

> > of a matrix. My firt shot at the problem I tried the following:
> >

> > for a positive semidefinte matrix (A):
> >

......> >

> > 3- Then I looked at the book by Golub and Van Loan, they suggest under
the matrix

> > square root the following method:
> >

> > use cholesky decomposition to write A as A = G G'

> > then use SVD on G, i.e, [u s v] = svd(G), then the square root of

> > A is equal to sqrtm(A) = u * s * u'
> >

...
> > - Abed
>
> Hi.

>
> If your matrix is symmetric and positive definite, then your method 1,

> based on eigenvalues and eigenvectors, is the best choice. Your method 2

> is the same as method 1 for symmetric, positive definite A because V = U

> in this case. But use of "svd", instead of "eig", does not take advantage

> of symmetry and is more work. I've never seen your method 3, can't

> find it in Golub and Van Loan, and don't believe it will work.

^^^^^^^^^^^^^^^^^^^^^^^^^^
>
....

> -- Cleve Moler
> mo...@mathworks.com

That method 3 is clearly unusual and complicated but it is obviously right.

Bernard Danloy

John Hench

unread,

Mar 14, 1997, 3:00:00 AM3/14/97

to

Pascal Gahinet wrote:

> May be the best solution is
>
> [u,t] = schur(A);
> e = diag(t); % eigenvalues of A
> if any(e<0), error('A is not positive definite'), end
> sqrtA = u * diag(sqrt(e)) * u';
>
> - pascal

For those who wish to implement the algorithm above,
keep in mind that A here is symmetric.

-- John

PS. Salud Pascal, ca va?

Cleve Moler

unread,

Mar 14, 1997, 3:00:00 AM3/14/97

to

Abed M. Hammoud, D.sc <ab...@oregano.wustl.edu> started this discussion with:

> I am trying to decide on the best algorithm to use to calculate the square

> root of a matrix. My first shot at the problem I tried the following:

>
> for a positive semidefinte matrix (A):
>

> 1- [E D] = eig(A); sqrtm(A) = E * sqrt(D) * E' where D is a diagonal matrix.

> sqrt(D) is formed by taking the square root of the diagonal entries in D.
>

> Another equivilant way is to use:

> 2- [U S V] = svd(A); sqrtm(A) = U * sqrt(S) * V'
>

> 3- Then I looked at the book by Golub and Van Loan, they suggest under the
> matrix square root the following method:

> Use cholesky decomposition to write A as A = G G'

> then use SVD on G, i.e, [u s v] = svd(G), then the square root of
> A is equal to sqrtm(A) = u * s * u'
>

> So my question is which is the most accurate and best way of doing

> the sqrtm operation. ...

I responded with:

> If your matrix is symmetric and positive definite, then your method 1,
> based on eigenvalues and eigenvectors, is the best choice. Your method 2
> is the same as method 1 for symmetric, positive definite A because V = U
> in this case. But use of "svd", instead of "eig", does not take advantage
> of symmetry and is more work. I've never seen your method 3, can't
> find it in Golub and Van Loan, and don't believe it will work.

> ...

Several other people joined the discussion, including my long time

Here are some additional thoughts on the overall problem, and then
a response to Pete's "funny" results.

First of all, what is a matrix square root? If A is any square matrix,
then MATLAB's notion of a square root of A is any matrix S which
satisfies

S*S = S^2 = A

There are no transposes involved in this definition. But, as has been
pointed out, there are other matrices which act like a square root.
Any matrix X which satisfies X'*X = A, or X*X' = A might also be thought
of as a square root. If such an X exists, and A is real, then A must be
symmetric and, in fact, positive definite. Such matrices have a Cholesky
decomposition.

The classic Cholesky decomposition is lower triangular (see Householder,
The Theory of Matrices in Numerical Analysis)

L*L' = A

But, it was Pete's idea with LINPACK to use an upper triangular decomposition
so that the Cholesky algorithm would be column oriented.

R'*R = A

MATLAB inherited this convention; our Cholesky decomposition is upper
triangular.

So, we have at least three different candidates for square root -- S, L and R.

If A is diagonal, with positive diagonal entries, we can take the positive
scalar square roots of the diagonal entries. The resulting diagonal
matrix could be called S, L or R -- all three are the same in this case.
But we could also put minus signs in front of any of the diagonal entries
and obtain other matrices which are square roots of A.

If A is symmetric and positive definite, but not diagonal, then S, L and R
are not the same. It is possible, and reasonable, to also make S symmetric
and positive definite. But L and R certainly aren't symmetric -- they
are triangular. Here's an example.

>> A = hilb(4)

A =
1.0000 0.5000 0.3333 0.2500
0.5000 0.3333 0.2500 0.2000
0.3333 0.2500 0.2000 0.1667
0.2500 0.2000 0.1667 0.1429

>> S = sqrtm(A)

S =
0.9115 0.3390 0.1927 0.1310
0.3390 0.3529 0.2475 0.1807
0.1927 0.2475 0.2411 0.2086
0.1310 0.1807 0.2086 0.2226

>> R = chol(A)

R =
1.0000 0.5000 0.3333 0.2500
0 0.2887 0.2887 0.2598
0 0 0.0745 0.1118
0 0 0 0.0189

>> L = R'

L =
1.0000 0 0 0
0.5000 0.2887 0 0
0.3333 0.2887 0.0745 0
0.2500 0.2598 0.1118 0.0189

Each of S, L and R satisfy their defining equation to within roundoff error.

For symmetric, positive definite A, MATLAB computes sqrtm(A) with the
first algorithm that Hammond mentions.

[V,D] = eig(A)
S = V*diag(sqrt(diag(D)))*V'

Hammond's second algorithm is a more expensive way of computing the same
thing. I was unsure about Hammond's third algorithm for three reasons.
I looked in the second, not the third, edition of Golub and Van Loan.
There is a confusion between the lower and upper Cholesky decompositions.
And, the algorithm doesn't generalize to nonsymmetric or indefinite matries.

Now let's look at Pete's "funny" results. His matrices have some random
entries, but one of them might be

A =
-1.441e-05 1 0 0
0 5.7115e-06 1 0
0 0 -3.9989e-06 1
0 0 0 6.9e-06

This matrix is not symmetric, certainly not positive definite, and is
dangerously close to:

A =
0 1 0 0
0 0 1 0
0 0 0 1
0 0 0 0

This is an instance of sqrtm's worst nightmare. Its square root does not
exist. Complex numbers and infinites don't help. There simply is no
matrix S for which S*S is equal to A.

As Pete's parameter "small" gets even smaller, his matrices approach
ones for which sqrtm must fail. The results don't exist. Even when
small = 1.e-5, sqrtm knows it's in trouble and prints out warning
messages which Pete has omitted.

The sqrtm function is based on a clever observation known as Parlett's
algorithm. It does involve eigenvalues. In contrast, the expoential
function, expm, does not use eigenvalues or Parlett's algorithm.
You can see the details with

type sqrtm
type expm1

It is true that sqrtm could be improved by using a block form of Parlett's
algorithm. But I'm not sure if that would make Pete happy. It's
impossible to compute things which don't exist. It's difficult to
compute things which almost don't exist.

-- Cleve Moler
mo...@mathworks.com

zippy

unread,

Mar 14, 1997, 3:00:00 AM3/14/97

to

Cleve Moler wrote:

> Here are some additional thoughts on the overall problem, and then
> a response to Pete's "funny" results.
>
> First of all, what is a matrix square root? If A is any square matrix,
> then MATLAB's notion of a square root of A is any matrix S which
> satisfies
>
> S*S = S^2 = A
>
> There are no transposes involved in this definition. But, as has been
> pointed out, there are other matrices which act like a square root.
> Any matrix X which satisfies X'*X = A, or X*X' = A might also be thought
> of as a square root. If such an X exists, and A is real, then A must be
> symmetric and, in fact, positive definite. Such matrices have a Cholesky
> decomposition.

[snip]

> So, we have at least three different candidates for square root -- S, L and R.

[snip]

> If A is symmetric and positive definite, but not diagonal, then S, L and R
> are not the same. It is possible, and reasonable, to also make S symmetric
> and positive definite.

This brings up a long-standing question I have had with
square roots of covariance matrices (which are symmetric and positive
definite). Obviously the square root is not unique. However,

a) if we also specify that the square root must be symmetric and pos. def,
does *that* guarantee uniqueness?

b) Does it matter? The problem where this arose was in the computation of a
Gauss-Markov (optimal) solution to the stochastic matrix problem Ax=y+n
(where n is noise, and the covariance matrices Rnn and Rxx are known.

The standard technique is to apply the SVD to the scaled matrix

\tilde{A} = Rnn^{-1/2}*A*Rxx^{1/2}

and proceed. Now, in all of the derivations I have seen for this no-one
seems to comment on the sqrtm uniqueness issue, but its not obvious
to me that it is unimportant.

Rich.

--
Rich Pawlowicz
Oceanography, Dept. of Earth and Ocean Sciences, Univ. of British Columbia,
6270 University Blvd., Vancouver, B.C. CANADA V6T 1Z4
email: ri...@ocgy.ubc.ca ph: (604) 822-1356 fax: (604) 822-6091

John Hench

unread,

Mar 15, 1997, 3:00:00 AM3/15/97

to

Cleve Moler wrote:

> It is true that sqrtm could be improved by using a block
> form of Parlett's algorithm. But I'm not sure if that
> would make Pete happy. It's impossible to compute things
> which don't exist. It's difficult to compute things which
> almost don't exist.

You might think about coding up Kenney's and
Laub's Schur-Frechet method for computing
functions of matrices, including expm() and
logm(), etc. It has the nice property that it
doesn't break down for multiple e-vals as
Parlett's algorithm does, plus it is miles more
accurate than anything else out there, especially
for these sketchy problems. BTW, Bjorck and
Hammarling's method for sqrtm() happens to be a
Schur-Frechet method, and so produces nice
results as well.

References:

@article{Bjorck83,
author= {A. Bj\"{o}rck and S.J. Hammarling},
title= {A Schur Method for the Square Root of a Matrix},
journal= {Lin. Alg. Appl.},
volume= {52/53},
year= {1983},
pages= {127-140}
}

@TechReport{Kenney95,
author = "C.S. Kenney and A.J. Laub",
title = "A {Schur-Fr\'{e}chet} Algorithm for Computing the
Logarithm of a Matrix",
institution = "University of California, Santa Barbara",
year = "1995",
number = "CCEC-95-0714",
address = "Department of Electrical and Computer Engineering,
University of California, Santa Barbara CA 93106",
month = "July"
}

Kenney and Laub's paper has been expanded to
include the matrix exponential, btw. I guess you
might have to write to one of the authors to get
the latest version.

Cheers, John

--

Ron Shepard

unread,

Mar 15, 1997, 3:00:00 AM3/15/97

to

In article <5gc1cs$r...@acoma.mathworks.com>, mo...@mathworks.com (Cleve
Moler) wrote:

[...]

> But we could also put minus signs in front of any of the diagonal entries

> and obtain other matrices which are square roots of A. [...]

If the eigenvalues are distince, then there are 2^n choices for these
signs for a (symmetric positive semidefinite) matrix of dimension n,
leading to 2^n distinct matrix square roots. If there are degeneracies
among the nonzero eigenvalues, then there are also continuous variations
of the matrix square root associated with (real) rotations within the
degenerate subspace(s).

If one is interested in being able to perform formal manipulations of the
matrix elements of the square roots, then a specific choice must be made
of these 2^n possibilities. In my own work in chemistry involving this
situation, it was easiest to choose positive diagonal entries, which is
equivalent to choosing the square root matrix to be also symmetric and
positive definite. This allows, for example, the matrix derivatives of A
(w.r.t. some set of parameters upon which the matrix elements of A depend)
to be used to define a consistent series expansion of SQRT(A).

$.02 -Ron Shepard