an optimization problem

Roy

unread,

Oct 9, 2006, 11:39:39 PM10/9/06

to

Let Y be a constant matrix, let X be an unknown matrix with full column
rank.
Let f = det(X'X)^{-m/2} + etr (Y (X'X)^{-1} Y').
Here X' denotes the matrix transpose, det(.) is the matrix determinant,
X^{-1} is the inverse of X.
I am interested in finding matrix X that maximize the function f.
Anyone knows how to do that?

Thanks a lot
Roy

Roy

unread,

Oct 9, 2006, 11:41:50 PM10/9/06

to

And etr(.) is the exponential trace function etr(.) = exp( trace(.) ).

eltetot

unread,

Oct 10, 2006, 2:47:11 AM10/10/06

to

Hello,

Please correct me if I'm wrong.

1. Are you dealing with real matrices?
2. It is enough to deal with X'X instead of X
3. X'X is a symmetric and regular matrix
4. Since it is symmetrix, it is diagonalisable with an orthonormal
similarity transformation
5. In fact, what you have is an optimisation problem of N variables:
D = Diag(x1,...,xn)

f = det(D)^{-m/2} + etr (Y D^{-1} Y')

I suggest you to apply derivatives then to find the optimum.

BR, Tamas

Duncan Muirhead

unread,

Oct 10, 2006, 4:45:24 AM10/10/06

to

If B = Y'*Y and X'*X = c*I for positive c c then
f = 1/pow( c, m*dim/2) + exp( Tr(B)/c)
As c->0 f->+inf
Duncan

Robert Israel

unread,

Oct 10, 2006, 2:12:56 PM10/10/06

to

In article <1160462831....@i42g2000cwa.googlegroups.com>,

eltetot <elt...@freemail.hu> wrote:
>Hello,
>
> Please correct me if I'm wrong.
>
>1. Are you dealing with real matrices?
>2. It is enough to deal with X'X instead of X
>3. X'X is a symmetric and regular matrix
>4. Since it is symmetrix, it is diagonalisable with an orthonormal
>similarity transformation
>5. In fact, what you have is an optimisation problem of N variables:
>D = Diag(x1,...,xn)
>
>f = det(D)^{-m/2} + etr (Y D^{-1} Y')

Not quite: if X'X = U D U', then
etr(Y (X'X)^(-1) Y') = etr(Y U D^(-1) U' Y')
and the dependence on U can't be removed.

Robert Israel isr...@math.ubc.ca
Department of Mathematics http://www.math.ubc.ca/~israel
University of British Columbia Vancouver, BC, Canada

Robert Israel

unread,

Oct 10, 2006, 2:14:40 PM10/10/06

to

In article <1160451706.1...@i3g2000cwc.googlegroups.com>,

The second term is nonnegative, but the first term can be arbitrarily
large. So f is unbounded, and there is no maximum.

Roy

unread,

Oct 10, 2006, 2:39:30 PM10/10/06

to

Ooops, I am really sorry I made a big typing mistake. The function
should be
f = det(X'X)^{-m/2} * etr ( -Y (X'X)^{-1} Y').
Both X and Y are real matrices. X has full column rank. So is this
function has a maximum?

Thanks
Roy

Robert Israel

unread,

Oct 10, 2006, 3:08:56 PM10/10/06

to

In article <1160505570....@i3g2000cwc.googlegroups.com>,
Roy <roy...@hotmail.com> wrote:

>Ooops, I am really sorry I made a big typing mistake. The function
>should be
>f = det(X'X)^{-m/2} * etr ( -Y (X'X)^{-1} Y').
>Both X and Y are real matrices. X has full column rank. So is this
>function has a maximum?

Take logarithms, noting that det(A) = etr(ln(A)) for
positive-definite matrices A. Writing Z = (X'X)^(-1), we have

ln f = m/2 tr(ln(Z)) - tr(Y Z Y')

If Y is invertible, this should have a maximum. But if not, it
isn't: if x is any vector with Y x = 0, let Z be a positive-definite
matrix for which x is an eigenvector with eigenvalue t. Then as
t increases (with the other eigenvalues staying fixed) the second
term does not change, while the first term goes to infinity.

Roy

unread,

Oct 10, 2006, 3:59:42 PM10/10/06

to

Robert Israel wrote:
> In article <1160505570....@i3g2000cwc.googlegroups.com>,
> Roy <roy...@hotmail.com> wrote:
>
> >Ooops, I am really sorry I made a big typing mistake. The function
> >should be
> >f = det(X'X)^{-m/2} * etr ( -Y (X'X)^{-1} Y').
> >Both X and Y are real matrices. X has full column rank. So is this
> >function has a maximum?
>
> Take logarithms, noting that det(A) = etr(ln(A)) for
> positive-definite matrices A. Writing Z = (X'X)^(-1), we have

How do you define ln(A)? natural log of each element of A?
If it is true, then etr(ln(A)) is the product of the diagonal elements
of A.
det(A) is the product of all eigenvalues (characteristic roots) of A.
How could these two be the same?

>
> ln f = m/2 tr(ln(Z)) - tr(Y Z Y')
>
> If Y is invertible, this should have a maximum.

Do you mean Y should have full column rank?

Robert Israel

unread,

Oct 10, 2006, 4:37:58 PM10/10/06

to

In article <1160510382.2...@h48g2000cwc.googlegroups.com>,

Roy <roy...@hotmail.com> wrote:
>Robert Israel wrote:
>> In article <1160505570....@i3g2000cwc.googlegroups.com>,
>> Roy <roy...@hotmail.com> wrote:
>>
>> >Ooops, I am really sorry I made a big typing mistake. The function
>> >should be
>> >f = det(X'X)^{-m/2} * etr ( -Y (X'X)^{-1} Y').
>> >Both X and Y are real matrices. X has full column rank. So is this
>> >function has a maximum?
>>
>> Take logarithms, noting that det(A) = etr(ln(A)) for
>> positive-definite matrices A. Writing Z = (X'X)^(-1), we have
>
>How do you define ln(A)? natural log of each element of A?

No, it's using the functional calculus. Explicitly, if A (which in
this case is positive definite) can be diagonalized as A = U D U',
where U is orthogonal and D is diagonal, then ln(A) = U C U'
where C is diagonal with diagonal entries C_{ii} = ln(D_{ii}).

Roy

unread,

Oct 10, 2006, 5:16:05 PM10/10/06

to

Robert Israel wrote:
> In article <1160510382.2...@h48g2000cwc.googlegroups.com>,
> Roy <roy...@hotmail.com> wrote:
> >Robert Israel wrote:
> >> In article <1160505570....@i3g2000cwc.googlegroups.com>,
> >> Roy <roy...@hotmail.com> wrote:
> >>
> >> >Ooops, I am really sorry I made a big typing mistake. The function
> >> >should be
> >> >f = det(X'X)^{-m/2} * etr ( -Y (X'X)^{-1} Y').
> >> >Both X and Y are real matrices. X has full column rank. So is this
> >> >function has a maximum?
> >>
> >> Take logarithms, noting that det(A) = etr(ln(A)) for
> >> positive-definite matrices A. Writing Z = (X'X)^(-1), we have
> >
> >How do you define ln(A)? natural log of each element of A?
>
> No, it's using the functional calculus. Explicitly, if A (which in
> this case is positive definite) can be diagonalized as A = U D U',
> where U is orthogonal and D is diagonal, then ln(A) = U C U'
> where C is diagonal with diagonal entries C_{ii} = ln(D_{ii}).

Oh, thank you for reminding me of this. I am not good at matrix
calculus at all. So if Y is invertible, how to solve the maximum
problem:

ln f = m/2 tr(ln(Z)) - tr(Y Z Y')

Any hints?

Robert Israel

unread,

Oct 10, 2006, 6:09:05 PM10/10/06

to

In article <1160514965.4...@c28g2000cwb.googlegroups.com>,

Write f = (det Z)^(m/2) etr(-Y Z Y').
Let Z = Y^(-1) R (Y')^(-1). Then
f = det(Y)^(-m) det(R)^(m/2) etr(-R), so we need to maximize
det(R)^(m/2) etr(-R) over positive-definite matrices R.
Now if R has eigenvalues lambda_j (counted by multiplicity)
det(R)^(m/2) etr(-R) = product_j lambda_j^(m/2) exp(-lambda_j).
The function g(z) = z^(m/2) exp(-z) has its maximum for z >= 0
at z = m/2, so the maximum of f is attained when all
lambda_j = m/2. Thus we can take R = m/2 I,
Z = m/2 Y^(-1)(Y')^(-1) = m/2 (Y' Y)^(-1),
and since I defined Z = (X' X)^(-1), we may as well take
X = sqrt(2/m) Y.

Roy

unread,

Oct 11, 2006, 6:01:05 PM10/11/06

to

Thank you very much! This is very neat. All I want to say is Math is
beautiful!
I have a follow-up question. What if some columns of X is already
known and fixed, and some others are unknown? In that case, we want to
find the value of those unknowns to maximize the function.

Roy