Support for Non-symmetric Eigenvalues, SVD, Norm

46 views
Skip to first unread message

Mark Stone

unread,
Jan 20, 2014, 3:26:45 PM1/20/14
to alg...@googlegroups.com
Can you please tell me the current status of support in algopy for forward or reverse mode first derivatives (gradient or Jacobian) and forward over reverse Hessian of functions of
1) eigenvalues of a non-symmetric matrix, possibly with multiplicity > 1
1a) sum of k largest magnitude eigenvalues of non-symmetric matrix
1b) sum of k smallest magnitude eigenvalues of non-symmetric matrix
2) singular values of a non-symmetric matrix
3) L^p norm for various p in[1,infinity] ; note: differentiation with respect to matrix argument, not with respect to p.  Of course, for p=2, this is the same as maximum singular value.

Thanks.

Sebastian Walter

unread,
Jan 21, 2014, 5:05:52 AM1/21/14
to alg...@googlegroups.com
Hello Mark,

1) EIG:  Algopy has support for the symmetric, but not for the general eigenvalue decomposition.
I could add support for the general eigenvalue decomposition rather quickly, with the following restrictions:
* only first-order forward,  first-order reverse and forward-over-reverse, but not higher-order forward derivatives
* only for the case of distinct eigenvalues
* only for diagonalizable matrices

For the other cases: 1a) and 1b) may be well-defined even if the eigenspaces are not unique or the matrix is not diagonalizable.
However, one would have to make a careful analysis first.

2) SVD: higher-order derivatives in the forward mode should work, but it is not very well tested. The reverse mode should also work for distinct singular values, but I'll have to update a couple of lines of code to make it work.

3) L^p norms: they are rather non-differentiable at many points since they contain max(.,.) and |.| operations, so I'd recommend to avoid their use in any code that gets differentiated.

cheers,
Sebastian








--
You received this message because you are subscribed to the Google Groups "algopy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to algopy+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Mark Stone

unread,
Jan 21, 2014, 6:46:00 AM1/21/14
to alg...@googlegroups.com
Thanks.

How about L^p for 1 <p < infinity, p=2 being a particularly common choice, which as said above, for which norm equals the maximum singular value

With regard to the various situations in my list for which the first derivative doesn't exist at some (generally small number of) points, could algopy provide derivatives which are correct except at those points, and if so, is there any way to communicate a warning?  It turns out, it is often possible to blast through those points with a nonlinear optimizer intended for smooth functions, so having derivatives of an eigenvalue which are correct except where multiplicity is greater than 1, might still be useful for optimizing matrix entries with respect to an eigenvalue criterion, for instance.  I agree on the need for diagonaliizability, though.  You should also often be able to blast through singular values with multiplicity greater than 1 (for which the "derivative" might not be correct, and no worries about diagonaliizability with SVD..

Sebastian Walter

unread,
Jan 21, 2014, 7:12:36 AM1/21/14
to alg...@googlegroups.com
On Tue, Jan 21, 2014 at 12:46 PM, Mark Stone <stochasti...@yahoo.com> wrote:
Thanks.

How about L^p for 1 <p < infinity, p=2 being a particularly common choice, which as said above, for which norm equals the maximum singular value

One can implement functions for the L^p norms rather easily: they are just a couple lines of code. Forcing the user to implement it also makes the user aware of the non-differentiability and the involved issues.
So I'd refrain from adding it to AlgoPy.


With regard to the various situations in my list for which the first derivative doesn't exist at some (generally small number of) points, could algopy provide derivatives which are correct except at those points, and if so, is there any way to communicate a warning?  It turns out, it is often possible to blast through those points with a nonlinear optimizer intended for smooth functions, so having derivatives of an eigenvalue which are correct except where multiplicity is greater than 1, might still be useful for optimizing matrix entries with respect to an eigenvalue criterion, for instance.  I agree on the need for diagonaliizability, though.  You should also often be able to blast through singular values with multiplicity greater than 1 (for which the "derivative" might not be correct, and no worries about diagonaliizability with SVD..

The singular values are directionally differentiable, so the forward mode will always return a useful result. 

About the reverse mode: I believe one can adapt the SVD and symmetric EVD to make a best effort when singular- or eigenvalues coalesce -- at the expense that the result could be numerical "noise".
If in the subsequent computation one takes sums or products of the eigenvalues, the result may even turn out be correct, since the overall computational procedure is  differentiable. I would have to look into that, though.

The same goes for the general eigenvalue decomposition.

Could you elaborate on your use case?

Sebastian

Mark Stone

unread,
Jan 21, 2014, 12:35:44 PM1/21/14
to alg...@googlegroups.com

Use case: Some or all elements of  a matrix are optimization (decision) variables for a nonlinear optimization problem.  The objective function and/or constraints may involve eigenvalues, singluar values, or norms, among other things, possibly embedded in more complicated expressions.  This should be a perfect opportunity to perform AD to provide gradient, Jacobian, and maybe Hessian for use by the optimizer.

Sebastian Walter

unread,
Jan 24, 2014, 8:59:32 AM1/24/14
to alg...@googlegroups.com
Hello Mark,

I've added support for the singular and eigenvalue decomposition.
Could you provide a self-contained, prototypical example for testing purposes?

There are a few things to keep in mind, though:
1) the code is not very well tested right now
2) numpy.eig returns either real-valued OR complex-valued factors that correspond to the right eigenvectors.
I'm not 100% sure what implications the complex values may have:
Algopy supports complex-valued computations to some degree, but there is only official support/unit tests for real-valued computations. 

cheers,
Sebastian

Reply all
Reply to author
Forward
0 new messages