Generalized Singular Value Decomposition

0 views

Skip to first unread message

Frida Kosofsky

unread,

Aug 4, 2024, 10:02:17 PM8/4/24

to throsseademe

Inlinear algebra, the generalized singular value decomposition (GSVD) is the name of two different techniques based on the singular value decomposition (SVD). The two versions differ because one version decomposes two matrices (somewhat like the higher-order or tensor SVD) and the other version uses a set of constraints imposed on the left and right singular vectors of a single-matrix SVD.

The generalized singular value decomposition (GSVD) is a matrix decomposition on a pair of matrices which generalizes the singular value decomposition. It was introduced by Van Loan [1] in 1976 and later developed by Paige and Saunders,[2] which is the version described here. In contrast to the SVD, the GSVD decomposes simultaneously a pair of matrices with the same number of columns. The SVD and the GSVD, as well as some other possible generalizations of the SVD,[3][4][5] are extensively used in the study of the conditioning and regularization of linear systems with respect to quadratic semi-norms. In the following, let F = R \displaystyle \mathbb F =\mathbb R , or F = C \displaystyle \mathbb F =\mathbb C .

It has equally found applications to estimate the spectral decompositions of linear operators when the eigenfunctions are parameterized with a linear model, i.e. a reproducing kernel Hilbert space.[13]

The weighted version of the generalized singular value decomposition (GSVD) is a constrained matrix decomposition with constraints imposed on the left and right singular vectors of the singular value decomposition.[14][15][16] This form of the GSVD is an extension of the SVD as such. Given the SVD of an mn real or complex matrix M

The weighted form of the GSVD is called as such because, with the correct selection of weights, it generalizes many techniques (such as multidimensional scaling and linear discriminant analysis).[17]

Perform a generalized singular value decomposition on A and B. The outputs include orthogonal U and V, a nonsingular X, and diagonal matrices C and S. Because A is rank deficient, the first diagonal element of C is zero.

Perform a generalized singular value decomposition on A and B. The outputs include orthogonal U and V, a nonsingular X, and diagonal matrices C and S. In this situation, the nonzero diagonal of C is diag(C,2).

The behavior change is that in all of these cases q is now equal to the numerical rank of [A; B]. The numerical rank is calculated from the QR factorization of [A; B]. This change ensures that nonzero elements of C and S are uniquely determined.

I have studied about Singular value decomposition (SVD) and had solved few numerical examples to understand SVD. Now I am studying Generalized singular value decomposition (GSVD). I followed this link to grasp the concept of GSVD but I haven't been able to understand. I have failed in finding any numerical example based on GSVD.

There are several papers about the GSVD, the most useful ones (for me) were Paige and Saunders, Chu et al., Van Loan, Kagstrom and for an example application see Kempf et al. (free). A reference book is Matrix Computations from Golub and Van Loan.

MATLAB has a gsvd function to perform the generalised SVD. Since 2013 I think there has been a lot of discussion on the github pages regarding putting it in scipy and some pages have code that I can use such as here which is super complicated for a novice like me(to get it running).

I also found LJWilliams github page with an implementation. This is of no good as has lot of bugs when transferred to python 3. Attempted correcting the simple ones such as assert and print. It quickly gets complicated.

Also, This is what I get with the LJWilliams implementation, once the print and assert statements are corrected. The code looks complicated and I am not sure spending time on it is the best thing to do! Also some people have reported issues on the same github page which I am not sure are fixed or connected.

If you want to work from the LJWillams implementation on github, there are a couple of bugs. However, to understand the technique fully, I'd probably recommend having a go at implementing it yourself. I looked up what Octave (MATLAB free software equivalent) do and their "code is a wrapper to the corresponding Lapack dggsvd and zggsvd routines.", which is what scipy should do IMHO.

I'll post up the bugs I found, but I'm not going to post the code in full working order, because I'm not sure how that stands with regard to copyright, given the copyrighted MATLAB implementation from which it is translated.

Caveat : I am not an expert on the Generalised SVD and have approached this only from the perspective of debugging, not whether the underlying algorithm is correct. I have had this working on your original random arrays and the test case already present in the Python file.

Generalized SVD is a significant problem to not have, and I am amazed it's not implemented in numpy or scipy directly! In fact, regular SVD and generalized eigen value decomposition are special cases of GSVD!You may want to check Generalized Eigen Vaue solution that is doable in scipy.linalg.eigMy understanding of LAPACK is that it is written in Fortran77. There is a C++ implementation of LAPACK but they do say that it is not a complete implementation and I am not sure if they have the Generalized SVD implemented or not.Whoever implements LAPACK and Numerical Recipies in python will be a hero :-)

The site is secure.

The ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

We describe a comparative mathematical framework for two genome-scale expression data sets. This framework formulates expression as superposition of the effects of regulatory programs, biological processes, and experimental artifacts common to both data sets, as well as those that are exclusive to one data set or the other, by using generalized singular value decomposition. This framework enables comparative reconstruction and classification of the genes and arrays of both data sets. We illustrate this framework with a comparison of yeast and human cell-cycle expression data sets.

Copyright: 2011 Ponnapalli et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

It was shown that the GSVD provides a mathematical framework for sequence-independent comparative modeling of DNA microarray data from two organisms, where the mathematical variables and operations represent biological reality [7], [8]. The variables, significant subspaces that are common to both or exclusive to either one of the datasets, correlate with cellular programs that are conserved in both or unique to either one of the organisms, respectively. The operation of reconstruction in the subspaces common to both datasets outlines the biological similarity in the regulation of the cellular programs that are conserved across the species. Reconstruction in the common and exclusive subspaces of either dataset outlines the differential regulation of the conserved relative to the unique programs in the corresponding organism. Recent experimental results [9] verify a computationally predicted genome-wide mode of regulation that correlates DNA replication origin activity with mRNA expression [10], [11], demonstrating that GSVD modeling of DNA microarray data can be used to correctly predict previously unknown cellular mechanisms.

We now define a higher-order GSVD (HO GSVD) for the comparison of datasets. The datasets are tabulated as real matrices , each with full column rank, with different row dimensions and the same column dimension, where there exists a one-to-one mapping among the columns of the matrices. Like the GSVD, the HO GSVD is an exact decomposition, i.e., each matrix is exactly factored as , where the columns of and have unit length and are the left and right basis vectors respectively, and each is diagonal and positive definite. Like the GSVD, the matrix is identical in all factorizations. In our HO GSVD, the matrix is obtained from the eigensystem of the arithmetic mean of all pairwise quotients of the matrices , or equivalently of all , .

Let the SVD of the matrices appended along the -columns axis be(6)Since the matrices are real and with full column rank, it follows from the SVD of that the symmetric matrices are real and positive definite, and their inverses exist. It then follows from Equations (5) and (6) that is similar to ,(7)and the eigenvalues of equal the eigenvalues of .

A sum of real, symmetric and positive definite matrices, is also real, symmetric and positive definite; therefore, its eigensystem(8)is real with orthogonal and . Without loss of generality let be orthonormal, such that . It follows from the similarity of with that the eigensystem of can be written as , with the real and nonsingular , where and such that for all .

Proof. Without loss of generality, let . From Equation (12) and the Cauchy-Schwarz inequality, an eigenvalue of equals its minimum lower bound if and only if the corresponding eigenvector is also an eigenvector of for all [37], where, from Equation (13), the corresponding eigenvalue equals ,(14)

Following Equations (14) and (15), where corresponds to a minimum eigenvalue , and since is orthonormal, we obtain(16)with zeroes in the th row and the th column of the matrix above everywhere except for the diagonal element. Thus, an eigenvalue of satisfies if and only if the corresponding left basis vectors are orthonormal to all other vectors in .

An eigenvalue of satisfies if and only if the corresponding right basis vector is a generalized singular vector of all pairwise GSVD factorizations of the matrices and with equal corresponding generalized singular values for all and .