Bayesian Factor Analysis

1,174 views
Skip to first unread message

Rick Farouni

unread,
Apr 27, 2015, 11:21:58 AM4/27/15
to stan-...@googlegroups.com


Hi Everyone,

I wrote a blog post about fitting a Bayesian factor analysis model in Stan. Stan code is provided. Here  is the link:


As always, feedback is very welcome.

Thanks,
Rick Farouni

Mauricio Garnier-Villarreal

unread,
Apr 28, 2015, 12:34:14 PM4/28/15
to stan-...@googlegroups.com
Hi Rick

I see you use the covariance matrix method like SEM program (lavaan, Mplus). For Bayesian factor models I prefer to use the data model structure, estimating the factor scores, like Lee and Song (2012) recommend (http://www.amazon.com/Advanced-Bayesian-Structural-Equation-Modeling/dp/0470669527/ref=sr_1_2?ie=UTF8&qid=1430238405&sr=8-2&keywords=bayesian+structural+equation+modeling), this method is more flexible like making it easy to do interactions between factors. 

For example a model like that would be

data{
int N; // sample size
int P; // number of variables
int D; // number of dimensions
vector[P] X[N]; // data matrix of order [N,P]
}

parameters{
vector[P] b; // intercepts
vector<lower=0>[P] lam; // factor loadings
vector[D] FS[N]; // factor scores, matrix of order [N,D]
corr_matrix[D] Rho; // correlation matrix between factors
vector<lower=0>[P] var_p; // residual variance for each variable
}

transformed parameters{
vector[D] M; // factor means
vector<lower=0[D] Sd_d; // factor standard deviations
vector[P] mu[N]; // predicted values
matrix[D,D] Ld; // cholesky decomposition of the covariance matrix between factors

for (m in 1:D) {
M[m] <- 0;
Sd_d[m] <- 1;}

Ld <- diag_matrix(Sd_d) * cholesky_decompose(Rho);

for(i in 1:N){
mu[i,1] <- b[1] + lam[1]*FS[i,1];
mu[i,2] <- b[2] + lam[2]*FS[i,1];
mu[i,3] <- b[3] + lam[3]*FS[i,1];
mu[i,4] <- b[4] + lam[4]*FS[i,2];
mu[i,5] <- b[5] + lam[5]*FS[i,2];
mu[i,6] <- b[6] + lam[6]*FS[i,2];
mu[i,7] <- b[7] + lam[7]*FS[i,3];
mu[i,8] <- b[8] + lam[8]*FS[i,3];
mu[i,9] <- b[9] + lam[9]*FS[i,3];
}
}

model{

b ~ normal(0, 100);
lam ~ normal(0.4,var_p);
var_p ~ cauchy(0,2.5);

for(i in 1:N){
X[i] ~ normal(mu[i],var_p);
FS[i] ~ multi_normal_cholesky(M, Ld);
}

}


Bye

- Mauricio Garnier-Villarreal

Rick Farouni

unread,
Apr 28, 2015, 1:07:37 PM4/28/15
to stan-...@googlegroups.com
Hi Mauricio,

If you instead write  Q<-L*Phi*L'+diag_matrix(psi)  and add a sampling statement for Phi in the model block, then you can model  correlations among the factors. I'll do that for the next blog post. The factor analysis model I used  assumes that the factor scores are nuisance variables of no interest to the researcher. This can be the case in a confirmatory analysis  setting when the focus is on finding out which items (i.e. observed variables) are associated with which factor. If the focus on the other hand is on determining the latent factor scores (e.g. ability), then one would need  to incorporate the latent variables in the model, just as you did in the code you provided. The drawback is increased computation time. 

Rick,

Bob Carpenter

unread,
Apr 29, 2015, 10:32:30 AM4/29/15
to stan-...@googlegroups.com
We also have a built-in function for you:

multiply_lower_tri_self_transpose(L) = L * L'

when L is lower triangular. We built it for just this purpose
and it'll be faster.

You also probably don't want to create the diagonal matrix
from psi

Q <- L * L' + diag_matrix(psi);

just adding the values to the diagonal in a loop
would be faster. I don't see a way to modify the Cholesky
factor easily to get the same effect --- that would be
even more efficient if it were possible.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Rick Farouni

unread,
Apr 29, 2015, 11:17:14 AM4/29/15
to stan-...@googlegroups.com
Hi Bob,


I tried to use multiply_lower_tri_self_transpose but I got an error regarding  variable type. My guess is that the error is due to declaring L as a cholesky_factor_cov type, instead of a matrix. If I don't diagnoalize the matrix I would have to use two for loops and a comment describing what the for loops imply. I thought that additional code optimization wasn't necessary since the sampling for this model was surprisingly really fast. 

Also, on page 87 (section 7.4) of the Stan manual, the code provided assumes a square lower triangular matrix L. Is that a mistake or is there a full-rank matrix factorization step required (i.e. A=B*C) that need to be added to the final model code?

Thanks,
Rick Farouni

Bob Carpenter

unread,
Apr 29, 2015, 8:43:31 PM4/29/15
to stan-...@googlegroups.com

> On Apr 30, 2015, at 1:17 AM, Rick Farouni <rfar...@gmail.com> wrote:
>
> Hi Bob,
>
>
> I tried to use multiply_lower_tri_self_transpose but I got an error regarding variable type.
> My guess is that the error is due to declaring L as a cholesky_factor_cov type, instead of a matrix.

I see --- that's not a square matrix. We need to generalize
the multiply_lower_tri_self_transpose function, so I added
an issue:

https://github.com/stan-dev/stan/issues/1434

> If I don't diagnoalize the matrix I would have to use two for loops and a comment describing what the for loops imply. I thought that additional code optimization wasn't necessary since the sampling for this model was surprisingly really fast.

Good point --- we'd always recommend clarity over optimization.
But to add to a diagonal, all you need is:

Q <- L * L';

for (i in 1:rows(Q))
Q[i,i] <- Q[i,i] + psi[i];

But as you say, clarity first if you don't need the speed.!

> Also, on page 87 (section 7.4) of the Stan manual, the code provided assumes a square lower triangular matrix L. Is that a mistake or is there a full-rank matrix factorization step required (i.e. A=B*C) that need to be added to the final model code?

I'm not sure what you mean. I don't see an A, B or C there --- that
code's just to build a square Cholesky factor for a covariance matrix
with a unit diagonal. Should it be non-square?

- Bob

Rick Farouni

unread,
Apr 29, 2015, 10:25:15 PM4/29/15
to stan-...@googlegroups.com


On Wednesday, April 29, 2015 at 8:43:31 PM UTC-4, Bob Carpenter wrote:

> On Apr 30, 2015, at 1:17 AM, Rick Farouni <rfar...@gmail.com> wrote:
>
> Hi Bob,
>
>
> I tried to use multiply_lower_tri_self_transpose but I got an error regarding  variable type.
> My guess is that the error is due to declaring L as a cholesky_factor_cov type, instead of a matrix.

I see --- that's not a square matrix.  We need to generalize
the multiply_lower_tri_self_transpose function, so I added
an issue:

https://github.com/stan-dev/stan/issues/1434

> If I don't diagnoalize the matrix I would have to use two for loops and a comment describing what the for loops imply. I thought that additional code optimization wasn't necessary since the sampling for this model was surprisingly really fast.

Good point --- we'd always recommend clarity over optimization.
But to add to a diagonal, all you need is:

Q <- L * L';

for (i in 1:rows(Q))
  Q[i,i] <- Q[i,i] + psi[i];

But as you say, clarity first if you don't need the speed.!

> Also, on page 87 (section 7.4) of the Stan manual, the code provided assumes a square lower triangular matrix L. Is that a mistake or is there a full-rank matrix factorization step required (i.e. A=B*C) that need to be added to the final model code?

I'm not sure what you mean.  I don't see an A, B or C there --- that
code's just to build a square Cholesky factor for a covariance matrix
with a unit diagonal.  Should it be non-square?

Correct me if I am wrong but since Factor analysis is mainly a dimensionality reduction technique, the n x p  loading matrix L will have n larger than p, so the rank of L is at most p, (when L has full column rank) and the n x n matrix LL' would have the same rank of L and  so isn't positive definite. That is , if we consider L*L' as a covariance matrix, then it would be rank deficient and won't have a Cholesky factor.  As for A=B*C, I was referring to this rank decomposition  result.

Ben Goodrich

unread,
Apr 30, 2015, 12:53:31 AM4/30/15
to stan-...@googlegroups.com
On Wednesday, April 29, 2015 at 10:25:15 PM UTC-4, Rick Farouni wrote:
Correct me if I am wrong but since Factor analysis is mainly a dimensionality reduction technique, the n x p  loading matrix L will have n larger than p, so the rank of L is at most p, (when L has full column rank) and the n x n matrix LL' would have the same rank of L and  so isn't positive definite. That is , if we consider L*L' as a covariance matrix, then it would be rank deficient and won't have a Cholesky factor.  As for A=B*C, I was referring to this rank decomposition  result.

If the reduced matrix is L * L', then its Cholesky factor is L conceptually. Nevertheless, Stan can't numerically distinguish between rank deficient and indefinite, so cholesky_decompose() would throw an exception.

BTW, if you are thinking of the model as EFA with the factors integrated out, you can use the likelihood that R uses:

> stats:::factanal.fit.mle
function (cmat, factors, start = NULL, lower = 0.005, control = NULL,
   
...)
{
   
FAout <- function(Psi, S, q) {
        sc
<- diag(1/sqrt(Psi))
       
Sstar <- sc %*% S %*% sc
        E
<- eigen(Sstar, symmetric = TRUE)
        L
<- E$vectors[, 1L:q, drop = FALSE]
        load
<- L %*% diag(sqrt(pmax(E$values[1L:q] - 1, 0)),
            q
)
        diag
(sqrt(Psi)) %*% load
   
}
   
FAfn <- function(Psi, S, q) {
        sc
<- diag(1/sqrt(Psi))
       
Sstar <- sc %*% S %*% sc
        E
<- eigen(Sstar, symmetric = TRUE, only.values = TRUE)
        e
<- E$values[-(1L:q)]
        e
<- sum(log(e) - e) - q + nrow(S)
       
-e
   
}
...

That FAfn() function returns the log likelihood so you can express priors on the uniquenesses and not worry about the unidentified loadings. However, it assumes S is a correlation matrix, which makes more sense when doing maximum likelihood (which is invariant) than MCMC.

Ben

Rick Farouni

unread,
May 1, 2015, 11:07:50 PM5/1/15
to stan-...@googlegroups.com
Thanks Ben!  Great suggestion, I'll try to implement it. 
Reply all
Reply to author
Forward
0 new messages