QR decomposition in NIMBLE

80 views
Skip to first unread message

Bert van der Veen

unread,
Sep 14, 2023, 3:21:16 AM9/14/23
to nimble-users
Hello all,

I am looking to orthogonalize the columns of a rectangular matrix in nimble (sampling on the stiefel manifold), and wonder if there is any possibility of using Eigen's QR decomposition in nimble?

My preference goes to combining this with HMC, so use of SVD/eigen decompositions is out of the question, since AD does not seem to be supported for those in nimble (?).

Kind regards,
Bert

Perry de Valpine

unread,
Sep 14, 2023, 11:01:21 AM9/14/23
to Bert van der Veen, nimble-users
Hi Bert,

Thanks for the question. This is an interesting case. If you didn't want to use HMC (BTW a new release of nimbleHMC is coming up on CRAN during this week), requiring AD (differentiation), you could either use nimbleRcall or nimbleExternalCall to get to a QR decomposition. But those won't give you AD support. Currently the only matrix decomposition supported for AD is Cholesky, so if you could use that for your purposes, it is there. An alternative would be to code your own QR as a nimbleFunction. That seems a bit ugly to have to do but perhaps it is feasible?

Best wishes,
Perry



--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/db8cfd8a-b86d-49e0-b38f-f309617c1b27n%40googlegroups.com.

Bert van der Veen

unread,
Sep 14, 2023, 11:29:10 AM9/14/23
to nimble-users
Thanks Perry!

My first try was QR via cholesky, which does OK but requires extra operations I am not (too) fond of. I did code my own QR algorithm (not nearly as fast as eigen I suspect), but I cannot get the nimble to compile my model when I use it. Code below, do you spot any potential issues? My suspicion is that AD might not appreciate iterating over the columns of A and replacing them. There is no clean way to avoid that though, I think. Note that the below returns Q multiplied by the diagonals of R, i.e., the output does not have unit norm columns.

Kind regards,
Bert

nimQR.u <- nimbleFunction(
  run = function(A = double(2), p  = integer(), n= integer()) {
    for (k in 2:p) {
      for (i in 1:(k-1)) {
        A[, k] <- A[, k] - inprod(A[,i],A[,k]) / inprod(A[, i],A[, i]) * A[, i]
      }
    }
    return(A)
    returnType(double(2))
  } )

Perry de Valpine

unread,
Sep 14, 2023, 11:48:09 AM9/14/23
to Bert van der Veen, nimble-users
Hi Bert,

If you have buildDerivs=TRUE in the model and are calling this from the model, you need to include buildDerivs=TRUE (or buildDerivs = list(...)) in the nimbleFunction as well.

One of the hassles writing a nimbleFunction with buildDerivs=TRUE is telling it what variables should not have derivatives tracked. All arguments from a model should be double, even if they will in fact be integers. Also for-loop indices sometimes need to be noted specifically as not having derivatives tracked. Replacing elements of A should not be the problem, I don't think.

There are two ways to say not to track derivatives, ADbreak and the list format of the buildDerivs argument. However, for arguments from a model, only ADbreak will work.

I don't have time to test this at the moment, but here is something like what I think you'll want.


nimQR.u <- nimbleFunction(
  run = function(A = double(2), p_in  = double(), n_in= double()) {
    p <- ADbreak(p_in)
     n <- ADbreak(n_in)

    for (k in 2:p) {
      for (i in 1:(k-1)) {
        A[, k] <- A[, k] - inprod(A[,i],A[,k]) / inprod(A[, i],A[, i]) * A[, i]
      }
    }
    return(A)
    returnType(double(2))
  } ,
buildDerivs = list(run = list(ignore = c("k", "i")))
)

A final alternative is to create k and i trivially as definitely integers (which never have derivatives tracked) before using them as for-loop indices, which would look like:


nimQR.u <- nimbleFunction(
  run = function(A = double(2), p_in  = double(), n_in= double()) {
    p <- ADbreak(p_in)
     n <- ADbreak(n_in)
    k <- 0L
    i <- 0L

    for (k in 2:p) {
      for (i in 1:(k-1)) {
        A[, k] <- A[, k] - inprod(A[,i],A[,k]) / inprod(A[, i],A[, i]) * A[, i]
      }
    }
    return(A)
    returnType(double(2))
  } ,
buildDerivs = TRUE
)

We tried to explain this stuff in the User Manual, but the AD system is still new.

If you still can't get it going, please send a reduced reproducible example.

Perry



Bert van der Veen

unread,
Sep 14, 2023, 1:31:26 PM9/14/23
to nimble-users
Thanks Perry. Silly me, I should have had a better look at the usual manual. 

I still cannot get it going, with note " couldn’t deduce template parameter ‘NRows’" (but no such parameter in my model). Unfortunately I will have to come back to this another day (potentially with reduced reproducible example). 

Cheers.

Bert van der Veen

unread,
Sep 14, 2023, 3:07:57 PM9/14/23
to nimble-users
Addin "p" to the ignore argument in buildDerivs did the trick. So, the resulting function is:

nimQR.u <- nimbleFunction(
  run = function(A = double(2), p  = double()) {

    for (k in 2:p) {
      A[, k] <- A[, k]
      for (i in 1:(k-1)) {
        A[, k] <- A[, k] - inprod(A[,i], A[,k]) / inprod(A[, i], A[, i]) * A[, i]
      }
    }
    return(A)
    returnType(double(2))
  } ,
buildDerivs = list(run = list(ignore = c("k", "i","p")))
)

Kind regards,
Bert

Perry de Valpine

unread,
Sep 14, 2023, 3:13:04 PM9/14/23
to Bert van der Veen, nimble-users
Great, thanks for sharing the solution.

Reply all
Reply to author
Forward
0 new messages