Bayesian Adaptive Regression Splines

24 views
Skip to first unread message

Kavir Narsai

unread,
11:08 AM (6 hours ago) 11:08 AM
to nimble-users
I note nimble is capable of implementing rjmcmc for the purpose of variable selection however, is it possible to use nimble to implement bayesian adaptive regression splines whereby one could randomly vary the number and placement of knots in a regression spline fit?

Perry de Valpine

unread,
11:30 AM (6 hours ago) 11:30 AM
to Kavir Narsai, nimble-users
That's an interesting question and idea. The RJMCMC system provided with nimble is pretty basic, set up for one-to-one pairs of indicators (0 or 1) and coefficients. I am not sure but off the top of my head I suspect what you would need is more involved. In that case, you could consider writing your own RJ samplers to work with this specific kind of situation.


On Mon, Jul 22, 2024 at 8:08 AM Kavir Narsai <kavir...@gmail.com> wrote:
I note nimble is capable of implementing rjmcmc for the purpose of variable selection however, is it possible to use nimble to implement bayesian adaptive regression splines whereby one could randomly vary the number and placement of knots in a regression spline fit?

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/d9457b09-d95c-470e-8c2a-ab8349b56b06n%40googlegroups.com.

Kavir Narsai

unread,
12:27 PM (5 hours ago) 12:27 PM
to nimble-users
Hi Perry

Noted, thank you for the feedback. Yes, I suppose one of the main elements to cater for is the possibility of the size of the design matrix and associated number of coefficients potentially changing at each iteration. I see that dynamic indexing is not allowed directly in the nimbleCode but is there a way to use a nimbleFunction to enable the indexing of the coefficients and design matrix to change?

Thanks 

Perry de Valpine

unread,
12:34 PM (5 hours ago) 12:34 PM
to Kavir Narsai, nimble-users
This is somewhat open-ended and it might help if you can sketch out more about what you would like to do. Dynamic indexing is supported in nimble models, so if you want to say more of what you are seeing about that, please do. For a model that conceptually changes dimension, you will need to implement the maximum sized model and then use model code and/or nimbleFunctions to make use of subsets of it as needed. There is a new feature (as of nimble 1.2.0) that allows nimbleFunctions with setup code to be used in model code. This opens the possibility of user-defined functions and distributions that retain their own internal variables across calls (i.e. they are class objects). That might be useful for example if you need to update internal variables about a design matrix or basis functions as you (have a sampler) change the number or placement of knots. A good way to get support would be if you can abstract your problem down to a toy example that involves the main implementation difficulties you want to solve.
HTH
Perry


Kavir Narsai

unread,
1:05 PM (4 hours ago) 1:05 PM
to nimble-users
Hi Perry

Below is a simple example in which I would like to fit a bayesian adaptive regression spline using nimble: (Note the nimble code below is meant to illustrate ideally what I would like to achieve but I'm not sure how to specify it correctly using nimbleFunctions:
<R code

## LOAD PACKAGES ---------
library(data.table)
library(splines2)
library(nimble)

## SIMULATE DATA ---------
set.seed(123)
dt <- data.table(x = seq(0.1, 10, 0.1))
betas <- c(20, 30, 50)

dt[, y := nsp(x, knots = 5, intercept = T)%*%betas + rnorm(100, 0, 0.5)
]

## ATTEMPT TO PREPARE BARS IN NIMBLE

## Specify nsp code in R
Build_nsp <- function(x, K) {
  nsp(x, knots = K, intercept = T)
}

## Convert R function to a function that can be called in nimble
Rnsp <- nimbleRcall(function(x = double(1), K = double(1)) {},
                    Rfun = 'Build_nsp',
                    returnType = double(2))

nim_spCode <- nimbleCode({
 
  ## Determine number of coefficients based
  ## on number of knots
  p <- KNum + 2
 
  ## Prep spline design matrix
  B[1:N,1:p] <- Rnsp(x[1:N], K = K)
 
  for (i in 1:N) {
    fit[i] <- inprod(B[i, 1:p], betas[1:p])
  }
 
  ## Simulate the number of knots
  ## such that it is very dependent on
  ## the data
  Alpha[1:Kmax] <- nimRep(0.01, Kmax)
  Theta[1:Kmax] ~ ddirch(Alpha[1:Kmax])
  KNum ~ dcat(Theta[1:Kmax])
 
  ## Attempt to make the knot position random
  for (i in 1:KNum){
    K[i] ~ dunif(0.1, 9.9)
  }
 
  for (i in 1:p) {
    betas[i] ~ dnorm(0, sd = 1)
  }
 
  for (i in 1:N) {
    y[i] ~ dnorm(fit[i], sd = 1)
  }
 
}
)


y_m  <- mean(dt$y)
y_sd <- sd(dt$y)

sp_Const <- list(N = dim(dt)[1], Kmax = 3)
sp_Data <- list(y = (dt$y - y_m) / y_sd,
                x = dt$x)
sp_Inits <- list(betas = rep(0, 3),  KNum = 2 )


sp_mod <- nimbleModel(code      = nim_spCode  ,
                      name      = "sp_mod"    ,
                      constants = sp_Const    ,
                      data      = sp_Data     ,
                      inits     = sp_Inits   )


The error received when running the nimbleModel is as follows: "Defining model

Error in getSymbolicParentNodesRecurse(x, constNames, indexNames, nimbleFunctionNames,  :

  Dynamic indexing found in a vector of indices, 1:p. Only scalar indices, such as 'idx' in 'x[idx]', can be dynamic. One can instead use dynamic indexing in a vector of indices inside a nimbleFunction."

I have been able to bring in a natural cubic spline (provided by the splines2 package) into the nimble code using the nimbleRcall. So it is possible to make a given number of knots random. However, I'm not sure how to enable the number of knots (KNum in the R code above) to vary as well.

Thank you for your assistance Perry.

Best regards
Kavir 

Daniel Turek

unread,
4:13 PM (1 hour ago) 4:13 PM
to Kavir Narsai, nimble-users, Perry de Valpine
Kavir, this is great work.  I think you only need to push a few things into nimbleFunctions, and the problem will resolve.

Please take a look at the code below, I'll highlight the changes so you can see them.  I think I got the indexing right, but please double-check that the maths look correct.

Cheers,
Daniel


## Specify nsp code in R
Build_nsp <- function(x, K) {
  nsp(x, knots = K, intercept = T)
}

## Convert R function to a function that can be called in nimble
Rnsp <- nimbleRcall(function(x = double(1), K = double(1)) {},
                    Rfun = 'Build_nsp',
                    returnType = double(2))

calc_fit <- nimbleFunction(
    run = function(x = double(1), K = double(1), betas = double(1), KNum = double(), N = double()) {

        p <- KNum + 2
        B <- Rnsp(x[1:N], K = K[1:KNum])
        fitArray <- B %*% betas[1:p]
        returnType(double(1))
        return(fitArray[1:N,1])
    }
)


nim_spCode <- nimbleCode({
   
    ## Determine number of coefficients based
    ## on number of knots
    ###p <- KNum + 2   ## remove p

   
    ## Prep spline design matrix
    ##B[1:N,1:p] <- Rnsp(x[1:N], K = K)     ## remove B
   
    ###for (i in 1:N) {
    ###    fit[i] <- inprod(B[i, 1:p], betas[1:p])    ## remove
    ###}
    f[1:N] <- calc_fit(x[1:N], K[1:Kmax], betas[1:(Kmax+2)], KNum, N)    ## new

   
    ## Simulate the number of knots
    ## such that it is very dependent on
    ## the data
    Alpha[1:Kmax] <- nimRep(0.01, Kmax)
    Theta[1:Kmax] ~ ddirch(Alpha[1:Kmax])
    KNum ~ dcat(Theta[1:Kmax])
   
    ## Attempt to make the knot position random
    ##for (i in 1:KNum){    ## remove
    for (i in 1:Kmax){      ## instead, define K[i] up to max size of Kmax

        K[i] ~ dunif(0.1, 9.9)
    }
   
    ##for (i in 1:p) {            ## remove
    for (i in 1:(Kmax+2)) {       ## instead, define betas[i] up to max size of Kmax+2

        betas[i] ~ dnorm(0, sd = 1)
    }
   
    for (i in 1:N) {
        y[i] ~ dnorm(fit[i], sd = 1)
    }
   
}
)

y_m  <- mean(dt$y)
y_sd <- sd(dt$y)

sp_Const <- list(N = dim(dt)[1], Kmax = 3)
sp_Data <- list(y = (dt$y - y_m) / y_sd, x = dt$x)
sp_Inits <- list(betas = rep(0,5),  KNum = 2,  ## change initial values for betas to be length = 5
                 K = runif(3, 0.1, 9.9))       ## also add initial values for K[1:Kmax]



sp_mod <- nimbleModel(code      = nim_spCode  ,
                      name      = "sp_mod"    ,
                      constants = sp_Const    ,
                      data      = sp_Data     ,
                      inits     = sp_Inits   )

Perry de Valpine

unread,
4:43 PM (1 hour ago) 4:43 PM
to Daniel Turek, Kavir Narsai, nimble-users
And to follow up further:

If you want to be sampling knots by position and also by RJ in and out of the model, you'll have to consider whether the existing RJ is satisfactory or not. You could use it for the beta[i] elements, but then the K[i] would still be mixing and following their prior if not being used in the model. Alternatively you could look at the RJ and make one that jointly samples both beta[i] and K[i] in and out of the model.

And if you want to separate computation of the nsp from computation of the fitArray, you could split those steps once you verify if Daniel's changes do what you want. I am guessing the nsp is a bit costly, and as written that will be done even if one of the beta[i] but not the K[i] is changed by a sampler.

Perry

Reply all
Reply to author
Forward
0 new messages