data in the objective function

157 views
Skip to first unread message

Jane Sullivan - NOAA Federal

unread,
Jun 21, 2024, 6:06:38 PM6/21/24
to TMB Users, Nathan Vaughan - NOAA Affiliate, Ben Williams - NOAA Federal
RTMB users,

One interesting feature of RTMB is that only the parameter list is passed to the objective function and MakeADFun. When building an R library with RTMB, if the data list and objective function are built in separate R functions, the data list is no longer available to the proper RTMB environment and the package fails (data not found). 

We were able to find a workaround for this issue by first assigning the data within the data function to the global environment `dat <<- data` then removing it after MakeADFun using `rm(dat,envir=globalenv())`.

This seems questionable. Has anyone else encountered this problem? Do you have a better solution?

Here is the original code where the data and objective function are in the same R function:
Here is the modified code where they are split into separate functions using the changes described above:

Sincerely,
Jane Sullivan and Nathan Vaughan

James Thorson

unread,
Jun 21, 2024, 6:34:49 PM6/21/24
to Jane Sullivan - NOAA Federal, TMB Users, Nathan Vaughan - NOAA Affiliate, Ben Williams - NOAA Federal
I've found Kasper's solution here to be sufficient for developing packages, and it avoids assigning to the general environment.  However, I'm still concerned about unexpected precedence and conflicts, where having something in the global environment might affect a function in ways that aren't intended.  

Definitely an interesting design topic 

--
To post to this group, send email to us...@tmb-project.org. Before posting, please check the wiki and issuetracker at https://github.com/kaskr/adcomp/. Please try to create a simple repeatable example to go with your question (e.g issues 154, 134, 51). Use the issuetracker to report bugs.
---
You received this message because you are subscribed to the Google Groups "TMB Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tmb-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tmb-users/CAD9860JgskUHjGwPmr6_06FFVm2QujYD%3DUgH17BbU%2BT0hCME1w%40mail.gmail.com.

Jane Sullivan - NOAA Federal

unread,
Jun 21, 2024, 8:27:30 PM6/21/24
to James Thorson, TMB Users, Nathan Vaughan - NOAA Affiliate, Ben Williams - NOAA Federal
I really appreciate the response Jim, and for pointing to your previous example. 

I implemented the environment() solution on a separate branch and everything looks good. I only used the environment() portion because I couldn't get local() from rtmbTest to work for our example. I either got NaNs or an error about unnamed lists, probably because rtmbTest doesn't use getAll(). Maybe it doesn't matter?

More generally, I'm curious from a design perspective why the data object is not an optional input to the objective function. It seems like a very intentional design choice, and I'd really appreciate understanding the rationale behind it.

Thank you for your amazing work!

Jane


Kasper Kristensen

unread,
Jun 22, 2024, 7:10:37 AM6/22/24
to TMB Users
The reason for not adding a data argument to RTMB::MakeADFun is that it's not needed. R closures conveniently combine function and data as a self-contained object.
An explicit data argument to MakeADFun would require extra assumptions on how to pass data to the user function. That would be more restrictive than the current design.
If you want an explicit data argument to your objective function you can do:

## Data as explicit argument
nll <- function(parms, data) { ... }
cmb <- function(f, d) function(p) f(p, d)
RTMB::MakeADFun(cmb(nll, data), parms)

Sure, RTMB could be made to do this behind the scenes, and allow 'MakeADFun(nll, parms, data=data)', but I consider this a small gain at the cost of being more restrictive.

Andrea Havron

unread,
Jun 22, 2024, 10:56:34 AM6/22/24
to Kasper Kristensen, TMB Users
Hi Kasper,

I have a somewhat related question. I am working on adding one-step-ahead residual calculations into our modeling project, FIMS. Currently, we have a stripped down .cpp file (see here) and data are passed into the model through the Rcpp interface. I was thinking I had to pass in a data list to the cpp file in order to declare the data indicator vector, 'keep', which requires the arguments 'keep' and 'y', with 'y' being the data vector. See here for our prototype example.

Outside of RTMB, is there a way to implement one-step-residuals through Rcpp that would circumvent the need to pass in the data list to the cpp file? Is there an example in RTMB I could look at to work this out? Or is this not possible with standalone TMB models?

Thanks,
Andrea

Mollie Brooks

unread,
Aug 6, 2024, 11:47:48 AM8/6/24
to TMB Users
This is sort of a followup about RTMB and the data, but using that for predicting with new data. I'm basing my attempt partly on Kasper's suggestion above, but also on the function glmmTMB::glmmTMB.predict which I don't fully understand. I'm doing something wrong and wondered if anyone can help me figure it out.  I'm using the example from the RTMB documentation, but changing it to have a data argument so that I can make predictions for my "newdata" object. The predited values I get (in black in attached graph) are completely unrealistic.

library(RTMB)
library(ggplot2)
data(ChickWeight)

parameters <- list(
  mua=0,          ## Mean slope
  sda=1,          ## Std of slopes
  mub=0,          ## Mean intercept
  sdb=1,          ## Std of intercepts
  sdeps=1,        ## Residual Std
  a=rep(0, 50),   ## Random slope by chick
  b=rep(0, 50)    ## Random intercept by chick
)

nll <- function(parms, data) {
  getAll(data, parms, warn=FALSE)
  ## Optional (enables extra RTMB features)
  weight <- OBS(weight)
  ## Initialize joint negative log likelihood
  nll <- 0
  ## Random slopes
  nll <- nll - sum(dnorm(a, mean=mua, sd=sda, log=TRUE))
  ## Random intercepts
  nll <- nll - sum(dnorm(b, mean=mub, sd=sdb, log=TRUE))
  ## Data
  predWeight <- a[Chick] * Time + b[Chick]
  nll <- nll - sum(dnorm(weight, predWeight, sd=sdeps, log=TRUE))
  ## Get predicted weight uncertainties
  ADREPORT(predWeight)
  ## Return
  nll

}
cmb <- function(f, d) function(p) f(p, d)

obj <- RTMB::MakeADFun(cmb(nll, ChickWeight), parameters, random=c("a", "b"))

opt <- nlminb(obj$par, obj$fn, obj$gr)

newdata <- data.frame(Time=20:30, Diet=3, Chick=10, weight=0)

oldPar <- opt$par

H <- optimHess(oldPar,obj$fn,obj$gr)

newObj <- RTMB::MakeADFun(cmb(nll, newdata), parameters, random=c("a", "b"))

newObj$fn(oldPar)  ## call once to update internal structures

sdr <- sdreport(newObj, oldPar, hessian.fixed=H, getReportCovariance=TRUE)
newdata$predWeight <- as.list(sdr, "Est", report=TRUE)$predWeight


ggplot(ChickWeight, aes(Time, weight, colour=Diet))+
  geom_point()+geom_smooth()+
  geom_line(data=newdata, aes(x=Time, y=predWeight), colour="black")

ggsave("extrapolate weights.png", height=3, width=5)
RTMB predict newdata.R
extrapolate weights.png

Kasper Kristensen

unread,
Aug 6, 2024, 12:18:03 PM8/6/24
to TMB Users
The glmmTMB::predict function actually augments the old data with the new data to make predictions. The newdata should have an empty response, so something like:

newdata <- data.frame(Time=20:30, Diet=3, Chick=10, weight=NA)
augdata <- rbind(ChickWeight, newdata)

The NA response can be handled in RTMB using the modified data likelihood:

 nll <- nll - sum(dnorm(weight, predWeight, sd=sdeps, log=TRUE), na.rm=TRUE)

You can then build the likelihood with the augmented dataset:

newObj <- RTMB::MakeADFun(cmb(nll, augdata), parameters, random=c("a", "b"))

With these minor changes I think your script should give reasonable predictions?

Mollie Brooks

unread,
Aug 6, 2024, 1:16:05 PM8/6/24
to Kasper Kristensen, TMB Users
Thanks Kasper! That seems to have worked. I knew about the data augmentation in glmmTMB::predict, but thought there might be a simpler way. Oh well.  Details are attached in case someone needs it in the future. 

cheers,
Mollie
RTMB predict newdata.R
Reply all
Reply to author
Forward
0 new messages