Fitting SCR models using RTMB

Rishika Chopara

unread,

Jun 30, 2025, 1:39:46 AMJun 30

to TMB Users

I am currently trying to use RTMB to fit SCR (spatial capture-recapture) models. I am using the following objective function:

objective.fn = function(params) {

# --------------------------------------

# Pulling in 'data.vec'
getDat = function(null) {
print("Get data")
env_get(data.env, "data")
}
empty = rep(params$beta0, 0)
data.vec = DataEval(getDat, empty)

# --------------------------------------

# Unpacking parameters
beta0 = params[["beta0"]]
logit.g0 = params[["logit.g0"]]
log.sigma = params[["log.sigma"]]
g0 = exp(logit.g0)/(1 + exp(logit.g0))
sigma = exp(log.sigma)

# Unpacking data
nocc = data[["n.occasions"]]
mask.dists = data[["mask.dists"]]
n.traps = ncol(mask.dists)
n.mask = nrow(mask.dists)
mask = data[["mask"]]
a = attr(mask, "area")

# Unpacking 'data.vec'
animal.indicator = data.vec[1:M]
capt.hist = matrix(data.vec[-(1:M)], ncol=n.traps)
n = sum(animal.indicator)

# --------------------------------------

## Negative log-likelihood calculations

# Density at each mask point (homogeneous density)
D = rep(exp(beta0), n.mask)

# Detection probability calculations
mask.probs = g0 * exp( -mask.dists^2 / (2 * sigma^2) )
p.avoid = advector(rep(0, n.mask))
p.avoid = (apply(1 - mask.probs, 1, prod)) ^ nocc
p.det = 1 - p.avoid

# E[n]
esa.mult.D = sum(a * D * p.det)

# Contribution of each capture history to the log-likelihood
mask.integrand = exp((log(mask.probs) %*% t(capt.hist)) + (log(1-mask.probs) %*% t(nocc-capt.hist)) + log(D) - log(esa.mult.D))
log.lik.contr = log(colSums(mask.integrand))

# Sending contributions for unobserved animals to 0. Then, summing across all contributions
log.f.capt = sum(log.lik.contr * animal.indicator)

# Log-likelihood contribution for n
log.f.n = dpois(n, esa.mult.D, log=TRUE)

# Overall log-likelihood
ll = log.f.n + log.f.capt

# Calculating the negative log-likelihood
nllike = ll * (-1)

# --------------------------------------

ADREPORT(g0)
ADREPORT(sigma)
nllike
}

I have tried to write this objective function so that I can reuse the output from MakeADFun() to fit my model to multiple data sets. I understand that when updating data, I must be working with a vector of fixed length. However, when fitting these models, the data I want to update (the capture history matrix) often changes dimension. For each data set, the number of observed animals determines the number of rows in the matrix.

To get around this, I select a value of M large enough so that the probability of observing M animals is essentially 0. I fit my model to a capture history matrix with M rows, where the last M-n rows represent artificial, unobserved animals that I manually add to the data.

Specifically, I assume the first 1:M elements of the vector labelled 'data.vec' specify an indicator vector, and the remaining elements specify the corresponding capture history matrix. This capture history matrix will have M rows, where the last M-n rows will be all-0 capture histories. Rows representing observed animals will correspond to a value of 1 in the indicator vector, and vice versa. I use the indicator vector in my calculations so that the unobserved animals do not contribute anything to the final value of the log-likelihood.

I have two main questions:

1) Are there ways I can make my objective function more efficient?

2) I would like to fit my SCR model to many simulated datasets. The approach described in (a) is my best attempt at providing the data as a vector of fixed length, so I do not have to re-run MakeADFun() to fit my model to every data set. Does this seem like a good approach, or are there any other recommended solutions to this?

Let me know if I can clarify anything. Thank you in advance!

Mollie Brooks

unread,

Aug 8, 2025, 8:23:30 AMAug 8

to Rishika Chopara, TMB Users

Hi Rishika,

Sorry this question slipped by over the summer. Maybe you fixed it already.

I might be missing some of the complexity that is needed, but it seems like things could be simplified. I made some comments below.

On 30 Jun 2025, at 07.39, 'Rishika Chopara' via TMB Users <tmb-...@googlegroups.com> wrote:

I am currently trying to use RTMB to fit SCR (spatial capture-recapture) models. I am using the following objective function:
objective.fn = function(params) {

Instead of the unpacking parts below, why are you not storing all the objects in lists named "dat" and "par" and then use the function getAll(dat, par) near the top of the objective function? That seems to be a standard from what I’ve seen.

# --------------------------------------

# Pulling in 'data.vec'
getDat = function(null) {
print("Get data")
env_get(data.env, "data")
}
empty = rep(params$beta0, 0)
data.vec = DataEval(getDat, empty)

# --------------------------------------

# Unpacking parameters
beta0 = params[["beta0"]]
logit.g0 = params[["logit.g0"]]
log.sigma = params[["log.sigma"]]
g0 = exp(logit.g0)/(1 + exp(logit.g0))
sigma = exp(log.sigma)

# Unpacking data
nocc = data[["n.occasions"]]
mask.dists = data[["mask.dists"]]
n.traps = ncol(mask.dists)
n.mask = nrow(mask.dists)
mask = data[["mask"]]
a = attr(mask, "area")

Would be possible to organize the data outside of the objective function so that animal.indicator and capt.hist are different objects in the list "dat" that I suggested above?

I looked at my own porject where I use the same objective function for many data sets and it looks like I always rerun MakeADFun with every new data set. Maybe someone else will chime in and suggest something else.

Best wishes,

Mollie

Let me know if I can clarify anything. Thank you in advance!

--
To post to this group, send email to us...@tmb-project.org. Before posting, please check the wiki and issuetracker at https://github.com/kaskr/adcomp/. Please try to create a simple repeatable example to go with your question (e.g issues 154, 134, 51). Use the issuetracker to report bugs.
---
You received this message because you are subscribed to the Google Groups "TMB Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tmb-users+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/tmb-users/e26e5d17-32a2-48f7-a5fe-ffc2cd131d7dn%40googlegroups.com.

Message has been deleted

Rishika Chopara

unread,

Aug 12, 2025, 5:29:22 AMAug 12

to TMB Users

Thank you for your reply, I appreciate it!

So far, I haven't organised the data outside of the objective function so that 'animal.indicator' and 'capt.hist' are separate objects because it seems any data I want to update when reusing the tape must be provided as a numeric vector of fixed length. Each time I fit the model, I want to update the 'animal.indicator' and 'capt.hist' objects, so I provide them in the same 'data.vec' vector (which has a fixed length). Feel free to let me know I can clarify this explanation :)

Mollie Brooks

unread,

Aug 18, 2025, 6:26:24 AMAug 18

to Rishika Chopara, TMB Users

Hi Rishika,

I changed an existing example to show that the data input can be a list with vectors of different sizes (see below). Was this the limiting thing for your data organization? The only thing I couldn’t put in the list in my experiences so far was a formula (but I found a different way around that... a separate topic).

cheers,

Mollie

library(RTMB)

data(ChickWeight)

indx1 = 1:500

dat1 = list(weight=ChickWeight[indx1,]$weight,

Time=ChickWeight[indx1,]$Time,

Chick=ChickWeight[indx1,]$Chick,

covariate = 5) #list item with different length just to show flexibility

indx2 = 20:578 #another list with objects of different length from above

dat2 = list(weight=ChickWeight[indx2,]$weight,

Time=ChickWeight[indx2,]$Time,

Chick=ChickWeight[indx2,]$Chick,

covariate = 4)#list item with different length just to show flexibility

parameters = list(

mua=0, ## Mean slope

sda=1, ## Std of slopes

mub=0, ## Mean intercept

sdb=1, ## Std of intercepts

sdeps=1, ## Residual Std

a=rep(0, 50), ## Random slope by chick

b=rep(0, 50), ## Random intercept by chick

beta=0 ## Coefficient on covariate

)

objective.fn = function(parms, dat) {

getAll(dat, parms, warn=FALSE)

## Optional (enables extra RTMB features)

weight <- OBS(weight)

## Initialize joint negative log likelihood

nll <- 0

## Random slopes

nll <- nll - sum(dnorm(a, mean=mua, sd=sda, log=TRUE))

## Random intercepts

nll <- nll - sum(dnorm(b, mean=mub, sd=sdb, log=TRUE))

## Data

predWeight <- a[Chick] * Time + b[Chick] + beta*covariate

nll <- nll - sum(dnorm(weight, predWeight, sd=sdeps, log=TRUE))

## Get predicted weight uncertainties

ADREPORT(predWeight)

## Return

nll

}

cmb = function(f, d) function(p) f(p, d)

obj = MakeADFun(cmb(objective.fn, dat1), parameters, random=c("a", "b"))

opt = nlminb(obj$par, obj$fn, obj$gr)

opt$par

obj = MakeADFun(cmb(objective.fn, dat2), parameters, random=c("a", "b"))

opt = nlminb(obj$par, obj$fn, obj$gr)

opt$par

To view this discussion visit https://groups.google.com/d/msgid/tmb-users/ec60798d-e2e4-4f64-b020-871c0448d1b0n%40googlegroups.com.

Mollie Brooks

unread,

Aug 18, 2025, 6:43:14 AMAug 18

to Rishika Chopara, TMB Users

P.S. Since your problem has a matrix in the data, here’s a version with that.

indx1 = 1:500

dat1 = list(weight=ChickWeight[indx1,]$weight,

Time=ChickWeight[indx1,]$Time,

Chick=ChickWeight[indx1,]$Chick,

covariate = 5,#list item with different length just to show flexibility

mat = matrix(1, nrow=5, ncol=4)

)

indx2 = 20:578 #another list with objects of different length from above

dat2 = list(weight=ChickWeight[indx2,]$weight,

Time=ChickWeight[indx2,]$Time,

Chick=ChickWeight[indx2,]$Chick,

covariate = 4,#list item with different length just to show flexibility

mat = matrix(1, nrow=4, ncol=4)

)

objective.fn = function(parms, dat) {

getAll(dat, parms, warn=FALSE)

## Optional (enables extra RTMB features)

weight <- OBS(weight)

## Initialize joint negative log likelihood

nll <- 0

## Random slopes

nll <- nll - sum(dnorm(a, mean=mua, sd=sda, log=TRUE))

## Random intercepts

nll <- nll - sum(dnorm(b, mean=mub, sd=sdb, log=TRUE))

## Data

predWeight <- a[Chick] * Time + b[Chick] + beta*covariate + mat[1,1]

nll <- nll - sum(dnorm(weight, predWeight, sd=sdeps, log=TRUE))

## Get predicted weight uncertainties

ADREPORT(predWeight)

## Return

nll

}

Reply all

Reply to author

Forward