Specifying multiple time-varying covariates as a data frame in secr.fit

174 views
Skip to first unread message

James Helferich

unread,
Mar 1, 2023, 11:12:49 AM3/1/23
to secr
Hi All! I am working with a multisession SCR set of rattlesnake data. I have 40 sessions each with between 6 and 9 occasions. I have been able to get a model to run with the time (occasion varying) covariate "search minutes" by specifying time covariates in secr.fit directly (a method recommended in a previous thread to deal with varying number of occasions between sessions by padding with NAs)
However, in addition to search minutes, I am also looking to add the time-varying covariate "gravidness" to represent the percent of captures gravid in each occasion. By trying to mimic the input of the 'timevaryingcov' function I was able to get a model to start by specifying it like this. [The code below is for a test of only 2 sessions, m is simply a pre-specified constant]

My question is essentially to confirm that this is the correct way to specify it, particularly because it does not run when I put "g0 ~ sea + grv" and must put "g0 ~ tcov[[1]] +tcov[[2]]" and I could not find any examples of multiple separate sets of time-varying covariates in the help pdfs or in previously asked questions. I am not specifying it as actual usage data because it is standardized around 0 and I believe having "negative" usage was the cause of previous errors. Thanks so much for the time and help!

Best,
James Helferich

Murray Efford

unread,
Mar 1, 2023, 12:51:22 PM3/1/23
to secr
Hello James
Welcome to the group. Unfortunately, the code was missing from your post as it came through (at least on my screen). Can you try again?
Murray

James Helferich

unread,
Mar 1, 2023, 3:48:46 PM3/1/23
to secr
Hi Murray, my apologies as it looked ok when I sent it, I have re-copied my original message below and it should hopefully have now sent properly. Thank you!!


Hi All! I am working with a multisession SCR set of rattlesnake data. I have 40 sessions each with between 6 and 9 occasions. I have been able to get a model to run with the time (occasion varying) covariate "search minutes" by specifying time covariates in secr.fit directly (a method recommended in a previous thread to deal with varying number of occasions between sessions by padding with NAs)

> sapply(searchmin, round, 1)
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
 [1,]  4.6 -0.4 -0.4 -0.4 -0.3 -0.3  4.3 -0.3 -0.3   4.2  -0.3  -0.4   4.6  -0.4  -0.4  -0.4
 [2,] -0.2 -0.4 -0.3 -0.2 -0.4 -0.1  0.3 -0.1 -0.3  -0.4  -0.2  -0.3  -0.2  -0.4  -0.3  -0.2
 [3,]  0.0 -0.4 -0.3  3.6 -0.3 -0.2  0.1 -0.3 -0.3   0.2  -0.2  -0.4   0.0  -0.4  -0.3   3.6
 [4,] -0.2  0.0 -0.2 -0.3 -0.4 -0.3  0.0 -0.3 -0.4  -0.2  -0.1  -0.3  -0.2   0.0  -0.2  -0.3
 [5,] -0.1  0.2 -0.3 -0.2 -0.3 -0.4  0.1 -0.1 -0.4  -0.2  -0.2  -0.3  -0.1   0.2  -0.3  -0.2
 [6,] -0.2 -0.2 -0.4 -0.1 -0.3 -0.3 -0.2 -0.2  0.1  -0.2  -0.3  -0.4  -0.2  -0.2  -0.4  -0.1
 [7,]   NA   NA   NA   NA   NA -0.4   NA   NA   NA    NA    NA  -0.2    NA    NA    NA    NA
 [8,]   NA   NA   NA   NA   NA -0.3   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA
 [9,]   NA   NA   NA   NA   NA -0.1   NA   NA   NA    NA    NA    NA    NA    NA    NA    NA
      [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30] {{etc.}}

#D~1, p0~Effort, sig~1
fit.Effort <- secr.fit(CH, mask = masklist, timecov = searchmin,  
                       model = c(D ~ 1, g0 ~ tcov, sigma ~ 1),
                       trace = NULL, verify = TRUE)


{{Output}}
> #D~1, p0~Effort, sig~1
> fit.Effort <- secr.fit(CH, mask = masklist, timecov = searchmin,  
+                        model = c(D ~ 1, g0 ~ tcov, sigma ~ 1),
+                        trace = NULL, verify = TRUE)
Checking data
Preparing detection design matrices
Preparing density design matrix
Finding initial parameter values...
Initial values  D = 0.26054, g0 = 0.00051, sigma = 140.02857
Maximizing likelihood...
Eval     Loglik        D       g0  g0.tcov    sigma


However, in addition to search minutes, I am also looking to add the time-varying covariate "gravidness" to represent the percent of captures gravid in each occasion. By trying to mimic the input of the 'timevaryingcov' function I was able to get a model to start by specifying it like this. [The code below is for a test of only 2 sessions, m is simply a pre-specified constant]

###Test with only session 1 and 18
tv1 <- list(sea = list(c(4.56, -0.23, -0.001, -0.25, -0.06, -0.19, NA, NA, NA),    ##Session 1
                       c(-0.31, -0.14, -0.23, -0.28, -0.38, -0.32, -0.37, -0.27, -0.15)),     ##Session 18
            grv = list(c(m, m, m, m, m, m, NA, NA, NA),      ##Session 1
                       c(2.89, 0.69, m, m, m, m, m, m, m)))       ##Session 18

#D~1, p0~Effort+Gravid, sig~1
fit.Effortt <- secr.fit(CHt, mask = list(mask1, mask18), timecov = tv1,  
                        model = c(D ~ 1, g0 ~ tcov[[1]] + tcov[[2]], sigma ~ 1),
                        trace = NULL, verify = TRUE)

{{Output}}
> #D~1, p0~Effort+Gravid, sig~1
> fit.Effortt <- secr.fit(CHt, mask = list(mask1, mask18), timecov = tv1,
+                         model = c(D ~ 1, g0 ~ tcov[[1]] + tcov[[2]], sigma ~ 1),
+                         trace = NULL, verify = TRUE)
Checking data
Preparing detection design matrices
Preparing density design matrix
Finding initial parameter values...
Initial values  D = 0.25369, g0 = 0.00051, sigma = 140.02857
Maximizing likelihood...
Eval     Loglik            D           g0 g0.tcov[[1]] g0.tcov[[2]]        sigma

My question is essentially to confirm that this is the correct way to specify it, particularly because it does not run when I put "g0 ~ sea + grv" and must put "g0 ~ tcov[[1]] +tcov[[2]]" and I could not find any examples of multiple separate sets of time-varying covariates in the help pdfs or in previously asked questions. I am not specifying it as actual usage data because it is standardized around 0 and I believe having "negative" usage was the cause of previous errors. Thanks so much for the time and help!

Best,
James Helferich


Murray Efford

unread,
Mar 3, 2023, 9:50:48 PM3/3/23
to secr
Hi James

It's hard for me to follow your logic. What is searchmin (matrix or data.frame?) and what are its dimensions? The variation in (scaled) search effort seems extreme. Why do you want 'gravidness' to be a population-wide covariate (an indicator of breeding season vs non-breeding season?)?

Which previous post? Time-varying trap covariates are almost certainly not relevant. My guess is that time covariates will work once you present timecov with a list of data.frames, each with columns 'sea' and 'grv', and use those names in the model. However, if you must have (possibly negative) effort varying by trap, occasion and session then you have a large messy problem. It's easier to help if you provide a simple reproducible example.

Murray

James Helferich

unread,
Mar 6, 2023, 3:18:08 AM3/6/23
to secr
Hi Murray,

Thanks so much for getting back to me, I realize I probably should have explained my analysis and study design more to avoid confusion. I am working with four years of data that I split up into 3 sessions per year (representing 3 different ecologically relevant periods). Because we are using full likelihood and would like to evaluate covariates on the spatial distribution of density we could not use individual covariates (such as age) so we further split Adult Males, Adult Females, Juveniles, and Neonates into separate sessions. That is why gravidness applies to the whole session (still varying by occasion), because it is the percent of all captures in an Adult Female session that were gravid instead of saying whether or not an individual is. Our effort (searchmin) information is also not trap varying, but is occasion varying. Each trap is actually the center of a cell in a sampling grid, in which the entire grid was visually surveyed at least once an occasion. However while most of the time searches were conducted by one or two people, some weeks/occasions had influxes of volunteers to help with visual surveys, causing the extreme variation in search effort. That is one reason we scaled effort in the first place.

The post I was referring to in which you used NAs to add a time varying covariate to a multisession model with different occasions per session is here (https://groups.google.com/g/secrgroup/c/i7EQZ-cTrjE/m/r8CYDSUNAgAJ). I made a simple example of my current problem specifying multiple occasion varying covariates, with examples of solutions I have already tried. I did discover that the method I used in my original post would not work. The file shows fitting a null model, as well as my steps to successfully fit a model with a single time varying covariate. However as I mentioned I am still stuck on specifying multiple time varying covariates? The example code is copied below and also attached as a txt file (along with the necessary files for the edf, tdfs, and masks). There are three sessions with varying number of occasions but the same mask. One additional but related question is does secr internally standardize covariates? I could not find any info on it but I know some other programs/packages do. Thank you so much for all your time and help, I appreciate both your help on this and contributions to all the previous posts on the group that I have learned a lot from.

library(secr)
#Load files and make secr objects
edf <- read.csv("testofedf.csv")
mask <- read.csv("testofmask.csv")

tdf1 <- read.traps(file = "testoftdf.txt", detector = "count")
tdf2 <- read.traps(file = "testoftdf.txt", detector = "count")
tdf3 <- read.traps(file = "testoftdf.txt", detector = "count")

tdflist <- list(tdf1, tdf2, tdf3)

mask1 <- read.mask(data=mask, spacing = 20, header = TRUE)
mask2 <- read.mask(data=mask, spacing = 20, header = TRUE)
mask3 <- read.mask(data=mask, spacing = 20, header = TRUE)

masklist <- list(mask1, mask2, mask3)

#Make Capture history
CH <- make.capthist(captures = edf, traps = tdflist,
                    fmt = "trapID", noccasions = c(4, 5, 4),
                    bysession = TRUE)
verify(CH)

#Fit initial null model, this works
fit <- secr.fit(CH, mask = masklist,
                model = c(D ~ 1, g0 ~ 1, sigma ~ 1),
                trace = NULL, verify = TRUE, start = c(2, -3, 4))


summary(fit)


#Use NAs to add a single occassion varying covariate and fit model (this works)
var1 <- list(c(1, 4, 3, 3, NA), #Session 1
             c(3, 2, 1, 1, 4),#Session 2
             c(4, 1, 4, 2, NA))# Session 3
sapply(var1, round, 1)


fit.2 <- secr.fit(CH, mask = masklist, timecov = var1,

                  model = c(D ~ 1, g0 ~ tcov, sigma ~ 1),
                  trace = NULL, verify = TRUE, start = c(2, -3, 0, 4))


summary(fit.2)



#Try to add two different occassion varying covariates (var1 and var2) to the model
#These are some of the different methods I have tried, but does not run
timevary <- list(data.frame(var1 = c(1, 2, 3, 3, NA), var2 = c(10, 12, 13, 10, NA)), #Session 1
                 data.frame(var1 = c(3, 2, 1, 1, 4), var2 = c(15, 9, 11, 12, 13)), #Session 2
                 data.frame(var1 = c(4, 1, 4, 2, NA), var2 = c(13, 13, 12, 12, NA))) #Session 3



#timevary <- list(list(var1 = c(1, 2, 3, 3, NA), var2 = c(10, 12, 13, 10, NA)), #Session 1
                 #list(var1 = c(3, 2, 1, 1, 4), var2 = c(15, 9, 11, 12, 13)), #Session 2
                 #list(var1 = c(4, 1, 4, 2, NA), var2 = c(3, 13, 12, 12, NA))) #Session 3


#timevary <- list(var1 = list(c(1, 2, 3, 3, NA), c(3, 2, 1, 1, 4), c(4, 1, 4, 2, NA)),
                # var2 = list(c(10, 12, 13, 10, NA), c(15, 9, 11, 12, 13), c(3, 13, 12, 12, NA)))



fit.3 <- secr.fit(CH, mask = masklist, timecov = timevary,
                  model = c(D ~ 1, g0 ~ var1, sigma ~ 1),
                  trace = NULL, verify = TRUE, start = c(2, -3, 0, 4))


Best,

James
testofedf.csv
testofmask.csv
testoftdf.txt
Rscript-test.txt

Murray Efford

unread,
Mar 6, 2023, 1:16:21 PM3/6/23
to secr
Hello James

Thanks for taking the trouble to lay this out so clearly. I was distracted by your 'time-varying' terminology - that term has been kept in 'secr' for trap covariates that vary over time.

I think the problem is embarassingly simple. The default in secr >= 4.0 is to collapse multi-occasion data to a single occasion when possible (binary or count proximity data). That should not be allowed when occasion-level detail is modelled, but it seems to have got past the check (I'll look to fix that). You can suppress the 'collapse to single occasion' rule by specifying the details argument 'fastproximity'. This works for me:

fit.3a <- secr.fit(CH, mask = masklist, timecov = timevary,

    model = c(D ~ 1, g0 ~ var1, sigma ~ 1),
    trace = NULL, verify = TRUE, start = c(2, -3, 0, 4),
    details = list(fastproximity = FALSE))

Incidentally, I suspect padding with NA is not needed when you specify timecov in the full list-of-data.frames form, so this works for me:

timevary <- list(data.frame(var1 = c(1, 2, 3, 3), var2 = c(10, 12, 13, 10)), #Session 1

    data.frame(var1 = c(3, 2, 1, 1, 4), var2 = c(15, 9, 11, 12, 13)),   #Session 2
    data.frame(var1 = c(4, 1, 4, 2), var2 = c(13, 13, 12, 12))) #Session 3

Murray

P.S. I debugged this by first checking the construction-of-design-matrices step internal to secr.fit:
secr.design.MS (CH, models = list(D = ~1, g0 = ~var1, sigma = ~1), timecov = timevary)
which gave no error. (Note formulae are in a slightly modified standard form).

James Helferich

unread,
Mar 8, 2023, 3:53:09 PM3/8/23
to secr
Hi Murray!

Thanks so much! That works on my end and it appears to be working when applied to my full set of actual data. I also am able to now use it without padding with NAs. I appreciate the help!

Best,

James

James Helferich

unread,
Mar 15, 2023, 5:33:54 PM3/15/23
to secr
Hi Murray!

I wanted to send a quick follow up question now that I have gotten a model to run (using fastproximity=FALSE and the newest version of secr 4.5.11, and without padding the timecovs with NAs.) The estimates all look reasonable, however, the output is showing the SE of the sigma parameter as NaN, and a negative variance in the variance covariance matrix. It also shows NaN for the error of detection probability, but only when search effort is exceptionally high:


(g0.sea is search effort)
Beta parameters (coefficients) beta SE.beta lcl ucl D -0.6107564 0.036300859 -0.6819048 -0.5396080 g0 -5.6923115 0.006167382 -5.7043994 -5.6802237 g0.sea 0.3718115 0.001860112 0.3681658 0.3754573 sigma 3.7178216 NaN NaN NaN Variance-covariance matrix of beta parameters D g0 g0.sea sigma D 1.317752e-03 -1.275211e-05 3.352004e-05 1.542862e-05 g0 -1.275211e-05 3.803660e-05 -4.603059e-05 1.411959e-06 g0.sea 3.352004e-05 -4.603059e-05 3.460017e-06 -2.726363e-07 sigma 1.542862e-05 1.411959e-06 -2.726363e-07 -1.062926e-05 Fitted (real) parameters evaluated at base levels of covariates session = 1, sea = 4.56 link estimate SE.estimate lcl ucl D log 0.54294002 0.01971568 0.5056529 0.5829767 g0 logit 0.01804192 NaN NaN NaN sigma log 41.17460323 NaN NaN NaN session = 2, sea = -0.44 link estimate SE.estimate lcl ucl D log 0.54294002 1.971568e-02 0.505652908 0.58297671 g0 logit 0.00285476 2.533537e-05 0.002805532 0.00290485 sigma log 41.17460323 NaN NaN NaN

(.....sessions 3 through 40)


The estimates themselves are actually similar to those I got when I was previously running models with an older version of secr (when I was only trying to specify one occasion varying covariate and was padding the sessions with NAs like I described in my original post). I was confused as to why this new version would converge on such "good" estimates but also have an issue calculating variance. I have seen in previous posts on here that using the default "Newton-Raphson" method (which I did) could cause NaNs, so could this issue be with the the method? I am waiting on a second more parameterized version to finish which may provide additional insights if it does or does not have NaNs. I do expect differences in sigma between sessions, and have models that hypothesis a session effect in my candidate set. With that, my second question would be could a more parameterized model with a better fit actually help variance estimation? Thanks so much for your help and time!

Best,

James

Murray Efford

unread,
Mar 16, 2023, 4:42:48 PM3/16/23
to secr
James
Extreme or missing variance estimates are usually taken to indicate a failure of model fitting. This happens when the model is not identifiable with the particular data, or when the numerical maximization gets into trouble (sometimes solved by specifying starting values other than the defaults). However, it can also happen that only the variance estimates are wrong; then, good estimates can sometimes be obtained by recalculating them with a different algorithm. So I suggest you start by running secr.fit() with the same data and arguments, but adding method = "none" and start = nameoffittedmodel. This will recompute the variance-covariance matrix at the previous estimates (using fdHess from package nlme; the likelihood is evaluated a few times around these values).
Murray

James Helferich

unread,
Mar 24, 2023, 12:57:23 PM3/24/23
to secr
Thanks Murray! That seems to work!

James

James Helferich

unread,
Apr 12, 2023, 12:03:18 PM4/12/23
to secr

Hi Murray,

Another question in relation to using method = "none" and starting values of the original model to recomputed the variance. It has worked great, but I have one model where the recomputed standard error is essentially 0 (or very close to it) for every parameter (see below for the original and re-computed estimates). Is this an indication that it is actually an error in convergence (the actual beta values do not change between the original and the re-fit), or is it the case where running it again may work. Interestingly, in the original model the SE that were NaN were all the g0 parameters and covariates, and I was wondering if that could somehow have an effect on refitting it. Thank you so much!

Original:
Beta parameters (coefficients) beta SE.beta lcl ucl D -0.40190515 0.094006805 -0.58615510 -0.21765520 D.DemAF -0.25472472 0.108302921 -0.46699455 -0.04245490 D.DemJ 0.16396944 0.127798903 -0.08651180 0.41445069 D.DemN 0.68288027 0.142628877 0.40333280 0.96242773 g0 -5.18830132 NaN NaN NaN g0.sea 0.34004321 NaN NaN NaN g0.Sexnum -0.53747023 NaN NaN NaN sigma 3.68597353 0.008785317 3.66875463 3.70319244 sigma.Dem2AM -0.11212146 0.007013705 -0.12586807 -0.09837485 sigma.Dem2JN 0.02043132 0.038427690 -0.05488557 0.09574821

Re-fit using method = "none":
Beta parameters (coefficients) beta SE.beta lcl ucl D -0.40190515 1.029478e-19 -0.40190515 -0.40190515 D.DemAF -0.25472472 2.977890e-20 -0.25472472 -0.25472472 D.DemJ 0.16396944 3.650670e-20 0.16396944 0.16396944 D.DemN 0.68288027 1.875241e-20 0.68288027 0.68288027 g0 -5.18830132 5.188303e-08 -5.18830142 -5.18830122 g0.sea 0.34004321 3.400433e-09 0.34004320 0.34004322 g0.Sexnum -0.53747023 5.374704e-09 -0.53747024 -0.53747022 sigma 3.68597353 3.685975e-08 3.68597346 3.68597361 sigma.Dem2AM -0.11212146 1.121215e-09 -0.11212146 -0.11212145 sigma.Dem2JN 0.02043132 1.409254e-19 0.02043132 0.02043132

Best,
James

Murray Efford

unread,
Apr 14, 2023, 3:22:37 PM4/14/23
to secr
James
I can offer sympathy, but not much insight. There seems to be a real problem with that model, so either the data were inadequate (for the complexity of the given model) or there was a numerical problem. Numerical problems can sometimes be solved by messing around with alternative algorithms (method = 'Nelder-Mead'?) or starting values.
Murray
P.S. Technically method = 'none' _only_ recomputes the variance-covariance matrix, so 're-fitting' is not accurate despite the re-use of secr.fit().
Reply all
Reply to author
Forward
0 new messages