dredge {MuMIn} for unmarkedFitPCO object (as from pcountOpen)

560 views
Skip to first unread message

simone santoro

unread,
Jan 25, 2016, 6:20:26 AM1/25/16
to unmarked
Hi all,

I have an object of class unmarkedFitPCO that I obtain by running a model like this:
Model1 <- pcountOpen(~var1, ~var2, ~1, ~var3, data= data$umf, dynamics= "trend",K=150)

It works fine (I am using it on simulated data and it works very well).
However, it would be very helpful for me using the 'dredge' function {MuMIn} that in the past I have already used on an unmarkedFit object (from a 'pcount' analysis).
This is the error message I get:
> dredge(Model1)
Error in dimnames(deps) <- list(ret, ret) : 
  length of 'dimnames' [1] not equal to array extent

Any help would be very appreciated,
Ciao

Simone

simone santoro

unread,
Jan 26, 2016, 6:22:23 PM1/26/16
to unmarked
Hi again,

I have been doing several trials to understand in which context the 'dredge' function does not work with models of this class.
This is what I get with the formula specific for a population growth model (a parameter for the population growth rate) that has this argument: dynamics = "trend".
In fact:
m1<- pcountOpen(~1, ~1, ~1, ~1, data= data$umf,K=30,dynamics="trend")
> dredge(m2)
Fixed terms are "lam(Int)", "gamTrend(Int)", "p(Int)", "NA(Int)" and "NA(Int)"
Error in names(zarg) <- formula.arg : 
  'names' attribute [4] must be the same length as the vector [3]
However, if I try with the 'normal' Dail & Madsen model (i.e. without the dynamics="trend"), it seems to work OK:
> m2<- pcountOpen(~1, ~1, ~1, ~1, data= data$umf,K=30)
> dredge(m2)
Fixed terms are "lam(Int)", "gamConst(Int)", "omega(Int)", "p(Int)" and "NA(Int)"
Global model call: pcountOpen(lambdaformula = ~1, gammaformula = ~1, omegaformula = ~1, 
    pformula = ~1, data = data$umf, K = 30)
---
Model selection table 
  lam(Int) gmC(Int) omg(Int)  p(Int) NA(Int) df   logLik   AICc delta weight
1    1.516  -0.6702    3.825 -0.8391       +  4 -909.005 1826.9     0      1
Models ranked by AICc(x) 

Have anyone an idea on how to fix it? I have to run quite a few models on simulated data under several scenarios and it would be really helpful to have the 'dredge' function working.
Bye,

Simone

Jeffrey Royle

unread,
Jan 26, 2016, 6:36:06 PM1/26/16
to unma...@googlegroups.com
Dear Simone,
 of course the dredge function is not part of unmarked and so it might be useful to direct this to the creator of the MuMIn package as I think it's probably an issue with dredge which perhaps isn't up-to-date with the last release of unmarked which includes the Hostetler and Chandler extensions of the DM model.
 If the creators of the dredge function can't fix it, you may have to debug this yourself (in which case a posting of the solution to the unmarked list would be appreciated!).

 If anyone can fix this problem and post a solution and working R script example then I'll give them a free Applied Hierarchical Models book. (be sure the contact me personally after the fact in case I miss postings on the email list).

regards
andy


--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Richard Schuster

unread,
Jan 27, 2016, 9:03:28 AM1/27/16
to unma...@googlegroups.com, simones...@gmail.com
Hi Simone,

The reason why dredge is not working is this:

the standard pcountOpen has four parts to its model formula and four estimates if you look at say:

summary(m2)

of your example

pcountOpen with dynamics="trend" also has four parts to its model formula, but only three estimates looking at

summary(m1)

Instead of 'Recruitment' and 'Apparent Survival' you only get 'Growth Rate'.
That confuses dredge, because it expects 4 estimates from the supplied formula that has 4 parts, but it only finds 3 estimates.

As for a fix: What parts of the model would you like to dredge? I am not sure how dredge works with the standard pcountOpen, but I am assuming it only dredges one part at a time (e.g. the lambda formula)?
I could create a very crude function where you would supply your null model (e.g. m2, or a model with multiple terms) a vector of terms to be added and specify the part of the model you want to dredge over.
If that's of interest to you, could you send me some example data, ideally including covariates and setup code so I can test this?

Best regards,
Richard

Drew Ricketts

unread,
Jan 27, 2016, 9:04:21 PM1/27/16
to unmarked, simones...@gmail.com
I don't think this issue has to do with the dynamics argument.  I've had similar problems as Simone using the default settings for trend.

Drew Ricketts

unread,
Jan 27, 2016, 9:05:11 PM1/27/16
to unmarked, simones...@gmail.com
meant to say default settings for dynamics...

Richard Schuster

unread,
Jan 27, 2016, 10:29:50 PM1/27/16
to unma...@googlegroups.com
Hi Drew,

Do you recall the error message you got?
I am pretty sure that the one that Simone had was related to what I described. At least that's what I assumed after looking at the MuMIn source code.
If you have example data (and code) we could test this though.

Cheers,
Richard
This email has been sent from a virus-free computer protected by Avast.
www.avast.com

Andrew Ricketts

unread,
Jan 27, 2016, 11:09:15 PM1/27/16
to unma...@googlegroups.com
Hi Richard,

Thanks for all of the help you’ve been giving folks on this group.  All of you gurus are an awesome resource for those of us with fewer skills and/or less knowledge.

Here’s a simple example using the Alder Flycatcher data that is on the web somewhere (should also be attached):

> alfl.data <- read.csv("alfl0506.csv", row.names=1)
> str(alfl.data)
'data.frame': 49 obs. of  20 variables:
 $ alfl1_05: int  1 2 0 1 0 3 0 0 0 0 ...
 $ alfl2_05: int  0 3 1 0 0 1 0 0 1 0 ...
 $ alfl3_05: int  1 0 1 1 0 1 0 0 2 0 ...
 $ alfl1_06: int  0 0 1 1 0 0 0 0 1 1 ...
 $ alfl2_06: int  1 0 0 0 0 0 0 0 0 0 ...
 $ alfl3_06: int  1 0 0 1 0 0 0 0 0 0 ...
 $ struct  : num  5.45 4.75 14.7 5.05 4.15 9.75 9.6 15.7 9.2 7.75 ...
 $ woody   : int  6 1 7 6 2 8 4 0 4 3 ...
 $ time1_05: num  8.68 9.43 8.25 7.77 9.57 9.1 8.6 8.12 7.63 9.92 ...
 $ time2_05: num  8.73 7.4 6.7 6.23 9.55 9.12 8.62 7.92 7.43 5.67 ...
 $ time3_05: num  5.72 7.58 7.62 7.17 5.73 9.12 6.72 8.07 7.6 9.72 ...
 $ date1_05: int  6 20 20 20 8 8 8 1 1 1 ...
 $ date2_05: int  25 32 32 32 27 27 27 27 27 27 ...
 $ date3_05: int  34 54 47 47 36 36 36 36 36 36 ...
 $ time1_06: num  8.75 6.38 7.07 5.75 10 9.55 9.02 6.13 5.63 7.93 ...
 $ time2_06: num  5.25 8.12 7.07 7.53 7.37 6.87 6.38 9.27 8.83 7.53 ...
 $ time3_06: num  5.12 6.87 6.2 5.75 5.75 ...
 $ date1_06: int  8 13 13 13 12 12 12 12 12 12 ...
 $ date2_06: int  26 37 37 37 28 28 28 28 28 28 ...
 $ date3_06: int  44 48 48 48 46 46 46 46 46 46 ...
> ### Load Package ###
> library(unmarked)
Loading required package: reshape
Loading required package: lattice
Loading required package: Rcpp
> ### Make Unmarked PCO dataframe ###
> alfl.umf <- unmarkedFramePCO(y=alfl.data[,1:6],
+     siteCovs=alfl.data[,c("woody", "struct")],
+ # we could have yearlySiteCovs here.
+     obsCovs=list(time=alfl.data[,c("time1_05", "time2_05", "time3_05",
+                                    "time2_06", "time3_06", "time3_06")],
+                  date=alfl.data[,c("date1_05", "date2_05", "date3_05",
+                                    "date2_06", "date3_06", "date3_06")]),
+     numPrimary=2)
> # Standardize covariates after making the UMF
> siteCovs(alfl.umf) <- scale(siteCovs(alfl.umf))
> obsCovs(alfl.umf) <- scale(obsCovs(alfl.umf))
> summary(alfl.umf)
unmarkedFrame Object

49 sites
Maximum number of observations per site: 6 
Mean number of observations per site: 6 
Number of primary survey periods: 2 
Number of secondary survey periods: 3 
Sites with at least one detection: 42 

Tabulation of y observations:
   0    1    2    3 <NA> 
 183   76   25   10    0 

Site-level covariates:
     woody             struct         
 Min.   :-1.5749   Min.   :-1.793802  
 1st Qu.:-0.5887   1st Qu.:-0.797115  
 Median :-0.0956   Median :-0.002653  
 Mean   : 0.0000   Mean   : 0.000000  
 3rd Qu.: 0.6441   3rd Qu.: 0.618471  
 Max.   : 2.3699   Max.   : 3.247416  

Observation-level covariates:
      time               date        
 Min.   :-1.96447   Min.   :-2.3602  
 1st Qu.:-0.80321   1st Qu.:-0.5701  
 Median :-0.04802   Median : 0.1012  
 Mean   : 0.00000   Mean   : 0.0000  
 3rd Qu.: 0.81723   3rd Qu.: 0.9217  
 Max.   : 2.38455   Max.   : 1.5930  
> library(MuMIn)
(afl.global.p <- pcountOpen(~1, ~1, ~1, ~date+time+struct, alfl.umf,
+                    K=30))

Call:
pcountOpen(lambdaformula = ~1, gammaformula = ~1, omegaformula = ~1, 
    pformula = ~date + time + struct, data = alfl.umf, K = 30)

Abundance:
 Estimate    SE    z P(>|z|)
     0.59 0.218 2.71 0.00673

Recruitment:
 Estimate    SE    z P(>|z|)
    0.863 0.417 2.07  0.0386

Apparent Survival:
 Estimate   SE      z P(>|z|)
    -1.35 2.63 -0.515   0.607

Detection:
            Estimate    SE     z  P(>|z|)
(Intercept)   -1.234 0.297 -4.16 3.18e-05
date          -0.790 0.145 -5.45 4.96e-08
time          -0.281 0.122 -2.30 2.12e-02
struct         0.218 0.140  1.55 1.21e-01

AIC: 537.5536 
> dredge(afl.global.p)
Fixed terms are "lam(Int)", "gamConst(Int)", "omega(Int)", "p(Int)" and "NA(Int)"
Error in matchCoef(fit1, all.terms = allTerms, beta = betaMode, allCoef = TRUE) : 
  'm1' is not nested within 'm2'
In addition: Warning message:
In model.frame.default(gamformula, transCovs, na.action = NULL) :
  object is not a matrix (model 1 skipped)



The other thing I’ve run into is that dredge will run the intercept only model, but not the additional models models with covariates



global.lambda <-pcountOpen(lambdaformula = ~woody+struct, gammaformula = ~1, omegaformula = ~1, 
+     pformula = ~1, data = alfl.umf, K = 30)
> global.lambda

Call:
pcountOpen(lambdaformula = ~woody + struct, gammaformula = ~1, 
    omegaformula = ~1, pformula = ~1, data = alfl.umf, K = 30)

Abundance:
            Estimate    SE    z  P(>|z|)
(Intercept)    1.779 0.413 4.31 1.63e-05
woody          0.476 0.113 4.22 2.47e-05
struct         0.130 0.103 1.26 2.08e-01

Recruitment:
 Estimate    SE    z P(>|z|)
     1.09 0.549 1.98  0.0479

Apparent Survival:
 Estimate    SE      z P(>|z|)
   -0.742 0.938 -0.791   0.429

Detection:
 Estimate    SE     z  P(>|z|)
    -2.33 0.427 -5.46 4.63e-08

AIC: 556.5234 
> dredge(global.lambda)
Fixed terms are "lam(Int)", "gamConst(Int)", "omega(Int)", "p(Int)" and "NA(Int)"
Global model call: pcountOpen(lambdaformula = ~woody + struct, gammaformula = ~1, 
    omegaformula = ~1, pformula = ~1, data = alfl.umf, K = 30)
---
Model selection table 
  lam(Int) gmC(Int) omg(Int) p(Int) NA(Int) df   logLik  AICc delta weight
1     1.54   0.4094   -0.173  -1.92       +  4 -284.959 578.8     0      1
Models ranked by AICc(x) 

No warning messages, but it doesn’t seem to have “dredged” the global model I gave it…

alfl0506.csv
ATT00001.htm

Richard Schuster

unread,
Jan 27, 2016, 11:24:44 PM1/27/16
to unma...@googlegroups.com
Thanks very much for the example Drew.
Can't really take credit for all the work Richard Chandler is doing here, I just try to chip in from time to time.

Going to give this a try tomorrow and let you know what I find. I have been talking to Simone about adopting one of my dredge like functions to
work with pcountOpen. I initially wrote it to work with occu, but have started to adopt it for other unmarked functions as well.

Cheers,
Richard S.
--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

simone santoro

unread,
Jan 28, 2016, 6:17:47 AM1/28/16
to unmarked
Hi all,

First, thanks to all of you. I forgot to mention the post (that I had read) by Drew Richetts on another problem with the dredge function for a model pcountOpen, sorry for that.
Below you can find a reproducible code for my case (attached the files with data). These are simulated data sets, that's why are so nice...

> counts<- read.table("exampleCounts.txt")# counts
> scv<- read.table("exampleScv.txt")# site covariates, there are two: A1 that will be tested on initial abundance (first parameter), and KR that will be tested on detection probability (fourth and last parameter). 
> yscv<- read.table("exampleYscv.txt")# yearly-site covariates, I will test this on the intrinsic growth rate that, when dynamics= "trend", is the second parameter in the pcountOpen model.
> yscv<- list(A2= matrix(rep(yscv[,1],nsites),nsites,nprimary,byrow=T))
> nsites<- 20
> nprimary<- 10 # no. of primary periods
> umf<- unmarkedFramePCO(y= counts, siteCovs= scv, yearlySiteCovs= yscv, numPrimary= nprimary)
> summary(umf)
unmarkedFrame Object

20 sites
Maximum number of observations per site: 10 
Mean number of observations per site: 10 
Number of primary survey periods: 10 
Number of secondary survey periods: 1 
Sites with at least one detection: 20 

Tabulation of y observations:
   0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22   23   24   25   26   27   28   33   35   36 <NA> 
   1    3    5    5   13   11   13   12   10   14   13   18    5   11    8    7   11    7    6    7    2    2    1    2    2    3    2    2    1    1    1    1    0 

Site-level covariates:
       A1              KR        
 Min.   :3.926   Min.   :-2.753  
 1st Qu.:4.021   1st Qu.:-2.611  
 Median :4.235   Median :-2.493  
 Mean   :4.235   Mean   :-2.465  
 3rd Qu.:4.441   3rd Qu.:-2.338  
 Max.   :4.597   Max.   :-2.198  

Yearly-site-level covariates:
       A2          
 Min.   :-0.48984  
 1st Qu.:-0.24796  
 Median :-0.08357  
 Mean   : 0.02882  
 3rd Qu.: 0.46201  
 Max.   : 0.64999  
> m1<- pcountOpen(~A1, ~A2, ~1, ~KR, data= umf, dynamics= "trend",K=max(data$y)*4)
> m1

Call:
pcountOpen(lambdaformula = ~A1, gammaformula = ~A2, omegaformula = ~1, 
    pformula = ~KR, data = umf, K = max(data$y) * 4, dynamics = "trend")

Abundance:
            Estimate    SE      z P(>|z|)
(Intercept)   0.0737 1.408 0.0524  0.9582
A1            0.7037 0.326 2.1557  0.0311

Growth Rate:
            Estimate     SE      z  P(>|z|)
(Intercept)  0.00271 0.0138  0.196 8.45e-01
A2           1.10462 0.0723 15.282 1.00e-52

Detection:
            Estimate    SE    z P(>|z|)
(Intercept)     1.80 1.377 1.30  0.1921
KR              1.17 0.566 2.06  0.0394

AIC: 1085.666 
> dredge(m1)
Error in dimnames(deps) <- list(ret, ret) : 
  length of 'dimnames' [1] not equal to array extent


El lunes, 25 de enero de 2016, 12:20:26 (UTC+1), simone santoro escribió:
exampleCounts.txt
exampleScv.txt
exampleYscv.txt

Richard Schuster

unread,
Jan 28, 2016, 11:03:54 AM1/28/16
to unma...@googlegroups.com
Hi everyone,

I have done some testing and think the attached function (f.AICc.pcountOpen) will do the trick to dredge pcountOpen models.
Below (also attached) a working example
using Simone Santoro's simulated data.
Please give it a try and let me know if that's what you were looking for or if you need any clarification or changes made.
At present the dredge part does only run over one type (lambda, gamma, omega or p) at a time.

Cheers,
Richard S


#######################################################
library(unmarked)
library(AICcmodavg)

setwd("D:/Work/R/pcountOpen")

source("f.AIC_cut.pcountOpen.sig.used.16.01.28.R")


counts<- read.table("exampleCounts.txt")# counts
scv<- read.table("exampleScv.txt")# site covariates, there are two: A1 that will be tested on initial
#abundance (first parameter), and KR that will be tested on detection probability (fourth and last parameter).
names(scv) <- c("SC1","SC2")

yscv<- read.table("exampleYscv.txt")# yearly-site covariates, I will test this on the intrinsic growth rate
#that, when dynamics= "trend", is the second parameter in the pcountOpen model.

nsites<- 20
nprimary<- 10 # no. of primary periods
yscv.2<- list(A1= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T),
            A2= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T),
            A3= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T))

yocv <- list(O1= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T),
            O2= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T),
            O3= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T))

scv$SC3 <- scv$SC1 + rnorm(20)
scv$SC4 <- scv$SC2 + rnorm(20)
scv$SC5 <- scv$SC1 + rnorm(20)

umf<- unmarkedFramePCO(y= counts, siteCovs= scv, obsCovs = yocv, yearlySiteCovs= yscv.2, numPrimary= nprimary)
str(umf)

(m1<- pcountOpen(~SC1, ~A1, ~1, ~1, data= umf, dynamics= "trend",K=50))


######################################################
#dredge function
#need to provide the starting model,
#blocks = character vector of covariates to test
#formlst = which formula should be dredged (1=lambda, 2=gamma, 3=omega, 4=p)
#dredge = should all possible combinations be tested
#if dredge = FALSE, one can specify AICcut and at each model iteration only models
#within that number are retained as base models for the next iteration
fmd <- f.AICc.pcountOpen(start.model=m1, blocks=names(scv)[-1], formlst=1, dredge=TRUE)

#Not run
#Alternative function specification
#fmd <- f.AICc.pcountOpen(start.model=m1, blocks=names(scv)[-1], formlst=1, AICcut = 2, dredge=TRUE)

str(fmd,max.level=1)
#List of 4
# $ model = start model
# $ modlst = all models that were fit
# $ fitlst = equivalent to modlst, but from unmarked's fitList function
# $ modsel = unmarked modsel table

#full unmarked modsel table
fmd$modsel@Full

#averaged coefficient example
modavg(fmd$modlst,"SC2",parm.type="lambda")

#only retain models within cutoff AIC from top model
fmd.2 <- unm.subset(fmd,cutoff=2)
modavg(fmd.2,"SC2",parm.type="lambda")

exampleCounts.txt
exampleScv.txt
exampleYscv.txt
f.AIC_cut.pcountOpen.sig.used.16.01.28.R
Simone.Santoro.R

Richard Schuster

unread,
Jan 29, 2016, 10:10:50 AM1/29/16
to unmarked
Attached a slightly updated function script.
Thanks to Drew for pointing out the problem with the previous version.

Best,
Richard S
f.AIC_cut.pcountOpen.sig.used.16.01.28.R

simone santoro

unread,
Jan 30, 2016, 11:05:35 AM1/30/16
to unmarked
Dear Richard,

I have been trying your function and it works pretty well. Thanks for that!
In my case I am going to evaluate the model selection performance - under the same set of candidate models - over a huge number of simulated data sets.
In particular, I will have 64 candidate models that represent all the possible comibinations of model structures nested to a global model like this:
(global.model<- pcountOpen(~SC1+SC2, ~A1+A2, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m2<- pcountOpen(~SC1, ~A1+A2, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m3<- pcountOpen(~SC2, ~A1+A2, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m4<- pcountOpen(~1, ~A1+A2, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m5<- pcountOpen(~SC1+SC2, ~A1, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m6<- pcountOpen(~SC1, ~A1, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m7<- pcountOpen(~SC2, ~A1, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m8<- pcountOpen(~1, ~A1, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
...
(m64<- pcountOpen(~1, ~1, ~1, ~1, data= umf, dynamics= "trend",K=50)

Therefore, if for a given scenario I have built 1000 simulated data sets I will need to get 1000 model selection tables with a given Akaike weight for each model of the set of candidates. Then I will compute the median weight (and sd) for each model in the set. I have previously used dredge over a pcount model and it was able to to all the combinations among the parameter types (in that case lambda and p). However, if I have understood well your function is able to run all the combinations within each parameter type, isn't it?
In my opinion my case is pretty specific and your solution is good for most of cases so that I think you are a meritorious candidate to get the Andy Royle's book!
I have contacted Kamil Burton who is the author and maintainer of the MuMIn package to ask about this.
I will keep you informed on the progress.
Ciao,

Simone

Richard Schuster

unread,
Jan 31, 2016, 11:52:55 PM1/31/16
to unma...@googlegroups.com
Hi Simone,

Thanks for testing the function.

I think if you 'only' have 64 candidate models you would not really need dredge or the function to create the model selection tables. What I would do is create all possible combinations as you outlined below and just change the data in a loop for the 1000 data sets (or parallelize the loop or split things up into groups). I would create a selection tables for each iteration in the loop, save to a list and use that in post-processing.

You are correct the function currently only allows for 'dredge' for one parameter type at a time. It would be fairly straight forward to extend this, but would require some time and I am not sure when I could get to this. Hopefully the MuMIn package can be updated soon so it supports pcountOpen dredge as well.

Cheers,
Richard

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages