dredge {MuMIn} for unmarkedFitPCO object (as from pcountOpen)

simone santoro

unread,

Jan 25, 2016, 6:20:26 AM1/25/16

to unmarked

Hi all,

I have an object of class unmarkedFitPCO that I obtain by running a model like this:

Model1 <- pcountOpen(~var1, ~var2, ~1, ~var3, data= data$umf, dynamics= "trend",K=150)

It works fine (I am using it on simulated data and it works very well).

However, it would be very helpful for me using the 'dredge' function {MuMIn} that in the past I have already used on an unmarkedFit object (from a 'pcount' analysis).

This is the error message I get:

> dredge(Model1)

Error in dimnames(deps) <- list(ret, ret) :

length of 'dimnames' [1] not equal to array extent

Any help would be very appreciated,

Ciao

Simone

simone santoro

unread,

Jan 26, 2016, 6:22:23 PM1/26/16

to unmarked

Hi again,

I have been doing several trials to understand in which context the 'dredge' function does not work with models of this class.

This is what I get with the formula specific for a population growth model (a parameter for the population growth rate) that has this argument: dynamics = "trend".

In fact:

m1<- pcountOpen(~1, ~1, ~1, ~1, data= data$umf,K=30,dynamics="trend")

> dredge(m2)

Fixed terms are "lam(Int)", "gamTrend(Int)", "p(Int)", "NA(Int)" and "NA(Int)"

Error in names(zarg) <- formula.arg :

'names' attribute [4] must be the same length as the vector [3]

However, if I try with the 'normal' Dail & Madsen model (i.e. without the dynamics="trend"), it seems to work OK:

> m2<- pcountOpen(~1, ~1, ~1, ~1, data= data$umf,K=30)

> dredge(m2)

Fixed terms are "lam(Int)", "gamConst(Int)", "omega(Int)", "p(Int)" and "NA(Int)"

Global model call: pcountOpen(lambdaformula = ~1, gammaformula = ~1, omegaformula = ~1,

pformula = ~1, data = data$umf, K = 30)

---

Model selection table

lam(Int) gmC(Int) omg(Int) p(Int) NA(Int) df logLik AICc delta weight

1 1.516 -0.6702 3.825 -0.8391 + 4 -909.005 1826.9 0 1

Models ranked by AICc(x)

Have anyone an idea on how to fix it? I have to run quite a few models on simulated data under several scenarios and it would be really helpful to have the 'dredge' function working.

Bye,

Simone

Jeffrey Royle

unread,

Jan 26, 2016, 6:36:06 PM1/26/16

to unma...@googlegroups.com

Dear Simone,

of course the dredge function is not part of unmarked and so it might be useful to direct this to the creator of the MuMIn package as I think it's probably an issue with dredge which perhaps isn't up-to-date with the last release of unmarked which includes the Hostetler and Chandler extensions of the DM model.

If the creators of the dredge function can't fix it, you may have to debug this yourself (in which case a posting of the solution to the unmarked list would be appreciated!).

If anyone can fix this problem and post a solution and working R script example then I'll give them a free Applied Hierarchical Models book. (be sure the contact me personally after the fact in case I miss postings on the email list).

regards

andy

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Richard Schuster

unread,

Jan 27, 2016, 9:03:28 AM1/27/16

to unma...@googlegroups.com, simones...@gmail.com

Hi Simone,

The reason why dredge is not working is this:

the standard pcountOpen has four parts to its model formula and four estimates if you look at say:

summary(m2)

of your example

pcountOpen with dynamics="trend" also has four parts to its model formula, but only three estimates looking at

summary(m1)

Instead of 'Recruitment' and 'Apparent Survival' you only get 'Growth Rate'.
That confuses dredge, because it expects 4 estimates from the supplied formula that has 4 parts, but it only finds 3 estimates.

As for a fix: What parts of the model would you like to dredge? I am not sure how dredge works with the standard pcountOpen, but I am assuming it only dredges one part at a time (e.g. the lambda formula)?
I could create a very crude function where you would supply your null model (e.g. m2, or a model with multiple terms) a vector of terms to be added and specify the part of the model you want to dredge over.
If that's of interest to you, could you send me some example data, ideally including covariates and setup code so I can test this?

Best regards,
Richard

Drew Ricketts

unread,

Jan 27, 2016, 9:04:21 PM1/27/16

to unmarked, simones...@gmail.com

I don't think this issue has to do with the dynamics argument. I've had similar problems as Simone using the default settings for trend.

Drew Ricketts

unread,

Jan 27, 2016, 9:05:11 PM1/27/16

to unmarked, simones...@gmail.com

meant to say default settings for dynamics...

Richard Schuster

unread,

Jan 27, 2016, 10:29:50 PM1/27/16

to unma...@googlegroups.com

Hi Drew,

Do you recall the error message you got?
I am pretty sure that the one that Simone had was related to what I described. At least that's what I assumed after looking at the MuMIn source code.
If you have example data (and code) we could test this though.

Cheers,
Richard

This email has been sent from a virus-free computer protected by Avast.
www.avast.com

Andrew Ricketts

unread,

Jan 27, 2016, 11:09:15 PM1/27/16

to unma...@googlegroups.com

Hi Richard,

Thanks for all of the help you’ve been giving folks on this group. All of you gurus are an awesome resource for those of us with fewer skills and/or less knowledge.

Here’s a simple example using the Alder Flycatcher data that is on the web somewhere (should also be attached):

> alfl.data <- read.csv("alfl0506.csv", row.names=1)

> str(alfl.data)

'data.frame': 49 obs. of 20 variables:

$ alfl1_05: int 1 2 0 1 0 3 0 0 0 0 ...

$ alfl2_05: int 0 3 1 0 0 1 0 0 1 0 ...

$ alfl3_05: int 1 0 1 1 0 1 0 0 2 0 ...

$ alfl1_06: int 0 0 1 1 0 0 0 0 1 1 ...

$ alfl2_06: int 1 0 0 0 0 0 0 0 0 0 ...

$ alfl3_06: int 1 0 0 1 0 0 0 0 0 0 ...

$ struct : num 5.45 4.75 14.7 5.05 4.15 9.75 9.6 15.7 9.2 7.75 ...

$ woody : int 6 1 7 6 2 8 4 0 4 3 ...

$ time1_05: num 8.68 9.43 8.25 7.77 9.57 9.1 8.6 8.12 7.63 9.92 ...

$ time2_05: num 8.73 7.4 6.7 6.23 9.55 9.12 8.62 7.92 7.43 5.67 ...

$ time3_05: num 5.72 7.58 7.62 7.17 5.73 9.12 6.72 8.07 7.6 9.72 ...

$ date1_05: int 6 20 20 20 8 8 8 1 1 1 ...

$ date2_05: int 25 32 32 32 27 27 27 27 27 27 ...

$ date3_05: int 34 54 47 47 36 36 36 36 36 36 ...

$ time1_06: num 8.75 6.38 7.07 5.75 10 9.55 9.02 6.13 5.63 7.93 ...

$ time2_06: num 5.25 8.12 7.07 7.53 7.37 6.87 6.38 9.27 8.83 7.53 ...

$ time3_06: num 5.12 6.87 6.2 5.75 5.75 ...

$ date1_06: int 8 13 13 13 12 12 12 12 12 12 ...

$ date2_06: int 26 37 37 37 28 28 28 28 28 28 ...

$ date3_06: int 44 48 48 48 46 46 46 46 46 46 ...

>

> ### Load Package ###

> library(unmarked)

Loading required package: reshape

Loading required package: lattice

Loading required package: Rcpp

>

> ### Make Unmarked PCO dataframe ###

> alfl.umf <- unmarkedFramePCO(y=alfl.data[,1:6],

+ siteCovs=alfl.data[,c("woody", "struct")],

+ # we could have yearlySiteCovs here.

+ obsCovs=list(time=alfl.data[,c("time1_05", "time2_05", "time3_05",

+ "time2_06", "time3_06", "time3_06")],

+ date=alfl.data[,c("date1_05", "date2_05", "date3_05",

+ "date2_06", "date3_06", "date3_06")]),

+ numPrimary=2)

>

> # Standardize covariates after making the UMF

> siteCovs(alfl.umf) <- scale(siteCovs(alfl.umf))

> obsCovs(alfl.umf) <- scale(obsCovs(alfl.umf))

> summary(alfl.umf)

unmarkedFrame Object

49 sites

Maximum number of observations per site: 6

Mean number of observations per site: 6

Number of primary survey periods: 2

Number of secondary survey periods: 3

Sites with at least one detection: 42

Tabulation of y observations:

0 1 2 3 <NA>

183 76 25 10 0

Site-level covariates:

woody struct

Min. :-1.5749 Min. :-1.793802

1st Qu.:-0.5887 1st Qu.:-0.797115

Median :-0.0956 Median :-0.002653

Mean : 0.0000 Mean : 0.000000

3rd Qu.: 0.6441 3rd Qu.: 0.618471

Max. : 2.3699 Max. : 3.247416

Observation-level covariates:

time date

Min. :-1.96447 Min. :-2.3602

1st Qu.:-0.80321 1st Qu.:-0.5701

Median :-0.04802 Median : 0.1012

Mean : 0.00000 Mean : 0.0000

3rd Qu.: 0.81723 3rd Qu.: 0.9217

Max. : 2.38455 Max. : 1.5930

> library(MuMIn)

(afl.global.p <- pcountOpen(~1, ~1, ~1, ~date+time+struct, alfl.umf,

+ K=30))

Call:

pcountOpen(lambdaformula = ~1, gammaformula = ~1, omegaformula = ~1,

pformula = ~date + time + struct, data = alfl.umf, K = 30)

Abundance:

Estimate SE z P(>|z|)

0.59 0.218 2.71 0.00673

Recruitment:

Estimate SE z P(>|z|)

0.863 0.417 2.07 0.0386

Apparent Survival:

Estimate SE z P(>|z|)

-1.35 2.63 -0.515 0.607

Detection:

Estimate SE z P(>|z|)

(Intercept) -1.234 0.297 -4.16 3.18e-05

date -0.790 0.145 -5.45 4.96e-08

time -0.281 0.122 -2.30 2.12e-02

struct 0.218 0.140 1.55 1.21e-01

AIC: 537.5536

> dredge(afl.global.p)

Fixed terms are "lam(Int)", "gamConst(Int)", "omega(Int)", "p(Int)" and "NA(Int)"

Error in matchCoef(fit1, all.terms = allTerms, beta = betaMode, allCoef = TRUE) :

'm1' is not nested within 'm2'

In addition: Warning message:

In model.frame.default(gamformula, transCovs, na.action = NULL) :

object is not a matrix (model 1 skipped)

The other thing I’ve run into is that dredge will run the intercept only model, but not the additional models models with covariates

global.lambda <-pcountOpen(lambdaformula = ~woody+struct, gammaformula = ~1, omegaformula = ~1,

+ pformula = ~1, data = alfl.umf, K = 30)

> global.lambda

Call:

pcountOpen(lambdaformula = ~woody + struct, gammaformula = ~1,

omegaformula = ~1, pformula = ~1, data = alfl.umf, K = 30)

Abundance:

Estimate SE z P(>|z|)

(Intercept) 1.779 0.413 4.31 1.63e-05

woody 0.476 0.113 4.22 2.47e-05

struct 0.130 0.103 1.26 2.08e-01

Recruitment:

Estimate SE z P(>|z|)

1.09 0.549 1.98 0.0479

Apparent Survival:

Estimate SE z P(>|z|)

-0.742 0.938 -0.791 0.429

Detection:

Estimate SE z P(>|z|)

-2.33 0.427 -5.46 4.63e-08

AIC: 556.5234

> dredge(global.lambda)

Fixed terms are "lam(Int)", "gamConst(Int)", "omega(Int)", "p(Int)" and "NA(Int)"

Global model call: pcountOpen(lambdaformula = ~woody + struct, gammaformula = ~1,

omegaformula = ~1, pformula = ~1, data = alfl.umf, K = 30)

---

Model selection table

lam(Int) gmC(Int) omg(Int) p(Int) NA(Int) df logLik AICc delta weight

1 1.54 0.4094 -0.173 -1.92 + 4 -284.959 578.8 0 1

Models ranked by AICc(x)

No warning messages, but it doesn’t seem to have “dredged” the global model I gave it…

alfl0506.csv

ATT00001.htm

Richard Schuster

unread,

Jan 27, 2016, 11:24:44 PM1/27/16

to unma...@googlegroups.com

Thanks very much for the example Drew.
Can't really take credit for all the work Richard Chandler is doing here, I just try to chip in from time to time.

Going to give this a try tomorrow and let you know what I find. I have been talking to Simone about adopting one of my dredge like functions to work with pcountOpen. I initially wrote it to work with occu, but have started to adopt it for other unmarked functions as well.

Cheers,
Richard S.

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

simone santoro

unread,

Jan 28, 2016, 6:17:47 AM1/28/16

to unmarked

Hi all,

First, thanks to all of you. I forgot to mention the post (that I had read) by Drew Richetts on another problem with the dredge function for a model pcountOpen, sorry for that.

Below you can find a reproducible code for my case (attached the files with data). These are simulated data sets, that's why are so nice...

> counts<- read.table("exampleCounts.txt")# counts

> scv<- read.table("exampleScv.txt")# site covariates, there are two: A1 that will be tested on initial abundance (first parameter), and KR that will be tested on detection probability (fourth and last parameter).

> yscv<- read.table("exampleYscv.txt")# yearly-site covariates, I will test this on the intrinsic growth rate that, when dynamics= "trend", is the second parameter in the pcountOpen model.

> yscv<- list(A2= matrix(rep(yscv[,1],nsites),nsites,nprimary,byrow=T))

> nsites<- 20

> nprimary<- 10 # no. of primary periods

> umf<- unmarkedFramePCO(y= counts, siteCovs= scv, yearlySiteCovs= yscv, numPrimary= nprimary)

> summary(umf)

unmarkedFrame Object

20 sites

Maximum number of observations per site: 10

Mean number of observations per site: 10

Number of primary survey periods: 10

Number of secondary survey periods: 1

Sites with at least one detection: 20

Tabulation of y observations:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 33 35 36 <NA>

1 3 5 5 13 11 13 12 10 14 13 18 5 11 8 7 11 7 6 7 2 2 1 2 2 3 2 2 1 1 1 1 0

Site-level covariates:

A1 KR

Min. :3.926 Min. :-2.753

1st Qu.:4.021 1st Qu.:-2.611

Median :4.235 Median :-2.493

Mean :4.235 Mean :-2.465

3rd Qu.:4.441 3rd Qu.:-2.338

Max. :4.597 Max. :-2.198

Yearly-site-level covariates:

A2

Min. :-0.48984

1st Qu.:-0.24796

Median :-0.08357

Mean : 0.02882

3rd Qu.: 0.46201

Max. : 0.64999

>

> m1<- pcountOpen(~A1, ~A2, ~1, ~KR, data= umf, dynamics= "trend",K=max(data$y)*4)

> m1

Call:

pcountOpen(lambdaformula = ~A1, gammaformula = ~A2, omegaformula = ~1,

pformula = ~KR, data = umf, K = max(data$y) * 4, dynamics = "trend")

Abundance:

Estimate SE z P(>|z|)

(Intercept) 0.0737 1.408 0.0524 0.9582

A1 0.7037 0.326 2.1557 0.0311

Growth Rate:

Estimate SE z P(>|z|)

(Intercept) 0.00271 0.0138 0.196 8.45e-01

A2 1.10462 0.0723 15.282 1.00e-52

Detection:

Estimate SE z P(>|z|)

(Intercept) 1.80 1.377 1.30 0.1921

KR 1.17 0.566 2.06 0.0394

AIC: 1085.666

> dredge(m1)

Error in dimnames(deps) <- list(ret, ret) :

length of 'dimnames' [1] not equal to array extent

El lunes, 25 de enero de 2016, 12:20:26 (UTC+1), simone santoro escribió:

exampleCounts.txt

exampleScv.txt

exampleYscv.txt

Richard Schuster

unread,

Jan 28, 2016, 11:03:54 AM1/28/16

to unma...@googlegroups.com

Hi everyone,

I have done some testing and think the attached function (f.AICc.pcountOpen) will do the trick to dredge pcountOpen models.
Below (also attached) a working example using Simone Santoro's simulated data.
Please give it a try and let me know if that's what you were looking for or if you need any clarification or changes made.
At present the dredge part does only run over one type (lambda, gamma, omega or p) at a time.

Cheers,
Richard S

#######################################################
library(unmarked)
library(AICcmodavg)

setwd("D:/Work/R/pcountOpen")

source("f.AIC_cut.pcountOpen.sig.used.16.01.28.R")

counts<- read.table("exampleCounts.txt")# counts
scv<- read.table("exampleScv.txt")# site covariates, there are two: A1 that will be tested on initial

#abundance (first parameter), and KR that will be tested on detection probability (fourth and last parameter).
names(scv) <- c("SC1","SC2")

yscv<- read.table("exampleYscv.txt")# yearly-site covariates, I will test this on the intrinsic growth rate

#that, when dynamics= "trend", is the second parameter in the pcountOpen model.

nsites<- 20
nprimary<- 10 # no. of primary periods

yscv.2<- list(A1= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T),
            A2= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T),
            A3= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T))

yocv <- list(O1= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T),
            O2= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T),
            O3= matrix(rep(yscv[,1]+rnorm(10),nsites),nsites,nprimary,byrow=T))

scv$SC3 <- scv$SC1 + rnorm(20)
scv$SC4 <- scv$SC2 + rnorm(20)
scv$SC5 <- scv$SC1 + rnorm(20)

umf<- unmarkedFramePCO(y= counts, siteCovs= scv, obsCovs = yocv, yearlySiteCovs= yscv.2, numPrimary= nprimary)
str(umf)

(m1<- pcountOpen(~SC1, ~A1, ~1, ~1, data= umf, dynamics= "trend",K=50))

######################################################
#dredge function
#need to provide the starting model,
#blocks = character vector of covariates to test
#formlst = which formula should be dredged (1=lambda, 2=gamma, 3=omega, 4=p)
#dredge = should all possible combinations be tested
#if dredge = FALSE, one can specify AICcut and at each model iteration only models
#within that number are retained as base models for the next iteration
fmd <- f.AICc.pcountOpen(start.model=m1, blocks=names(scv)[-1], formlst=1, dredge=TRUE)

#Not run
#Alternative function specification
#fmd <- f.AICc.pcountOpen(start.model=m1, blocks=names(scv)[-1], formlst=1, AICcut = 2, dredge=TRUE)

str(fmd,max.level=1)
#List of 4
# $ model = start model
# $ modlst = all models that were fit
# $ fitlst = equivalent to modlst, but from unmarked's fitList function
# $ modsel = unmarked modsel table

#full unmarked modsel table
fmd$modsel@Full

#averaged coefficient example
modavg(fmd$modlst,"SC2",parm.type="lambda")

#only retain models within cutoff AIC from top model
fmd.2 <- unm.subset(fmd,cutoff=2)
modavg(fmd.2,"SC2",parm.type="lambda")

exampleCounts.txt

exampleScv.txt

exampleYscv.txt

f.AIC_cut.pcountOpen.sig.used.16.01.28.R

Simone.Santoro.R

Richard Schuster

unread,

Jan 29, 2016, 10:10:50 AM1/29/16

to unmarked

Attached a slightly updated function script.
Thanks to Drew for pointing out the problem with the previous version.

Best,
Richard S

f.AIC_cut.pcountOpen.sig.used.16.01.28.R

simone santoro

unread,

Jan 30, 2016, 11:05:35 AM1/30/16

to unmarked

Dear Richard,

I have been trying your function and it works pretty well. Thanks for that!

In my case I am going to evaluate the model selection performance - under the same set of candidate models - over a huge number of simulated data sets.

In particular, I will have 64 candidate models that represent all the possible comibinations of model structures nested to a global model like this:

(global.model<- pcountOpen(~SC1+SC2, ~A1+A2, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m2<- pcountOpen(~SC1, ~A1+A2, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m3<- pcountOpen(~SC2, ~A1+A2, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m4<- pcountOpen(~1, ~A1+A2, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m5<- pcountOpen(~SC1+SC2, ~A1, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m6<- pcountOpen(~SC1, ~A1, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m7<- pcountOpen(~SC2, ~A1, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
(m8<- pcountOpen(~1, ~A1, ~1, ~O1+O2, data= umf, dynamics= "trend",K=50)
...
(m64<- pcountOpen(~1, ~1, ~1, ~1, data= umf, dynamics= "trend",K=50)

Therefore, if for a given scenario I have built 1000 simulated data sets I will need to get 1000 model selection tables with a given Akaike weight for each model of the set of candidates. Then I will compute the median weight (and sd) for each model in the set. I have previously used dredge over a pcount model and it was able to to all the combinations among the parameter types (in that case lambda and p). However, if I have understood well your function is able to run all the combinations within each parameter type, isn't it?

In my opinion my case is pretty specific and your solution is good for most of cases so that I think you are a meritorious candidate to get the Andy Royle's book!

I have contacted Kamil Burton who is the author and maintainer of the MuMIn package to ask about this.
I will keep you informed on the progress.

Ciao,

Simone

Richard Schuster

unread,

Jan 31, 2016, 11:52:55 PM1/31/16

to unma...@googlegroups.com

Hi Simone,

Thanks for testing the function.

I think if you 'only' have 64 candidate models you would not really need dredge or the function to create the model selection tables. What I would do is create all possible combinations as you outlined below and just change the data in a loop for the 1000 data sets (or parallelize the loop or split things up into groups). I would create a selection tables for each iteration in the loop, save to a list and use that in post-processing.

You are correct the function currently only allows for 'dredge' for one parameter type at a time. It would be fairly straight forward to extend this, but would require some time and I am not sure when I could get to this. Hopefully the MuMIn package can be updated soon so it supports pcountOpen dredge as well.

Cheers,
Richard

--

You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward