Double observer sampling, detection probability using gmultmix

687 views
Skip to first unread message

Mike - DUC

unread,
Aug 10, 2011, 6:27:55 PM8/10/11
to unmarked
Greetings,
I have waterfowl data collected by two observers per site, using a
dependent double observer sampling method (i.e., the second observer
recorded only those birds not recorded by the primary observer. The
sites were surveyed in a single year with up to 4 visits per year. I
would like to calculate species specific detection probabilities for
each observer using all visits. From my understanding, I can use the
gmultmix function to estimate the detection probability; however, I
receive an error (see below) when I run the following program.

Any advice on how I can fix the problem would be greatly appreciated.

Thanks,
Mike

The code I have so far is:
library("unmarked")
data.y <- as.matrix(read.csv("Obs.csv"))
data.obsCovs <- read.csv("ObsCovs.csv")
data.siteCovs <- read.csv("SiteCovs.csv")
umfGMM <- unmarkedFrameGMM(y=data.y, siteCovs=data.siteCovs,
obsCovs=data.obsCovs, numPrimary=1, type="double")
(m1 <- gmultmix(~1,~1,~1, data=umfGMM))

I get the following error when I try to run the gmultmix function:
Error in cp[, t, 1:R] <- do.call(piFun, list(p[, t, ])) :
number of items to replace is not a multiple of replacement length
In addition: Warning message:
In cbind(X.long.na, Xdet.long.na) :
number of rows of result is not a multiple of vector length (arg 2)

The summary of the unmarked frame GMM is:
unmarkedFrame Object

45 sites
Maximum number of observations per site: 2
Mean number of observations per site: 2
Number of primary survey periods: 1
Number of secondary survey periods: 2
Sites with at least one detection: 43

Tabulation of y observations:
0 1 2 3 4 5 6 7 8 9 10 11 13
15 16

18 20 22 24 26 27 29
13 2 10 13 7 10 1 2 1 2 1 1 1
2 1

3 2 1 4 2 2 2
35 36 37 38 43 <NA>
1 2 1 2 1 0

Site-level covariates:
Site SCov1 SCov2 SCov3
SCov4

SCov5
Min. : 1.00 A:12 Min. :0.003744 Min. :0.0000000 Min.

:0.00e+00 Min. :0.000000
1st Qu.:18.00 H:26 1st Qu.:0.009936 1st Qu.:0.0000000 1st

Qu.:0.00e+00 1st Qu.:0.003312
Median :35.00 P: 3 Median :0.021744 Median :0.0000000 Median

:0.00e+00 Median :0.010944
Mean :37.73 R: 4 Mean :0.035600 Mean :0.0005632 Mean

:3.20e-06 Mean :0.014358
3rd Qu.:56.00 3rd Qu.:0.051840 3rd Qu.:0.0001440 3rd

Qu.:0.00e+00 3rd Qu.:0.018720
Max. :84.00 Max. :0.128880 Max. :0.0060480 Max.

:1.44e-04 Max. :0.070704

Observation-level covariates:
Observer
B:45
M:45

Richard Chandler

unread,
Aug 11, 2011, 8:19:17 AM8/11/11
to unma...@googlegroups.com
Hi Mike,

That sure is a worthless error message -- I'll make a note to fix it. It's hard to know exactly what the problem is without knowing the dimensions of the matrices you imported. Can you show the results of:

str(data.y)
str(data.obsCovs)
str(data.siteCovs)

data.y should have 8 columns and one row per site. The order of the columns should be something like: "obs1visit1, obs2visit1, obs1visit2, etc..."  

data.siteCovs should also have one row per site and each column should be a covariate.

There are several ways to format obsCovs, but let's ignore that for the moment.

Once your data is set up correctly, you need to specify numPrimary=4.

Another issue that needs to be addressed is that type="removal" assumes that you used the "independent" double observer method. I'll make sure to add that to the documentation. Since you want the "dependent" method, you will have to supply a user-defined "piFun". I can help with that, but could you first try to run your code again with the data formatted as described above?

Richard

_____________________________________
Richard Chandler, post-doc
USGS Patuxent Wildlife Research Center
301-497-5643



From: Mike - DUC <mrobin...@yahoo.com>
To: unmarked <unma...@googlegroups.com>
Date: 08/10/2011 06:44 PM
Subject: [unmarked] Double observer sampling, detection probability using gmultmix
Sent by: unma...@googlegroups.com


Mike - DUC

unread,
Aug 11, 2011, 12:13:48 PM8/11/11
to unmarked
Hi Richard

Thanks for your quit reply.

With the project area that I am using to test the gmultmix function,
we have 2 primary periods, so a total of 4 columns in data.y. I think
the data is organized properly, please see the output below.

I'll definitely need a little help with the "piFun" function when we
get there. I am still unsure of the purpose of the function.

Thanks,
Mike

> summary(umfGMM)
unmarkedFrame Object

45 sites
Maximum number of observations per site: 4
Mean number of observations per site: 4
Number of primary survey periods: 2
Number of secondary survey periods: 2
Sites with at least one detection: 45

Tabulation of y observations:
0 1 2 3 4 5 6 7 8 9 10 11 12
13 14 16 17 18 19 20 21 23 24 26 27 28 29
30 31 33 34 35 36 37 38
16 1 19 4 9 13 3 3 7 10 5 5 8
2 3 4 1 4 4 6 5 4 5 2 1 1 9
5 1 3 2 2 2 3 2
40 45 46 47 100 <NA>
1 2 1 1 1 0

Site-level covariates:
GRID_ID Type_2009 OW AB MF MM
Min. : 1.00 A:12 Min. :0.003744 Min. :0.0000000
Min. :0.00e+00 Min. :0.000000
1st Qu.:18.00 H :26 1st Qu.:0.009936 1st Qu.:0.0000000
1st Qu.:0.00e+00 1st Qu.:0.003312
Median :35.00 P : 3 Median :0.021744 Median :0.0000000
Median :0.00e+00 Median :0.010944
Mean :37.73 R : 4 Mean :0.035600 Mean :0.0005632
Mean :3.20e-06 Mean :0.014358
3rd Qu.:56.00 3rd Qu.:0.051840 3rd Qu.:0.0001440
3rd Qu.:0.00e+00 3rd Qu.:0.018720
Max. :84.00 Max. :0.128880 Max. :0.0060480
Max. :1.44e-04 Max. :0.070704

Observation-level covariates:
LastName
Bi:54
Fr :36
Ma :90

> str(data.y)
int [1:45, 1:4] 36 7 9 40 31 20 20 12 33 21 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:4] "BP1_Primary" "BP1_Secondary" "BP2_Primary"
"BP2_Primary.1"

> str(data.obsCovs)
'data.frame': 180 obs. of 1 variable:
$ LastName: Factor w/ 3 levels "Bi","Fr",..: 1 3 3 2 3 3 1 2 1 2 ...

> str(data.siteCovs)
'data.frame': 45 obs. of 6 variables:
$ GRID_ID : int 1 2 3 6 7 8 12 13 15 16 ...
$ Type_2009 : Factor w/ 4 levels "A",..: 2 2 2 2 2 2 2 2 2 2 ...
$ OW : num 0.01613 0.00374 0.06264 0.06178 0.03686 ...
$ AB : num 0 0 0 0 0 ...
$ MF : num 0.000144 0 0 0 0 0 0 0 0 0 ...
$ MM: num 0.01598 0.03154 0.00346 0 0.00331 ...

Richard Chandler

unread,
Aug 11, 2011, 1:43:34 PM8/11/11
to unma...@googlegroups.com
So there were only 2 visits, not 4? In that case, please try the code below, which includes an appropriate "piFun" and "obsToY" matrix.

depDoubleObs <- function(p) {
    M <- nrow(p)
    pi <- matrix(NA, M, 2)
    pi[,1] <- p[,1]
    pi[,2] <- p[,2]*(1-p[,1])
    return(pi)
}

obsToY <- matrix(1, 2, 2)
numPrimary <- 2
obsToY <- kronecker(diag(numPrimary), obsToY)

umfGMM <- unmarkedFrameGMM(y=data.y, siteCovs=data.siteCovs,
  obsCovs=data.obsCovs, numPrimary=2, obsToY=obsToY,

   piFun="depDoubleObs")
(m1 <- gmultmix(~1,~1,~1, data=umfGMM))



I realize that the obsToY matrix is confusing, but, the great thing about this function and multinomPois() is that the user can create their own function for computing multinomial cell probs. Thus you can tailor the function to your study design as we are doing here. obsToY just tells the function how the covariates "map" to the counts. THere are more details about this scattered around the help files.

Keep us posted.

Richard


_____________________________________
Richard Chandler, post-doc
USGS Patuxent Wildlife Research Center
301-497-5643



From: Mike - DUC <mrobin...@yahoo.com>
To: unmarked <unma...@googlegroups.com>
Date: 08/11/2011 12:14 PM
Subject: [unmarked] Re: Double observer sampling, detection probability using gmultmix
Sent by: unma...@googlegroups.com





Mike - DUC

unread,
Aug 11, 2011, 3:36:41 PM8/11/11
to unmarked
Hi Richard,

Thank you for all your help. Adding in the functions allowed the
program to run with no problems.

I have been using type="double", assuming that it defines the
functions for dependent double observer sampling. Is my assumption
wrong?
Based on the documentation, setting type equal to either "removal" or
"double" will create the appropriate functions. Should I not specify
the type?

Thanks,
Mike

Richard Chandler

unread,
Aug 11, 2011, 4:08:41 PM8/11/11
to unma...@googlegroups.com
Hi Mike,

Glad it is working now. The confusion stems from the fact that when you use type="double", it fits the independent double observer method, not the dependent one. I will change this so that you can use "type" for either method.

Richard




From: Mike - DUC <mrobin...@yahoo.com>
To: unmarked <unma...@googlegroups.com>
Date: 08/11/2011 03:36 PM
Subject: [unmarked] Re: Double observer sampling, detection probability using gmultmix
Sent by: unma...@googlegroups.com





Nicholas Masto

unread,
Aug 17, 2018, 2:45:40 PM8/17/18
to unmarked
Hello,

This post is several years old but I'm encountering a similar issue with the same error message received when attempting to run my null model.  I also have duck data in which two independent observers count ducks from an airplane and each observer is considered a separate "visit" (i.e., y=data[1:2]).  My code so far looks like this.  Any assistance in solving this error would be greatly appreciated.

Regards,
Nick

data=data.frame(read.csv('Jan2018.ducks.csv'))
data[is.na(data)]=0

#specify categorical variables, scale and center numerical variables
factors=c("groupsize.1", "groupsize.2", "stratum")
data[factors]=lapply(data[factors], factor)

scale(data[6:7])
scale(data[10:16])

#specifying survey abundance columns
ducks.y=data[,1:2]

#setting up site covariates 
rice=data[,12]            #managed, broken, inland ricefields (hectares)
open=data[,13]          #permanent lakes, rivers, ponds (hectares)
scrub=data[,14]         #forested and emergent wetlands (hectares)
estuarine=data[,15]   #deep and shallow estuarine wetlands (hectares)
transect=data[,16]     #transect length (kilometers)
stratum=data[,17]      #stratum

siteCovs=data.frame(rice=rice, open=open, scrub=scrub, estuarine=estuarine, transect=transect, stratum=stratum)

#setting up obs covariates (covariates that should influence detection)
#note here that we made a matrix 

date=as.matrix(data[,6:7])                #date
groupsize=as.matrix(data[,7:8])        #group size (categorical, 1-5)
ricefield=as.matrix(data[8:9])            #rice field (contin)

obsCovs=list(date=date, groupsize=groupsize, ricefield=ricefield)

#setting up the unmarked Frame to give the formula layout for each model
#type=ind.doub.observer
#numPrimary=indicates population closure 
ducks.unmked=unmarkedFrameGMM(y=ducks.y, siteCovs = siteCovs, obsCovs = obsCovs, type = "double", numPrimary = 1)

#null model and model selection to choose mixture
ducks.m0.P=gmultmix(~1, ~1, ~1, data=ducks.unmked, mixture="P")
ducks.m0.NB=gmultmix(~1, ~1, ~1, data=ducks.unmked, mixture="NB")

Error in cp[, t, 1:R] <- do.call(piFun, list(p[, t, ])) : 
  number of items to replace is not a multiple of replacement length
In addition: Warning message:
In cbind(X.long.na, Xdet.long.na) :
  number of rows of result is not a multiple of vector length (arg 2)

Richard Chandler

unread,
Aug 19, 2018, 8:30:12 AM8/19/18
to Unmarked package
Hi Nick,

If the counts are independent, you would need to use a binomial N-mixture model instead of a multinomial N-mixture model. Take a look at pcount().

Best,
Richard


--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Richard Chandler
Associate Professor
Wildlife Ecology and Management
Warnell School of Forestry and Natural Resources
University of Georgia

Nicholas Masto

unread,
Aug 19, 2018, 12:52:15 PM8/19/18
to unma...@googlegroups.com
Hi Richard,

I appreciate the quick response! 

I've explored the pcount function; however, I thought that even though the front- and rear-seat aerial observers keep their counts independent from one another, it is still considered double observer method that would produce a multinomial detection distribution.  There's the probabilities that neither observer detects birds, that front- and rear-seat observers detect the same number of birds, that the front-seat observer detects more birds than the rear-seat observer, and vice versa = p(0,0), (1,1), (1,0), and (0,1).  

Am I misunderstanding and should be using pcount() and explicitly add 'observer' covariate into the detection process of the model?  Or perhaps instead of gmultmix() I should use multinomPois() because there's no secondary sampling periods in my design.  I did not choose this function because I wanted to determine whether my data fit a Poisson or NegBin abundance distribution and thought that I could specify closed sampling with numPrimary=1 argument.

Thank you for your responses and keeping up with this helpful google forum.  NM

To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Richard Chandler
Associate Professor
Wildlife Ecology and Management
Warnell School of Forestry and Natural Resources
University of Georgia

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Nick M. Masto
Clemson University M.S. Candidate '19
Kennedy Waterfowl and Wetlands Conservation Center
(864) 580-8003

Richard Chandler

unread,
Aug 19, 2018, 8:34:09 PM8/19/18
to Unmarked package
Hi Nick,

The jargon around double observer sampling can be a bit confusing. In the 'independent double observer' method, the two observers record data independently, but then they reconcile their counts so that they can determine if an animal was detected by observer 1 but not observer 2, observer 2 but not observer 1, or both observers. In this case, you wind up with 3 counts at each site. The 'dependent double observer' method is similar, except observer 2 only records detections that observer 1 missed. Either way, these types of data can be modeled with a multinomial N-mixture model

I may have misunderstood your original email, but I was under the impression that, in your study, the two observers did not reconcile their counts. In that case, the two counts are independent and a binomial N-mixture model could be used. 

Were the counts reconciled in your study?

Richard


To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Richard Chandler
Associate Professor
Wildlife Ecology and Management
Warnell School of Forestry and Natural Resources
University of Georgia

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Nick M. Masto
Clemson University M.S. Candidate '19
Kennedy Waterfowl and Wetlands Conservation Center
(864) 580-8003

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Nicholas Masto

unread,
Aug 20, 2018, 9:19:23 AM8/20/18
to unma...@googlegroups.com
Richard,

This is most helpful.  No we chose to forgo reconciliation because a) it is rather difficult to reconcile "on-the-fly" and b) we thought it may introduce more bias than it was worth to reconcile once we touched down.  

Using pcount(), I can't obtain detection probabilities for each observer; however, I did use "observer" in detection process of the binomial N-mixture and obtained betas-- it was clearly the best detection model via model selection.  

In those models however, I've also received many NANs which from my understanding is a computation problem when psi=~1.  Is this solvable or is that just how my data fits the model?

To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Richard Chandler
Associate Professor
Wildlife Ecology and Management
Warnell School of Forestry and Natural Resources
University of Georgia

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Nick M. Masto
Clemson University M.S. Candidate '19
Kennedy Waterfowl and Wetlands Conservation Center
(864) 580-8003

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
Richard Chandler
Associate Professor
Wildlife Ecology and Management
Warnell School of Forestry and Natural Resources
University of Georgia

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Richard Chandler

unread,
Aug 20, 2018, 9:44:50 AM8/20/18
to Unmarked package
Hi Nick,

With only 2 replicates per site, you won't be able to account for many sources of variation in detection probability in a binomial N-mixture model. There are many reasons why you might see NANs, but I'm afraid I don't have time to help diagnose the issue. Perhaps others on the list could help.

Richard
  

To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Richard Chandler
Associate Professor
Wildlife Ecology and Management
Warnell School of Forestry and Natural Resources
University of Georgia

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Nick M. Masto
Clemson University M.S. Candidate '19
Kennedy Waterfowl and Wetlands Conservation Center
(864) 580-8003

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
Richard Chandler
Associate Professor
Wildlife Ecology and Management
Warnell School of Forestry and Natural Resources
University of Georgia

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Nick M. Masto
Clemson University M.S. Candidate '19
Kennedy Waterfowl and Wetlands Conservation Center
(864) 580-8003

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages