Discarded Observations

526 views
Skip to first unread message

Kristin Broms

unread,
Sep 3, 2010, 1:33:02 PM9/3/10
to unmarked

Hi,

I am getting a warning message about missing (observational) covariate
data when I run my occupancy models, and it's baffling me as to why.
I imagine it's an easy fix, but I need a second set of eyes to see my
problem.

For the site covariates, my data set has 6 habitat types that are
incorporated as 5 indicator variables (0,1). These habitat types
affect both occupancy and detection, and occu() has no problem
creating this model:

fm3 <- occu(~AV + Fynbos + Grass + NKB + SKB ~AV + Fynbos + Grass +
NKB + SKB, unmarked.dat)

For the observational/survey-specific covariates, I have one indicator
variable of whether or not it is breeding season (I'm looking at bird
data), and another variable indicating year (90, 91, or 92). When I
run a model that includes either or both of these survey-specific
variables:
fm4 <- occu(~AV + Fynbos + Grass + NKB + SKB + j ~AV + Fynbos + Grass
+ NKB + SKB, unmarked.dat)

I get the following warning message, and the estimates do not match
Presence:
Warning messages:
1: In .local(umf, ...) :
Some observations have been discarded because corresponding
covariates were missing.FALSE
2: In .local(umf, ...) :
47 sites have been discarded because of missing data.FALSE

(Yes, a data set that includes multiple seasons and/or multiple years
should be looked at using a multi-season model, but I am just playing
around with the models to start and so am sticking to the simpler
models to get a better feel for the data structures/ error messages,
etc.)

The same data and model does run in Presence without any (visible)
errors. And looking at the rows/sites that are discarded in occu(),
there is no apparent missing data.

Thank you for any help! And thank you for creating the "unmarked"
pacakage!

~Kristin


Kristin Broms
Pre-doctoral Candidate
Quantitative Ecology and Resource Management
University of Washington
Seattle, WA 98195

Jeffrey Royle

unread,
Sep 3, 2010, 11:21:42 PM9/3/10
to unma...@googlegroups.com
hi Kristin,
the function is just telling you that some of your covariate data are
missing , which is ok, provided that it is true. In your covariate
data set, are there some missing values?
The function should produce sensible and valid results in the
presence of missing covariate data.
regards,
andy

rcha...@eco.umass.edu

unread,
Sep 4, 2010, 7:17:24 AM9/4/10
to unma...@googlegroups.com
Hi Kristin,

Could you post the results of:

summary(unmarked.dat)

and maybe:

show(unmarked.dat)


Also, it sounds like you coded the dummy variables by hand, which is
unnecessary. It's much easier and less error prone to import the data
with a single column for habitat type. R will treat this as a factor
with a level for each habitat type and you could use:

occu(~habitat.type ~habitat.type, unmarked.dat)


Hope this helps,
Richard

Quoting Kristin Broms <bro...@gmail.com>:

--
Richard Chandler
UMass Amherst
Natural Resources Conservation
nrc.umass.edu/index.php/people/graduate-students/chandler-richard/

Kristin Broms

unread,
Sep 7, 2010, 3:19:00 PM9/7/10
to unmarked
Hi,

Thank you for the help. It is good to know that I don't have to code
the dummy variables. It is already coded for this data set, but I
will keep that in mind for the next analysis.

I have looked at the data information for the discarded sites, and I
do not notice anything unusual for these rows, i.e. I see no missing
data. I thought that maybe I had a habitat type/ breeding combination
for which there were only 1's (species always present) or always 0's
(always absent), but that doesn't seem to be the case either.

Here is the summary data:
> summary(unmarked.dat)
unmarkedFrame Object

424 sites
Maximum number of observations per site: 10
Mean number of observations per site: 6.19
Sites with at least one detection: 246

Tabulation of y observations:
0 1 <NA>
1638 985 1617

Site-level covariates:
AV Fynbos Grass
NKB Sav SKB
Min. :0.00000 Min. :0.0000 Min. :0.0000 Min. :0.0000
Min. :0.0000 Min. :0.00000
1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
1st Qu.:0.0000 1st Qu.:0.00000
Median :0.00000 Median :0.0000 Median :0.0000 Median :0.0000
Median :0.0000 Median :0.00000
Mean :0.04245 Mean :0.1675 Mean :0.1368 Mean :0.3915
Mean :0.1840 Mean :0.07783
3rd Qu.:0.00000 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:1.0000
3rd Qu.:0.0000 3rd Qu.:0.00000
Max. :1.00000 Max. :1.0000 Max. :1.0000 Max. :1.0000
Max. :1.0000 Max. :1.00000

Observation-level covariates:
time j yr id
Min. : 1.0 Min. : 0.0000 Min. : 90.00 Min. : 1.0
1st Qu.: 3.0 1st Qu.: 0.0000 1st Qu.: 90.00 1st Qu.:106.8
Median : 5.5 Median : 1.0000 Median : 90.00 Median :212.5
Mean : 5.5 Mean : 0.5806 Mean : 90.42 Mean :212.5
3rd Qu.: 8.0 3rd Qu.: 1.0000 3rd Qu.: 91.00 3rd Qu.:318.2
Max. :10.0 Max. : 1.0000 Max. : 92.00 Max. :424.0
NA's :1617.0000 NA's :1617.00
>

"j" is the breeding information and "yr" is the year information. As
I mentioned before, I'm treating these as observation-level covariates
for now while I get a better feel for the models. I get the same
warning message for when either "j" or "yr" or both are included as
covariates on detection.

Here is the data from some of the discarded sites:
>all[sites.rm, 2:27]
y.1 y.2 y.3 y.4 y.5 y.6 y.7 y.8 y.9 y.10 AV Fynbos Grass NKB Sav
SKB j.1 j.2 j.3 j.4 j.5 j.6 j.7 j.8 j.9 j.10
206 1 0 1 0 1 NA NA NA NA NA 0 0 1 0 0
0 1 1 0 1 1 NA NA NA NA NA
217 0 NA NA NA NA NA NA NA NA NA 0 0 0 1 0
0 1 NA NA NA NA NA NA NA NA NA
220 0 NA NA NA NA NA NA NA NA NA 0 0 0 1 0
0 1 NA NA NA NA NA NA NA NA NA
248 0 1 NA NA NA NA NA NA NA NA 0 0 1 0 0
0 0 1 NA NA NA NA NA NA NA NA
249 0 1 0 NA NA NA NA NA NA NA 0 0 0 1 0
0 0 0 0 NA NA NA NA NA NA NA
251 1 1 NA NA NA NA NA NA NA NA 0 0 0 0 1
0 1 1 NA NA NA NA NA NA NA NA
259 0 0 1 NA NA NA NA NA NA NA 0 0 0 0 1
0 1 0 1 NA NA NA NA NA NA NA
263 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1
0 1 0 0 1 1 1 1 1 1 1
291 1 1 1 1 1 1 1 1 NA NA 0 0 1 0 0
0 1 0 1 0 0 0 0 0 NA NA
293 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1
0 0 1 1 0 0 0 0 0 1 1
304 1 1 1 1 0 NA NA NA NA NA 0 0 0 0 1
0 0 0 1 0 0 NA NA NA NA NA
305 0 0 NA NA NA NA NA NA NA NA 0 0 0 0 0
1 1 0 NA NA NA NA NA NA NA NA
322 0 0 NA NA NA NA NA NA NA NA 0 0 0 1 0
0 1 0 NA NA NA NA NA NA NA NA
325 0 0 0 0 NA NA NA NA NA NA 0 0 0 1 0
0 1 0 1 0 NA NA NA NA NA NA
326 0 NA NA NA NA NA NA NA NA NA 0 0 0 1 0
0 1 NA NA NA NA NA NA NA NA NA
331 0 0 0 NA NA NA NA NA NA NA 0 0 0 1 0
0 1 0 0 NA NA NA NA NA NA NA

Sorry that the formatting isn't visible!

Thank you again. I am going to play around with subsets of the data
and maybe that help shed some light on my problem. But please let me
know if you have an idea of where the warning is coming from.

~Kristin

rcha...@eco.umass.edu

unread,
Sep 7, 2010, 8:06:18 PM9/7/10
to unma...@googlegroups.com
Uh oh, it looks like a bug. I get the same using:

y <- matrix(c(1,1,NA,0), 2)
x <- matrix(c(-1,3,NA,2), 2)
umf <- unmarkedFrameOccu(y=y, obsCovs=list(x=x))
summary(occu(~1~x, umf))


I know that this was not around previously, so it might be related to
a new R version. What version are you using? I'm on 2.11.1. I'll try
to fix this soon. Thanks for the report.

Kristin Broms

unread,
Sep 7, 2010, 8:22:14 PM9/7/10
to unmarked

I'm using the same version, 2.11.1.

Sorry that this means there's a bug! Thanks for looking into it.

Kristin

rcha...@eco.umass.edu

unread,
Sep 7, 2010, 8:41:27 PM9/7/10
to unma...@googlegroups.com
Wait, what am I talking about, ignore that last message. I meant to test this:

y <- matrix(c(1,1,NA,0), 2)
x <- matrix(c(-1,3,NA,2), 2)
umf <- unmarkedFrameOccu(y=y, obsCovs=list(x=x))

summary(occu(~x~1, umf))

which does not incorrectly remove sites. Could you send me the file
created by:

umf <- as(unmarked.dat, "data.frame")
dump("umf", "umfDump.R")

Thanks,

rcha...@eco.umass.edu

unread,
Sep 12, 2010, 2:07:44 PM9/12/10
to unma...@googlegroups.com
Kristin and I figured this out off-list. It was not a bug, so no need to fret.

One thing to remember when formatting observation-level covariates is
that they need to be in site-major order. This is only an issue if
supplying a data.frame to obsCovs. I find it easier to use a list of
matrices instead. Here is an example to help clarify this.

y <- matrix(c(1,1,NA,0), 2)
x <- matrix(c(-1,3,NA,2), 2)

(umf1 <- unmarkedFrameOccu(y=y, obsCovs=list(x=x)))
(umf2 <- unmarkedFrameOccu(y=y, obsCovs=data.frame(x=as.vector(t(x)))))

all.equal(umf1, umf2)


Either of the above methods is correct. However, this is incorrect
because x is not transformed before converting to a vector:

unmarkedFrameOccu(y=y, obsCovs=data.frame(x=as.vector(x)))


Richard

Reply all
Reply to author
Forward
0 new messages