gdistsamp warning message - detection variables

368 views
Skip to first unread message

Rachel Field

unread,
Nov 17, 2014, 3:02:04 PM11/17/14
to unma...@googlegroups.com
Hello All, 

I have been struggling with the following analysis in unmarked gdistsamp: 

I conducted repeated 10 min point counts over one season. I have 50 sites (100-m radius @ 10-m detection intervals) replicated over 3 surveys, and each individual observation (n = 1108) is recorded at discrete distance intervals (i.e., the number of observations for each site during each survey is not always equal). Habitat variables were measured once for each site (n = 50), and detection covariates (e.g., julian date, time of day @ start of count, wind speed, observer) were measured for each site on each survey (n = 150). 

I wish to test the effect(s) of various habitat metrics on songbird abundance/density, to include detection covariate(s) in my models, and to  account for repeated measures in my design. I think that 'gdistsamp' is most  appropriate for this

I have followed Chandler's 'Distance sampling analysis in unmarked (2011)' and everything seems to work until I add detection covariates (using gdistsamp; prior to adding abundance/density habitat predictors), when running my models produce the warning: "*In lambda * A : longer object length is not a multiple of shorter object length*". 

Questions:
(a) Am I using the appropriate fitting function (i.e., distsamp vs. gdistsamp vs. pcount vs. ???) 
(b) Why am I getting this warning message? 

Here is my code: 

************START CODE*************

#data (rows) at the individual observations level (n=1108) (site/survey detection covariates repeated for each individual at each site/survey, n=150)
dists <-read.csv("file/path/pointcounts.csv")     
 
#data (rows) at the site (n = 50) level
covs <- read.csv("file/path/habitat.csv")         

#example of covariates (to be used as detection covariates) 
jdate<-(dists$day.julian) 
daytime<-(dists$time.hour.num)

#example of habitat variables (to be used as density/abundance predictors)
tcov<-(covs$tree.cover)
FHD<-(covs$foliage.heigh.diversity)
covspoint<-covs$point)

head(dists, 1108) 

#'point' contains character+numerical site names (e.g., 'sweco03') 
levels(dists$point) <- c(levels(dists$point), "sweco03") 

levels(dists$point) 

yDat <- formatDistData(dists, distCol = "distance.m", transectNameCol = "point", dist.breaks = c(0,10,20,30,40,50,60,70,80,90,100), occasionCol = "survey")

yDat

covs <- data.frame(tcov, sFHD, row.names = covspoint)

#individual observations were recorded at 10-m distance intervals to 100-m; numPrimary = 3 surveys over the season 
umf <-unmarkedFrameGDS(y = as.matrix(yDat), siteCovs = covs, survey = "point", dist.breaks = c(0,10,20,30,40,50,60,70,80,90,100), unitsIn = "m", numPrimary = 3) 
summary(umf) 

#to determine the best detection function 
hn_Null <- gdistsamp (~1, ~1, ~1, umf, keyfun = "halfnorm", output = "density", unitsOut = "ha") 
haz_Null <-gdistsamp (~1, ~1, ~1, umf, keyfun = "hazard")               #lowest AIC
uni_Null <- gdistsamp (~1, ~1, ~1, umf, keyfun = "uniform") 
exp_Null <- gdistsamp (~1, ~1, ~1, umf, keyfun = "exp")

#to test the fit of detection covariates 
m1 <- gdistsamp (~1, ~1, ~jdate, umf, keyfun = "hazard"
, output="density", mixture="P") 
m2 <- gdistsamp (~1, ~1, ~daytime, umf, keyfun = "hazard", output="density", mixture="P") 

#When I attempt to run these models, I get the "*
In lambda * A : longer object length is not a multiple of shorter object length*" warning message. 

*********END CODE*********

I think this may have something to do with the fact that the detection covariates have not been reformatted as a data.frame. If so, could someone please provide me some advice on how to product such a data frame so that the row names will be equivalent to those for both the individual observation-level data (row n=1108) and the site-level habitat data (row n = 50)?

Thank-you in advance, 
Rachel 

Jeffrey Royle

unread,
Nov 17, 2014, 10:19:10 PM11/17/14
to unma...@googlegroups.com
hi Rachel,
 Yes you are losing the correct function -- gdistsamp() for replicated distance sampling.  You could also use distsamp() if you processed your data in a slightly different way (using replicates as new pseudo-sites, which has been discussed on the listserve before).

 You seem to have a simple data formatting error with your covariate which is hard to diagnose without looking at all of your data and rolling up my sleeves.
 I don't see jdate and the other covariate existing in the unmarkedFrameGDS
 In any case the unmarkedFrameGDS can have several types of covariates (see the help file for unmarkedFrameGDS) and I think what you have here are "yearlySiteCovs" which should work if they have n=50 rows and 10 (number of replicates) columns.   So if you can make this matrix having the unique Julian date of each site x replicate sample then it should run.

 If it doesn't (after you've made that matrix) , let me know...
regards
andy



siteCovs

Data frame of covariates that vary at the site level.

obsCovs

Data frame of covariates that vary within site-year-observation level.

numPrimary

Number of primary time periods (seasons in the multiseason model).

yearlySiteCovs

Data frame containing covariates at the site-year level.


--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rachel Field

unread,
Nov 18, 2014, 10:50:00 AM11/18/14
to unma...@googlegroups.com
Hello Andy,

Thanks very much for your reply. I followed your advice, along with other coding advice in this listserve (titled: "[unmarked] using obsCovs in unmarkedFrame", post on Friday, February 8, 2013 2:44:19 PM UTC-7). Unfortunately, I now have new problems. My new code is as follows:

#########

library(unmarked)

#distance data (individual observations)
dists <-read.csv("Data/csv data/PointCountSONG.csv", colClasses=c("factor","factor","numeric","factor","factor","numeric","numeric"))

#site-level covariates (sites = 50)
covs <-read.csv("Data/csv data/Habitat.csv", colClasses=c("factor","factor","factor","factor","factor",rep("numeric",21)))

#site+survey-level (detection/'yearly') covariates (sites+surveys = 150)
det <-read.csv("Data/csv data/DetectionCovs_SURVxSITE_updated.csv",colClasses=c("factor","factor","factor","factor","factor",rep("numeric",15)))

head(dists, 1108)

levels(dists$point) <- c(levels(dists$point), "sweco03")
levels(dists$point)

yDat <- formatDistData(dists, distCol = "distance.m", transectNameCol = "point", dist.breaks = c(0,10,20,30,40,50,60,70,80,90,100), occasionCol = "survey")
yDat

umf <-unmarkedFrameGDS(y = as.matrix(yDat), siteCovs = covs, yearlySiteCovs = det, survey = "point", dist.breaks = c(0,10,20,30,40,50,60,70,80,90,100), unitsIn = "m", numPrimary = 3)
umf
#Msg: Error in data.frame(df, yscwide) : arguments imply differing number of rows: 50, 17 In addition: There were 20 warnings (use warnings() to see them)
warning()
#Warning messages:
1: In FUN(X[[20L]], ...) :
  data length [50] is not a sub-multiple or multiple of the number of rows [17]
#NB: these error/warning messages are removed when I remove 'yearlySiteCovs' from the dataframe.

hn_Null <- gdistsamp(~1, ~1, ~1, umf, keyfun = "halfnorm", output = "density", unitsOut = "ha", K=2000)
haz_Null <- gdistsamp(~1, ~1, ~1, umf, keyfun = "hazard")
uni_Null <- gdistsamp(~1, ~1, ~1, umf, keyfun = "uniform")
exp_Null <- gdistsamp(~1, ~1, ~1, umf, keyfun = "exp")

hn_Null #AIC = 3082.192
haz_Null #2754.778 - lowest
uni_Null #4141.68
exp_Null #2846.203

#fit model for detection covariates
mNULL <-gdistsamp(~1, ~1, ~1, umf, keyfun = "hazard")
m1 <- gdistsamp(~1, ~1, ~day.julian.1+day.julian.2+day.julian.3, umf, keyfun = "hazard")
#Warning message:
In gdistsamp(~1, ~1, ~day.julian.1 + day.julian.2 + day.julian.3,  :
  Hessian is singular. Try using fewer covariates and supplying starting values.

#Try adding starting values. 
#Get a vector of starting values
tmp <- mNULL@estimates@estimates
starts <- c(tmp$lambda@estimates, tmp$phi@estimates, tmp$det@estimates, 0,0,0, tmp$scale@estimates)

#day.julian.# = days separated by survey
m1s <-gdistsamp(~1, ~1, ~day.julian.1+day.julian.2+day.julian.3, umf, keyfun = "hazard", starts=starts)
#Warning message:
In sqrt(diag(vcov(obj))) : NaNs produced

##########END

I have checked my yearlySiteCovs data and there do not appear to be any errors. I will try to use distsamp with pseudo-site replicates as you suggested. However, I am curious to know what might be going on to product these error messages in the above code.

Thanks again,
Rachel

Rachel Field

unread,
Nov 19, 2014, 3:02:43 PM11/19/14
to unma...@googlegroups.com
Andy found a solution to this problem:

In summary, detection variables (i.e., measured at each site on each survey) may be imported from a separate file (i.e., independent of observation file (at level of individual distance observations) and habitat variable file (at site level)). The detection variable file should be imported with row numbers equal to number of sites, with each detection variable having a separate column for measurements taken on each survey, e.g. (50 sites; three surveys; detection variable = var):

    site var_1 var_2 var_3
1    a1    4.2    6.6    8.1
2    a2    3.7    3.4    5.5
3    a3    2.6    3.3    3.3
...
50    a50    5.1    4.9    2.2

See below for an example with code.

EXAMPLE: Bird point counts were conducted at 50 sites over 3 surveys (1 season), yielding a total of 1107 individual observations.

####EXAMPLE CODE

library(unmarked)

# Import distance data (row n=1107); classify columns appropriately (e.g., "factor", "numeric", etc.)
dists <-read.csv("filepath/Observations.csv", colClasses=c("factor","factor","numeric","factor","factor","numeric","numeric"))

# Import site-level covariates (row n=50)
covs <-read.csv("filepath/HabitatCovs.csv", colClasses=c("factor","factor","factor","factor","factor",rep("numeric",21)))

# Import site+survey-level (detection) covariates (row n=50)
det <-read.csv("filepath/DetectionCovss.csv",colClasses=c(rep("factor",5),rep("numeric",15)))

head(dists, 1108)

# NB: "sweco03" is a site name
levels(dists$point) <- c(levels(dists$point), "sweco03")
levels(dists$point)

yDat <- formatDistData(dists, distCol = "distance.m", transectNameCol = "point", dist.breaks = c(0,10,20,30,40,50,60,70,80,90,100), occasionCol = "survey")
yDat

#make a vector of detection variables (n=150 rows) and standardize
tmp<- data.frame(observer = factor(t(det[,3:5])[1:150]), julian = scale(t(det[,6:8])[1:150]), time = scale(t(det[,9:11])[1:150]), temp = scale(t(det[,12:14])[1:150]), wind = scale(t(det[,15:17])[1:150]), cloud = scale(t(det[,18:20])[1:150]))

umf <-unmarkedFrameGDS(y = as.matrix(yDat), siteCovs = covs, yearlySiteCovs = tmp, survey = "point", dist.breaks = c(0,10,20,30,40,50,60,70,80,90,100), unitsIn = "m", numPrimary = 3)
umf

#model containing one detection variable
m1 <-gdistsamp(~1, ~julian, ~1, umf, keyfun = "hazard") 

####END (not run)
Reply all
Reply to author
Forward
0 new messages