This question is highly R focused, but I have seen other R questions on this forum and I am hoping someone here can help me get to the root of this problem.
I have a series of VHF radio-telemetry data from animals that I am hoping to be able analyze in R. I have done my research on the different packages available for this, and decided that it appears utilizing the package ks to select a plug-in bandwidth method for all individuals will be most appropriate, based on previous research with my study species. This is reinforced in section 4.3 of this
I have written two R functions, initially to check that I got similar results, and then hopefully select one to analyze the home ranges of my study animals. My results from the two functions were not as similar as I hoped, and that is what brings me to this forum - to figure out why. The first function relies entirely on the ks package, by using the kde() function and a pre-defined grid expanded by a factor of the bandwidth to get the volume of the Utilization distribution (UD). The UD was then sorted by high to low density. I calculated the size of a grid cell, then calculated the cumulative sum of the grid cell volumes. I then took the number of cells in this list needed to give a cumulative volume of 0.95 or 0.50 (depending on if I am interested in the 95% or 50% kde area) and then multiplied that number of cells by the cell size to get the area. Perhaps there is a flaw in my logic here that I am not seeing, if so please alert me.
kde_ks.hpi <- function(filename, percentage, gridsize, multiplier){
data <- read.csv(file = filename)
x <- as.data.frame(data$X)
y <- as.data.frame(data$Y)
loc <- cbind(x,y)
data.h <-Hpi(loc)
x.grid.size<- (gridsize) #Set grid size as no. of nodes in the x direction
band.mult<- (multiplier)
x=seq(min(loc[,1])-band.mult*sqrt(data.h[1,1]), max(loc[,1]) + band.mult *sqrt(data.h[1,1]), length.out=x.grid.size)
y=seq(min(loc[,2])-band.mult*sqrt(data.h[2,2]), max(loc[,2])+ band.mult*sqrt(data.h[2,2]), by=(x[2]-x[1])) #sets out nodes in y axis, spaced the same as x axis
eval.pts <-expand.grid(x,y)
UD <-kde(loc,H=data.h,eval.points=eval.pts)
output<- data.frame(cbind(UD$eval.points[,1],UD$eval.points[,2],UD$estimate))
colnames(output)<- c("xcoord","ycoord","UD_ht")
cell.area <- (x[2]-x[1])*(y[2]-y[1])
grid.vol <- sum(cell.area*output[,3]) #check that grid UD vol is >=0.99
vol<-output[,3]/sum(output[,3]) #standardize volume to add to 1
output<-data.frame(cbind(output,vol))
output<-output[order(-output$vol),] #sort by descending volume
cumV<-cumsum(output$vol) #calculate cumulative UD volume
output<-data.frame(cbind(output,cumV)) #add cumulative vol to output
sub.UD<-subset(output,output$cumV<=(percentage/100)) #subset output by percent vol
cellcount <- nrow(sub.UD)
area <- cellcount*cell.area
KDE <- data.frame(c(area,grid.vol))
KDE
}
So as an example -
kde_ks.hpi("C01 .csv", percentage=50,gridsize =150,multiplier =3) #I will attach a stripped down version of this data file if anyone is interested in reproducing my results.
In this case - I found a 99% home range area of 3958 meter^2 for this individual.
My second function is largely based on code that I found from the Spatial Ecology textbook that I referenced above. This code transforms the bandwidth and kde from the ks package with rasters to utilize functions in the adehabitatHR function to easily get UD volumes and areas. The difference between my code and the one in the book is that I define the grid using the same methods as the first function. Yet, I don't get the same results, despite using the same grid and bandwidth as I did in the previous function. Below is the annotated code for that function:
CRS.SC <- CRS("+init=epsg:32616") #define study area projection first
kde_ade.ks.hpi <- function(filename, percentage, gridsize, multiplier){
data <- read.csv(file = filename) #read in data
x <- as.data.frame(data$X)
y <- as.data.frame(data$Y)
xy <- c(x,y)
loc <- cbind(x,y)
data.h <-Hpi(loc) #define bandwidth using plug-in method
data.proj <- SpatialPointsDataFrame(xy,data, proj4string =
CRS.SC)
boundingVals <- data.proj@bbox #get the bounding values of the animal locations
band.mult <- (multiplier) #define the value used to expand the evaluation grid
y.expand <- band.mult*sqrt(data.h[2,2]) #set the value used to expand the grid by a function of the multiplier and the bandwidth
x.expand <- band.mult*sqrt(data.h[1,1]) #note that this expansion is different in the x and y direction
deltaX <- as.integer(((boundingVals[1,2]) - (boundingVals[1,1])) + (2*x.expand)) #get the total length of the grid axis
deltaY <- as.integer(((boundingVals[2,2]) - (boundingVals[2,1])) + (2*y.expand))
x.grid.size <- (gridsize) #set the number of grid nodes in the X direction
gridRes <- deltaX/x.grid.size #determine the grid resolution, (i.e the size of one side of a cell)
y.grid.size <- deltaY/gridRes #determine the number of nodes in the Y direction using the same cell size as the x axis
boundingVals[2,1] <- boundingVals[2,1] - y.expand #min Y - expand the in each direction grid by the previously determined value
boundingVals[2,2] <- boundingVals[2,2] + y.expand #max Y
boundingVals[1,1] <- boundingVals[1,1] - x.expand #min X
boundingVals[1,2] <- boundingVals[1,2] + x.expand #max X
gridTopo <- GridTopology((boundingVals[,1]), c(gridRes,gridRes),c(x.grid.size,y.grid.size)) #Grid Topology object is basis for sampling grid (offset, cellsize, dim)
sampGrid <- SpatialGrid(gridTopo, proj4string =
CRS.SC) #Using the Grid Topology and projection create a SpatialGrid class
sampSP <- as(sampGrid, "SpatialPixels") #Cast over to Spatial Pixels
sampRaster <- raster(sampGrid) #convert the SpatialGrid class to a raster
sampRaster[] <- 1 #set all the raster values to 1 such as to make a data mask
evalPoints <- xyFromCell(sampRaster, 1:ncell(sampRaster)) #Get the center points of the mask raster with values set to 1
hpikde <- kde(x=loc, H=data.h, eval.points=evalPoints) #Create the KDE using the evaluation points
hpikde.raster <- raster(sampRaster) #Create a template raster based upon the mask and then assign the values from the kde to the template
hpikde.raster <- setValues(hpikde.raster,hpikde$estimate)
hpikde.px <- as(hpikde.raster,"SpatialPixelsDataFrame") #Cast over to SPxDF
hpikde.ud <- new("estUD", hpikde.px) #create new estUD using the SPxDF
hpikde.ud@vol = FALSE #Assign values to a couple slots of the estUD
hpikde.ud@h$meth = "Plug-in Bandwidth"
hpikde.ud.vol <- getvolumeUD(hpikde.ud, standardize=TRUE) #Convert the UD values to volume using getvolumeUD from adehabitatHR and cast over to a raster
hpikde.ud.vol.raster <- raster(hpikde.ud.vol)
hpikde.vol <- getverticeshr(hpikde.ud, percent = percentage,ida = NULL, unin = "m", unout = "m2", standardize=TRUE) #Here we generate volume contours using the UD
hpikde.vol$area #Determine UD area at that contour
}
So for example:
So in this case - I found a 50% kde home range area of 3965 meter^2 for this individual. Not really that big a difference from the other value of 3958 m^2- but I would love to know where it comes from, as it the difference between the two functions can vary based on how I define the grid, multiplier, or percentage. At 99% the same data set produces a value of 67182 m^2 and 67453 m^2 for the first and second methods, respectively. In this case the second method produces a larger value both times, but when I try this with a different animal, the first method gives me larger areas. What is going on here, why is there a difference? Is one method better than another? What do you all recommend for moving forward?
I am hoping to use these functions and alter them to produce shapefiles that I can then map in GIS, once my methodology is settled.