What is the correct way to aggregate individual ODs at the population level?

49 views
Skip to first unread message

Danielle Berger

unread,
May 26, 2023, 12:40:13 AM5/26/23
to ctmm R user group
Hi Chris,

I am trying to define seasonal habitat patches frequently used by bighorn sheep.  As an initial step, I fit occurrence distributions to individual/season/year GPS data folds. Now, I want to aggregate these occurrence distributions into a single composite raster layer for each season that represents something like population-level intensity of use for any given pixel. The caveat is that individuals can contribute less than a full season of data, and still be included. For example, I will have some individuals that contribute two weeks of data and others that contribute two months of data in a season, which results in ODs with orders of magnitude difference in scaling among pixel values. 

In the examples I could find in the literature, the standard approach seems to be to take either the sum or the arithmetic mean of ODs to create an aggregate OD. If I want to represent that average probability of use of a pixel by a sheep, given differing sampling durations between animals, it seems like I shouldn't just take the mean. However, I don't think it is as simple as taking a weighted mean, where weights are the number of sampling days for each individual because that doesn't correct the difference in scaling between ODs with different sampling periods. 

I've considered normalizing all ODs between 0 and 1 and then taking a weighted average, but this seems to artificially compress variance. I've also thought about taking the log of all ODs to help deal with overdispersion, and then calculating a weighted geometric mean. Do you have any thoughts on the correct way to aggregate individuals ODs at the population level, accounting for different durations of sampling?

Thanks!

Best,
Dani Berger

Christen Fleming

unread,
May 26, 2023, 7:01:09 PM5/26/23
to ctmm R user group
Hi Dani,

It depends on what you want the aggregate population OD to correspond to. The individual OD corresponds to asking where the animal is located after sampling a random time (uniformly) from the sampling period  For multiple individuals:
  1. Sampling a random individual and then sampling a random time from that individual's sampling period would correspond to the unweighted mean.
  2. Sampling a random time from the sampling period of all individuals and then sampling a random individual would correspond to the weighted mean, where the weights are proportional to the sampling period of each individual.
2. makes more sense to me. The difference in scaling should come from either the difference in sampling period or sampling frequency. If the frequencies are the same, then shorter tracks should just have more probability mass in a smaller area, and the weights should account for that.

Best,
Chris
Reply all
Reply to author
Forward
0 new messages