Variance proportion

103 views
Skip to first unread message

Arthur Jacquemin

unread,
Mar 10, 2022, 5:14:47 AM3/10/22
to distance-sampling
Hello,
I am looking for a way to decompose the variance of my global coefficient of variation to determine proportion explained by encounter rate, group size (I work with groups of birds), and detection function.
The dht2 package seems to provide this information, but only seems to work for stratified data, which is not my situation.

So I tried to calculate these variance proportions myself simply based on the formula:
cv(D)² = [cv(encouter rate) ]² + [cv(detection function)²] + [cv(cluster size)]²

At the outpuf of the model I already have cv(D), cv(ER) and cv(cluster size), so I thought I could easily define the missing cv(detection function).

But I have a major problem: cv(cluster size) is bigger than cv(D), which makes the equation completely false.
I don't understand, is it really possible to have a cv(cluster size)> cv(D)?

Here my data:
cv(D)= 0,1576
cv(ER)=0,1411
cv(cluster size)=0,1603

Thanks for your help,
Arthur

Eric Rexstad

unread,
Mar 10, 2022, 7:56:32 AM3/10/22
to Arthur Jacquemin, distance-sampling
Arthur

There are several questions packed into this.  I don't have an answer for cv(size) > cv(D); but let's start smaller.  All elements needed to perform the variance component calculations can be extracted from the summary(dsobject)​.  I demonstrate with the wren_5min​ data set shipped with the Distance package (with a fabricated group size).

library(Distance)
data("wren_5min")
wren_5min$size <- rpois(134, 1) +1
cfac <- convert_units("meter", NULL, "hectare")
arthur <- ds(wren_5min, key="hr", truncation = 110,
             transect = "point", convert_units = cfac)
gof_ds(arthur)
summary(arthur)
Summary for distance analysis
Number of observations :  132
Distance range         :  0  -  110

Model : Hazard-rate key function
AIC   : 1167.512

Detection function parameters
Scale coefficient(s):  
            estimate         se
(Intercept) 4.195202 0.05797625

Shape coefficient(s):  
            estimate        se
(Intercept) 1.881361 0.2303151

                       Estimate          SE         CV
Average p             0.4594877  0.03454677 0.07518541
N in covered region 287.2764876 28.36284260 0.09873012

Summary for clusters

Summary statistics:
    Region Area CoveredArea Effort   n  k     ER     se.ER      cv.ER
1 Montrave 33.2    243.2849     64 132 32 2.0625 0.1901692 0.09220324

Abundance:
  Label Estimate      se        cv      lcl      ucl       df
1 Total 39.20333 4.66409 0.1189718 30.96093 49.64001 77.73582

Density:
  Label Estimate        se        cv       lcl      ucl       df
1 Total 1.180823 0.1404846 0.1189718 0.9325582 1.495181 77.73582

Summary for individuals

Summary statistics:
    Region Area CoveredArea Effort   n  k       ER     se.ER      cv.ER mean.size
1 Montrave 33.2    243.2849     64 261 32 4.078125 0.3934375 0.09647508  1.977273
     se.mean
1 0.08963513

Abundance:
  Label Estimate       se        cv      lcl      ucl      df
1 Total 77.51567 9.481117 0.1223123 60.80371 98.82094 73.6152

Density:
  Label Estimate        se        cv      lcl      ucl      df
1 Total 2.334809 0.2855758 0.1223123 1.831437 2.976534 73.6152

Expected cluster size
  Region Expected.S se.Expected.S cv.Expected.S
1  Total   1.977273    0.08496549    0.04297105
2  Total   1.977273    0.08496549    0.04297105


To find the variance components of encounter rate, detection function and group size, adopt the following (from online workshop lecture on precision, slide 32):
For detection probability: 0.075^2/0.1223^2=0.376
For encounter rate: 0.0922^2 /.1223^2 = 0.568
For group size: 0.04297^2/.1223^2 = 0.123

That should give you the first order of approximation regarding sources of uncertainty in your density estimates.

Double check your calculations to see if you are looking at density of groups or density of individuals as a possible source of confusion in your calculations.

From: distance...@googlegroups.com <distance...@googlegroups.com> on behalf of Arthur Jacquemin <arthur.j...@orange.fr>
Sent: 10 March 2022 10:14
To: distance-sampling <distance...@googlegroups.com>
Subject: {Suspected Spam} [distance-sampling] Variance proportion
 
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/20aafc86-3896-46fa-88a6-8607e4cfa8c2n%40googlegroups.com.

Arthur Jacquemin

unread,
Mar 10, 2022, 12:52:09 PM3/10/22
to distance-sampling
Thank you, indeed I had not realized that the cv (detection function) was given in output.
However, even after having restarted my analyses I still face the same problem: a group cv higher than the cv.

Here are the output results that might allow you to identify a problem (I work on the total area):


summary1.PNG

summary2.PNGsummary3.PNG
summary4.PNG

Eric Rexstad

unread,
Mar 11, 2022, 4:45:28 AM3/11/22
to Arthur Jacquemin, distance-sampling
Thank you for the additional detail Arthur.  Given your interest is in the entire study area (total), perhaps you should change Region.Label to "everything" so stratum-specific estimates are not produced; those stratum-specific estimates are unreliable (especially for the last two strata).  I also encourage the use of the convert_units() function so the area is in units such as sq km rather than sq m; this way the encounter rates will be more interpretable.  I'm not sure how the CV(encounter rate) for the entire study area is computed; it might be weighted in some manner.

The detail flags up additional inconsistencies to me just at the stratum-specific level. For example with the Camargue, the CV(abundance of individuals)(0.173) < CV(abundance of groups)(0.209); that also should not occur.

If you are willing, perhaps share this data set and code for fitting the half normal detection function with log(group size) covariate with me off-list and I will give it further attention.


Sent: 10 March 2022 17:52
To: distance-sampling <distance...@googlegroups.com>
Subject: Re: {Suspected Spam} [distance-sampling] Variance proportion
 

Arthur Jacquemin

unread,
Mar 14, 2022, 1:30:26 PM3/14/22
to distance-sampling
Hello Eric
Did you receive my email with the data?
I can't find any traces of this email and I ask myself if it has really been sent

Have a nice day

Eric Rexstad

unread,
Mar 14, 2022, 2:44:20 PM3/14/22
to Arthur Jacquemin, distance-sampling
No Arthur, I did not receive an email with your data.  Send it along to my personal email address, not to the list.

Sent: 14 March 2022 17:30
To: distance-sampling <distance...@googlegroups.com>
Subject: {Suspected Spam} Re: {Suspected Spam} [distance-sampling] Variance proportion
 
Reply all
Reply to author
Forward
0 new messages