Calculating derived statistics as in BPA 2012 seems wrong?

412 views
Skip to first unread message

Tomas T.

unread,
Feb 20, 2025, 7:43:50 AMFeb 20
to hmecology: Hierarchical Modeling in Ecology

Hello all,

suppose you want to calculate correlation on two variables estimated by a model fitted in bayesian software. For example, a model that calculated two demographic parameters over years, and you have the posterior samples. How to do this?

Until now, I heavily used the method recommended in Bayesian Population Analysis using WinBUGS (Kery & Schaub 2012, code for Fig. 11-8 (Web appendix 1)), which calculates the pearson correlation coefficient for each posterior sample and then uses that as posterior distribution of the correlation. I thought this method is neat. Only now I realized this method is probably wrong?! It does take into account the uncertainty of the estimated parameters, but it completely ignores the uncertainty of the pearson correlation coefficient itself. So, if you have relatively precise estimates of the parameters, it can make very weak correlations appear very strong! I will illustrate it by an example below.

For me this realization is relatively serious, as I already used it in a published paper. Perhaps this is also case of the cited paper Schaub et al (2011)? I understand mistakes like this happen, I make them myself, but I think it would deserve some big red warning so that no more people make this mistake.

Have you used this method yourself? Or are you using a different one? How do you solve this? Below, I also proposed a remedy to this - instead of taking just the mean estimate of pearson, sample from its error distribution. I am not 100% sure this is statistically correct though.

Please let me know!

Best regards,

Tomas

PS: Here is a reproducible code that demonstrates the problem:

library(mvtnorm)

### generate two correlated vectors - true values
set.seed(123)
n <- 15
r <- 0.25 # pearson r
X <- rmvnorm(n, sigma = matrix(c(1, r, r, 1), 2))

cor.test(X[,1], X[,2])
# r = 0.1944498
# you see the CI is very wide: -0.3529724 to 0.6426943
# which means very weak correlation

### simulate the "MCMC samples" of these two vectors, with small uncertainty
samp <- 5000
A <- matrix(rnorm(n*samp, X[,1], 0.05), samp, byrow = TRUE)
B <- matrix(rnorm(n*samp, X[,2], 0.05), samp, byrow = TRUE)


### Calculate correlation coefficients as in BPA 2012, Code for Fig. 11-8 (Web appendix 1)
correl <- c()
for (i in 1:samp) {
correl[i] <- cor(A[i,], B[i,])
}
# Credible intervals of correlation coefficients
quantile(correl, c(0.05, 0.5, 0.95))

# the resultant CI is very narrow - i.e. the correlation suddenly seems super stong:
#        5%       50%       95%
# 0.1628700 0.1936207 0.2257177
#
# This is definitely wrong, as we know that the true correlation is very weak!




####################################################
#  Could this be a possible remedy? I am not sure:
#  to sample not just mean value, but from the probability distribution of each statistics
#  (it is also an approximation, since the distribution of Pearson is not exactly normal, see
# https://en.wikipedia.org/wiki/Pearson_correlation_coefficient#Using_the_exact_distribution

correl2 <- c()
for (i in 1:samp) {
st <- cor.test(A[i,], B[i,])
correl2[i] <- rnorm(1, st$estimate, diff(st$conf.int)/1.96/2)
}
quantile(correl2, c(0.05, 0.5, 0.95))

Matthijs Hollanders

unread,
Feb 20, 2025, 11:46:38 AMFeb 20
to hmecology: Hierarchical Modeling in Ecology
You're simulating very little random noise around your simulated X with your "MCMC samples", which is what's reflected in your "CI". Btw, if you want to directly estimate correlations, you can model the two (or more) time-varying demographic parameters as multivariate normal and estimate the correlation coefficient(s). 

Tomas T.

unread,
Feb 20, 2025, 12:36:17 PMFeb 20
to Matthijs Hollanders, hmecology: Hierarchical Modeling in Ecology

Dear Matthijs,

answering below:

On Thu, 20 Feb 2025 at 17:46, Matthijs Hollanders <matthijs....@gmail.com> wrote:
You're simulating very little random noise around your simulated X with your "MCMC samples", which is what's reflected in your "CI".

Of course, that was the intention and that's the point, to provide an example that would clearly demonstrate that the method is in principle wrong. Doesn't matter if it is realistic or not. It clearly demonstrates that the method ignores the variance of the correlation statistics, which leads to wrong inference.
 
Btw, if you want to directly estimate correlations, you can model the two (or more) time-varying demographic parameters as multivariate normal and estimate the correlation coefficient(s). 

True, you could do that, then, if you have e.g. 4 parameters, you would have to estimate the whole correlation matrix... I am not sure how much it would slow down the model fit though. And what would the change of the model mean - perhaps also the estimates might be different then?

Regards,

Tomas
 
--
*** Three hierarchical modeling email lists ***
(1) unmarked: for questions specific to the R package unmarked
(2) SCR: for design and Bayesian or non-bayesian analysis of spatial capture-recapture
(3) HMecology (this list): for everything else, especially material covered in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle (2016, 2021) and Schaub & Kéry (2022)
---
You received this message because you are subscribed to the Google Groups "hmecology: Hierarchical Modeling in Ecology" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hmecology+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hmecology/20b2720e-f495-4de3-b260-5cd6ee98b1fcn%40googlegroups.com.

Matthijs Hollanders

unread,
Feb 20, 2025, 4:03:38 PMFeb 20
to Tomas T., hmecology: Hierarchical Modeling in Ecology
Hey Tomas,

You're not interested in the frequentist 95% CI around the correlation coefficient of your sample, in this case the 2 vectors of length 15 that you simulated. The correlation coefficient is just a single value that you're computing for each MCMC sample from your posterior. Therefore, when you do this for each MCMC sample, you have draws from the posterior of the correlation between your time varying parameters.

Re: the correlation matrix, I don't think it would slow down too much. If you just model the varying rates as univariate normal distributions, this implies a prior distribution that there's 0 correlation between the parameters. If you model it multivariate normal, you can still express an uninformative prior over the correlation matrix, but at least your prior is that the rates may be correlated.

Hope this helps.

Matt

Tomas T.

unread,
Feb 21, 2025, 6:04:46 AMFeb 21
to Matthijs Hollanders, hmecology: Hierarchical Modeling in Ecology

Hi Matthijs,

You're not interested in the frequentist 95% CI around the correlation coefficient of your sample, in this case the 2 vectors of length 15 that you simulated. The correlation coefficient is just a single value that you're computing for each MCMC sample from your posterior.

No, wrong. Most of the people (that includes the cited references, and also other studies which used this method) are not interested just in a single value of the correlation coefficient. Most people are interested in inference, that is, the probability that the correlation coefficient is positive/negative/different from zero, which is what you can do with the CI of the correlation coefficient returned by cor.test() - and which you aim to do with your resultant posterior distribution. Or, practically equivalently, in frequentistic world, people are interested in p-value - what is the probability that they mistakenly reject the null hypothesis (correlation equals to zero) when it is true.
 
Therefore, when you do this for each MCMC sample, you have draws from the posterior of the correlation between your time varying parameters.

No. Using this method, you have posterior of the mean estimate of the correlation coefficient, which is useless, because you cannot use it for inference of the probability that the correlation coefficient is positive/negative/different from zero. Which is, again, what the cited references and other studies using this method aim to do.

The example I provided should clearly demonstrate how this method could lead to wrong inference and thus wrong conclusions.

Tomas

John C

unread,
Feb 21, 2025, 10:34:48 AMFeb 21
to hmecology: Hierarchical Modeling in Ecology
Hi Tomas,

I'm not sure the example you present is quite analogous to the situations where the technique is used. If I followed the code correctly, it first generates a small data set based on some known variance/covariance parameters, and then creates 5000 new datasets (based on different variance/covariance parameters). This is a little distinct from the focus of the question, but if you simulate 5000 datasets following X[,,i] <- rmvnorm(n, sigma = matrix(c(1, r, r, 1), 2)) and look at the distribution of those correlations, the quantiles are of comparable width to the CI returned by cor.test on the initial X.

But anyway, what the technique is used for in practice is estimating correlations between latent things (parameters, discrete latent quantities, etc.). I think a test that might better assess this would be to:
a) generate a set of parameters (with some known correlation) and quickly estimate their correlation/CI using cortest; 
b) generate data arising from these parameters (and maybe other nuisance things); 
c) fit an MCMC model to the data to estimate the parameters; 
d) then use the technique in question and compare the estimated correlation/CI to the generating correlation and the estimate from a). 
Then repeat many times. 

It's difficult for me to see how d) would generally be more precise than a) where the set of parameters are actually known. But maybe certain priors or certain weakly-identifiable models and similar situations might lead to some cases where d) and a) look different.

Best,

John

Matthijs Hollanders

unread,
Feb 21, 2025, 1:48:24 PMFeb 21
to Tomas T., hmecology: Hierarchical Modeling in Ecology
Hey Tomas,

I think I understand what you're saying. I believe one way of framing it is that you don't think the posterior distribution over the empirical correlation coefficient (i.e., computing the coefficient for each posterior draw) gives you adequate uncertainty over the correlation. This may be correct, but I'm not sure; in some ways it is a hacky solution. But like I said, if you are interested in formally modeling the correlation, assuming your rates (or link-function transformed versions thereof) are multivariate normal distributed will let you actually estimate those correlations, albeit under that assumed parametric model.

What I would suggest is simulating bivariate normal data, then analysing those in (1) a bivariate normal model where you estimate the correlation as a parameter (i.e., the exact data-generating process) and (2) a model with two univariate normal parameters and then estimating the correlation coefficient in the method described in BPA. That would be a better way to test this issue compared to the simulation you provided. This is roughly what John has suggested as well, I believe.

If you do end up doing this, I'd be keen to see the results.

Cheers,

Matt

hender...@gmail.com

unread,
Feb 21, 2025, 2:44:33 PMFeb 21
to hmecology: Hierarchical Modeling in Ecology
Hi all,

Tomas, this is an interesting point you’ve brought up. I thought I'd throw in my 2 cents.

If i'm correct, your example reflects the extremely small measurement error (sd = 0.05) assigned to the 15 data points. If you check the standard deviation of each point’s values across your 5000 MCMC draws (column-wise SDs of ~0.048–0.052), you’ll find almost no variation from one draw to another. In other words, if the model believes each point is pinned down with minimal measurement error, it will naturally infer that the correlation implied by those 15 points is quite certain.

Frequentistically, cor.test() on 15 data points yields a wide confidence interval for the sample correlation i.e., “if we repeatedly collected new datasets of size 15, how variable would the estimated correlation be?” By contrast, a Bayesian approach (like Kéry & Schaub) asks, “Given the model, data, and pzriors, how plausible are different correlation values?” If the model is told there’s almost no uncertainty in each observation, it makes sense how it could produce narrowly distributed posterior for the correlation, even with just 15 points.

I don't think your toy example necessarily show that the BPA 2012 approach is wrong. Instead, i think it just highlights that extremely small measurement error leads to overly confident posterior inferences. Once you allow more realistic measurement error (or other relevant variance components), the distribution of correlation values will widen and better reflect the true uncertainty.

RE: your concern about p‐values and the probability that a correlation is positive or negative. A Bayesian workflow does let you determine how often the posterior distribution of the correlation falls above/below zero. Unlike a single “mean correlation,” you get a distribution of correlation values —one for each posterior draw- which directly answers whether the correlation is likely positive, negative, or near zero. There’s no need for a separate “SE of correlation” inflation step, because the Bayesian draws already capture all uncertainty (provided the model itself is realistic..).

Finally, I like Matt's comment suggesting a more thorough simulation study!

Nick

John C

unread,
Feb 21, 2025, 6:24:48 PMFeb 21
to hmecology: Hierarchical Modeling in Ecology
Oh,  I see what Tomas means just playing around with this quickly. I've repeated the following a few times with different nData and so forth.  Yeah, seems like one should probably should use the correlation across the posterior of posterior-predicted parameters. Kind of like a true Monte Carlo assessment or parametric bootstrap or something.

John

###Start by simulating "parameters"

set.seed(8675309)
np <- 15

r <- 0.25 # pearson r
X <- rmvnorm(n, sigma = matrix(c(1, r, r, 1), 2))
X2<-plogis(X)

cor.test(X[,1], X[,2]) ###0.033, 95 %CI -0.49 - 0.54
cor.test(X2[,1], X2[,2]) ###0.042, 95 %CI -0.48 - 0.54

###Let's say X2[, 1] is survival probability,
###and X2[, 2] is...the probability of a successful nest or something.

nData<-1000
###we'll make the simplifying assumption that in every year,
###we observe an independent set of 100 birds that survive or do not
###and have a successful nest or not.

Data1<-matrix(NA, nData, np)
Data2<-matrix(NA, nData, np)

for (i in 1:np){
  Data1[, i]<-rbinom(nData, 1, X2[i, 1]) ##survived or not
  Data2[, i]<-rbinom(nData, 1, X2[i, 2]) ##nest or not
}

library(nimble)
blah<-nimbleCode({
  mu.phi~dlogis(0, 1)
  sigma.phi~dunif(0, 3)
  mu.gamma~dlogis(0, 1)
  sigma.gamma~dunif(0, 3)
 
  for (i in 1:15){
    logit(phi[i])~dnorm(mu.phi, sd=sigma.phi)
    logit(gamma[i])~dnorm(mu.gamma, sd=sigma.gamma)
    logit(simphi[i])~dnorm(mu.phi, sd=sigma.phi)
    logit(simgamma[i])~dnorm(mu.gamma, sd=sigma.gamma)
    for (j in 1:1000){
      Data1[j, i]~dbern(phi[i])
      Data2[j, i]~dbern(gamma[i])
    }
  }
})

Data=list(Data1=Data1, Data2=Data2)

Sim2<- nimbleModel(code = blah, name = "blah", constants = NULL,
                   data = Data, inits = NULL, calculate=FALSE)

SimMCMC2<-configureMCMC(Sim2, monitors=c("phi", "gamma", "simphi", "simgamma"), useConjugacy = FALSE)

SimRMCMC2<-buildMCMC(SimMCMC2)

SimCMCMMC2<-compileNimble(SimRMCMC2,Sim2)

samples<-runMCMC(mcmc=SimCMCMMC2$SimRMCMC2, niter=20000, nburnin = 10000, thin=20, nchains=1, samplesAsCodaMCMC=FALSE)

corMCMC<-rep(NA, 500)
for (i in 1:500){ ###number of stored MCMC samples
  corMCMC[i]<-cor.test(samples[i, 1:15], samples[i, 16:30])$estimate
}

quantile(corMCMC, probs=c(.025, .5, .975)) ###too narrow

corMCMC2<-rep(NA, 500)
for (i in 1:500){ ###number of stored MCMC samples
  corMCMC2[i]<-cor.test(samples[i, 31:45], samples[i, 46:60])$estimate
}

quantile(corMCMC2, probs=c(.025, .5, .975)) ###about right.

John Clare

unread,
Feb 21, 2025, 6:35:00 PMFeb 21
to John Clare, hmec...@googlegroups.com

Oh, I suppose the distinction is between finite sample inference on the correlation between parameters (the parameters were correlated in these years with data) and otherwise (over many many years not yet sampled we expect the parameters will be correlated as…).

On Feb 21, 2025, at 4:25 PM, 'John C' via hmecology: Hierarchical Modeling in Ecology <hmec...@googlegroups.com> wrote:

Oh,  I see what Tomas means just playing around with this quickly. I've repeated the following a few times with different nData and so forth.  Yeah, seems like one should probably should use the correlation across the posterior of posterior-predicted parameters. Kind of like a true Monte Carlo assessment or parametric bootstrap or something.

Manuel Marton Marc KERY

unread,
Feb 26, 2025, 12:52:17 PMFeb 26
to Matthijs Hollanders, Tomas T., hmecology: Hierarchical Modeling in Ecology
Dear Thomas,

thanks for your email. I am convinced that all is right in the example in the BPA, and something is wrong in your numerical example.

One of the greatest things about a Bayesian analysis with MCMC is the ease with which we can propagate errors into derived quantities: for each posterior draw of the constituents of such a function of one or more parameters, we simply apply that function, and out comes the posterior distribution of that function. This can then be summarized in the usual way, to get a point estimate and an uncertainty assessment with post SD or percentiles. Therefore, taking the posterior draws of two sets of parameters, such as juv and ad survival, and calculating the correlation for each draw of the two vectors, is a valid way of assessing the correlation between them. (Though a more elegant and rigorous perhaps would be to specify a multivariate normal for both and thereby directly estimate the correlation as a parameter, rather than as a derived quantity.)

The problem with your attempt to provide a simple analogy seems to be the "small uncertainty" that you assume around each point. Choosing such a very high precision as you do will completely drive the result that you get. If you make that "small uncertainty" ten times less small, then you get this posterior summary for the correlation:

        5%        50%        95% 
-0.1160775  0.1575047  0.4185037

Most of this has already been said by Matt (and perhaps better and more concisely 🙂...).

Best regards  --- Marc



De: hmec...@googlegroups.com <hmec...@googlegroups.com> en nombre de Matthijs Hollanders <matthijs....@gmail.com>
Enviado: jueves, 20 de febrero de 2025 22:03
Para: Tomas T. <tomas.t...@gmail.com>
Cc: hmecology: Hierarchical Modeling in Ecology <hmec...@googlegroups.com>
Asunto: Re: Calculating derived statistics as in BPA 2012 seems wrong?
 

Jim Baldwin

unread,
Feb 28, 2025, 11:02:08 AMFeb 28
to Tomas T., hmecology: Hierarchical Modeling in Ecology
I'm guessing that you've asked this question at https://stats.stackexchange.com/questions/661840/how-to-calculate-correlation-from-two-vectors-taking-into-account-their-sd-in?noredirect=1#comment1246882_661840 ?

Maybe that content provides some clarification to the issue you have.

Jim


--
*** Three hierarchical modeling email lists ***
(1) unmarked: for questions specific to the R package unmarked
(2) SCR: for design and Bayesian or non-bayesian analysis of spatial capture-recapture
(3) HMecology (this list): for everything else, especially material covered in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle (2016, 2021) and Schaub & Kéry (2022)
---
You received this message because you are subscribed to the Google Groups "hmecology: Hierarchical Modeling in Ecology" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hmecology+...@googlegroups.com.

Tomas T.

unread,
Feb 28, 2025, 11:12:42 AMFeb 28
to Jim Baldwin, hmecology: Hierarchical Modeling in Ecology
Yes, this is my question. I asked this to see if it is possible to use metafor package to do a correlation, where you take the uncertainty of the input vectors into account (some kind of meta-analysis). I also wrote to the author of the package, so far no response. I wanted to add the metafor package to my simulation comparison study. But so far it seems like metafor package can't do it.

BTW, it's not "an issue I have". I am pointing out an issue in the BPA 2012 method.

Tomas

Jim Baldwin

unread,
Feb 28, 2025, 11:59:17 AMFeb 28
to Tomas T., hmecology: Hierarchical Modeling in Ecology
I'm all for asking the same question simultaneously on multiple forums but to be fair/respectful to the experts trying to answer those questions, they need to know what other input and even answers that have already occurred.  Even a suggestion from one forum might help someone in another forum get you an answer.

Jim

Tomas T.

unread,
Feb 28, 2025, 12:27:48 PMFeb 28
to Jim Baldwin, hmecology: Hierarchical Modeling in Ecology
Perhaps we have different perspectives, but I fail to see how asking on multiple forums without cross-linking is not fair/respectful. On the contrary - this is clearly a very confusing topic, so I think it is even better to not cross-link and ask independently, with a clean slate, to avoid complications and confusion.

Tomas

Michael Schaub

unread,
Mar 1, 2025, 4:37:29 PMMar 1
to Tomas T., Jim Baldwin, hmecology: Hierarchical Modeling in Ecology

Hi Tomas,

 

I think it is important to consider the context in which we have used the correlation before suggesting that BPA 2012 is wrong. We fitted an integrated population model and then did a retrospective population analysis to understand the demographic drivers in the population. To do this, we calculated the correlation coefficient between the estimated population growth rate and the estimated demographic rates. Since the population growth rate and the demographic rates had estimation errors, we calculated the posterior distribution of the correlation coefficient. Thus, instead of having a single value, we have a distribution that takes into account the uncertainty in the values used to calculate the correlation coefficient. This is useful, for example, to compare the correlation coefficient between growth rate and survival with that between growth rate and productivity, and to test whether one is greater than the other. Or to test whether the correlation coefficient differs from some fixed value. This limits the inference to the case study, i.e. we can say retrospectively what is the probability that one correlation coefficient was greater than the other, given the estimation uncertainty. If there were no estimation uncertainty, there would be no uncertainty in the correlation coefficients. One would be greater than the other. That is all. We are not making any inferences about other hypothetical populations, so we do not need such a confidence interval, which you can get from (e.g.) cor.test. Without this context, we are comparing apples and oranges.

 

By the way, another and more formal way of doing a retrospective analysis is the 'life table response experiment', one variant of which is a variance decomposition of the population growth rate (Koons et al. 2016, DOI: 10.1111/ele.12628).

 

Kind regards

 

Michael

Tomas T.

unread,
Mar 12, 2025, 10:16:22 AMMar 12
to Michael Schaub, Jim Baldwin, hmecology: Hierarchical Modeling in Ecology
Dear Michael, dear all,

thanks for your response, sorry for taking some time to respond.

I see contradiction in your answer. You speak about specific context which implies that you don't need to consider the uncertainty from (e.g.) cor.test, which is given by number of years over which you correlating. And yet you say that you "calculated posterior distribution of the correlation coefficient". I don't see anything context-specific on the definition of posterior distribution. The posterior is simply a distribution of the true value of the parameter (correlation coefficient) of a process that generated your data, in this case the correlated demographic variables. And this is how you interpret your results, in your example in BPA Fig 11-8 and the cited paper (DOI: 10.1007/s00442-011-2070-5) - you study the correlation as a parameter of a process that generated the population growth and e.g. survival in your population (e.g. "This indicates that demographic components impacting the arrival of new individuals into the populations were more important for their dynamics than demographic components affecting the loss of individuals."). Thus, both the definition of the posterior, as well as your inference, look at your data as one of possible outcomes of that process that generated the demographic parameters.

So, you say to have calculated the posterior distribution, which, as far as I know, has a clear and unambiguous definition. But I think I have clearly demonstrated that the distribution the BPA 2012 method is generating is clearly not the posterior distribution of the correlation coefficient. Did you have a chance to look at the simualtion study I sent in the other thread? (https://groups.google.com/g/hmecology/c/ORy4x8HQLmM/m/szBUY1KgAwAJ). Attaching here again. For example, how would you explain that the CI from the post hoc BPA method is way narrower than when you estimate the correlation directly in the model? See Figure 4 and compare "BPA2012" with "model_est". This figure shows that the BPA method produces way narrower CIs than the other methods, including when you correlate the true values of the demographic parameters. And the Figure 1, showing the coverage of 95% CIs, confirms that whereas the other methods are more or less the same and fine, having approx. 95% of the CIs covering the true value, the BPA method is having way low coverage, which is even worse with more precise estimates. Thus, the Figure 1 shows that the distribution estimated by the BPA method is way narrower than the posterior distribution of the correlation coefficient.

There are two components of uncertainty here - one is given by the number of individuals in each year, which drives the precision of the demographic estimates ("n_ind" in the attached study), and the other is given by the number of years. The BPA method completely ignores the uncertainty given by the number of years. Imagine you have only say 5 years. You cannot really say anything about correlation in that case, yet the BPA method could give you seemingly very precise "posteriors" of the correlation coefficient.

I have provided the evidence that the method doesn't work - with simulations, as you and Marc taught us. Do you have evidence that it works?

Best regards,

Tomas

PS: I was aware of the LTRE method, thanks :-) Seemed to me quite complicated though, so I went for correlations as they are way simpler. But indeed, LTRE is probably better and more sophisticated way to do it.

correlation on posterior distributions-v2.docx
correlation-on-posterior-samples6.R
an2.R

Matthijs Hollanders

unread,
Mar 23, 2025, 12:52:06 AMMar 23
to hmecology: Hierarchical Modeling in Ecology
I completely agree with Tomas that the BPA method does not give an estimate of the correlation coefficient between sets of parameters, as it ignores the "sample size" of the two components where the correlation is computed (i.e., number of years). Even though I think it's technically correct to say that the BPA methods gives some sort of posterior distribution over the empirical correlation between the vectors, I think this actually doesn't have any good statistical interpretation (but maybe it does?). I didn't look all that closely at Tomas's proposed method as I think it's bending over backwards to just fit the appropriate model in the first place, i.e., with a correlation matrix. However, I can see why a posthoc method would be useful if a model took a long time to run. Since ignoring the "sample size" of the correlation coefficient seems to yield overly precise estimates of the correlation, I think it's probably appropriate to update the BPA errata? And hopefully the method hasn't resulted in overconfident conclusions in the literature that have employed this method. Nice catch, Tomas!

Marc Kery

unread,
Mar 27, 2025, 7:06:26 AMMar 27
to hmecology: Hierarchical Modeling in Ecology
Dear Tomas and Matt, all,

OK, after the vacations, we give another try at mentally penetrating the issue and will report back to the group.

Best regards  ---- M & M




From: hmec...@googlegroups.com <hmec...@googlegroups.com> on behalf of Matthijs Hollanders <matthijs....@gmail.com>
Sent: Sunday, March 23, 2025 05:52
To: hmecology: Hierarchical Modeling in Ecology <hmec...@googlegroups.com>
Subject: Re: Calculating derived statistics as in BPA 2012 seems wrong?
 

Tim Lyons

unread,
Apr 11, 2025, 5:31:01 PMApr 11
to hmecology: Hierarchical Modeling in Ecology
Just curious if there is an update?

And to help muddy the waters further:

Explicitly estimating the correlation between parameters sounds like what is done typically when you're looking at the correlation  between structural parameters in a model. Example that comes to mind is in the MARK "Gentle Introduction" appendices. But in those cases, it's the correlation among parameters where the same data is estimating multiple parameters (like S/f in a Brownie recovery or f/Phi in a Pradel model). The MARK examples talks about "doing statistics on statistics" as problematic because of sampling covariance of the parameters.  Does the fact that in the IPM, each data set is only estimating a constituent part matter, or is it still doing "statistics on statistics"?  Am I way off here thinking there is any kind of conceptual relationship here?

Marc Kery

unread,
Apr 12, 2025, 1:19:37 AMApr 12
to Tim Lyons, hmecology: Hierarchical Modeling in Ecology
Dear Tim,

haven't had the time yet, sorry.

Re. your further point(s). First, I don't think it's relevant whether the parameters are estimated from exactly the same data or from multiple data sets with only partly or non-overlapping information content on the parameters. And second, the point about stats on stats is problematic only in analyses with maximum likelihood. In contrast, with Bayesian MCMC-based inference, uncertainty is propagated into all estimands in an analysis in a trivially easy manner --- that's one of the main selling points of MCMC (and therefore, of Bayesian analyses) in fact.

Best regards  --- Marc


Sent: Friday, April 11, 2025 23:31

Matthijs Hollanders

unread,
Apr 12, 2025, 5:55:21 PMApr 12
to Tim Lyons, hmecology: Hierarchical Modeling in Ecology
Hey Tim and others,

I think I thought of a good example of why the empirical correlation coefficient is inappropriate for the modeling task. 

Consider we have 10 groups (sites, surveys) and we want to model them hierarchically. We’d probably assume they come from a normal distribution on the scale of the link function and estimate the mean and SD as well as the group-level realisations. But now imagine you modeled the 10 groups as fixed effects instead, with independent priors. You could still compute an empirical mean and SD of the group-level parameters. But these are not the same as the mean and SD of the underlying distribution. As Tomas has pointed out, we don’t account for the sample size in these empirical computations. 

Matt

Reply all
Reply to author
Forward
0 new messages