Hi
I have compared the R_hat (Gelman-Rubin) convergence statistic of kabuki (
https://github.com/hddm-devs/kabuki) and pymc. It looks like both have errors, so I also did my own implementation and compared all three on a few runs.
After correcting the errors I found and square rooting the return value of the pymc implementation, I got the same results for all three implementations.
Kabuki seems to have the number of chains and samples mixed up (see below). The pymc implementation match the one in kabuki except from one part. If "V = s2 + B_over_n/m" is skipped, all three implementations provides equal results.
Does anyone have any experience using these?
Below is the code I used for comparing the different implementations. The comments are mostly mine, but the actual calculations are from the source code (except from the expressions marked # CORRECTED).
##########################################
# kabuki/analyze.py, line 285 ("def R_hat")
def R_hat_kabuki(samples):
#n, num_chains = samples.shape # n=num_samples
num_chains, n = samples.shape # CORRECTED
chain_means = np.mean(samples, axis=1)
# Calculate between-sequence variance
between_var = n * np.var(chain_means, ddof=1)
chain_var = np.var(samples, axis=1, ddof=1)
within_var = np.mean(chain_var) # OK (=pymc)
marg_post_var = ((n-1.)/n) * within_var + (1./n) * between_var # = pymc s2
R_hat_sqrt = np.sqrt(marg_post_var/within_var)
return R_hat_sqrt
# pymc/diagnostics.py, line 450 ("def gelman_rubin")
def R_hat_pymc(x):
if np.shape(x) < (2,):
raise ValueError('Gelman-Rubin diagnostic requires multiple chains.')
try:
m,n = np.shape(x)
except ValueError:
return [gelman_rubin(np.transpose(y)) for y in np.transpose(x)]
# Calculate between-chain variance
B_over_n = np.sum((np.mean(x,1) - np.mean(x))**2)/(m-1)
# Calculate within-chain variances
W = np.sum([(x[i] - xbar)**2 for i,xbar in enumerate(np.mean(x,1))]) / (m*(n-1)) # OK (=kabuki)
# (over) estimate of variance
s2 = W*(n-1)/n + B_over_n # = marg_post_var
# Pooled posterior variance estimate
V = s2 + B_over_n/m
V = s2 # CORRECTED
# Calculate PSRF
R = V/W
return R
def R_hat_my_imp(samples):
m, n = numpy.shape(samples) # m = chains, n = samples
# Chain variance
chain_var = numpy.var(samples, axis=1, ddof=1) # degrees of freedom = n-ddof
# Within-chain variance (mean of variances of each chain)
W = 1./m * numpy.sum(chain_var)
# Chain means
chain_means = numpy.mean(samples, axis=1)
#mean_of_means = numpy.mean(chain_means) # all chains have same length
# Variance of chain means
chain_means_var = numpy.var(chain_means, ddof=1)
# Between-chain variance
B = n * chain_means_var
# Weighted average of within and between variance
#(marginal posterior variance)
Var_hat = (float(n-1)/n)*W + B/n
# Potential scale reduction factor
R_hat = numpy.sqrt(Var_hat / W)
return R_hat
##########################################
Does my implementation of the Gelman-Rubin statistic look right?
Does anyone know what the extra B_over_n/m that is added to the estimated variance in the pymc implementation is? Is this from a slightly different version of gelman-rubin, or is it simply a coding/implementation error? (I will also post this to the pymc forum)
Thanks,
Øystein