Hi dadi-users!
Apologies if this is a very simple question, but I wanted to check if I've done the LRT for nested models correctly given two slightly different situations of a parameter on the boundary vs the interior of the parameter space.
For comparing the 2D models no_mig and sym_mig, I set m = 0 in the simple model and used (0.5,0.5) weights for a single parameter on the boundary (as in the example here:
https://dadi.readthedocs.io/en/latest/examples/YRI_CEU/YRI_CEU/). But when doing the LRT for sym_mig vs asym_mig, I've set m21 and m12 to be the same optimized value in the simple model i.e., m (sym_mig) before running it through the complex model function (asym_mig) and used weights (0,1) since the parameter is on the interior - is this correct? (code below). Also, if it's better to use the optimized params for the complex model in
dadi.Godambe.LRT_adjust- how would I do this for sym_mig vs asym_mig?
LRT for 'no_mig' and 'sym_mig' models:
data = dadi.Spectrum.from_file(AI_EP_2d_linked_sfs_folded.fs)
ns = data.sample_sizes
pts_l = [max(ns)+20, max(ns)+30, max(ns)+40]
func_ex_simple = dadi.Numerics.make_extrap_log_func(dadi.Demographics2D.no_mig)
func_ex_complex = dadi.Numerics.make_extrap_log_func(dadi.Demographics2D.sym_mig)
# Optimized parameters for simple model (nu1, nu2, T)
p0_simple = [1.24, 1.23, 0.36]
# Optimized parameters for the complex model (nu1, nu2, m, T)
p0_complex = [1.85, 1.80, 0.20, 1.27]
# Buffered parameters for the simple->complex model (set m = 0)
p0_simple_buffered =
[1.24, 1.23, 0, 0.36]
nested_indices = [2]
# Log-likelihood for simple model with optimized parameters
model_fit_simple = func_ex_simple(p0_simple, ns, pts_l)
ll_simple = dadi.Inference.ll_multinom(model_fit_simple, data)
# Log-likelihood for complex model with optimized parameters
model_fit_complex = func_ex_simple(p0_complex, ns, pts_l)
ll_complex = dadi.Inference.ll_multinom(model_fit_complex, data)
# Calculate adjusted D (using optimized params for simple model)
adj = dadi.Godambe.LRT_adjust(func_ex_complex, pts_l, boot_dir, p0_simple_buffered, data, nested_indices, multinom=True)
D_adj = adj*2*(ll_complex - ll_simple)
weights = (0.5,0.5) # For 1 nested parameter on boundary of parameter space (m = 0)
pval = dadi.Godambe.sum_chi2_ppf(D_adj, weights)
LRT for 'sym_mig' and 'asym_mig' models:
data = dadi.Spectrum.from_file(AI_EP_2d_linked_sfs_folded.fs)
ns = data.sample_sizes
pts_l = [max(ns)+20, max(ns)+30, max(ns)+40]
func_ex_simple = dadi.Numerics.make_extrap_log_func(dadi.Demographics2D.sym_mig)
func_ex_complex = dadi.Numerics.make_extrap_log_func(dadi.Demographics2D.asym_mig)
# Optimized parameters for simple model (nu1, nu2, m, T)
p0_simple = [1.85, 1.80, 0.20, 1.27]
# Optimized parameters for the complex model (nu1, nu2, m12, m21, T)
p0_complex = [1.66, 1.96, 0.32, 0.09, 1.24]
# Buffered parameters for the simple->complex model (set m12 and m21 = m?)
p0_simple_buffered =
[1.85, 1.80, 0.20, 0.20, 1.27]
nested_indices = [3]
# Log-likelihood for simple model with optimized parameters
model_fit_simple = func_ex_simple(p0_simple, ns, pts_l)
ll_simple = dadi.Inference.ll_multinom(model_fit_simple, data)
# Log-likelihood for complex model with optimized parameters
model_fit_complex = func_ex_simple(p0_complex, ns, pts_l)
ll_complex = dadi.Inference.ll_multinom(model_fit_complex, data)
# Calculate adjusted D
(using optimized params for simple model)
adj = dadi.Godambe.LRT_adjust(func_ex_complex, pts_l, boot_dir, p0_simple_buffered, data, nested_indices, multinom=True)
D_adj = adj*2*(ll_complex - ll_simple)
weights = (0, 1) # For 1 nested parameter on interior of parameter space (m != 0).
pval = dadi.Godambe.sum_chi2_ppf(D_adj, weights)
Thank you so much!
Cheers,
Skye