Hi Ryan,
I am using dadi-cli to generate my model data,
successfully I hope! However, as statistics is quite new to me, I have some
questions regarding the results. It would be very helpful to get
some rules of thumb on how to handle the following:
Briefly, I am working with pseudo-diploidized,
subsampled, and composite RadSeq data that has been cleared of paralogues.
I have tested many models, and most yield similar results to the one appended
here.
- Uncertainty
in theta: For all my parameters except theta, I get reasonable
confidence intervals (CIs). Theta, however, is always 'unconfident' (wide
CIs). Is this acceptable? I understand theta is used primarily to
calculate real-world values (times, population sizes, etc.), but can I
still trust the rest of my demographic parameters? Is there a detectable
reason why theta is consistently so uncertain? I have seen this behavior
even with datasets containing paralogues, as well as on projected and
unlinked datasets.
- Step
Sizes: The confidence intervals for the parameters differ by an order
of magnitude depending on the step size used. Is this normal?
Should they be approximately the same, or is it okay to select one step
size and present that as the result?
- Small
CIs: Sometimes the confidence intervals are so small they
seem biologically unrealistic. Is that okay? I’ve heard this can be
due to a 'mathematical collapse' when a likelihood peak is too sharp, but
I am unsure how to interpret this.
- Model
Comparison: Is there a way to compare models generated by dadi-cli on
composite data using their log-likelihoods? I know it should be
possible to use CLAIC for non-nested models
and LRT for nested models, but I cannot find a way to perform
these in dadi-cli. Can these be inferred from the GIM (Godambe
Information Matrix) results? If dadi-cli doesn’t support this
directly, what is the best alternative?
Any suggestions or guidance would
be greatly appreciated.
Thank you in advance and all the best,
Hana