Dear BEAST users,
I am currently evaluating whether my RNA virus dataset contains sufficient temporal signal for divergence time estimation in BEAST v2.6.7.
As an initial screening step, I examined the dataset using TempEst. I am working under an uncorrelated lognormal relaxed clock (UCLD) model. To further assess temporal signal, I am considering a Bayesian evaluation of temporal signal (BETS) using path sampling. In addition, I performed a cluster randomization test using TipDatingBeast, following Duchêne et al. (2015), but did not detect significant temporal signal in that analysis.
I would be very grateful for your advice on the following two questions, both of which are important for interpreting my dataset:
I would greatly appreciate any guidance on how to think about these two issues, as well as any relevant references or practical recommendations.
Best regards,
So Noguchi
Dear Lambo,
Thank you very much for your reply.
Yes, I can share the TempEst root-to-tip regression plot. I have attached a screenshot of the current result.
At present, the regression appears weak, with a slope of 1.1735E-2, a correlation coefficient of 0.2801, and an R² of 0.0785. The dated tip range is 28.84 years.
I would greatly appreciate any general comments on whether this level of signal would make you cautious about relying on tip-dating, or whether additional formal testing such as BETS would still be worth exploring.
Best regards,
So Noguchi
Hi Sou,
This is very helpful to share, you can see that there are time points with many samples that have a wide range of divergence. Are these samples that are actually sampled on those days or are you using an arbitrary date for sequences where you might not have the full date?
Also out of curiousity which virus are you looking at? I think that if your data is actually coming from the same days that would be weird so I am guessing it is the latter situation where you are using an arbitrary date. If that is the case I would suggest trying to run TreeTime which has functionality to infer the true date. Once you do that it might give you a better idea of your temporal signal and in BEAST you can set an offset for the tip dates to deal with uncertainty.
Best
Lambo
Hi Lambo,
Thank you very much for your helpful comments.
To clarify, the dates in my dataset are based on the actual collection dates of the samples, not arbitrary or imputed dates. In addition, the dataset was screened for recombination, and recombinant sequences were removed before these analyses.
I would prefer not to specify the virus species at this stage, but it is not a human virus.
Thank you also for your suggestions regarding TreeTime and tip-date uncertainty.
Best regards,
So Noguchi
Hi Lambo,
Thank you very much for your suggestions. I have already checked the residuals and the possible outlier in the mid-1990s sample.
My main question is still about the interpretation of the dating analyses:
(1) whether it is acceptable to perform BETS with an informative prior on ucld.mean, and
(2) how to interpret a case where cluster-based DRT is negative but BETS supports temporal structure.
If you have any thoughts on these points, I would greatly appreciate it.
Best,
So Noguchi
Hi Kehinde,
Thank you for your interest.
At this stage, I am thinking of moving forward by progressively relaxing the prior constraint on ucld.mean and checking whether support for temporal structure remains stable across those settings. If the result is still supported under less restrictive priors, I would regard that as evidence that the inference is not being driven solely by the prior.
At the same time, because my cluster-based DRT did not support significant temporal signal, I would still interpret the result cautiously rather than treat it as definitive proof of strong temporal signal.
Best,
So Noguchi