Hello Emma,
If everything is done correctly (including the using maximum likelihood tree as input), the time range (or sampling window) of your data can be one of the main reason for low R2. But, low R2 also means high rate variation among branches, and then a relaxed clock is more preferable. High R2 values, in turn, indicate that strict clock is more applicable.
Nevertheless, wider sampling window might change nothing in the case of influenza. As I see from the literature,
coefficient of rate variation for influenza data sets is above 1.0, so researchers applied relaxed clock as I expected (
https://doi.org/10.1016/j.ympev.2019.01.019). When coefficient of rate variation is high, R2 in TempEst tends to be low.
Also, TempEst does not detect temporal signal in data, it is just a tool for assessing the degree of clock-like behavior of data. To test temporal signal reliably, you should use whether a permutation test, where dates are randomly permuted between sequences, or Bayesian evaluation of temporal signal:
https://beast.community/bets_tutorial.
Best,
Artem
вторник, 9 апреля 2024 г. в 14:20:10 UTC+8, Emma Wang: