RogueNaRok: can we exclude all taxa with rogue behaviour above a certain threshold?

54 views
Skip to first unread message

Jamie Thompson

unread,
Jun 21, 2020, 4:56:27 AM6/21/20
to raxml
Hiya,

I read a paper that found rogues and excluded only those with a raw improvement of 0.5.  I would like to do this, however, I heard from someone else that this can't be done and you have to exclude multiple rogues in the order of the output.  Could you advise which is correct?

Best,
Jamie

Alexandros Stamatakis

unread,
Jun 21, 2020, 11:05:09 PM6/21/20
to ra...@googlegroups.com
Hi Jamie,

You can do this if you use the web-server at https://rnr.h-its.org/ as
you will get an ordered list of taxa.

I am attaching a screenshot.

Alexis
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to raxml+un...@googlegroups.com
> <mailto:raxml+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raxml/d84b35ad-fb60-4552-89e0-e5c9c4111481o%40googlegroups.com
> <https://groups.google.com/d/msgid/raxml/d84b35ad-fb60-4552-89e0-e5c9c4111481o%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org
rnr-server.png

Grimm

unread,
Jun 22, 2020, 3:38:48 AM6/22/20
to raxml
Morning Jamie,

 
Could you advise which is correct?

There is no "correct" and "wrong" here. It depends to what end you want to filter the rogues for. Whether an OTU goes rogue, and too which extent, can have many reasons (lack of discriminate signal, being primitive, i.e. acting like an actual ancestor, incomplete lineage sorting issues, and being the product of reticulation).

So, if you data is messy producing a tree with poor branch support and you don't have the luxury to bother about the density of the tip sample (number and coverage of OTUs), using a threshold is a consistent way to do it. Since there are so many reasons for going rogue (see below), there is no actual fixed cut-off value; so one would always be better off with using more than one and see how the topology changes (e.g. in Alexi's example screenshot, 0.4 and 0.7 would be cut-offs, I'd go for rather than sticking to 0.5)

But if you want to assess how much individual rogues bug the tree inference, you do an iteration inference dropping the most roguish in the list and stop when the tree is clear (e.g. all branches supported by BS ~ 100).

Which one is the better choice also depends mainly on the basic tree-likeness of the data. RogueNaRok tests the behaviour in direct relation to the data set and the (in)capacity of the data to find a tree. E.g. when there's only a few OTUs messing up an otherwise trivial tree, they will be very easy to spot in RogueNaRok's output (like in Alexi's example screenshot) and there would be no reason for thinking about applying a thresholds. But if your matrix does not produce very treelike signal at all, i.e. you sticked your finger in a rogue's nest, RogueNaRok may come up with a more or less continuous list and one may be better off comparing topologies using different cut-offs and increasingly reduced OTU sets. For instance if you have a noisy matrix, high cut-off threshold may filter for deep signal, while low will give you less biased terminal subtrees – the basic assumption being here that evolution becomes messier (less treelike) the closer we get to the tips of the Tree of Life, but in the long run sorts out towards a coalescent.

Good de-roguing, Guido

PS principal tree-likeness of the matrix and its OTUs can be quickly assessed using Delta Values (Holland et al. 2002. Mol. Biol. Evol. 19:2051-2059), here you can find a simple programme: dist_stats to calculate them.

RAxML8 has a ML distance export using the optimised model (doesn't seem to be yet implemented in RAxML-NG, only topological distances). You can't read in RAxML's output directly because it is a list of pairwise distance, but input for dist_stats has to be a distance matrix in extended PHYLIP format. Then compare how the iDV, the "individiual Delta Values" match with RogueNaRok's score (on the dist_stats page you find links to papers that made use of Delta Values.

In a messy matrix, you'll have iDVs ≥ 0.25 and building a continuum, in a tree-like iDVs will typically be ≤ 0.15 and a few outliers identified as obvious rogues in the RogueNaRok output (the values are just very rough rules of thumbs, Delta Values are affected by taxon sampling and number of sequence patterns in the matrix).
Message has been deleted

Jamie Thompson

unread,
Jun 22, 2020, 6:08:47 AM6/22/20
to raxml
Hi Alexis,

Thanks for the reply, very useful.

Yesterday I ran RogueNaRok in the command line, following the "Basic rogue taxon search" from Github, and received the output.  I then ordered by rawImprovement in Excel, and removed all the taxa above 0.5.  This is where my concern lies, because I reordered the output by rawImprovement.  The previous paper that used a 0.5 threshold (https://www.sciencedirect.com/science/article/abs/pii/S1055790313002947) only removed one taxon, but if my set of taxon above 0.5 rawImprovement is not sequential in the RBIC column, then should I not use a rawImprovement threshold?

Thanks,
Jamie

On Monday, June 22, 2020 at 4:05:09 AM UTC+1, Alexis wrote:
Hi Jamie,

You can do this if you use the web-server at https://rnr.h-its.org/ as
you will get an ordered list of taxa.

I am attaching a screenshot.

Alexis


On 21.06.20 11:56, Jamie Thompson wrote:
> Hiya,
>
> I read a paper that found rogues and excluded only those with a raw
> improvement of 0.5.  I would like to do this, however, I heard from
> someone else that this can't be done and you have to exclude multiple
> rogues in the order of the output.  Could you advise which is correct?
>
> Best,
> Jamie
>
> --
> You received this message because you are subscribed to the Google
> Groups "raxml" group.
> To unsubscribe from this group and stop receiving emails from it, send
Reply all
Reply to author
Forward
0 new messages