Zero-length branches bug

8 views
Skip to first unread message

Brandon Seah

unread,
Mar 4, 2024, 6:12:12 AM3/4/24
to Phylogenetic Placement
Hello,

My question relates to this open issue on epa-ng here: https://github.com/pierrebarbera/epa-ng/issues/38

When the reference tree has branches of length zero, epa-ng incorrectly reports all those branches as having length 0.1053605157 in the output jplace file.

I understand that having identical sequences doesn't make sense from the perspective of the placement algorithm, since they don't provide any new information and may also split the placement weights.

From a biological perspective, though, they can be meaningful. I'd like to use phylogenetic placement to classify metabarcode sequences and also to evaluate limitations in the reference database, which is a use-case where placement methods have an advantage over conventional classifiers. The reference DB limitations I'd like to be able to diagnose include cases where more than one species has identical reference and cannot be distinguished from each other with this barcode locus.

However, to do so, I'd need to be able to represent these identical sequences (with differing taxonomy) in the jplace tree before passing it to downstream tools like gappa assign.

Has anyone else encountered this situation, or has some kind of workaround? I'd like to avoid having to do a hacky search-and-replace in the jplace file if possible. My current solution is to stick with standard RAxML, but it is slower than epa-ng.

Thank you!
Brandon
Reply all
Reply to author
Forward
0 new messages