Which database to use?

32 views
Skip to first unread message

Karolina Ferreira Rodrigues

unread,
Dec 18, 2024, 5:48:21 PM12/18/24
to ptp-species-...@googlegroups.com
Dear all, I have a question about what a database should be like to generate the input tree for the bPTP and GMYC methods. In some studies, I noticed that they use the entire database and others use only unique sequences (representative of haplotypes). I'm still learning how to use the method. If you can help me, I will be very grateful.
--
Karolina F. Rodrigues
Licenciada em Ciências Biológicas
Mestranda em Ecologia Aquática e Pesca

Universidade Federal do Pará
Programa de Pós-graduação em Ecologia Aquática e Pesca (PPGEAP)
Grupo de Investigação Biológica Integrada (GIBI)

Alexandros Stamatakis

unread,
Dec 19, 2024, 4:53:42 PM12/19/24
to ptp-species-...@googlegroups.com
I assume that there is no clear answer to this, I would personally only
include sequences that are unique, i.e., no exactly identical sequences
as it makes mathematically absolutely no sense to include them.

I guess you will need to experiment a bit.

Personally, I would also infer a tree on the dataset (or rather MSA)
version and pass that has the lowest phylogenetic difficulty as this
indicates that the phylogenetic signal will be strongest.

The phylogenetic difficulty can easily be computed with this tool here:

https://academic.oup.com/mbe/article/39/12/msac254/6832260

Alexis

On 19.12.24 00:48, Karolina Ferreira Rodrigues wrote:
> Dear all, I have a question about what a database should be like to
> generate the input tree for the bPTP and GMYC methods. In some studies,
> I noticed that they use the entire database and others use only unique
> sequences (representative of haplotypes). I'm still learning how to use
> the method. If you can help me, I will be very grateful.
> *--*
> *Karolina F. Rodrigues*
> Licenciada em Ciências Biológicas
> Mestranda em Ecologia Aquática e Pesca
>
> Universidade Federal do Pará
> Programa de Pós-graduação em Ecologia Aquática e Pesca (PPGEAP)
> Grupo de Investigação Biológica Integrada (GIBI)
> *
> *
>
> --
> You received this message because you are subscribed to the Google
> Groups "PTP and GMYC species delimitation" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to ptp-species-delimi...@googlegroups.com
> <mailto:ptp-species-delimi...@googlegroups.com>.
> To view this discussion visit
> https://groups.google.com/d/msgid/ptp-species-delimitation/CABPdfMd-o3eO5Aqp4Ee0ZxngjZ_xPm%2BNsGhHBiWFLS8fOtSyTA%40mail.gmail.com <https://groups.google.com/d/msgid/ptp-species-delimitation/CABPdfMd-o3eO5Aqp4Ee0ZxngjZ_xPm%2BNsGhHBiWFLS8fOtSyTA%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--
Alexandros (Alexis) Stamatakis

ERA Chair, Institute of Computer Science, Foundation for Research and
Technology - Hellas
Research Group Leader, Heidelberg Institute for Theoretical Studies
Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.biocomp.gr (Crete lab)
www.exelixis-lab.org (Heidelberg lab)

Andréa Franco

unread,
Jan 18, 2025, 9:40:33 AM1/18/25
to Alexandros Stamatakis, ptp-species-...@googlegroups.com
Hi everyone,
 Regarding the dataset used as input of models to delimit species. I empirically tested which type of dataset is most suitable.The PTP works with the input tree built with a balanced dataset –the same or similar number of sequences studied per clade or the tree built with only haplotypes. It
can be useful to compare the results of the PTP analysis using both types of datasets. The tip that we can apply to different models is to include in your input dataset sequences of at least one closely related species to the studied clade or taxa of interest. This closely related species must have the taxonomy solve. This study was published in the Journal Eukaryotic Microbiology :  https://doi.org/10.1111/jeu.12986
Best regards,
Andréa

To unsubscribe from this group and stop receiving emails from it, send an email to ptp-species-delimi...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ptp-species-delimitation/02234ca2-2af5-41f5-8fb5-dd1c7b3edc6b%40gmail.com.
Reply all
Reply to author
Forward
0 new messages