e-values on homology searches

54 views
Skip to first unread message

lucilam...@gmail.com

unread,
Jun 9, 2021, 11:50:17 AM6/9/21
to TransDecoder-users
Hi Brian,

First of all, thank you so much for this great tool and for your technical support!

I have a question to ask you regarding the homology searches. I have assembled my transcriptome with Trinity and ran the first step of Transdecoder, finding out that my database of longest_orfs.pep is quite big (~100K predicted orfs). Because of that, I'd like to control the reported hits in the output of the homology searches to avoid non-significant results. In the case of blastp, I set -evalue to 1e-50. But hmmscan has several parameters and I am a bit confused of which would be the best strategy. I have tried running it as follows:

hmmscan --cpu 8 --domtblout pfam.domtblout -E 1e-50 Pfam-A.hmm longest_orfs.pep > pfam.log

with default inclusion thresholds and default --domE value. Do you think that it would be better to set other thresholds besides -E?

Thank you so much for your help.
Best,
Lucila.

Brian Haas

unread,
Jun 9, 2021, 12:31:42 PM6/9/21
to lucilam...@gmail.com, TransDecoder-users
Hi Lucila,

The blast and pfam searches as part of TransDecoder are really aimed at controlling false negatives rather than false positives, but I suppose you could do some additional filtering to eliminate orfs that you deem as not worthy based on your criteria.

For blastp, we tend to use e-values of 1e-5.  Note, for faster searches you could try diamondblast.

For pfam searches, I hear hmmsearch is way faster than hmmscan, so you might try that instead too.  Wrt e-value cutoffs, I usually stick to the defaults.  Pfam searches usually leverage their built in domain-specific thresholds for reporting purposes, so it's not generally a one p-value fits all like with blast (afaik).

hope this helps,

~b

--
You received this message because you are subscribed to the Google Groups "TransDecoder-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to transdecoder-us...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/transdecoder-users/f9458720-98fc-4e6e-a168-d0e85a178356n%40googlegroups.com.


--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

lucilam...@gmail.com

unread,
Jun 9, 2021, 2:28:57 PM6/9/21
to TransDecoder-users
Thank you so much Brian! You were very helpful.
Best,
Lucila

Reply all
Reply to author
Forward
0 new messages