[maker-devel] Improving BUSCO stats

86 views
Skip to first unread message

Kyungyong Seong

unread,
May 25, 2021, 1:19:16 PM5/25/21
to maker...@yandell-lab.org
Hi 

The BUSCO statistics obtained from my genome seems to be decent with 97.3% completeness (-m geno). I am having problems generating genome annotation sets that show comparable BUSCO completeness (-m prot). Currently, completeness is around 88%, and iterative MAKER annotation is not significantly increasing this value. 

I started with prot2genome and cdn2genome alignments. I then trained AUGUSTUS with BUSCO gene sets and SNAP with the predicted gene models with good quality, and ran MAKER without prot/cdn2genome. The third run was with newly trained AUGUSTUS and SNAP, which only increased the BUSCO completeness by 2%. I imagined that single copy orthologs would be well supported by evidence and may be relatively easy to predict as well. I wasn't quite sure what is happening. Would you have any advice?

Thank you!
Kyungyong




Carson Holt

unread,
Jun 8, 2021, 4:10:47 PM6/8/21
to Kyungyong Seong, maker...@yandell-lab.org
It may be insufficient evidence. You can scan the rejected Augustus/Snap models for known protein domains using InterProScan and add them back to the final set (model rescue). Info in this paper https://www.yandell-lab.org/publications/pdf/maker_current_protocols.pdf (see Basic Protocol 5).  If model rescue does not improve it, then you may have genes split across short contigs.  In that cases there is not enough sequence for the gene predictors to call a model, but there is enough to generate a BUSCO match.  If that’s the case, you would have to improve the assembly to recover the models.

—Carson


_______________________________________________
maker-devel mailing list
maker...@yandell-lab.org
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org

Reply all
Reply to author
Forward
0 new messages