[maker-devel] Missing genes in lift-over with est2genome

45 views
Skip to first unread message

Lior Glick

unread,
Apr 21, 2020, 9:09:25 AM4/21/20
to Maker Mailing List
Hello,
I am using MAKER to annotate a plant genome assembly. A high-quality reference genome and annotation exists for another variety of the same species, so my first step is lifting over reference genes to my genome. I do this by setting est2genome = 1 and providing MAKER with the reference cDNA (transcriptome). No other evidence is provided and no prediction is performed. Repeat masking is done using the reference repeats library.
When checking the results, I found out lots of reference genes missing from the lift-over result. However, if I blast the sequences of these genes myself, I get good matches. I even see these matches when I look at the blast results buried in the MAKER data_store.
For example, a transcript of length 1077 got a match of length 855 - 100% identity and no gaps. Bitscore was 1709 and E-value 0. This looks like a pretty good match, but it is not found in the final MAKER results (gff/fasta).
Why is this happening? Are there some cutoffs that are not satisfied? If so, what are they and how can they be configured?

Thanks,
Lior

Carson Holt

unread,
Apr 23, 2020, 1:43:50 PM4/23/20
to Lior Glick, Maker Mailing List
There are percent cutoffs for the est2genome algorithm you can set in the maker_bopts.ctl file. Additionally, maker will give the alignment but not produce a gene model if it can’t translate through the est2genome alignment (i.e. stop codons in the assembly). I believe the cutoff is 50%. If you add est_forward=1 to the maker_opts.ctl file names will be copied from the alignment source and the score in the GFF3 column will be the percent match to the original transcript.

—Carson

> _______________________________________________
> maker-devel mailing list
> maker...@yandell-lab.org
> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
maker...@yandell-lab.org
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org

Lior Glick

unread,
Apr 30, 2020, 12:21:13 PM4/30/20
to Carson Holt, Maker Mailing List
Thanks Carson - your answer was very helpful.
Another question related to the lift-over process, if I may.
I want to take the resulting gff and pass it on to another MAKER run, where I provide further, lower confidence evidence (ESTs and proteins). I'm not sure which option to use though. According to this helpful post, I tried using pred_gff and model_gff, but both created cases of fusion genes when genes are very adjacent to one another (see attached picture), even with the correct_est_fusion parameter enabled. It looks like the only way to take lifted-over genes "as-is" would be to use other_gff, but I figure that this was not really intended for genes. Would you recommend this usage? Am I missing something?
Thank you!

‫בתאריך יום ה׳, 23 באפר׳ 2020 ב-20:43 מאת ‪Carson Holt‬‏ <‪cars...@gmail.com‬‏>:‬
fusion.png
Reply all
Reply to author
Forward
0 new messages