Hello,
I am using MAKER to annotate a plant genome assembly. A high-quality reference genome and annotation exists for another variety of the same species, so my first step is lifting over reference genes to my genome. I do this by setting est2genome = 1 and providing MAKER with the reference cDNA (transcriptome). No other evidence is provided and no prediction is performed. Repeat masking is done using the reference repeats library.
When checking the results, I found out lots of reference genes missing from the lift-over result. However, if I blast the sequences of these genes myself, I get good matches. I even see these matches when I look at the blast results buried in the MAKER data_store.
For example, a transcript of length 1077 got a match of length 855 - 100% identity and no gaps. Bitscore was 1709 and E-value 0. This looks like a pretty good match, but it is not found in the final MAKER results (gff/fasta).
Why is this happening? Are there some cutoffs that are not satisfied? If so, what are they and how can they be configured?
Thanks,
Lior