I am trying to use EVM to resolve fragmentation in the annotation of a conifer species (douglas fir). I am inputting a Braker gff with 45334 genes, a protein alignment gff with 33884 genes, and transcripts from Maker (13794 genes) and Illumina (23410 genes) which include partial genes. EVidence Modeler is only returning 25902 genes and has shortened many introns and genes. I have attached an excel file with comparisons of statistics about genes from both files, but I am wondering why so many of my genes are being removed (20357), without overlapping any of the new EVidence Modeller genes. I am also wondering why it seems many of my genes are being shortened. I am currently weighting my genes 4 and proteins and transcripts 6. Is it necessary to use PASA to get EVM to work for my data?
Thank you for any help,
-Alyssa