Hi Brian,
Happy New Year! When calling Launch_PASA_pipeline.pl in "gene_overlap mode", I used "Saccharomyces_cerevisiae.R64-1-1.108.gff3" as the corresponding annotation (i.e., the input to argument --annots).
However, we just realized that "Saccharomyces_cerevisiae.R64-1-1.108.gff3", an official Ensembl annotation for S. cerevisiae, includes annotations for whole chromosomes among its many features; these annotations for whole chromosomes appear as giant chromosome-wide blocks when viewing the gff3 in IGV and other genome browsers (please see attached screenshot). This concerned us because it raises the possibility that, when using gene_overlap mode, all transcripts would be clustered regardless of the percent value supplied to --gene_overlap because all transcripts overlap the chromosome annotation. Do you know if this is the case? If so, then it would appear that I need to edit the Ensembl gff3 to remove the whole-chromosome annotations—and perhaps other features. In that case, do you have any opinions on what features I should specifically retain in the gff3? I was thinking to cut everything out except mRNA annotations? Or maybe that doesn't really matter—especially in comparison to the whole-chromosome annotations.
Anyway, thank you,
Kris