GTF vs. GFF

Bishwa Kiran

unread,

Jun 11, 2017, 9:36:52 AM6/11/17

to rna-star

Hi Alex,
I have received an updated GFF files for my organisms, but no changes in ref genome. So, I am having some debate if I should rerun my analyses with new GFF (I had used GTF in previous run). I had some questions and have answered it from my experience on bioinformatics. But, could you verify and give some suggestions:

     1) Is there any differences in outcome when using GTF vs. GFF.

         I think there will be no difference in alignement, since STAR aligns first to the reference genome and if transcriptomeBAM is need it will then search for corresponding alignement (from genome) with gene/exon boundries in gtf/gff files. But, would there be any advantage in using GTF vs. GFF files.

     2) Also, how much alignment differences in the genome could we expect if we use two different version of GTF/GFF with the same reference genome and same RNAseq data.
         - Because STAR first aligns the data with ref genome first, I don't expect to see (or very less difference) when using same ref genome, RNAseq data but different gtf.
         - But, I expect to see changes in gtf/gff boundries if 'SJ.out.tab' files are supplied during 2nd-pass mapping. But, still the final SJ.out.tab boundries should be the same, because its same RNAseq data.

Alexander Dobin

unread,

Jun 12, 2017, 3:36:10 PM6/12/17

to rna-star

Hi Bishwa,

STAR uses the splice junction information in the GTF/GFF files to improve alignments of reads spliced through annotated junctions.

Particularly, you will see significantly increased sensitivity for the splices with short overhangs. The best way to see it is to run mapping without and with annotations, and compare the number of splices in the Log.final.out file.

The difference between the two annotations will be determined by how many more junctions are annotated in the new GFF vs old GTF, and how many reads map to those junctions.

Cheers

Alex

Bishwa Kiran

unread,

Jun 13, 2017, 4:13:47 PM6/13/17

to rna-star

Thanks much. I think I will have to rerun my whole analyses. The new GTF/GFF is significantly improve, which good amount of new genes and some old annotations are removed.

Thanks,

Reply all

Reply to author

Forward