GFF3

56 views
Skip to first unread message

Nate Hansen

unread,
Apr 22, 2025, 4:39:39 PMApr 22
to Biociphers
Hello,

I have been trying to run MAJIQ on a more recent human ensembl build but have bene running into issues with the conversion and making sure it is perfect for majiq.

I know you guys already have hg19 available for download since it was in one of the papers, but I was wondering if you also have hg38 available to be downloaded somewhere as this would make running majiq on the updated build much easier.

Thanks, 
Nate

Nate Hansen

unread,
Apr 23, 2025, 2:58:36 PMApr 23
to Biociphers
Following up on this, I do not see a script available that converts either a gtf or gff3 from ensembl to the specific format required for MAJIQ. I am trying to run against GRCh38. Please advise, thank you!

San Jewell

unread,
Jun 4, 2025, 11:46:05 AMJun 4
to Biociphers
Hi Nate,

In general you should be fine with using the gff3 that ensembl provides (i.e. https://ftp.ensembl.org/pub/release-114/gff3/homo_sapiens/Homo_sapiens.GRCh38.114.gff3.gz)

Some annotations do produce warnings like "Error, incorrect gff. exon doesn't have valid transcript" ; however, this does not indicate that the build will produce invalid results, rather, there exist a few exons which don't have a proper transcript specified in the annotation, so these exons will not be used downstream. For example, I see that the ensembl annotation has approximately 40 orphan exons. However, the rest of the annotation is usable and it should generally not be a cause for concern. Let me know if that answers your question.

Thanks,
-San

Nate Hansen

unread,
Jun 24, 2025, 1:52:34 PMJun 24
to Biociphers
Hi San,

Thank you for the response. I have been using the gff3 that ensembl provides, but my bam files have a "chr" to label the chromosomes whereas the gff3 file does not contain that prefix and I am identifying 0 LSVs. Is that difference in the chromosome annotation the reason why no LSVs are identified?

Thanks,
Nate

Nate Hansen

unread,
Jun 24, 2025, 2:32:25 PMJun 24
to Biociphers
Hi San,

Looks like that did seem to be the issue and when I matched those up everything ran smoothly.

Thanks,
Nate

Reply all
Reply to author
Forward
0 new messages