Unexpected extra exons in MAJIQ 2.5 output and UCSC link error

26 views
Skip to first unread message

Alex PG

unread,
Oct 28, 2025, 11:03:24 AMOct 28
to Biociphers
Hi San and Biociphers community, 

I’ve been running MAJIQ version 2.5 on a small dataset consisting of six samples: three pairs of normal colon tissue and polyp tissue. I used the comprehensive gene annotation (CHR) GFF3 file from Gencode: https://www.gencodegenes.org/human/ 

In the VOILA output, I noticed that several genes show more exons than are present in the annotation file.
For example (attached), in one gene [ABHD8] that should have 5 exons, VOILA displays 8. 
According to IGV (attached), 2 of the additional exons seem to be part of a different gene [MRPL34].  I have search for MRPL34  in the viola output but this gene is not included. How do you suggest to interpret it? 

This pattern appears to persist across other genes as well. Interestingly, I didn’t encounter this issue when running the workshop example dataset.

Do you have any suggestions on how to interpret these extra exons or how to handle this discrepancy?

Additionally, when I try to access the UCSC link from the output, I receive the following error:
Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.
Do you have any advice on how to fix or troubleshoot this issue?

Thanks in advance for your help! And sorry if it is too simple, I am still new to Majiq. 

All the best,
Alex 
voila_1.png
igv_1.png

Matthew Gazzara

unread,
Oct 28, 2025, 12:20:32 PMOct 28
to Biociphers
Hi Alex,

Thanks for your interest. Transcriptome annotations are always confusing, so no worries there. 

Is your RNA-seq data stranded or unstranded? In early steps of the MAJIQ build reads from overlapping genes are assigned to positive (e.g. MRPL34) and/or negative (e.g. ABHD8) strand transcripts. If the data is strand specific you can tell MAJIQ this in the build step and it will more accurately assign reads. If its unstranded things are a bit more tricky and I believe the algorithm has changed slightly between v2 and v3. MRPL34 likely does not show up in your output because there are no LSVs detected for this gene. 

As for the number of exons are you using the "gencode.v49.annotation.gff3" file? This can help me figure out where your extra exon annotations may be coming from. In the first steps MAJIQ collapses the various transcript versions of a gene to form a splicegraph. You can see in the "Annotation" drop down menu the various individual transcripts MAJIQ saw and collapsed to make your splicegraph. The one displayed (ENST00000247706.4) looks like the canonical / MANE version of ABHD8 and it does have the five exons you would expect.

From the v49 file specifically I see a transcript that includes what's labeled as exon 4 in your Voila image (ENST00000594194.1). If you select this transcript from the annotation drop down, it should include the "exon 4" region. I don't see what is labeled as exon 1 and exon 2 in your splicegraph, however. These exons and their junctions being grayed out indicates they were present in the annotation, but not in your data. They should not be affecting the quantification of anything. 

Finally, the 8 read red junction from "Normal Combined" / 16 read red junction from "Polyp Combined" in the splice graph likely is from MRPL34 on the other strand (see highlighted regions in attached screen shot). If you give me the exact coordinates I can confirm. 

Again, if your RNA-seq data was strand specific the reads could better be assigned in this case. If it is not it's possible that v3 read assignment may handle this better for unstranded data. 

Finally regarding the UCSC link error let me see if @San can make any sense of that

All the best,
Matt Gazzara
Screen Shot 2025-10-28 at 12.17.44 PM.png

San Jewell

unread,
Oct 29, 2025, 12:17:22 PMOct 29
to Biociphers
Hi Alex, 

For the UCSC link, can you paste the link here? (right click, copy link). In general, the 500 error means there is an issue on the end of UCSC, not with your request, but it's also possible that there is some weirdness in the way it's being formatted from voila which might trigger an incorrect 500 response. (which should be a 400 response if there was an error in the request but a lot of web apps may accidentally mess this up) 

Thanks, 
-San

Alex PG

unread,
Nov 3, 2025, 11:51:09 AM (10 days ago) Nov 3
to Biociphers
Hi Matt and San,

Sorry for the delay! I have realised that my reply from last week was not saved. 

Thank you very much for your helpful reply!

To answer your questions:

The RNA-seq data is stranded. I’ve updated the configuration file accordingly to include the strandness information (see attached). However, the results look very different. Do you mind clarifying what is the correct way to specify strandness in the configuration file?

I used gencode.v49.annotation.gff3 (the comprehensive gene annotation, chr version, from gencodegenes.org/human) Is this the correct annotation file, or would you recommend a different one?

I was also wondering what it means when splicing events in the splicegraph don’t show any read counts (for example, exon 1 and exon 2 in the viola_1 file I initially attached)?

Regarding the red junctions, here are the exact coordinates: chr19:17,292,854–17,296,716. I’ve attached a screenshot of the coordinates to confirm we’re referring to the same region.

Sorry for all the questions, but could you please let me know where I can find more information on how the final list of genes is generated (since it only includes around 200 genes/LSVs)? Also, which output file should contain the splicing events that show the most statistically significant differences between the two groups?

It also doesn’t seem to work when I use Majiqlopedia. Here’s a possible link
https://tools.biociphers.org/majiqlopedia_normal/generate_ucsc_link?gene_id=gene%3AENSG00000158467

Thank you so much for all your help! I really appreciate it!

All the best,
Alex


coordinates.png
majiq_build_config_batch1_reverse.ini
Reply all
Reply to author
Forward
0 new messages