Numbers of rMATS Events reported tend to bias toward Exon Skipping Events

19 views
Skip to first unread message

X L

unread,
Jan 13, 2026, 10:09:00 AMJan 13
to rMATS User Group
Dear rMATS users and developers,

I’ve been using rMATS to analyze alternative splicing events from RNA-seq data, and I’ve consistently observed that the number of total and significant skipped exon (SE) events is by far the highest, while intron retention (RI) events are consistently the lowest (example report attached).

From the literature, this pattern appears common—many studies using rMATS also report SE as the predominant event type. However, I’m puzzled by the low frequency of RI events. Even in datasets generated from poly(A)-selected (poly-dT enriched) mRNA, it’s not uncommon to observe reads mapping within introns or spanning exon–intron boundaries (e.g., reads overlapping splice sites), which would seem to support the detection of intron retention.

I’m wondering: Is this skew toward SE and away from RI due to a biological reality, or
Could it reflect an algorithmic bias in rMATS (e.g., in how RI events are defined, filtered, or detected)?

I’d greatly appreciate any insights or references you might have on this topic.

Best regards,
Xiao
summary.txt

kutsc...@gmail.com

unread,
Jan 14, 2026, 8:58:20 AMJan 14
to rMATS User Group
rMATS will only detect retained intron events that are mostly annotated in the --gtf. The detection of RI events is described in this issue: https://github.com/Xinglab/rmats-turbo/issues/17

For rMATS to detect a RI event, it basically requires an annotated transcript where one of the exons is actually the intron combined with the two exons. Having some reads that map in the intron region isn't enough for rMATS to detect a RI event (but it will count those reads if the event is detected)

rMATS can detect novel SE events as long as the individual exons are annotated. Then rMATS just needs to see reads that support the different junctions

Eric

X L

unread,
Jan 14, 2026, 9:58:09 AMJan 14
to rMATS User Group
Hi, Eric,

Thank you very much for your reply and for the reference to the post.

Regarding GTF-annotated intron retention events, do you mean the "retained_intron" transcript type specified in GTF files? For example, I can see these lines in my GTF:

chr7 HAVANA gene 98214624 98252232 . - . gene_id "ENSG00000205356.10"; gene_type "protein_coding"; gene_name "TECPR1"; level 2; hgnc_id "HGNC:22214"; havana_gene "OTTHUMG00000154273.7";
chr7 HAVANA transcript 98215516 98233842 . - . gene_id "ENSG00000205356.10"; transcript_id "ENST00000490842.5"; gene_type "protein_coding"; gene_name "TECPR1"; transcript_type "retained_intron"; transcript_name "TECPR1-217"; level 2; transcript_support_level "1"; hgnc_id "HGNC:22214"; havana_gene "OTTHUMG00000154273.7"; havana_transcript "OTTHUMT00000334663.1";
chr7 HAVANA exon 98215516 98215730 . - . gene_id "ENSG00000205356.10"; transcript_id "ENST00000490842.5"; gene_type "protein_coding"; gene_name "TECPR1"; transcript_type "retained_intron"; transcript_name "TECPR1-217"; exon_number 16; exon_id "ENSE00001899749.1"; level 2; transcript_support_level "1"; hgnc_id "HGNC:22214"; havana_gene "OTTHUMG00000154273.7"; havana_transcript "OTTHUMT00000334663.1";
......
If I understand you correctly, only those "retained_intron" transcript types can be detected in rMATS (--novelSS disabled).

Thanks,

Xiao

X L

unread,
Jan 14, 2026, 10:36:48 AMJan 14
to rMATS User Group

Hi Eric,

I have just noticed that even with GTF files lacking transcript_type annotations (e.g., no "retained_intron" specified), rMATS can still identify intron retention events (with --novelSS disabled).

From reading your post, you mentioned that rMATS requires these definitions to detect RI events:

  1. The exon to the left of the intron
  2. The exon to the right of the intron
  3. An exon which starts at the left boundary of the left exon and ends at the right boundary of the right exon
  4. A splice junction from the left exon to the right exon

I'm wondering if criterion 3 might be too strict. In some cases, there are exons that overlap with introns from other transcripts, but their boundaries don't exactly match the flanking exons of those introns. Under the current criterion, these would not be considered intron retention events.

I was thinking that even without criterion 3, we could still detect intron retention based on the other criteria, plus an additional requirement: detecting reads that span exon-intron boundaries (i.e., reads crossing exon-5'ss-intron or intron-3'ss-exon junctions). Would this be a viable alternative approach?

Thanks,
Xiao

kutsc...@gmail.com

unread,
Jan 15, 2026, 9:02:59 AMJan 15
to rMATS User Group
The only keys that rMATS uses from the last column of the gtf (the key value pairs) are  gene_id, transcript_id, and gene_name. rMATS doesn't look at the transcript_type

Criterion 3, which requires a large exon that includes both smaller exons and the intron, is quite strict. I don't think we're going to change the rMATS behavior. You can force rMATS to detect the RI events that you want by adding transcripts to the GTF. You could instead try https://github.com/Xinglab/siri which is specifically for retained introns and accounts for some overlap of introns with exons

Eric

X L

unread,
Jan 15, 2026, 5:21:33 PMJan 15
to rMATS User Group
Hi Eric,

Thank you for the clarification. I had initially understood that to detect intron retention, the "large exon" region needed to encompass both flanking exons and the intervening intron, with coordinates precisely matching the start of the upstream exon and the end of the downstream exon.

I have tried to enable the --novelSS parameter, which significantly increased the number of RI events detected (along with other event types, RI event counts are still the lowest ). However, I've noticed that for these novel events in the rMATS report, the flanking exon coordinates often don't align with the reference GTF annotation—which is expected since they're labeled as novel.

I'm attaching a screenshot showing one of such cases where rMATS fails to report an obviously significant RI event under default settings. When I enable --novelSS, it detects the event, but the riExonStart_0base and riExonEnd coordinates (indicated by the red bar in the attached screenshot) show that the riExonEnd doesn't correspond to an annotated exon boundary in the GTF.

While I agree that novel splicing events exist and some intron retention may be involved in novel events, in this case, the intron is located between well-annotated flanking exons, so most of the retention events here shouldn't be classified as novel.

Thank you,
Xiao

Screenshot 2026-01-15 at 5.05.47 PM.png
Reply all
Reply to author
Forward
0 new messages