Re: [rMATS] Any suggestion to improve the analysis

Message has been deleted

Theo

unread,

Jan 26, 2025, 5:53:33 AMJan 26

to Praveenkumar R, rMATS User Group

Others are far more knowledgeable to answer this but I was wondering about a few things.
I am a bit confused on what additional information total RNA will provide (for splicing events)?

Have you performed ribosome depletion of your total RNA?

species?

On Fri, Jan 24, 2025 at 3:24 PM Praveenkumar R <barryp...@gmail.com> wrote:

I conducted paired-end total RNA sequencing with triplicates for both wild type and drug-treated conditions. Each wild type BAM file contains ~200 million mapped reads (processed with STAR using two-pass mode, read length 65 nt), while the treatment BAM files have ~400 million mapped reads. I analyzed splicing events, focusing on exon skipping, using rMATS-turbo (comparison-1).

To assess reproducibility, I randomly down sampled the reads in each BAM file to 90%, 80%, and 50% and repeated the splicing analysis (comparison-2). I compared the skipped exons identified in comparison-1 and comparison-2 but still observed many skipped exons.

Lastly, I generated a reference annotation using wild type data with STRINGTIE, STRINGTIE merge, and GFFCOMPARE (-R -Q -M) and reanalyzed with rMATS. However, I noticed that many reads were discarded.

What could be causing these discrepancies or read loss?"

Example:

USED: 213349495

NOT_PAIRED: 0

NOT_NH_1: 0

NOT_EXPECTED_CIGAR: 2501633

NOT_EXPECTED_READ_LENGTH: 0

NOT_EXPECTED_STRAND: 0

EXON_NOT_MATCHED_TO_ANNOTATION: 188082831

JUNCTION_NOT_MATCHED_TO_ANNOTATION: 24397005

CLIPPED: 0

TOTAL_FOR_BAM: 428330964

WT_1.bam

My questions:

1.     is will rMATS somehow consider the depth of the library and expression level of transcript.

2.     How well rMATS can handle the annotation file that was build form GFFCOMAPRE.

3.     Is there any quality check that I can do? Or is there any room for the improvement.

--
You received this message because you are subscribed to the Google Groups "rMATS User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rmats-user-gro...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/rmats-user-group/099b1c66-f0a6-4d1b-be24-ac47ef1ee7fen%40googlegroups.com.

Praveenkumar R

unread,

Jan 26, 2025, 8:25:27 AMJan 26

to rMATS User Group

Hi,

Thanks for your reply. Sorry i forgot to add that information.

1. I can understand your concern. But you can capture almost all the possible splicing events through total RNA.

The advantage is that you are not restricting to mature transcripts. Also it is the disadvantage.

2. Yes, indeed all the samples were ribo-depleted.

If you need more clarification, please feel free to ask me.

Thanks

kutsc...@gmail.com

unread,

Jan 27, 2025, 9:56:20 AMJan 27

to rMATS User Group

1. rMATS doesn't consider the total library depth. For each splicing event, rMATS basically counts the number of reads that support the inclusion isoform (IJC_SAMPLE_1, IJC_SAMPLE_2) and the number of reads that support the skipping isoform (SJC_SAMPLE_1, SJC_SAMPLE_2). Then it runs a statistical test to get a p-value for that event

2. From https://ccb.jhu.edu/software/stringtie/gffcompare.shtml
> -M discard (ignore) single-exon transfrags and reference transcripts (i.e. consider only multi-exon transcripts)

The rMATS output shows that about 43% of the alignments were filtered for EXON_NOT_MATCHED_TO_ANNOTATION. Based on the description I think the output gtf from GffCompare -M might not have any transcripts in regions where the alignments don't cover a splice junction. Since your reads are only 65 nt, most of the reads might not include a junction

Eric

Praveenkumar R

unread,

Jan 27, 2025, 10:50:19 AMJan 27

to rMATS User Group

Hi,

Yeah, you are right. The read length is bit short for the splicing analysis. But i have decent depth.