Different read length rMATS.3.2.4

Adrian Johnson

unread,

Aug 10, 2016, 12:40:26 PM8/10/16

to rMATS User Group

Hello MATS group:

I have used MATS before (non strand specific runs) and fairly comfortable using it and published results.

However, our facility now churning reads using strand specific runs and I see different reads lengths in the Tophat2 aligned BAM files.

The length ranges from 60 - 130.

I have two questions:

1. Is rMATS.3.2.4 - can this version able to deal with different read lengths.

2. In case not, what should I do?

should I trim reads to 100bp at FASTQ level align and then use BAM files with read length option as 100?

Could anyone suggest.

thanks a lot.

-Adrian

Juw Won Park

unread,

Sep 2, 2016, 5:46:04 PM9/2/16

to rMATS User Group

Hello,

As you described in your 2nd option, you can trim the reads using the trimFastq.py script that comes with rMATS.

Thank you,

Juw Won

Yaoi T

unread,

Sep 3, 2016, 12:47:28 AM9/3/16

to rMATS User Group

Hi, Juw won

I have analyzed our RNA-seq data which were created under the same facility, and were with 100bp or so read length.

So, Adrian's 2nd option approach has ever succeeded in the analysis using rMATS-3.2.5 with STAR-2.5.2.

Recently I am trying to compare our experiment data with several data fetched from SRA.

In a curious experiment out of them, data type is PE but those read length is short, 36 bp !

Apparently, the comparison of them with our trimmed data succeeded in reporting many AS events.

I'm sure, however, that these results possess lower reliability because of the following fact.

After the above, I conducted the rMATS-3.2.5 analysis after the 36 bp-trimming of my RNA-seq data, and then compared the output with the result of the 90 bp -trimmed ones (The used data were 2 groups, each 3 biological replicates).

As a result, the number of detected AS events differed from each other, and there are less common AS events.

I guess that, for success of Adrian's 2nd option approach, we need to understand the adequate read length and -a option setting for rMATS.

Moreover, I think that the tuning of STAR option setting is probably required in some case.

A series of rMATS has not included the option setting for an aligner. It is much convenient to analyze the RNA-seq data which were created under the same facility.

On the other hand, in order to utilize the huge data in SRA and the others, rMATS users need to understand the aligning steps and rMATS program more well.

If you provide such information, I feel to be happy like the other users who are new in bioinformatics.

Of course, only introduction of good and reliable informative sites is enough well.

Thanks,

Yaoi T

2016年9月3日土曜日 6時46分04秒 UTC+9 Juw Won Park:

Juw Won Park

unread,

Sep 9, 2016, 5:56:03 PM9/9/16

to rMATS User Group

Hello,

rMATS can run in 2 modes (with fastq files or with bam files). In the 2nd mode, users can use their choice of aligners (including STAR, HISAT, or Tophat2) with any options they prefer. Since the mapping can be done in parallel, it would provide a good speed up as well.

For the adequate read length, I would examine the read length distribution using some tools like fastqc then select a length that covers >95% of the total reads.