Beta release of rmats-turbo: a much faster and slimmer version of rMATS!

2,315 views
Skip to first unread message

Yi Xing

unread,
May 9, 2017, 1:22:45 PM5/9/17
to rmats-us...@googlegroups.com
Dear rMATS Users:

We are happy to announce the beta release of rMATS-turbo: a much faster and slimmer version of rMATS. rMATS-turbo achieves a significant gain in computational speed and data storage efficiency. Compared to rMATS 3.2.5, the counting procedure is 20-100 times faster, the statistical test is 300-500 times faster, and the size of the intermediate files is ~1000 times smaller. We are currently releasing rMATS-turbo for beta testing as a stand-alone Docker container. Feedback and bug reports are welcome.


Yours,
Yi Xing

Brian Cole

unread,
May 10, 2017, 12:43:32 PM5/10/17
to rMATS User Group
Great work!  Were some of the bits recoded into a lower-level language?

Exciting to see rMATS continue development. 

Thanks,
Brian S. Cole, PhD
Institute for Biomedical Informatics
University of Pennsylvania Perelman School of Medicine

YI XING

unread,
May 10, 2017, 12:46:05 PM5/10/17
to Brian Cole, rMATS User Group
Definitely. The entire program was completely rewritten in C and Cython.

--
You received this message because you are subscribed to the Google Groups "rMATS User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rmats-user-group+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rmats-user-group/dd043fc1-03a2-4ed3-9a81-f113c2ea82bb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yi Xing

unread,
May 10, 2017, 8:11:27 PM5/10/17
to rMATS User Group
Attached is a speed test of rMATS-turbo vs rMATS 3.2.5, running on a single computing node. 
rmats-turbo-speed-test.jpg

shi Ya

unread,
May 12, 2017, 1:53:31 AM5/12/17
to rMATS User Group
Dear Yi Xing and rMATS users

I failed to run the new version using fastq files (I have not tried using bam files).
After the following command was run, the error message was returned.
I, however, can't understand that. 
fastq files and gtf one was used for the analysis using rMATS.3.2.5. Of course, I succeed in this analysis.

$ sudo docker run -v /tools/rMATSturbo/testData:/data rmats:turbo01 --s1 /data/s1.txt --s2 /data/s2.txt --gtf /data/genes.gtf -t paired --readLength 90 --bi /data/mm10 --cstat 0.001 --od /data/output
[sudo] password for shi: 
mapping the first sample
Traceback (most recent call last):
  File "/rmats-turbo/rmats.py", line 302, in <module>
    main()
  File "/rmats-turbo/rmats.py", line 270, in main
    args = get_args()
  File "/rmats-turbo/rmats.py", line 142, in get_args
    args.b1, args.b2 = doSTARMapping(args)
  File "/rmats-turbo/rmats.py", line 33, in doSTARMapping
    map_folder = os.path.join(args.tmp, 'bam%d_%d' % (i+1, rr+1));
AttributeError: 'Namespace' object has no attribute 'tmp'



Here, the following line is included in s1.txt:
/data/011_1.fastq,/data/011_2.fastq,/data/012_1.fastq,/data/012_2.fastq,/data/027_1.fastq,/data/027_2.fastq.
and
in s2.txt:
/data/021_1.fastq,/data/021_2.fastq,/data/022_1.fastq,/data/022_2.fastq,/data/027_1.fastq,/data/028_2.fastq

mm10 star index was used as genes.gtf.

Because I am a new user of " Docker-ce ", I may take mistakes about the system.

I am looking forward to any comments or advices.
Thanks,

shi Ya


2017年5月10日水曜日 2時22分45秒 UTC+9 Yi Xing:

sandeep n

unread,
May 30, 2017, 1:52:25 PM5/30/17
to rMATS User Group, Anil.Ke...@jax.org
Hi Yi,

This is great work! Would it be possible to release standalone rMATS-turbo code, instead of docker version? We would like to run rMATS on our cluster (which doesn't support Docker).

Thanks,
Sandeep


On Tuesday, May 9, 2017 at 1:22:45 PM UTC-4, Yi Xing wrote:

Dario Strbenac

unread,
Jun 19, 2017, 9:00:13 PM6/19/17
to rMATS User Group
The user documentation needs more explanations about installing and running the image. For example, the commands suggested didn't work for me unless I put sudo in front of them. Some users might be using Docker for the first time, so better explanations would be valuable.

Imam Toufique

unread,
Jun 28, 2017, 7:40:47 PM6/28/17
to rMATS User Group, Anil.Ke...@jax.org
Hi, same issue here.  We need to install this in a cluster, so, if you could please release source code with some basic build instructions, that would very helpful.  We need to run this in a cluster as well.

thanks.

Elmira Forouzmand

unread,
Jun 29, 2017, 5:26:31 PM6/29/17
to rMATS User Group
Hi. I'm having the same problem. I think there's a bug in rmats.py. I have not tried yet because I cannot change the version we have installed on our cluster,  but if we move this line:
args.tmp = tempfile.mkdtemp()
to get_args it might work.

Daniele Ottaviani

unread,
Jul 8, 2017, 7:24:25 AM7/8/17
to rMATS User Group
Dear Yi Xing and rMATS users,

I'm trying to test rMATS-turbo since rMATS is still running after a week as it can't use multiple cores I suppose. 
I'm comparing 3 .bam replicates control samples vs. 3 cases.
rMATS turbo is working well and fast :-) ...However it doesn't report any novel A3/A5SS...exactly what I'm interested in. 
Is there any -novelSS-ish option command as for rMATS?
Temporary files in rMATS - that is still running - confirm that there are novel A3/A5SS events..so input files should't be the problem.
If possible, it would be anyway great to be able to run rMATS on multiple cores.

Thanks in advance for any guess and help!

Best wishes,

Daniele

YI XING

unread,
Jul 20, 2017, 8:54:17 PM7/20/17
to Daniele Ottaviani, rMATS User Group

Hi Daniele,

Have you finished your rmats 3.2.5 run on your data? 3 vs 3 is not a big dataset, and rmats 3.2.5 should  not take more than a couple days to finish at most. Rmats-turbo currently doesn’t look for novel splice sites/exons, but one workaround is to run a transcript assembly tool (e.g. StringTie) on your RNA-seq data & existing GTF file to update your GTF file and feed the new GTF file to rmats-turbo.

 

Yi

--

You received this message because you are subscribed to the Google Groups "rMATS User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email to rmats-user-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rmats-user-group/2d7b3e7f-14a1-47e5-9f95-56b680c79060%40googlegroups.com.

Message has been deleted
Message has been deleted
Message has been deleted

Roman Spurdo

unread,
Jul 31, 2017, 1:27:07 AM7/31/17
to rMATS User Group
Hi, yes, we need to move
args.tmp = tempfile.mkdtemp()  get_args method, before launching doSTARMapping(args).
But how to rebuild installing package for Docker? After changing file i got error about checksums. 

Roman Spurdo

unread,
Aug 7, 2017, 10:34:18 AM8/7/17
to rMATS User Group
All right, i solved this problem by replacing rmats.py in
/var/lib/docker/overlay2/d86b95c543f6e9a6c7fabf99394fde1d424785303bbc11b4a571d2a4ca148c96/diff/rmats-turbo using su privileges.
Attaching edited file.
So, now i don`t have with attribute 'tmp'. But i got another one, now system can`t find STAR. I installed
https://hub.docker.com/r/stevetsa/star/ this one using docker pull stevetsa/star command. But i still have this error. Have no idea. As written in manual, i need to install list of software TO image, but seems that i`m installing them just to docker as independed image.
rmats.py

Roman Spurdo

unread,
Aug 12, 2017, 5:37:06 AM8/12/17
to rMATS User Group
small update, i understood that i don`t need to install this image with star. i need to install STAR to existing image. I tried to do it via docker run -it ubuntu bash and then installing all required packages, but after exiting container my changes are lost. I tried to commit it, no result.
I had absolutely to problems installing regular version of rmats, because there is very wide manual about installation. But i`m having a lot of problems with turbo. It will be great, if someone writes user-frendly manual for installation.
Thank you.    

Royden Clark

unread,
Aug 23, 2017, 6:04:57 PM8/23/17
to rMATS User Group
Also would like to install this on a cluster that does not yet support Docker. Is this available?

diy...@berkeley.edu

unread,
Oct 31, 2017, 2:26:12 PM10/31/17
to rMATS User Group
Hi, I'm also interested in installing on a cluster that doesn't support Docker. Do you have the Dockerfile that you used to create the .tar.gz that you have available for download?

Thank you!


On Tuesday, May 9, 2017 at 10:22:45 AM UTC-7, Yi Xing wrote:

Xi Wang

unread,
Nov 30, 2017, 8:41:05 AM11/30/17
to rMATS User Group
Prof. Xing, Very great work! 

I just encountered a problem when trying to run rMATS. I tested the Unicode type of Python, and found out it worked with UCS4. However, when I run the version with UCS4, it gave me the error message: 

Traceback (most recent call last):

  File "rMATS-turbo-Linux-UCS4/rmats.py", line 17, in <module>

    from rmatspipeline import run_pipe

ImportError: /home/xiwang/tools/rMATS/rMATS.4.0.1/rMATS-turbo-Linux-UCS4/rmatspipeline.so: undefined symbol: PyUnicodeUCS4_FromStringAndSize


Hope you can give me a clue how can i fix it. 


Thanks a lot! 

gwle...@gmail.com

unread,
Nov 30, 2017, 10:07:50 AM11/30/17
to rMATS User Group
Thanks for your nice work!
The new version rMATS is ultra fast right now, but I get a trouble when analyzing Mus_musculus rna-seq data. here is my run command:

python rMATS-turbo-Linux-UCS4/rmats.py --b1 b1.txt --b2 b2.txt --gtf Mus_musculus.GRCm38.90.gtf --od ./result -t single --nthread 10 --readLength 100 --tstat 10

There are 52636 distinct gene ID in the gtf file

There are 131195 distinct transcript ID in the gtf file

There are 33274 one-transcript genes in the gtf file

There are 782519 exons in the gtf file

There are 24967 one-exon transcripts in the gtf file

There are 20503 one-transcript genes with only one exon in the transcript

Average number of transcripts per gene is 2.492496

Average number of exons per transcript is 5.964549

Average number of exons per transcript excluding one-exon tx is 7.131378

Average number of gene per geneGroup is 7.225076


==========

Done processing each gene from dictionary to compile AS events

Found 17008 exon skipping events

Found 726 exon MX events

Found 7287 alt SS events

There are 4631 alt 3 SS events and 2656 alt 5 SS events.

Found 3554 RI events

==========


Running the statistical part.

The statistical part is done.

Done.


It is unstranded single-end data, and I have tried HISAT2 or STAR alignment, the gtf file is downloaded from ensemble.
It seems that rMATS runs well without any mistake reported, however, I found the statistical part costed a very unusual short time. 
Then I check the output files (XXX.MATS.JC/JCEC.txt), they all contain only header line, no splicing events are recorded(see below):
bash-4.1$ wc *
      1      23     230 A3SS.MATS.JCEC.txt
      1      23     230 A3SS.MATS.JC.txt
      1      23     230 A5SS.MATS.JCEC.txt
      1      23     230 A5SS.MATS.JC.txt
   4632   50952  451249 fromGTF.A3SS.txt
   2657   29227  258370 fromGTF.A5SS.txt
    727    9451   83346 fromGTF.MXE.txt
      1      11     102 fromGTF.novelEvents.A3SS.txt
      1      11     102 fromGTF.novelEvents.A5SS.txt
      1      13     140 fromGTF.novelEvents.MXE.txt
      1      11     108 fromGTF.novelEvents.RI.txt
      1      11     104 fromGTF.novelEvents.SE.txt
   3555   39105  346102 fromGTF.RI.txt
  17009  187099 1668842 fromGTF.SE.txt
      1       7      78 JCEC.raw.input.A3SS.txt
      1       7      78 JCEC.raw.input.A5SS.txt
      1       7      78 JCEC.raw.input.MXE.txt
      1       7      78 JCEC.raw.input.RI.txt
      1       7      78 JCEC.raw.input.SE.txt
      1       7      78 JC.raw.input.A3SS.txt
      1       7      78 JC.raw.input.A5SS.txt
      1       7      78 JC.raw.input.MXE.txt
      1       7      78 JC.raw.input.RI.txt
      1       7      78 JC.raw.input.SE.txt
      1      25     268 MXE.MATS.JCEC.txt
      1      25     268 MXE.MATS.JC.txt
      1      23     236 RI.MATS.JCEC.txt
      1      23     236 RI.MATS.JC.txt
      1      23     232 SE.MATS.JCEC.txt
      1      23     232 SE.MATS.JC.txt

-bash-4.1$ cat A3SS.MATS.JC.txt
ID GeneID geneSymbol chr strand longExonStart_0base longExonEnd shortES shortEE flankingES flankingEE ID IJC_SAMPLE_1 SJC_SAMPLE_1 IJC_SAMPLE_2 SJC_SAMPLE_2 IncFormLen SkipFormLen PValue FDR IncLevel1 IncLevel2 IncLevelDifference

I am sure there is nothing wrong in bam and gtf files, cuz i got a normal result by previous version rMATS 3.5.0. what's more, my colleague runs new rMATS-turbo well with human ran-seq data. so Im wondering if there's any bug in new rMATS-turbo with mouse ran-seq data analyses

在 2017年5月10日星期三 UTC+8上午1:22:45,Yi Xing写道:

habo...@uci.edu

unread,
Dec 5, 2017, 3:51:02 PM12/5/17
to rMATS User Group
Hello,

I am currently having the same issue as Yi Xing while trying to run rMATS 4.0.1. Do we know what the issue is?

No errors reported but all splice files are empty except the headers. FromGTF files contain data. I am running human data aligned with STAR and using pre-made and downloaded GTF. 

Output from script: 

There are 26485 distinct gene ID in the gtf file
There are 58259 distinct transcript ID in the gtf file
There are 15188 one-transcript genes in the gtf file
There are 535393 exons in the gtf file
There are 5729 one-exon transcripts in the gtf file
There are 3772 one-transcript genes with only one exon in the transcript
Average number of transcripts per gene is 2.199698
Average number of exons per transcript is 9.189876
Average number of exons per transcript excluding one-exon tx is 10.083076
Average number of gene per geneGroup is 261.649536

==========
Done processing each gene from dictionary to compile AS events
Found 11466 exon skipping events
Found 1106 exon MX events
Found 7152 alt SS events
There are 4065 alt 3 SS events and 3087 alt 5 SS events.
Found 1139 RI events
==========

Running the statistical part.
The statistical part is done.
Done.

Resulting Files (all files but FromGTF just contain the headers):

230 Dec  5 11:52 A3SS.MATS.JCEC.txt
230 Dec  5 11:52 A3SS.MATS.JC.txt
230 Dec  5 11:52 A5SS.MATS.JCEC.txt
230 Dec  5 11:52 A5SS.MATS.JC.txt
331K Dec  5 11:35 fromGTF.A3SS.txt
251K Dec  5 11:35 fromGTF.A5SS.txt
110K Dec  5 11:35 fromGTF.MXE.txt
102 Dec  5 11:35 fromGTF.novelEvents.A3SS.txt
102 Dec  5 11:35 fromGTF.novelEvents.A5SS.txt
140 Dec  5 11:35 fromGTF.novelEvents.MXE.txt
108 Dec  5 11:35 fromGTF.novelEvents.RI.txt
104 Dec  5 11:35 fromGTF.novelEvents.SE.txt
92K Dec  5 11:35 fromGTF.RI.txt
938K Dec  5 11:35 fromGTF.SE.txt
78 Dec  5 11:52 JCEC.raw.input.A3SS.txt
78 Dec  5 11:52 JCEC.raw.input.A5SS.txt
78 Dec  5 11:52 JCEC.raw.input.MXE.txt
78 Dec  5 11:52 JCEC.raw.input.RI.txt
78 Dec  5 11:52 JCEC.raw.input.SE.txt
78 Dec  5 11:52 JC.raw.input.A3SS.txt
78 Dec  5 11:52 JC.raw.input.A5SS.txt
78 Dec  5 11:52 JC.raw.input.MXE.txt
78 Dec  5 11:52 JC.raw.input.RI.txt
78 Dec  5 11:52 JC.raw.input.SE.txt
268 Dec  5 11:52 MXE.MATS.JCEC.txt
268 Dec  5 11:52 MXE.MATS.JC.txt
236 Dec  5 11:52 RI.MATS.JCEC.txt
236 Dec  5 11:52 RI.MATS.JC.txt
232 Dec  5 11:52 SE.MATS.JCEC.txt
232 Dec  5 11:52 SE.MATS.JC.txt

I had submitted my job as the following script: 

#!/bin/bash
#$ -N dnLEF1
#$ -q som
# asom,free*,pub*
#$ -pe openmp 12
#$ -m beas

module load STAR/2.5.2a
module load samtools/1.3
#module load enthought_python/7.3.2
module load rMATS/4.0.1
SH=/som/habowski
cd ${SH}/rMATS/rMATS.4.0.1

#python ${SH}/rMATS/rMATS.4.0.1/rMATS-turbo-Linux-UCS2/rmats.py \

/usr/bin/time -v rmats.py \
--b1 \
mock.txt \
--b2 \
dnLEF1.txt \
--nthread 8 \
--tstat 8 \
--gtf ${SH}/genome/Hg38/hg38.gtf \
--od ${SH}/rMATS/dnLEF1 \
-t single --readLength 100

Yongqiang Xing

unread,
Dec 10, 2017, 5:27:59 PM12/10/17
to rMATS User Group
      I have the same problem when I analyzed human RNA-seq. I also check the chromosome name and confirm the consistency of chr name between gtf and bam files. When I analyze these data using rMATS 3.2.5, the results is good. so far, I don't find reasopn. Thus,I am very confusing. I analyze the Arobidopsis alternative splicing using turbo verstion. The results is nomal. I need some help from the author or people who have the same experience.

在 2017年12月5日星期二 UTC-6下午2:51:02,habo...@uci.edu写道:

mamy.andriant...@gmail.com

unread,
Dec 11, 2017, 1:16:14 PM12/11/17
to rMATS User Group
Hello,

The same issue for me too

Mamy

Zhijie Xie (Jay)

unread,
Dec 12, 2017, 2:58:22 AM12/12/17
to rMATS User Group
Hi all,

Is there any overlapping between your --b1 and --b2 file?

Thanks
Zhijie Xie

在 2017年12月11日星期一 UTC-8上午10:16:14,mamy.andriant...@gmail.com写道:

habo...@uci.edu

unread,
Dec 12, 2017, 5:19:33 AM12/12/17
to rMATS User Group
Hello, 

Nope, there is no overlap in the b1 and b2 files. They each have three different bam files listed (mock files versus kd). 

Zhijie Xie (Jay)

unread,
Dec 12, 2017, 10:17:24 AM12/12/17
to rMATS User Group
Hi Xi,

Could you run rMATS-turbo-Linux-UCS2? There may be some issue of your environment variable. It looks like you're actually running python with UCS2.

Thanks
Zhijie Xie

在 2017年11月30日星期四 UTC-8上午5:41:05,Xi Wang写道:

Zhijie Xie (Jay)

unread,
Dec 12, 2017, 10:46:20 AM12/12/17
to rMATS User Group
Hello,

Could you please show me your BAM file? About 100 lines of read is more than enough.
Also, Could you please re-run your analysis in Paired mode? It would be helpful if we know that rMATS-turbo is working well on your dataset in paired mode.

Thanks.
Zhijie Xie


在 2017年11月30日星期四 UTC-8上午7:07:50,gwle...@gmail.com写道:

Zhijie Xie (Jay)

unread,
Dec 12, 2017, 7:13:17 PM12/12/17
to rMATS User Group
Could you please show me your BAM file? About 100 lines of read should be enough.
Also, Could you please re-run your analysis in paired mode? It would be helpful if we know that rMATS-turbo is working well on your dataset in paired mode.

Thanks

在 2017年12月5日星期二 UTC-8下午12:51:02,habo...@uci.edu写道:

habo...@uci.edu

unread,
Dec 12, 2017, 7:31:45 PM12/12/17
to rMATS User Group
Hello again, 

I ran rMATS again only changing to paired and got the same result. FromGTF files have content, all other output files just have a header. 

Is this what you want for the bam file? 

$ head mock_r1_star_sorted.bam 
 �  � BC $
        �Xm�^E ��v���-- , e�����YM(mi-n?pI]"f}�n����ֶ?P#�X�R�  b+56��D
                                                                     �����1� 1
 I��I+ �Ι��9��  ?�ޤ���s�sΜ9gfV_�� ��k{�n?��wx�୻�g�M��� �?��wx�� ��whӠ2!Zg��&��ڪX   ���ڒ0m�R�$� l&��:(2�e�W��ї�τ��$B DȄ�R�]A�L蠣�җVmF�詇R� [�(��1�[]��d�{i�3Z��i�_��h/      ��1J �� �X�۔�9A+덑%�  *���Ɉ �N��Q�g�O�20*�o��b
                                                                                                                                                                                                                      e�U�>EE �JGFڳ�*S�|)���7��7� κ���C �B  ;���+�E�$�d4��9��8�ٚ
                                           *DG8UٶM R�Jƅ�\�&�Եɴ44�P���lל�?睫8 �fJ"ϸXs>��E9U5�B�?'k�Z �Me?�.j��!SRJ�9�6�s.�o��T=,M7c0`�(�?��y3����rJƠ�iZ"�jq12��6� c�մD ?�ی�.V���U�?
                                                                                                                                                                                 O�S8�iQF7�kY t S  師�IA�|ն�g��J1m��UJŢ�1e�b�L�Z����S�d
    �H*b��      �别U��Ȭy�c*6ƫ&�|
                                *Hm�1� �,�`�h�<�?Q���Z���� S 0�#�[��I ��A<�4�& LA��L� t�  >  �j�?tT� �Qŗc�0��Y�? �r���=���мVa�H}��DR+g
                                                                                                                                   F  :�`�Q�8�`�I  �)e g�x��)0�#`#Ř&q ���x��k� ��Ƭ�:l1�!M�fuʨ�˲�1MmӎY�j U�Ϩ)� �,[��`    ���&M=��H˥q(г�d�����T8�K���c*V�lU����1 }����*� I�� [�2lq[̹ �gLӆ K-����M��U�    ?
                                                                  ���K��?��y��8ܔx��  �:6EV�x��&� � ;g�=,��\Q {�X��w
                                                                                                                   VLU1՘r�TŒa5O�Y6`��<�X�!�?S�'�- ���yX��`�?�2�1�X��d��b�N�!>CX���`��e;��9��9m i6�i6�/0_5��M�j�w�s�v炣A��L:Y8FIl�J�QX>���� �T��V�u?�y� �%ʥ "�H�>S�ey�q���w�k\�:m�
                                              ��C�
                                                  \$6�ԍ���7g�
6ȌÐDs�8�� %���Y>�M[�*�  �H:Vk �1A�=��Z"Cz��X�C ��f+�z ��^7'���@kܠI��D7;���)��s{�󂉝3m�%W��R$�Z����ސ��)�1�i U   %�-n�f;�t�ǐx� *���8Λ;�� ?Y�ݩo�pJY /�1I����8\�u*o0$1��2C����<��� ���' >q��  ���ž��+?���S� ES�5ŋ�m�h�V � �l� 눤 Y�e}���_���-�������~ۯZ�k��c__�ʕ�����1;�?�ԧLFn ���?_;1�70735��u�̞�� 0d`lf���Lkl�=8�[719>�a��<6=t�:9�k���֞�֮�� ?�"�җ+W����7��Ʒ̎o����m�Y��tkr���\[��̶���rT��ޮL���έ�o0����Jn��ܻ{n|�o���Tkz�orbz|�/��?�£����4�o ����S��'� ]� \�   O�ZR?(���
                                                                                                                                                                                                                >h�}����{o݊V�r�/ �N<{m?>xa�>�[? ��� {�(>��U=
                         �V���v� o[���8z�� ?N�o_pr > q�[͓  �Y�<9q  宛��ɋ֎���)�wl?�<E�t����IU�� O7OR�����'%Muw�.��c��s7�/�   d6�GW   �(=��~ +ug      � ��� ʗ��?�+�Ti�-'
(_-��@���e �( �(��>U�
                     q�  �d  �>��@�'�W PB��
                                          � (u �5J���     �1t�Pj ��x�oD��W��! 8tO7���n� �?�UB�       �' H� _|��@`� b 6�� 5���?���rpV-# �a�@ եD l Ů     �?��@� �6��� ��( R[ ��>@l�΋Ap�*N,& �ZD� nS� �       ">�WBp*� ���T� ���T !"�T|�    ���% |� �#܊��� Z\D 8 ���!8 �=K           �N�p   � �� I&�� ��o�ttP     � ��@�$ p$ ��m8
                                                               ���[���D ���e�-P����  pC)F���I�u!����' ���@��+� ���! ��El�e������[]        �I  q#1��ЍDD^��[yY� % R�}����!n{����pyi
                                                                                                                                                                           �    ��z� �b�   [O�P�
  ZN x��n �CK�x�[?1C!��c z`�@���   �˻    ���& ���=  ]��t^ ��񛷖o�}��  [l�$n�"��|1O�ѱ� ��ϗi�o����         �*'-߽����BNL�V�&w^J ph�% �\�KB���  Q�*q;׉%d:r�x�h      �:��B ��C$��(<_�Ϸd��G� �軍�"��{G        ��wu�PN��~� �S��! Ap�|��b��\y?y
                                                                                                                                                                                                                               �s� ��<�Q2 ��x�@��v* �?x�@��g   � s����򭜫��L�c��yg�@P�b>��O�@�~�A2 �?� � �   9u�  �{�@}������X��$qz��"\#��$�y�� ��@�cw  Կ�0 ��rO��@�}_""@��   � o/�����<\��s�{�@����,!P���-P� 1� �$"@�� b
                                                                                                                                                                               �v��ꯘ,��e����@�� �=\͈�d) �d�sg  ��M� �`��]  �/�R \���;��       }Q�% � � BC �cܽ{�&YU'�UVUw  �>   V���'��/�|5W��g A
l T�       ?�����!N��G�Xx� ��K��  ^��� _w?��k��@��Q �� � �w?Ή�2+{M�揫ٝY�^����o�~�ó��!��<�|}�vי��/��
                                                                                               �;{�������ۋ��a�_��u�{��[wonnl޽�1�߽��7�=�����     %�u�
ޔ� �93<�3���� �u���  ����?�                                                                                                                         �)x�<�6 b ?� V��Q���M��u��IS��^|������?x��� ��m       }0� �)�K�{�z
                           �Z���[�$r+�X'�H]+E�:��U��d>��Л�ij:)��DZ�Dz��5��ȂDu -+�2�K���MQҷ�7�� ��"�@�L   ���K"�D>�"'�� {�{�>x�7+�<���S^,��֌��2w^�� �`�L�uc/��{oF%F/ra�,K��d:<��H�p��O�ϡ�
H�� WR$ z ���㷐��E��3�����g�l�ćB� ��$��ÿ=��Ӽ [׉�Z׷9�|nz� �  �;��ҵ� ���UɪX�
�E(�  �yx}��  
�5�Y[�� 3        ߸ '�_�Z^t�z�b��o}*ۡK��;-��ށ ȱ 9�
��GEV� ��.�hQ� �� /�"|*�   =|�{�����y � +���b����W�͠4�� ljm�t%�Q�+�ik��� ��t֔I Z^�    ���KS   <  :^|S� @Y24V�M��e��
                                                                                                          �X��N6?� *@m����A��;~mV ���V yw3��   @g�
� 0 %��? ���[/���h� ���  ����m�Z�y�m���`���X3x��  

mamy.andriant...@gmail.com

unread,
Dec 13, 2017, 5:42:40 AM12/13/17
to rMATS User Group
Hi Zhijie Xie,

I run rMATS-turbo-Linux-UCS2. It works with paired-end data and the output files seem correct.
It seems to work also with single-end data but output files are all empty (either from fastq or bam input).

Thanks and do not hesitate if you need more information.

Best,
Mamy

Zhijie Xie (Jay)

unread,
Dec 13, 2017, 4:41:41 PM12/13/17
to rMATS User Group
Hi,

Thanks for the update. I would like to read the human-readable content of this bam file. Could you run this command: samtools view -h filename.bam | head -n 100
and send the output to me?

在 2017年12月12日星期二 UTC-8下午4:31:45,habo...@uci.edu写道:

Zhijie Xie (Jay)

unread,
Dec 13, 2017, 4:47:00 PM12/13/17
to rMATS User Group
Hi,

Could you please show me your BAM file? 100 lines should be enough.
You can run this command to get what we want: samtools view -h filename.bam | head -n 100

Is your paired-data data the same as your single-end data? Are they the same dataset? If not, please send both of them to me.

Thanks,
Zhijie Xie

在 2017年12月13日星期三 UTC-8上午2:42:40,mamy.andriant...@gmail.com写道:

Yongqiang Xing

unread,
Dec 13, 2017, 7:00:32 PM12/13/17
to rMATS User Group
Hi,
    I encounter  same problem when I handle human single-end RNA-seq. I try every method  however, it's failed. I can run test data correctly. I can also run Arabidopsis RNA-seq(paired-end) correctly based on same pipeline.
     I output sam file using hisats2. I check the chromosome name in sam and gtf(all is chr1, chr 2, ...). The only diference is I changed the " -t paired" to "single". But the matsout is empty besides gtf results.  I run these sample correctly using rMATs3.2.5. 
.    Whether is turbo  vertion not work for single-end data? 
     I send you the single-end bam file as attachment. It inclued two sample. Every sample have 3 replicates. I selected 500 lines ahead. This file is the original file I produced, thus the chromosome name hvenn't "chr".

Best

University of Texas at Dallas

Yongqiang 

Yongqiang Xing

unread,
Dec 13, 2017, 7:13:59 PM12/13/17
to rMATS User Group
Hi,
    I encounter  same problem when I handle human single-end RNA-seq. I try every method  however, it's failed. I can run test data correctly. I can also run Arabidopsis RNA-seq(paired-end) correctly based on same pipeline.
     I output sam file using hisats2. I check the chromosome name in sam and gtf(all is chr1, chr 2, ...). The only diference is I changed the " -t paired" to "single". But the matsout is empty besides gtf results.  I run these sample correctly using rMATs3.2.5. 
.    Whether is turbo  vertion not work for single-end data? 
     I send you the single-end bam file as attachment. It inclued two sample. Every sample have 3 replicates. I selected 500 lines ahead. This file is the original file I produced, thus the chromosome name hvenn't "chr".

Best

University of Texas at Dallas

Yongqiang 




在 2017年5月9日星期二 UTC-5下午12:22:45,Yi Xing写道:
bam file.zip

habo...@uci.edu

unread,
Dec 14, 2017, 1:53:57 AM12/14/17
to rMATS User Group
Hi Jay, here is the samtools view of one of the bam files. 

$samtools view -h mock_r1_star_sorted.bam | head -n 100
@HD     VN:1.4  SO:queryname
@SQ     SN:chr1 LN:248956422
@SQ     SN:chr2 LN:242193529
@SQ     SN:chr3 LN:198295559
@SQ     SN:chr4 LN:190214555
@SQ     SN:chr5 LN:181538259
@SQ     SN:chr6 LN:170805979
@SQ     SN:chr7 LN:159345973
@SQ     SN:chr8 LN:145138636
@SQ     SN:chr9 LN:138394717
@SQ     SN:chr10        LN:133797422
@SQ     SN:chr11        LN:135086622
@SQ     SN:chr12        LN:133275309
@SQ     SN:chr13        LN:114364328
@SQ     SN:chr14        LN:107043718
@SQ     SN:chr15        LN:101991189
@SQ     SN:chr16        LN:90338345
@SQ     SN:chr17        LN:83257441
@SQ     SN:chr18        LN:80373285
@SQ     SN:chr19        LN:58617616
@SQ     SN:chr20        LN:64444167
@SQ     SN:chr21        LN:46709983
@SQ     SN:chr22        LN:50818468
@SQ     SN:chrX LN:156040895
@SQ     SN:chrY LN:57227415
@SQ     SN:chrM LN:16569
@SQ     SN:GL000008.2   LN:209709
@SQ     SN:GL000009.2   LN:201709
@SQ     SN:GL000194.1   LN:191469
@SQ     SN:GL000195.1   LN:182896
@SQ     SN:GL000205.2   LN:185591
@SQ     SN:GL000208.1   LN:92689
@SQ     SN:GL000213.1   LN:164239
@SQ     SN:GL000214.1   LN:137718
@SQ     SN:GL000216.2   LN:176608
@SQ     SN:GL000218.1   LN:161147
@SQ     SN:GL000219.1   LN:179198
@SQ     SN:GL000220.1   LN:161802
@SQ     SN:GL000221.1   LN:155397
@SQ     SN:GL000224.1   LN:179693
@SQ     SN:GL000225.1   LN:211173
@SQ     SN:GL000226.1   LN:15008
@SQ     SN:KI270302.1   LN:2274
@SQ     SN:KI270303.1   LN:1942
@SQ     SN:KI270304.1   LN:2165
@SQ     SN:KI270305.1   LN:1472
@SQ     SN:KI270310.1   LN:1201
@SQ     SN:KI270311.1   LN:12399
@SQ     SN:KI270312.1   LN:998
@SQ     SN:KI270315.1   LN:2276
@SQ     SN:KI270316.1   LN:1444
@SQ     SN:KI270317.1   LN:37690
@SQ     SN:KI270320.1   LN:4416
@SQ     SN:KI270322.1   LN:21476
@SQ     SN:KI270329.1   LN:1040
@SQ     SN:KI270330.1   LN:1652
@SQ     SN:KI270333.1   LN:2699
@SQ     SN:KI270334.1   LN:1368
@SQ     SN:KI270335.1   LN:1048
@SQ     SN:KI270336.1   LN:1026
@SQ     SN:KI270337.1   LN:1121
@SQ     SN:KI270338.1   LN:1428
@SQ     SN:KI270340.1   LN:1428
@SQ     SN:KI270362.1   LN:3530
@SQ     SN:KI270363.1   LN:1803
@SQ     SN:KI270364.1   LN:2855
@SQ     SN:KI270366.1   LN:8320
@SQ     SN:KI270371.1   LN:2805
@SQ     SN:KI270372.1   LN:1650
@SQ     SN:KI270373.1   LN:1451
@SQ     SN:KI270374.1   LN:2656
@SQ     SN:KI270375.1   LN:2378
@SQ     SN:KI270376.1   LN:1136
@SQ     SN:KI270378.1   LN:1048
@SQ     SN:KI270379.1   LN:1045
@SQ     SN:KI270381.1   LN:1930
@SQ     SN:KI270382.1   LN:4215
@SQ     SN:KI270383.1   LN:1750
@SQ     SN:KI270384.1   LN:1658
@SQ     SN:KI270385.1   LN:990
@SQ     SN:KI270386.1   LN:1788
@SQ     SN:KI270387.1   LN:1537
@SQ     SN:KI270388.1   LN:1216
@SQ     SN:KI270389.1   LN:1298
@SQ     SN:KI270390.1   LN:2387
@SQ     SN:KI270391.1   LN:1484
@SQ     SN:KI270392.1   LN:971
@SQ     SN:KI270393.1   LN:1308
@SQ     SN:KI270394.1   LN:970
@SQ     SN:KI270395.1   LN:1143
@SQ     SN:KI270396.1   LN:1880
@SQ     SN:KI270411.1   LN:2646
@SQ     SN:KI270412.1   LN:1179
@SQ     SN:KI270414.1   LN:2489
@SQ     SN:KI270417.1   LN:2043
@SQ     SN:KI270418.1   LN:2145
@SQ     SN:KI270419.1   LN:1029
@SQ     SN:KI270420.1   LN:2321
@SQ     SN:KI270422.1   LN:1445
@SQ     SN:KI270423.1   LN:981

Zhijie Xie (Jay)

unread,
Dec 14, 2017, 9:15:20 AM12/14/17
to rMATS User Group
Hi,

Could you increate the number of line? 100 seems to be insufficient, we need 1000.

在 2017年12月13日星期三 UTC-8下午10:53:57,habo...@uci.edu写道:

habo...@uci.edu

unread,
Dec 14, 2017, 3:07:14 PM12/14/17
to rMATS User Group
Please see attached text file with 1000 lines.


On Thursday, December 14, 2017 at 6:15:20 AM UTC-8, Zhijie Xie (Jay) wrote:
Hi,

Mock_r1_bam_1000.txt

Alberto Riva

unread,
Dec 15, 2017, 2:54:21 PM12/15/17
to rMATS User Group
I am also having the same issue. Running on paired-end BAM files works perfectly, while when I use single-end BAM files I get empty output files.

Alberto
Message has been deleted

Jessica Elman

unread,
Apr 18, 2018, 12:12:26 PM4/18/18
to rMATS User Group
Hi,

I am currently trying to run rMATS and having the same problem you had a few months ago-- it seems to run normally but my output files are all empty except for the fromGTF.A3SS.txt etc files. Did you ever resolve your issue and if so, can you offer any advice?

Thanks in advance,

Jess 

habo...@uci.edu

unread,
Apr 18, 2018, 4:37:57 PM4/18/18
to rMATS User Group
Hi Jess,

No I never resolved the issue and never heard back from/got help from rMATS folks. I still think it is an issue with single read data - something in their code. I have since done the same analysis (aka the same code/files) on different paired end data and it works just fine. Sorry not the best news. Hopefully they are still working on the issue and resolve it with the next update. 

Amber 

YI XING

unread,
Apr 18, 2018, 4:39:59 PM4/18/18
to habo...@uci.edu, rMATS User Group, SHIHAO SHEN, 谢志杰(Zhijie Xie)

Dear all,

We are working to release a new version that fixes this issue with single-end data in the latest version of rmats. The package is already done and we are just doing some final checks before uploading the new version.

 

Yi

--
You received this message because you are subscribed to the Google Groups "rMATS User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rmats-user-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rmats-user-group/9a773eef-2eea-4df1-a60e-221da817b2e0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Jiajia Li

unread,
Aug 1, 2018, 5:10:01 PM8/1/18
to rMATS User Group
Dear Dr. Yi Xing,

I tried latest rMATS v4.0.2 today, and it still gave me empty outputs except for the fromGTF.*.txt files, even when I tried with your sample data. What do you think might be going wrong? Thanks a lot!

chisom Ezekannagha

unread,
Dec 6, 2018, 11:21:12 AM12/6/18
to rMATS User Group
Dear Dr. Yi

Please can you reply with the solution of the existing problem, even when I am running on a paired read type, it gives empty output except for the fromGTF files. 

below is my command:

python rMATS-turbo-Linux-UCS4/rmats.py --b1 /mnt/chisom/Alternative_Splicing/bam_data/b1.txt --b2 /mnt/chisom/Alternative_Splicing/bam_data/b2.txt --gtf /data/biodata/Cassava/V6/annotations/Mesculenta_305_v6.1.gene_exons.gtf --od CMD_bamTest -t paired --nthread 6 --readLength 101

and the output from the script is 

There are 33033 distinct gene ID in the gtf file
There are 41381 distinct transcript ID in the gtf file
There are 27539 one-transcript genes in the gtf file
There are 234473 exons in the gtf file
There are 7402 one-exon transcripts in the gtf file
There are 7298 one-transcript genes with only one exon in the transcript
Average number of transcripts per gene is 1.252717
Average number of exons per transcript is 5.666199
Average number of exons per transcript excluding one-exon tx is 6.682686
Average number of gene per geneGroup is 4.533807
Fail to open CMD_A--2Aligned.out.dedup.bam
Fail to open CMD_BB-5Aligned.out.dedup.bam
Fail to open CMD_A--1Aligned.out.dedup.bam
Fail to open CMD_B--3Aligned.out.dedup.bam             (I also don't understand this part)
Fail to open CMD_BBB1Aligned.out.dedup.bam
Fail to open CMD_AA-3Aligned.out.dedup.bam

==========
Done processing each gene from dictionary to compile AS events
Found 1039 exon skipping events
Found 12 exon MX events
Found 2637 alt SS events
There are 1851 alt 3 SS events and 786 alt 5 SS events.
Found 1230 RI events
==========

Running the statistical part.
The statistical part is done.
Done.

I will really appreciate to know solution of why i am still getting empty outputs even when my sample is paired. I hope you or anyone with a solution replies soonest. Thanks a lot!

Yongqiang Xing

unread,
Dec 10, 2018, 3:33:38 AM12/10/18
to rmats-us...@googlegroups.com
Hi,  
   In the rMats3.2.5, the --readLength must be inputed, and the read length must be same in fastq file or bam file. However, in rMats4.0.2, if the read length is varied in bam file, I find the parameter read-Length can be omiited or input different value. My qution is how do I select the readLength value when the read length is varied(for example: read length: 31bp-151bp) for rMats4.0.2? 

best

Yongqiang XIng

Yongqiang Xing <imus...@gmail.com> 于2017年12月14日周四 上午8:00写道:
--
You received this message because you are subscribed to a topic in the Google Groups "rMATS User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rmats-user-group/JcPQ4NJNRFw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rmats-user-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rmats-user-group/13745c9c-bffc-4496-8ee7-709b1e6e31fd%40googlegroups.com.

mei luom

unread,
Dec 20, 2018, 6:36:45 AM12/20/18
to rMATS User Group
Hello, I am using your docker version of rMATS, I have some questions in the following:

(1) My samples include single-end and paired-end sequencing, how to set  the " -t " parameter?
(2) Can rMATS do alternative splicing  of a single sample instead of differential alternative splicing  of multiple samples?


mei luom

unread,
Jan 10, 2019, 9:14:06 AM1/10/19
to rMATS User Group
hello,
 in the Release of rMATS 4.0.2,
  • Fixed a bug related to single-end reads.
  • Fixed a bug in the test data set.
in the docker ,you seem not to fix these bug

Ming Leung

unread,
Aug 28, 2019, 10:48:34 AM8/28/19
to rMATS User Group
Hi!

I've been using rMATS to analyze paired end RNA-seq data. I first ran the analysis using rMATS3.2.5, and then  rMATS4.0.2.

As the changelog seems to say that the difference between rMATS3.2.5 and rMATS4.0.2 is optimization, parallelization, and a bug fix for single end reads. As such, I expected the results from 3.2.5 and 4.0.2 to be the same.

However, the programs returned different numbers of rows. For example,
    rMATS3.2.5 
        A3SS.MATS.ReadsOnTargetAndJunctionCounts.txt returned 1163 entries, whereas
    rMATS4.0.2 
        A3SS.MATS.JCEC.txt returned 1220 entries.

Were there additional changes to the paired end read analysis as well?

Thanks!
Ming

Zhijie Xie (Jay)

unread,
Sep 21, 2019, 1:09:25 PM9/21/19
to rMATS User Group
Hi Ming,

During the development of v4.0.2, we fixed a minor issue in finding the exon boundary of A3SS/A5SS/RI. v3.2.5 might miss some A3SS/A5SS/RI event due to this issue.
This issue affect both single end read and paired end read analysis.

Thanks.
Zhijie Xie

在 2019年8月28日星期三 UTC+8下午10:48:34,Ming Leung写道:

刘思雨

unread,
Dec 11, 2019, 3:37:11 AM12/11/19
to rMATS User Group
hi,
I have illumina paired end fastq data and I have mapped it using STAR. After running fastqc on  data, I got sequence length of 20-101. but i find that  MATS currently requires all the read lengths to be the same,if I use -readlength 101 with MATS,will there be any problem? I don't want to run them with trimgalore and STAR again because it takes a long time.
by the way, if I only mapping one time,will the effect be different from using 2-pass mapping? 

M S

unread,
Apr 13, 2020, 11:36:16 PM4/13/20
to rMATS User Group
Hi Jay, 
I have tried pretty much everything suggested on this group but I am constantly having an issue shown in the attached screen shot. The screen just keep going on an on after this.
Thanks
MS 
Message has been deleted

M S

unread,
Apr 13, 2020, 11:46:03 PM4/13/20
to rMATS User Group
here is the screen shot in the attachment. 


On Saturday, 21 September 2019 13:09:25 UTC-4, Zhijie Xie (Jay) wrote:
Screen Shot 2020-04-13 at 11.44.42 PM.png

Thomas Danhorn

unread,
Apr 14, 2020, 12:24:11 AM4/14/20
to M S, rMATS User Group
If you read the manual carefully, you will see that the options --b1 and
--b2 require a *text* file with the names of the BAM files you want to use
as input, *not* the BAM files themselves. So what is happening is that
rMATS interprets the *content* of the BAM file as file name and tells you
it cannot find that file (not very surprising).

Hope this helps,

Thomas
>>>> in the Release of *rMATS 4.0.2
>>>> <http://rnaseq-mats.sourceforge.net/rmats4.0.2/>,*
>>>>
>>>> - Fixed a bug related to single-end reads.
>>>>
>>>>
>>>> - Fixed a bug in the test data set.
>>>>
>>>> in the docker ,you seem not to fix these bug
>>>>
>>>> 在 2017年5月10日星期三 UTC+8上午1:22:45,Yi Xing写道:
>>>>>
>>>>> Dear rMATS Users:
>>>>>
>>>>> We are happy to announce the beta release of rMATS-turbo: a much faster
>>>>> and slimmer version of rMATS. rMATS-turbo achieves a significant gain in
>>>>> computational speed and data storage efficiency. Compared to rMATS 3.2.5,
>>>>> the counting procedure is 20-100 times faster, the statistical test is
>>>>> 300-500 times faster, and the size of the intermediate files is ‾1000 times
>>>>> smaller. We are currently releasing rMATS-turbo for beta testing as a
>>>>> stand-alone Docker container. Feedback and bug reports are welcome.
>>>>>
>>>>> http://rnaseq-mats.sourceforge.net/rmatsdockerbeta/
>>>>>
>>>>> Yours,
>>>>> Yi Xing
>>>>>
>>>>
>
> --
> You received this message because you are subscribed to the Google Groups "rMATS User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rmats-user-gro...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/rmats-user-group/2c93a4e4-409b-46f6-b1aa-9907dae9decb%40googlegroups.com.
>
Reply all
Reply to author
Forward
0 new messages