pysam dependencies for rMATS 3.2.4

680 views
Skip to first unread message

ken...@hudsonalpha.org

unread,
Aug 3, 2016, 12:33:45 PM8/3/16
to rMATS User Group
I'm trying to run the newest version of rMATS; however, I have run into a few issues with the pysam dependencies. I have the correct versions of python, samtools, and the dependencies (through anaconda). Below is the series of things I tried, with the end result being empty output folders and an error with samtools.  

-First tried:
python RNASeq-MATS.py -b1 *.bam -b2 *.bam -gtf gtf/Homo_sapiens.Ensembl.GRCh37.72.gtf -o bam_test -t paired -len 50 -c 0.0001 -analysis U -libType fr-firststrand
Traceback (most recent call last):
  File "RNASeq-MATS.py", line 5, in <module>
    import re,os,sys,logging,time,datetime,pysam;
ImportError: No module named pysam

-then:
pip install pysam --upgrade
Successfully installed pysam-0.9.1.4

-retried first script then got:
Traceback (most recent call last):
  File "RNASeq-MATS.py", line 5, in <module>
    import re,os,sys,logging,time,datetime,pysam;
  File "/Users/kengel/anaconda/lib/python2.7/site-packages/pysam/__init__.py", line 5, in <module>
    from pysam.libchtslib import *
ImportError: dlopen(/Users/kengel/anaconda/lib/python2.7/site-packages/pysam/libchtslib.so, 2): Library not loaded: libcurl.4.dylib
  Referenced from: /Users/kengel/anaconda/lib/python2.7/site-packages/pysam/libchtslib.so
  Reason: Incompatible library version: libchtslib.so requires version 9.0.0 or later, but libcurl.4.dylib provides version 7.0.0
  
-then tried:
conda install -c bioconda pysam=0.9.1

-then got:
It seemed to run, but went very very fast. There were no files in the output folder. Looked in the log.RNASeq-MATS.2016-07-31\ 15\:33\:58.490932.txt file and it said this...

016-07-31 15:33:58,491 rMATS version: 3.2.4
2016-07-31 15:33:58,491 Start the program with [RNASeq-MATS.py -b1 /Users/kengel/Documents/Myers_Lab/CSER/Pip$

2016-07-31 15:33:58,503 ################### folder names and associated input files #############
2016-07-31 15:33:58,503 SAMPLE_1\REP_1  /Users/kengel/Documents/Myers_Lab/CSER/Pipeline/NPC_RNAseq_Summer16/S$
2016-07-31 15:33:58,504 SAMPLE_2\REP_1  /Users/kengel/Documents/Myers_Lab/CSER/Pipeline/NPC_RNAseq_Summer16/S$
2016-07-31 15:33:58,504 #########################################################################

2016-07-31 15:33:58,504 start mapping..
2016-07-31 15:33:58,504 bam files are provided. skip mapping..
2016-07-31 15:33:58,504 done mapping..
2016-07-31 15:33:58,504 indexing bam files to use pysam
2016-07-31 15:33:58,504 getting unique SAM function..
2016-07-31 15:33:58,508 There is an exception in indexing bam files
2016-07-31 15:33:58,508 Exception: <class 'pysam.utils.SamtoolsError'>
2016-07-31 15:33:58,508 Detail: 'samtools returned with error 1: stdout=, stderr=[bam_sort] Use -T PREFIX / -o FILE to specify temporary and final output files\nUsage: samtools sort [options...] [in.bam]\nOptions:\n  -l INT     Set compression level, from 0 (uncompressed) to 9 (best)\n  -m INT     Set maximum memory per thread; suffix K/M/G recognized [768M]\n  -n         Sort by read name\n  -o FILE    Write final output to FILE rather than standard output\n  -T PREFIX  Write temporary files to PREFIX.nnnn.bam\n  -@, --threads INT\n             Set number of sorting and compression threads [1]\n      --input-fmt-option OPT[=VAL]\n               Specify a single input file format option in the form\n               of OPTION or OPTION=VALUE\n  -O, --output-fmt FORMAT[,OPT[=VAL]]...\n               Specify output format (SAM, BAM, CRAM)\n      --output-fmt-option OPT[=VAL]\n               Specify a single output file format option in the form\n               of OPTION or OPTION=VALUE\n      --reference FILE\n               Reference sequence FASTA FILE [null]\n'

Erin Wissink

unread,
Aug 3, 2016, 6:13:39 PM8/3/16
to rMATS User Group
I also have this error. I installed pysam but get the error message:

ImportError:  Library not loaded: libcurl.4.dylib

  Reason: Incompatible library version: libchtslib.so requires version 9.0.0 or later, but libcurl.4.dylib provides version 7.0.0

I've trying googling to figure out how to solve this problem, but I'm unsuccessful so far. Any advice?

Thanks!
Message has been deleted
Message has been deleted

Jinyeong Lim

unread,
Aug 4, 2016, 12:25:16 AM8/4/16
to rMATS User Group
I've solved this problem. I launched old version pysam such as pysam==0.8.0

$ pip install pysam==0.8.0

But, I met another error...

Here is my log.


------------------------------
------------------------------------------------------------------------------------------------------------
2016-08-04 11:38:19,544 rMATS version: 3.2.4
2016-08-04 11:38:19,544 Start the program with [RNASeq-MATS.py -b1 testData/231ESRP.25K.rep-1.bam,testData/231ESRP.25K.rep-2.bam -b2 testData/231EV.25K.rep-1.bam,testData/231EV.25K.rep-2.bam -gtf gtf/Homo_sapiens.Ensembl.GRCh37.72.gtf -o bam_test_2 -t paired -len 50 -c 0.0001 -analysis U -libType fr-firststrand ]

2016-08-04 11:38:19,552 ################### folder names and associated input files #############
2016-08-04 11:38:19,552 SAMPLE_1\REP_1    testData/231ESRP.25K.rep-1.bam
2016-08-04 11:38:19,552 SAMPLE_1\REP_2    testData/231ESRP.25K.rep-2.bam
2016-08-04 11:38:19,552 SAMPLE_2\REP_1    testData/231EV.25K.rep-1.bam
2016-08-04 11:38:19,552 SAMPLE_2\REP_2    testData/231EV.25K.rep-2.bam
2016-08-04 11:38:19,552 #########################################################################

2016-08-04 11:38:19,552 start mapping..
2016-08-04 11:38:19,552 bam files are provided. skip mapping..
2016-08-04 11:38:19,552 done mapping..
2016-08-04 11:38:19,552 indexing bam files to use pysam
2016-08-04 11:38:19,552 getting unique SAM function..
2016-08-04 11:38:21,116 done indexing bam files..
2016-08-04 11:38:21,118 start getting AS events from GTF and BAM files
2016-08-04 11:38:21,118 getting AS events function..
2016-08-04 11:39:22,192 getting AS events is done with status 0
2016-08-04 11:39:22,192
2016-08-04 11:39:22,192 done getting AS events..
2016-08-04 11:39:22,196 Setting proper string
2016-08-04 11:39:22,213 start making MATS input files from AS events and SAM files
2016-08-04 11:39:22,213 making MATS input function..
2016-08-04 11:39:25,172 making MATS input is done with status 256
2016-08-04 11:39:25,172 error in making MATS input 256
2016-08-04 11:39:25,173 error detail: Traceback (most recent call last):
  File "~/Program/rMATS.3.2.4/bin/MATS.processsUnique.bam.py", line 1515, in <module>
    processSample_stranded(sample_1,S1,dataType,'first')
  File "~/Program/rMATS.3.2.4/bin/MATS.processsUnique.bam.py", line 1424, in processSample_stranded
    if (cStart<=(a5ss[chr][group][c][3]-(rL-junctionLength/2)+1) and cEnd<=a5ss[chr][group][c][1] and cEnd>=(a5ss[chr][group][c][3]+(rL-junctionLength/2))): ## multi-exon read supporting target
KeyError: 1248
2016-08-04 11:39:25,173 There is an exception in making MATS input
2016-08-04 11:39:25,173 Exception: <type 'exceptions.Exception'>
2016-08-04 11:39:25,173 Detail:


-----------------------------------------------------------------------------------------------------------------------------------------
What' wrong..? :(

Yaoi T

unread,
Aug 4, 2016, 4:03:37 AM8/4/16
to rMATS User Group
Hi,

I construct the virtual environment for rMATS-3.2.4 by using pyenv and anaconda on Ubuntu 14.0.4 LTS.
Under the environment, it work.
Since you have already installed STAR and samtools, i write how to install  pyenv and anaconda, to  construct the virtual environment, and to use the one.
Although anaconda does not conflict pip, you can use pysam (0.8.4). So, I recommend Lim to pip uninstall older pysam.
Even if your OS system's python version differs form a pre-required version, you can construct a virtual environment using  the adequate python: under the system python-2.X, python-3.x works in the virtual.
Use of anaconda is useful when you need to construct the new python - based environment with maintaining the current environment.
In the following, I used " git clone " for installation of pyenv. On MacOSX, however, " Homebrew " is used in stead of " git clone ".

Yaoi T


[1] Installation of pyenv

$ git clone https://github.com/yyuu/pyenv.git ~/.pyenv
$ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
$ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
$ echo 'eval "$(pyenv init -)"' >> ~/.bashrc
$ source ~/.bashrc

[2] Insallation of anaconda

[2-1] cheking your python version
$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
>>> quit()

[2-2] installation of the latest version of anaconda for python (the above version)
# showing the list of all version of anaconda that you can install now.
$ pyenv install -l | grep ana
  anaconda-1.4.0
  anaconda-1.5.0
  anaconda-1.5.1
  anaconda-1.6.0
  anaconda-1.6.1
  anaconda-1.7.0
  anaconda-1.8.0
  anaconda-1.9.0
  anaconda-1.9.1
  anaconda-1.9.2
  anaconda-2.0.0
  anaconda-2.0.1
  anaconda-2.1.0
  anaconda-2.2.0
  anaconda-2.3.0
  anaconda-2.4.0
  anaconda-4.0.0
  anaconda2-2.4.0
  anaconda2-2.4.1
  anaconda2-2.5.0
  anaconda2-4.0.0
  anaconda2-4.1.0  # the latest for python 2.x
  anaconda3-2.0.0
  anaconda3-2.0.1
  anaconda3-2.1.0
  anaconda3-2.2.0
  anaconda3-2.3.0
  anaconda3-2.4.0
  anaconda3-2.4.1
  anaconda3-2.5.0
  anaconda3-4.0.0
  anaconda3-4.1.0  # the latest for python 3.x
$ pyenv install anaconda2-4.1.0  # herein installing the latest for python 2.x
$ pyenv rehash
$ pyenv global anaconda2-4.1.0
$ which anaconda
/(PATH to .pyenv)/shims/anaconda
$ cd /(PATH to .pyenv)/shims/
~/.pyenv/shims$ conda info -e   # this command shows the list of virtual environments constructed by using anaconda so far.
# conda environments:
#
root                  *  /home/tyaoi4/.pyenv/versions/anaconda2-4.1.0


[2-3] construction of virtual environment for rMATS-3.2.4

If you want tu create the envronmet named rMATS_conda, input the folloeing;

~/.pyenv/shims$ conda create -n rMATS_conda python=2.7 numpy scipy

If you want to install the various library including numpy and scipy

~/.pyenv/shims$ conda create -n rMATS_conda python=2.7 anaconda2


~/.pyenv/shims$ conda info -e
Using Anaconda Cloud api site https://api.anaconda.org
# conda environments:
#
rMATS_anaconda           /(PATH to .pyenv)/versions/anaconda2-4.1.0/envs/rMATS_anaconda
root                  *  /(PATH to .pyenv)/versions/anaconda2-4.1.0

[2-4] Installing pysam into the virtual environment "rMATS_anaconda"

~/.pyenv/shims$ conda install -n rMATS_anaconda pysam
~/.pyenv/shims$ exit

In this case, pysam (0.8.4) is installed into "rMATS_anaconda" but not the other environments.
If not using -n <envionment>, you can installed into all environment.


[3] Use of rMATS-3.2.4/STAR-2.5/samtools-1.2

After starting terminal,.......

$ which anaconda
/(PATH to .pyenv)/shims/anaconda
$ cd /(PATH to .pyenv)/shims/
~/.pyenv/shims$ conda info -e
Using Anaconda Cloud api site https://api.anaconda.org
# conda environments:
#
rMATS_anaconda           /(PATH to .pyenv)/versions/anaconda2-4.1.0/envs/rMATS_anaconda
root                  *  /(PATH to .pyenv)/versions/anaconda2-4.1.0

$ source /(PATH to .pyenv)/versions/anaconda2-4.1.0/envs/rMATS_anaconda/bin/activate rMATS_anaconda
prepending /(PATH to .pyenv)/versions/anaconda2-4.1.0/envs/rMATS_anaconda/bin to PATH
(rMATS_anaconda) ~/.pyenv/shims$

Now, your virtual environment successfully boots, and you can resume whatever you were doing.

(rMATS_anaconda) ~/.pyenv/shims$ export PATH="/PATH/to/samtools-1.2:$PATH"
(rMATS_anaconda) ~/.pyenv/shims$ export PATH="/PATH/to/STAR-2.5:$PATH"
(rMATS_anaconda) ~/.pyenv/shims$ cd /PATH/to/rMATS-3.2.4
(rMATS_anaconda) ~/.pyenv/shims$ ./testRun.sh /PATH/to/STARindex/hg19

Yaoi T

unread,
Aug 4, 2016, 4:26:59 AM8/4/16
to rMATS User Group
Sorry, I forgot an important command.

After completing the work, you must exit the virtual environment by using the following, and then end terminal.

(rMATS_anaconda) ~/.pyenv/shims$ source deactivate
~/.pyenv/shims$ exit

Yaoi T

2016年8月4日木曜日 17時03分37秒 UTC+9 Yaoi T:

Yaoi T

unread,
Aug 4, 2016, 9:53:06 PM8/4/16
to rMATS User Group
Correction !!

In [3] Use of rMATS-3.2.4/STAR-2.5/samtools-1.2


~/.pyenv/shims$ conda info -e
Using Anaconda Cloud api site https://api.anaconda.org
# conda environments:
#
rMATS_anaconda           /(PATH to .pyenv)/versions/anaconda2-4.1.0/envs/rMATS_anaconda
root                  *  /(PATH to .pyenv)/versions/anaconda2-4.1.0

$ source /(PATH to .pyenv)/versions/anaconda2-4.1.0/envs/rMATS_anaconda/bin/activate rMATS_anaconda

=>  $ source /(PATH to .pyenv)/versions/anaconda2-4.1.0/envs/rMATS_anaconda/bin/activate rMATS_anaconda/bin/activate rMATS_anaconda

This command is for the activation of a selected virtual environment.
So, when your work ends, you must exit this environment by using " source deactivate ".


Yoai T


2016年8月4日木曜日 17時26分59秒 UTC+9 Yaoi T:

Erin Wissink

unread,
Aug 5, 2016, 4:41:45 AM8/5/16
to rMATS User Group
Hi Jinyeong,

I ran into the same error, and I found that the error is in the -libType fr-firststrand option. I was able to run rMATs in unstranded mode (although I would prefer to use strand-specific data if possible).

Erin

Yibo Liu

unread,
Aug 5, 2016, 5:22:50 AM8/5/16
to rMATS User Group
hi,

I meet the same problem( exactly same log file).  Did you solve it?

Thanks,

Yibo

Tiago Bruno Castro

unread,
Aug 7, 2016, 11:16:18 PM8/7/16
to rMATS User Group
I got the same error. This is how I solve it.


First of all it seems that pysam change its syntax for indexing and sorting the BAM files. But I could not fix it on the rMATS code. I found that rMATS try to index and sort the BAM files even if I already did that manually. 

I commented the lines from the python code that try to index and sort and simply copy each BAM in the REP_1 or REP_2 folders with the name alignedSorted.bam and .bai names.

It worked normally after that. 

ken...@hudsonalpha.org

unread,
Aug 10, 2016, 1:15:26 PM8/10/16
to rMATS User Group
I am new to python, would you mind sharing the new code you introduced into the .py? I commented out rMATS code and here is what I tried instead:

    #pysam.sort(bam_fn,rTempFolder+'/aligned.sorted'); ## it will make aligned.sorted.bam file

    #pysam.index(rTempFolder+'/aligned.sorted.bam'); ## it will make aligned.sorted.bam.bai file

    shutil.copyfile(/Volumes/kengelHD/AlternativeSplicingTests/ENCODEbams/SL13023.sorted.bam, rTempFolder+'/aligned.sorted.bam')

    shutil.copyfile(/Volumes/kengelHD/AlternativeSplicingTests/ENCODEbams/SL13023.sorted.bam.bai, rTempFolder+'/aligned.sorted.bam.bai')

YI XING

unread,
Aug 10, 2016, 2:16:59 PM8/10/16
to ken...@hudsonalpha.org, rMATS User Group

This issue is caused by changes to Samtools’s API. We just updated our program to support Samtools v0.1.19/1.2/1.3 so this error should go away. Please wait for a few days before we finish testing and upload the new version to sourceforge.

--
You received this message because you are subscribed to the Google Groups "rMATS User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rmats-user-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rmats-user-group/58efb4a1-b392-43eb-bd37-bff947197045%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ken...@hudsonalpha.org

unread,
Aug 10, 2016, 5:47:32 PM8/10/16
to rMATS User Group, ken...@hudsonalpha.org
Just to update from earlier. Adding quotes to the path fixed the syntax error. It ran for a while but then gave me this error:

#    pysam.sort(bam_fn,rTempFolder+'/aligned.sorted'); ## it will make aligned.sorted.bam file
#    pysam.index(rTempFolder+'/aligned.sorted.bam'); ## it will make aligned.sorted.bam.bai file
    shutil.copyfile('/Volumes/kengelHD/AlternativeSplicingTests/ENCODEbams/SL13023.sorted.bam', rTempFolder+'/aligned.sorted.bam')
    shutil.copyfile('/Volumes/kengelHD/AlternativeSplicingTests/ENCODEbams/SL13023.sorted.bam.bai', rTempFolder+'/aligned.sorted.bam.bai')

python RNASeq-MATS.py -b1 /Volumes/kengelHD/AlternativeSplicingTests/ENCODEbams/ENCFF666DTY.sort.bam -b2 /Volumes/kengelHD/AlternativeSplicingTests/ENCODEbams/SL13023.sorted.bam -gtf /Users/kengel/rMATS.3.2.4/gtf/Homo_sapiens.Ensembl.GRCh37.72.gtf -o /Volumes/kengelHD/AlternativeSplicingTests/bam_test -t paired -len 101 -c 0.0001 -analysis U -libType fr-firststrand

Output Error: /Volumes/kengelHD/AlternativeSplicingTests/bam_test/log.RNASeq-MATS.2016-08-10\ 13\:30\:04.898617.txt
2016-08-10 14:19:47,588 done getting AS events..
2016-08-10 14:19:47,611 Setting proper string
2016-08-10 14:19:47,663 start making MATS input files from AS events and SAM files
2016-08-10 14:19:47,663 making MATS input function..
2016-08-10 14:40:02,242 making MATS input is done with status 256
2016-08-10 14:40:02,260 error in making MATS input 256
2016-08-10 14:40:02,260 error detail: Traceback (most recent call last):
  File "/Users/kengel/rMATS.3.2.4/bin/MATS.processsUnique.bam.py", line 1518, in <module>
    processSample_stranded(sample_2,S2,dataType,'first')
  File "/Users/kengel/rMATS.3.2.4/bin/MATS.processsUnique.bam.py", line 1424, in processSample_stranded
    if (cStart<=(a5ss[chr][group][c][3]-(rL-junctionLength/2)+1) and cEnd<=a5ss[chr][group][c][1] and cEnd>=(a5ss[chr][gr$
KeyError: 4946
2016-08-10 14:40:02,261 There is an exception in making MATS input
2016-08-10 14:40:02,261 Exception: <type 'exceptions.Exception'>
2016-08-10 14:40:02,261 Detail:

zong

unread,
Aug 15, 2016, 2:45:23 AM8/15/16
to rMATS User Group
The following is the command I used, 

python /clusterdata/hiseq_apps/bin/freeze001/rMATS/rMATS.3.2.4/RNASeq-MATS.py -b1 /illumina/Data/others/AlternativeSplicingTools/U2OS_SS/results_hisat_cut75/hisat/SRR1362999/accepted_hits.corsort.bam,/illumina/Data/others/AlternativeSplicingTools/U2OS_SS/results_hisat_cut75/hisat/SRR1363002/accepted_hits.corsort.bam,/illumina/Data/others/AlternativeSplicingTools/U2OS_SS/results_hisat_cut75/hisat/SRR1363005/accepted_hits.corsort.bam -b2 /illumina/Data/others/AlternativeSplicingTools/U2OS_SS/results_hisat_cut75/hisat/SRR1362999/accepted_hits.corsort.bam,/illumina/Data/others/AlternativeSplicingTools/U2OS_SS/results_hisat_cut75/hisat/SRR1363002/accepted_hits.corsort.bam,/illumina/Data/others/AlternativeSplicingTools/U2OS_SS/results_hisat_cut75/hisat/SRR1363005/accepted_hits.corsort.bam -gtf /clusterdata/hiseq_apps/resources/freeze001/hg19/hg19.gtf -o /illumina/Data/others/AlternativeSplicingTools/U2OS_U2OS_SS/results_hisat/rMATS -t paired -len 75 -libType fr-firststrand -novelSS 1 -o /illumina/Data/others/AlternativeSplicingTools/U2OS_U2OS_SS/results_hisat/rMATS


and I got the following ERROR message:

2016-08-11 20:04:15,759 error detail: Traceback (most recent call last):

  File "/clusterdata/hiseq_apps/bin/freeze001/rMATS/rMATS.3.2.4/bin/MATS.processsUnique.bam.py", line 1515, in <module>

    processSample_stranded(sample_1,S1,dataType,'first')

  File "/clusterdata/hiseq_apps/bin/freeze001/rMATS/rMATS.3.2.4/bin/MATS.processsUnique.bam.py", line 1447, in processSample_stranded

    if (cStart>a5ss[chr][group][c][0] and cStart<=(a5ss[chr][group][c][2]-(rL-junctionLength/2)+1) and cEnd>=(a5ss[chr][group][c][2]+(rL-junctionLength/2))): ## multi-exon read supporting target

KeyError: 52

2016-08-11 20:04:15,759 There is an exception in making MATS input

2016-08-11 20:04:15,759 Exception: <type 'exceptions.Exception'>

2016-08-11 20:04:15,759 Detail: 


looks like the option "-libType fr-firststrand" has problem, since I successfully finished my rMATS without this option.

Any help will be appreciated.

Zong
Reply all
Reply to author
Forward
Message has been deleted
0 new messages