rna-seq error

412 views
Skip to first unread message

Hong Hur

unread,
May 26, 2017, 3:17:35 PM5/26/17
to QualiMap
Hello all,

I am using qualimap as a qc tool in rna-seq workflow with docker. 
It has been working just fine, but I have run into weird problem where qualimap is kept failing to run rna-seq when writing out to output file while running the pipeline with pair-end samples.

More detail on set up:
I am testing two pair end samples. Sample A will go through qualimap okay sometimes and fail some other while Sample B fails qualimap all the time.
We have a workflow set up on hpc environment also. There, qualimap had no issues.


These are what I have tried:
- augmented volumes have plenty of quota, it is not running out of space. Nor the inodes.
- I have tried setting up -Djava.io.tmpdir to other location but had no luck. I am guessing since I am running qualimap in Docker container, it might need a different approach in setting that tmpdir.

I am wondering if anyone with more experience using Docker could help me resolve this issue.


This is the command I use to start the workflow:

docker run --rm -v /data1/hpc:/bank --rm -v /data1/work_new:/analysis hhur/snipe_docker:latest snakemake -s /usr/local/snipe/rna_seq_pipe.snake --configfile /analysis/conf.yaml -d /analysis -j 12 --resources memory=34359738368 -p-ba-s /usr/local/snipe/rna_seq_pipe.snake --configfile /analysis/conf.yaml -d /analysis -j 12 --resources memory=34359738368 -p


This is the qualimap command generated by the workflow:

qualimap rnaseq -outdir final/qc/CLR10-2_S14_001_qualimap/ -a proportional                         --sequencing-protocol strand-specific-reverse                         --paired -bam STAR/CLR10-2_S14_001.bam -gtf /bank/Hsapiens/GRCh37/rnaseq/ref-transcripts.gtf                         --java-mem-size=4294967296 > log/CLR10-2_S14_001.qualimap.log 2>&1


This is the error message I get:

Initialized 2000000 regions...

Initialized 2100000 regions...

Initialized 2200000 regions...

Initialized 2300000 regions...

Initialized 2400000 regions...

Initialized 2500000 regions...


Initialized 2530013 regions it total


Starting constructing transcripts for RNA-seq stats...

Finished constructing transcripts


Starting BAM file analysis


Sorting BAM file by name...


Read 10000000 records.

Failed to run rnaseq

net.sf.samtools.util.RuntimeIOException: Write error; BinaryCodec in writemode; streamed file (filename not available)

        at net.sf.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:199)

        at net.sf.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:189)

        at net.sf.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:139)

        at net.sf.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:37)

        at net.sf.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:210)

        at net.sf.samtools.util.SortingCollection.add(SortingCollection.java:150)

        at net.sf.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:170)

        at org.bioinfo.ngs.qc.qualimap.process.ComputeCountsTask.sortSamByName(ComputeCountsTask.java:143)

        at org.bioinfo.ngs.qc.qualimap.process.ComputeCountsTask.run(ComputeCountsTask.java:431)

        at org.bioinfo.ngs.qc.qualimap.process.RNASeqQCAnalysis.run(RNASeqQCAnalysis.java:68)

        at org.bioinfo.ngs.qc.qualimap.main.RnaSeqQcTool.execute(RnaSeqQcTool.java:188)

        at org.bioinfo.ngs.qc.qualimap.main.NgsSmartTool.run(NgsSmartTool.java:187)

        at org.bioinfo.ngs.qc.qualimap.main.NgsSmartMain.main(NgsSmartMain.java:111)

Caused by: java.io.IOException: No space left on device

        at java.io.FileOutputStream.writeBytes(Native Method)

        at java.io.FileOutputStream.write(FileOutputStream.java:345)

        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)

        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)

        at net.sf.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:197)

        ... 12 more



If anyone can make any suggestions or give any tips, I will greatly appreciate it.
Thank you !

Konstantin Okonechnikov

unread,
May 29, 2017, 6:53:53 AM5/29/17
to qual...@googlegroups.com
Hi!

The problem is definitely with sorting procedure which is called from picard library samtools implementation: it requires tmp folder to keep the data. Setting temp folder for the procedure should be OK. How did you set it? For example, for me it worked by updating qualimap launch script:

TMPDIR=/location/to/tmp/folder

java $java_options -Djava.io.tmpdir=$TMPDIR  -classpath "$QUALIMAP_HOME"/qualimap.jar:"$QUALIMAP_HOME"/lib/* org.bioinfo.ngs.qc.qualimap.main.NgsSmartMain "${ARGS[@]}"

 Another solution: sort BAM file by name in advance using samtools binary and then use option -sorted for Qualimap.

--
   Konstantin
 

--
You received this message because you are subscribed to the Google Groups "QualiMap" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qualimap+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hong Hur

unread,
Jun 2, 2017, 9:54:28 AM6/2/17
to QualiMap, k.okone...@gmail.com
Hello

I managed to fix the problem with a mixed combination of two suggestions you made.
I used samtools to sort the reads by name before I pass .bam files to qualimap. I also used samtools sort's -T to give a prefix to a temporary file which I set to system level tmp directory: /tmp

Thank you for your help!
Greatly appreciate it.

Hong
To unsubscribe from this group and stop receiving emails from it, send an email to qualimap+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages