Qualimap v.0.8 bamqc java.lang.ArrayIndexOutOfBoundsException

417 views
Skip to first unread message

鲁娜

unread,
Mar 25, 2014, 12:12:31 PM3/25/14
to qual...@googlegroups.com
Dear QualiMap authors and users,

I have tried to use the qualmap v.0.8  to run bamqc,but I get the following java.lang.ArrayIndexOutOfBoundsException error;

Details:

$ ~/tools/qualimap/qualimap bamqc --java-mem-size=15G -c -gd HUMAN -nt 8 -outformat PDF  -bam test.bam

Java memory size is set to 15G
Launching application...

QualiMap v.0.8
Built on 2014-03-05 17:17

Selected tool: bamqc
Available memory (Mb): 32
Max memory (Mb): 14316
Tue Mar 25 17:56:02 CST 2014            WARNING output folder already exists
Starting bam qc....
Loading sam header...
Loading locator...
Loading reference...
Number of windows: 400, effective number of windows: 492
Chunk of reads size: 1000
Number of threads: 8
Processed 50 out of 492 windows...
Processed 100 out of 492 windows...
Processed 150 out of 492 windows...
Processed 200 out of 492 windows...
Processed 250 out of 492 windows...
Processed 300 out of 492 windows...
Processed 350 out of 492 windows...
Processed 400 out of 492 windows...
Processed 450 out of 492 windows...
Total processed windows:492
Number of reads: 9857870
Number of valid reads: 7363854
Number of duplicated reads: 0
Number of correct strand reads:0
Tue Mar 25 18:06:03 CST 2014            WARNING SAMRecordParser failed to process 2 reads.

Inside of regions...
Num mapped reads: 7363854
Num mapped first of pair: 3722096
Num mapped second of pair: 3641758
Num singletons: 118556
Time taken to analyze reads: 600
Computing descriptors...
numberOfMappedBases: 467019594
referenceSize: 3137161264
numberOfSequencedBases: 466796496
numberOfAs: 129856974
Computing per chromosome statistics...
Computing histograms...


The error is :
java.lang.IndexOutOfBoundsException: Index: 65, Size: 65
        at java.util.ArrayList.RangeCheck(ArrayList.java:547)
        at java.util.ArrayList.get(ArrayList.java:322)
        at org.bioinfo.ngs.qc.qualimap.beans.BamStats.computeReadsContentHistogrmas(BamStats.java:880)
        at org.bioinfo.ngs.qc.qualimap.beans.BamStats.computeHistograms(BamStats.java:801)
        at org.bioinfo.ngs.qc.qualimap.process.BamStatsAnalysis.run(BamStatsAnalysis.java:481)
        at org.bioinfo.ngs.qc.qualimap.main.BamQcTool.execute(BamQcTool.java:198)
        at org.bioinfo.ngs.qc.qualimap.main.NgsSmartTool.run(NgsSmartTool.java:177)
        at org.bioinfo.ngs.qc.qualimap.main.NgsSmartMain.main(NgsSmartMain.java:98)

I don't get any outfile, How can I do?
Thanks;
lunar;

鲁娜

unread,
Mar 25, 2014, 12:35:34 PM3/25/14
to qual...@googlegroups.com
$samtools view -H test.bam 
@HD     VN:1.0  SO:coordinate
@RG     ID:FCC3RCHACXX_L7       PL:ILLUMINA     LB:RSZAXPI001722-37     SM:Mal-1
@PG     ID:bwa  PN:bwa  VN:0.6.2-r126-tpx

I have do this:
$samtools view -s 0.01 -b test.bam > random.bam 

The error still have:
$/tools/qualimap/qualimap bamqc -bam test.bam -gd HUMAN -c -outformat PDF
Java memory size is set to 1200M
Launching application...

QualiMap v.0.8
Built on 2014-03-05 17:17

Selected tool: bamqc
Available memory (Mb): 32
Max memory (Mb): 1118
Starting bam qc....
Loading sam header...
Loading locator...
Loading reference...
Number of windows: 400, effective number of windows: 492
Chunk of reads size: 1000
Number of threads: 8
Processed 50 out of 492 windows...
Processed 100 out of 492 windows...
Processed 150 out of 492 windows...
Processed 200 out of 492 windows...
Processed 250 out of 492 windows...
Processed 300 out of 492 windows...
Processed 350 out of 492 windows...
Processed 400 out of 492 windows...
Processed 450 out of 492 windows...
Total processed windows:492
Number of reads: 98875
Number of valid reads: 73796
Number of duplicated reads: 0
Number of correct strand reads:0

Inside of regions...
Num mapped reads: 73796
Num mapped first of pair: 37295
Num mapped second of pair: 36501
Num singletons: 1204
Time taken to analyze reads: 240
Computing descriptors...
numberOfMappedBases: 4680535
referenceSize: 3137161264
numberOfSequencedBases: 4678062
numberOfAs: 1302189
Computing per chromosome statistics...
Computing histograms...

Konstantin Okonechnikov

unread,
Apr 22, 2014, 4:35:13 AM4/22/14
to qual...@googlegroups.com
Hi! Sorry for late reply.

Would be great if you could attach your small random sample test.bam or send it to me directly, so we can investigate the problem in detail.

Thanks in advance,
   Konstantin

Travis Collier

unread,
Sep 25, 2014, 4:35:21 AM9/25/14
to qual...@googlegroups.com
Did you ever figure this out.  I'm having the same issue (I think).
"Computing histograms" fails trying to go outside the bounds of an array.  In my case, it is trying for index 150 of an array sized 150.
Our data is PE150 illumina reads, but they have been trimmed using trimmomatic... so some reads are a bit shorter.  Mapped with bwa mem 0.7.5a-r405, but also fed through picard tools markdup, and then GATK local realignment.

The error isn't consistent.  For our latest data it failed on 2 out of 8 samples, but ran fine on the other 6.
I've tried both v0.6 and v2.0 (both from your builds).

Sadly, I don't see how to go about making you a small test bam since the error is inconsistent.  There isn't anything obviously different between the data it fails on and the data it works for (all processed through an identical workflow, similar qualities, similar numbers of reads, ect.)  I could give you access to download one of the failing bam files, but they are really big (bit over 3Gb).  If you'd like, just say so.

Here is the command and output...
$install/qualimap_v2.0/qualimap bamqc -c -nt 30 --java-mem-size=4G -bam  realigned.bam -outdir realigned.bam.qualimap
Java memory size is set to 4G
Launching application...

QualiMap v.2.0
Built on 2014-08-28 17:03

Selected tool: bamqc
Available memory (Mb): 33
Max memory (Mb): 3817
Starting bam qc....
Loading sam header...
Loading locator...
Loading reference...
Number of windows: 400, effective number of windows: 400
Chunk of reads size: 1000
Number of threads: 30
Processed 50 out of 400 windows...
Processed 100 out of 400 windows...
Processed 150 out of 400 windows...
Processed 200 out of 400 windows...
Processed 250 out of 400 windows...
Processed 300 out of 400 windows...
Processed 350 out of 400 windows...
Processed 400 out of 400 windows...
Total processed windows:400
Number of reads: 35905466
Number of valid reads: 299741
Number of duplicated reads: 47786
Number of correct strand reads:0

Inside of regions...
Num mapped reads: 299741
Num mapped first of pair: 128845
Num mapped second of pair: 128691
Num singletons: 1310
Time taken to analyze reads: 85
Computing descriptors...
numberOfMappedBases: 12375351
referenceSize: 5113802
numberOfSequencedBases: 12371766
numberOfAs: 2512352
Computing per chromosome statistics...
Computing histograms...
Failed to run bamqc

java.lang.IndexOutOfBoundsException: Index: 150, Size: 150
        at java.util.ArrayList.rangeCheck(ArrayList.java:635)
        at java.util.ArrayList.get(ArrayList.java:411)
        at org.bioinfo.ngs.qc.qualimap.beans.BamStats.computeReadsContentHistogrmas(BamStats.java:876)
        at org.bioinfo.ngs.qc.qualimap.beans.BamStats.computeHistograms(BamStats.java:797)
        at org.bioinfo.ngs.qc.qualimap.process.BamStatsAnalysis.run(BamStatsAnalysis.java:493)
        at org.bioinfo.ngs.qc.qualimap.main.BamQcTool.execute(BamQcTool.java:197)
        at org.bioinfo.ngs.qc.qualimap.main.NgsSmartTool.run(NgsSmartTool.java:187)
        at org.bioinfo.ngs.qc.qualimap.main.NgsSmartMain.main(NgsSmartMain.java:103)
Thu Sep 25 01:08:21 PDT 2014            WARNING Cleanup output dir

PS: I've just downloaded the source... Building is being a pain (I'm not a java guy, so my machine isn't setup for it and I don't know the build system).
However, I think I might have found the bug:
ensureListSize(...) in BamStats.java calls list.size(), but XYVector defines a getSize() function, not a size() function.  That shouldn't work at all IMO, but maybe XYVector is inheriting a size() method from Serializable which which can actually be called (a bug in Serializable IMO).
Anyways, if you define a size() method for XYVector which just returns getSize(), that *might* fix this problem.  I'd suggest doing that anyway just so that XYVector has the canonical interface fully defined.
Sorry that I can spend the time at the moment to get all the prereqs and figure out the build system to actually test it.

Konstantin Okonechnikov

unread,
Sep 25, 2014, 1:18:16 PM9/25/14
to qual...@googlegroups.com
Dear Travis,

thanks a lot for your report and for the investigation!

I think the best way would be to send me the download link to the problematic BAM file. This way I will be able to fix the issue for sure.

You can send it privately to my work e-mail:
okonechnikov(at)mpiib-berlin(dot)mpg(dot)de


--
Konstantin



--
You received this message because you are subscribed to the Google Groups "QualiMap" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qualimap+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Travis Collier

unread,
Sep 25, 2014, 10:55:45 PM9/25/14
to qual...@googlegroups.com, k.okone...@gmail.com
I'm pretty sure I've found (and fixed) the problem.  I'm working off the v2.0 source (finally figured out how to build it).
I misread the code, so my original suggested fix doesn't help.  The offending object is an ArrayList<Long>, not an XYVector.  Please just ignore that.

However, I do see that your for loop at BamStats.java:870 (in 2.0-src) is:
        for (int i = 0; i <= readMaxSize; ++i ) {
But just above that your only ensuring size goes to readMaxSize (not readMaxSize+1):
        ensureListSize(readsNsData, readMaxSize);

My guess is that the "<=" in the for loop condition should just be "<".  
        for (int i = 0; i < readMaxSize; ++i ) {
Alternatively, you could be intending that the readsNsData (and the other ArrayLists) are actually readMaxSize+1 in size.

I've got the build working, and either of those fixes seems to resolve the issue.  I don't know about correctness though.
The results look pretty identical:
insert_size_across_reference.txt has slightly different values... less than 1% difference as far as I see.
mapped_reads_nucleotide_content.txt for the alternative (readMaxSize+1) fix has an extra line:
  150.0 NaN     NaN     NaN     NaN     NaN

Konstantin Okonechnikov

unread,
Sep 26, 2014, 6:44:00 AM9/26/14
to qual...@googlegroups.com
Dear Travis,

you are right, there is this typical bug when comparing with  <= against array size instead of simply <. I applied the fix and uploaded the snapshot:

However, the bug has nothing to do with insert size, it should only affect "Reads Nucleotide Content Histogram". The issue was random since it is only applicable for alignment data with various read sizes and only in some specials cases (when initial read size estimation was not enough).

Thanks a lot for your help!

Best regards,
  Konstantin




--

Alexander Peltzer

unread,
Dec 29, 2014, 9:37:57 AM12/29/14
to qual...@googlegroups.com, k.okone...@gmail.com
Can you make this available, maybe as an official version 2.0.1 or something like this? I had the same issues and using the latest dev snapshot worked pretty well for me, too. 

Konstantin Okonechnikov

unread,
Dec 29, 2014, 11:27:55 AM12/29/14
to qual...@googlegroups.com
Dear Alex,

thanks for the advice!

Yep, we hope to release 2.0.1 pretty soon.

--
  Konstantin


Chongjing Xia

unread,
Apr 10, 2015, 12:49:10 AM4/10/15
to qual...@googlegroups.com, k.okone...@gmail.com

Hello Konstantin:

I am using QualiMap v.2.1., to run BAM QC and got an similar problem. Following is the error I got:


Launching application...

QualiMap v.2.1
Built on 2015-03-19 12:05

Selected tool: bamqc
Available memory (Mb): 32
Max memory (Mb): 3817
Starting bam qc....
Loading sam header...
Loading locator...
Loading reference...
Number of windows: 400, effective number of windows: 402
Chunk of reads size: 1000
Number of threads: 12
Processed 50 out of 402 windows...
Processed 100 out of 402 windows...
Processed 150 out of 402 windows...
Processed 200 out of 402 windows...
Processed 250 out of 402 windows...
Processed 300 out of 402 windows...
Processed 350 out of 402 windows...
Processed 400 out of 402 windows...
Total processed windows:401
Number of reads: 4158062
Number of valid reads: 4147723
Number of duplicated reads: 0
Number of correct strand reads:0

Inside of regions...
Num mapped reads: 4147723
Num mapped first of pair: 2073896
Num mapped second of pair: 2073827
Num singletons: 3789
Time taken to analyze reads: 30
Computing descriptors...
numberOfMappedBases: 417609800
referenceSize: 60000003
numberOfSequencedBases: 413430557
numberOfAs: 135922366
Std coverage squared:21.645729063229332
Std coverage squared v2:-24.868628234744133
Computing per chromosome statistics...
Failed to run bamqc

java.lang.IndexOutOfBoundsException: Index: 401, Size: 401
        at java.util.ArrayList.RangeCheck(ArrayList.java:547)
        at java.util.ArrayList.get(ArrayList.java:322)
        at org.bioinfo.ngs.qc.qualimap.beans.BamStats.computeChromosomeStats(BamStats.java:1725)
        at org.bioinfo.ngs.qc.qualimap.process.BamStatsAnalysis.run(BamStatsAnalysis.java:500)
        at org.bioinfo.ngs.qc.qualimap.main.BamQcTool.execute(BamQcTool.java:203)
        at org.bioinfo.ngs.qc.qualimap.main.NgsSmartTool.run(NgsSmartTool.java:187)
        at org.bioinfo.ngs.qc.qualimap.main.NgsSmartMain.main(NgsSmartMain.java:103)
Thu Apr 09 21:32:47 PDT 2015            WARNING Cleanup output dir

There should be no problem with my bam file, since I am working four bam files and this is the only file not working for qualimapBAM QC. What should do with this?

Thanks

Konstantin Okonechnikov

unread,
Apr 10, 2015, 3:59:44 AM4/10/15
to qual...@googlegroups.com
Hi Chongjing,

thanks for report!

Could you please share the BAM file with me (using drive.google.com for example) , so I can check it carefully?

Additionally, you can create a small subsample from BAM file and try once again. If BAM QC fails on subsample then share the subsample.

Subsample can be created using SAMtools:
samtools view -s 0.01 -b file.bam > file.sample.bam

--
  Konstantin

Bruno Zeitouni

unread,
Apr 23, 2015, 6:09:00 AM4/23/15
to qual...@googlegroups.com, dul...@gmail.com
Dear Konstantin,

I have the same error than Chongjing in some of my BAM files.

Would you have solved the problem or any idea where the error is coming from ?

Thanks,

Bruno


Java memory size is set to 20G

Launching application...

QualiMap v.2.1
Built on 2015-03-19 12:05

Selected tool: bamqc
Available memory (Mb): 33
Max memory (Mb): 19088
Thu Apr 23 12:32:26 BST 2015        WARNING    output folder already exists


Starting bam qc....
Loading sam header...
Loading locator...
Loading reference...
Number of windows: 400, effective number of windows: 424

Chunk of reads size: 1000
Number of threads: 4
Initializing regions from /media/bioinf_data/annotations/Homo_sapiens/AGILENT_NT_KITS/50mb_V5/50mb_V5_Covered.bed.....
Found 230418 regions
Filling region references...
Processed 50 out of 424 windows...
Processed 100 out of 424 windows...
Processed 150 out of 424 windows...
Processed 200 out of 424 windows...
Processed 250 out of 424 windows...
Processed 300 out of 424 windows...
Processed 350 out of 424 windows...
Processed 400 out of 424 windows...
Total processed windows:423
Number of reads: 125694423
Number of valid reads: 125694423

Number of duplicated reads: 0
Number of correct strand reads:0

Inside of regions...
Num mapped reads: 89410331
Num mapped first of pair: 44986758
Num mapped second of pair: 44423573
Num singletons: 160623
Time taken to analyze reads: 722
Computing descriptors...
numberOfMappedBases: 8726165961
referenceSize: 3095693983
numberOfSequencedBases: 8725634061
numberOfAs: 2270209700
Std coverage squared:14754.560221051099
Std coverage squared v2:-15226.1525283338

Computing per chromosome statistics...
Failed to run bamqc

java.lang.IndexOutOfBoundsException: Index: 423, Size: 423

    at java.util.ArrayList.rangeCheck(ArrayList.java:635)
    at java.util.ArrayList.get(ArrayList.java:411)
    at org.bioinfo.ngs.qc.qualimap.beans.BamStats.computeChromosomeStats(BamStats.java:1725)
    at org.bioinfo.ngs.qc.qualimap.process.BamStatsAnalysis.run(BamStatsAnalysis.java:500)
    at org.bioinfo.ngs.qc.qualimap.main.BamQcTool.execute(BamQcTool.java:203)
    at org.bioinfo.ngs.qc.qualimap.main.NgsSmartTool.run(NgsSmartTool.java:187)
    at org.bioinfo.ngs.qc.qualimap.main.NgsSmartMain.main(NgsSmartMain.java:103)
Thu Apr 23 12:44:29 BST 2015        WARNING    Cleanup output dir

Konstantin Okonechnikov

unread,
Apr 24, 2015, 9:11:47 AM4/24/15
to qual...@googlegroups.com
Hi Bruno,

thanks a lot for report!

Unfortunately Chongjing did not share any files with me and did not answer to my questions.

Could you please share the BAM and annotation files with me?  I will check the issue in detail.

--
  Konstantin


rspreafico

unread,
Oct 28, 2015, 5:29:31 PM10/28/15
to QualiMap, k.okone...@gmail.com
Hi Konstantin

I also got an index out of bound expection with Qualimap v2.1.2 run on a pairend end sample aligned against mm10:

Java memory size is set to 4G
Launching application...

detected environment java options -Djava.awt.headless=true
QualiMap v.2.1.2
Built on 2015-09-23 14:22

Selected tool: bamqc
Available memory (Mb): 1897
Max memory (Mb): 28155
Starting bam qc....
Loading sam header...
Loading locator...
Loading reference...
Number of windows: 400, effective number of windows: 465
Chunk of reads size: 1000
Number of threads: 32
Processed 50 out of 465 windows...
Processed 100 out of 465 windows...
Processed 150 out of 465 windows...
Processed 200 out of 465 windows...
Processed 250 out of 465 windows...
Processed 300 out of 465 windows...
Processed 350 out of 465 windows...
Processed 400 out of 465 windows...
Processed 450 out of 465 windows...
Total processed windows:464
Number of reads: 11711330
Number of valid reads: 11711330
Number of duplicated reads: 0
Number of correct strand reads:0
Number of reads with

Inside of regions...
Num mapped reads: 11711330
Num mapped first of pair: 5855665
Num mapped second of pair: 5855665
Num singletons: 0
Time taken to analyze reads: 116
Computing descriptors...
numberOfMappedBases: 581328331
referenceSize: 2730871774
numberOfSequencedBases: 581301587
numberOfAs: 152195210
Computing per chromosome statistics...
Failed to run bamqc
java.lang.IndexOutOfBoundsException: Index: 464, Size: 464
        at java.util.ArrayList.rangeCheck(ArrayList.java:635)
  
Funny enough, this is the only BAM file within the same experiment that gave me issues: all other BAM files underwent the same exact pipeline and gave no errors.

I can share both the GTF and the BAM files privately.

Thank you for your help,

Roberto

Konstantin Okonechnikov

unread,
Oct 29, 2015, 5:10:21 AM10/29/15
to qual...@googlegroups.com
Hi Roberto,

thanks a lot the report!

Would be great if you could share the data privately, provide the command line options that you applied and I'll check the issue in detail.

My e-mail: k.okonechnikov [at] gmail [dot] com

--
   Konstantin


rspreafico

unread,
Oct 29, 2015, 11:51:30 AM10/29/15
to QualiMap, k.okone...@gmail.com
Hi Konstantin,

thank you! This is the command:

JAVA_OPTS="-Djava.awt.headless=true" qualimap bamqc -c -bam sample.bam -outdir ./test --java-mem-size=4G

It worked for any other BAM file produced along with the faulty one.

I'm sharing the link to the BAM file privately.

Thanks,

Roberto

Konstantin Okonechnikov

unread,
Oct 30, 2015, 8:22:53 AM10/30/15
to qual...@googlegroups.com
Hi Roberto,

once again, thanks a lot for sharing the data!  I figured out the issue: the bug was in window processing. 

The error was rather rare, since it occurs only if the last window includes the exact number of reads as the maximum processing bunch size. Now it is fixed.

Here's the link to the novel version with the fix:

--
  Konstantin





 

rspreafico

unread,
Nov 3, 2015, 5:46:53 PM11/3/15
to QualiMap, k.okone...@gmail.com
Thank you Konstantin!

Daniel Sobral

unread,
Nov 24, 2015, 10:47:12 AM11/24/15
to QualiMap, dul...@gmail.com
Hello,

I also have had this problem.
I changed the number of windows and it seemed to work... at least for my particular case.

Daniel
Reply all
Reply to author
Forward
0 new messages