Hi,
I’ve been trying to run qualimap rnaseq on paired-end files with mixed results. I tried it on several versions (2.2.2a, 2.2.2b, 2.2.2c and 2.2.2d installed via conda ; I also tried the latest build (31-08-20) available on bitbucket) and with several input files.
Here’s an example of the command I used :
qualimap rnaseq -bam X_sorted.bam -gtf X.gtf -a uniquely-mapped-reads --sequencing-protocol non-strand-specific -pe --java-mem-size=32000M
The annotation files are directly downloaded from the NCBI.
The bam files are obtained using a classic worklow : fastq are downloaded from the SRA, trimmed for adapters and bad bases, mapped using Hisat2 ; the sam file produced by Hisat2 is then converted into bam, the multi-mapping reads are removed and the bam file is finally sorted (by location, not by name) and indexed (all the last steps are performed using samtools). I tried viewing these files and I can clearly see pairs of reads.
And here’s an example of qualimap’s log :
Display:
Java memory size is set to 32000M
Launching application...
QualiMap v.2.2.2-dev
Built on 2019-11-11 14:05
Selected tool: rnaseq
Initializing regions from X.gtf...
Initialized 100000 regions...
Initialized 364586 regions it total
Starting constructing transcripts for RNA-seq stats...
Finished constructing transcripts
Starting BAM file analysis
Sorting BAM file by name...
Read 10000000 records.
Finished reading inputs, merging and writing to output now.
Sorting by name finished.
WARNING! The following chromosomes from reads are not found in annotations:
[I removed the list in order to avoid a long post]
Counted 0 read pairs, 0 single reads
The number of skipped alignments having only one read in fragment: 28825354
Processed 28825354 reads in total
Cleanup of temporary files
BAM file analysis finished
Creating plots
Writing HTML report...
HTML report created successfully
I’m not bothered by the missing chromosomes because some contigs/scaffolds are not annotated, so it’s not a problem. However, as you can see in bold, qualimap finds reads but doesn’t count them as pairs or single reads, it just skips them.
I’ve also tried the same command without the -pe option and it worked fine.
Do you have any idea about what could be causing that ?
Thanks in advance,
Best,
Antonin THIÉBAUT