rnaseq error

109 views
Skip to first unread message

Richard

unread,
Mar 26, 2016, 12:12:12 AM3/26/16
to QualiMap

   Hi Konstantin,
 
   I really think your software is excellent. Recently I am trying to use tool rnaseq, but I was running into the following failures:
  --------------------------------------------------------------------------------------------------------------------------------------------------------------------
   Finished constructing transcripts
   Starting BAM file analysis
   Sorting BAM file by name...
   Failed to run rnaseq
   net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 7, Read name ST-E00161:151:HKTTVCCXX:4:1101:10013:66830, Mapped mate should have mate reference name
        at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448)
        at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:506)
        at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:487)
        at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:446)
        at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:641)
        at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:619)
        at org.bioinfo.ngs.qc.qualimap.process.ComputeCountsTask.sortSamByName(ComputeCountsTask.java:142)
        at org.bioinfo.ngs.qc.qualimap.process.ComputeCountsTask.run(ComputeCountsTask.java:431)
        at org.bioinfo.ngs.qc.qualimap.process.RNASeqQCAnalysis.run(RNASeqQCAnalysis.java:68)
        at org.bioinfo.ngs.qc.qualimap.main.RnaSeqQcTool.execute(RnaSeqQcTool.java:188)
        at org.bioinfo.ngs.qc.qualimap.main.NgsSmartTool.run(NgsSmartTool.java:187)
        at org.bioinfo.ngs.qc.qualimap.main.NgsSmartMain.main(NgsSmartMain.java:111)
-------------------------------------------------------------------------------------------------------------------------------------------------------------
        Firstly, I used Mapsplice aligner to align my reads . Then I ran qualimap bamqc and everything was fine. However when I ran rnaseq, I got the problem above. So Konstantin, can you give me any hint on this problem?  Many thanks!

       Kind regards,
        Richard

Konstantin Okonechnikov

unread,
Mar 26, 2016, 8:05:16 AM3/26/16
to qual...@googlegroups.com
Hi Richard,

thanks a lot for your opinion about Qualimap!

Could you perhaps share the  file or a subsample from it along with link to applied annotation, so I can check the problem in detail?

Also, I will investigate the file from previous message (https://groups.google.com/forum/#!topic/qualimap/B6DRfSHARh4), since the the error report there is similar to the one that you posted.

--
  Konstantin

--
You received this message because you are subscribed to the Google Groups "QualiMap" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qualimap+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Richard

unread,
Mar 26, 2016, 11:02:45 PM3/26/16
to QualiMap, k.okone...@gmail.com
   
    Hi  Konstantin,
  
    Thanks so much for your quick reply. My subsample bam file (It got the similar error as I used my bam file) is in https://drive.google.com/file/d/0ByU6HuMxG-0Ib3k4a2RobnRVTEk/view?usp=sharing and the annotation file I used is ftp://ftp.ensembl.org/pub/release-75/gtf/homo_sapiens/.
    The command line I used is  "./qualimap rnaseq --java-mem-size=5G -bam /mydata/alignments_subsample.bam -gtf /mydata/Homo_sapiens.GRCh37.75.final.gtf -outdir /mydata/ -pe".

   --
     Kind regards,
     Richard
 
在 2016年3月26日星期六 UTC+8下午8:05:16,Konstantin Okonechnikov写道:

Konstantin Okonechnikov

unread,
Mar 28, 2016, 11:28:33 AM3/28/16
to qual...@googlegroups.com
Hi Richard,

I investigated the BAM file in detail and figured out the problem: some alignments were not correct according to SAM format specification. Because of this problem Picard library was providing exception. This was carefully solved in BAM QC mode, but not taken into account in  RNA-seq QC.

Here is an example:

"Read name ST-E00161:151:HKTTVCCXX:4:2210:23358:8833, Mapped mate should have mate reference name"

SAM flag of this read (first of pair) is 65, which means that it is in pair and Picard expects the coordinates of the pair according to format standards. Same issue is for the second read in the pair.  What tool was applied for alignment procedure?

This problem was fixed now and novel version with fix is available for download from here:
https://bitbucket.org/kokonech/qualimap/downloads/qualimap-build-28-03-16.tar.gz

However there are some other small issues also. For example, the annotation data applied for analysis is from Ensembl, however the references in alignment data are from other database ( UCSC or Gencode? ).  There are typical differences in databases, well-known problem are chromosome names:
https://www.biostars.org/p/138011/

I converted the chromosome names from Ensembl annotation and run the analysis on the sample BAM applying novel annotation. Results are attached. Also, just in case I attached Python script that was applied to convert chromosome names.

Maybe you can try running analysis on the full dataset and check if it works.

--
  Konstantin





 



 




gtfConvertChrNames.py
alignments_subsample_rnaseq_qc.zip

Richard

unread,
Mar 30, 2016, 11:18:23 PM3/30/16
to QualiMap, k.okone...@gmail.com

    Hi Konstantin,

    Sorry for the late reply. How considerate of you ! Thanks so much for what you have done. The novel version with fix works on my sub sample but there are still some problems on my full dataset.
    I used Mapsplice to align my reads but now I don't think it's a good aligner because its incompatibilty with other downstream analysis softwares, although TCGA uses it for alignment. The reference file is from GATK. BTW, do you think the reference and annotation file from different database will make a difference to the results?

    --
     Kind regards,
     Richard

在 2016年3月28日星期一 UTC+8下午11:28:33,Konstantin Okonechnikov写道:
Hi Richard,

I investigated the BAM file in detail and figured out the problem: some alignments were not correct according to SAM format specification. Because of this problem Picard library was providing exception. This was carefully solved in BAM QC mode, but not taken into account in  RNA-seq QC.

Here is an example:

"Read name ST-E00161:151:HKTTVCCXX:4:2210:23358:8833, Mapped mate should have mate reference name"

SAM flag of this read (first of pair) is 65, which means that it is in pair and Picard expects the coordinates of the pair according to format standards. Same issue is for the second read in the pair.  What tool was applied for alignment procedure?

This problem was fixed now and novel version with fix is available for download from here:
https://bitbucket.org/kokonech/qualimap/downloads/qualimap-build-28-03-16.tar.gz

However there are some other small issues also. For example, the annotation data applied for analysis is from Ensembl, however the references in alignment data are from other database ( UCSC or Gencode? ).  There are typical differences in databases, well-known problem are chromosome names:
https://www.biostars.org/p/138011/

I converted the chromosome names from Ensembl annotation and run the analysis on the sample BAM applying novel annotation. Results are attached. Also, just in case I attached Python script that was applied to convert chromosome names.

Maybe you can try running analysis on the full dataset and check if it works.

--
 c





 



 




Konstantin Okonechnikov

unread,
Apr 1, 2016, 9:28:59 AM4/1/16
to qual...@googlegroups.com
Hi Richard,

in principle there could be differences between references, here's a well-known example:

Usually it's better to apply the same source or check for sure that there are no differences (just comparing the size should be enough).

--
  Konstantin


Richard

unread,
Apr 2, 2016, 2:38:26 AM4/2/16
to QualiMap, k.okone...@gmail.com
   Hi Konstantin,

   I got it. Thanks again!

   Best regards,
   Richard

在 2016年4月1日星期五 UTC+8下午9:28:59,Konstantin Okonechnikov写道:
Reply all
Reply to author
Forward
0 new messages