Not seeing Insert Size Histogram

57 views
Skip to first unread message

Mike Greenwood

unread,
May 6, 2021, 4:13:24 PM5/6/21
to QualiMap
Seems like other functions are working but I don't see insert size histogram.  I'm really interested in mean/median insert size per reference.
Capture.JPG

Konstantin Okonechnikov

unread,
May 10, 2021, 10:09:10 AM5/10/21
to qual...@googlegroups.com
Hi!

Does BAM QC report contain insert size values in the Summary? Or is it only a figure unavailable? Typically, insert size is active only if the reads are paired-end in the input alignment BAM file.

Best regards,
   Konstantin

On Thu, May 6, 2021 at 10:13 PM Mike Greenwood <mkg...@gmail.com> wrote:
Seems like other functions are working but I don't see insert size histogram.  I'm really interested in mean/median insert size per reference.
Capture.JPG

--
You received this message because you are subscribed to the Google Groups "QualiMap" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qualimap+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/qualimap/42e02486-2d09-4c40-b8ab-10a4c3ecefa6n%40googlegroups.com.

Mike

unread,
May 10, 2021, 12:15:59 PM5/10/21
to qual...@googlegroups.com
Hello!  Thank you for your reply.  It doesn't appear so, but the reads are paired.

Summary


Globals
Reference size 147,124
Number of reads 704,888
Mapped reads 486,004 / 68.95%
Unmapped reads 218,884 / 31.05%
Mapped paired reads 486,004 / 68.95%
Mapped reads, first in pair 243,009 / 34.47%
Mapped reads, second in pair 242,995 / 34.47%
Mapped reads, both in pair 485,990 / 68.95%
Mapped reads, singletons 14 / 0%
Read min/max/mean length 131 / 151 / 135.1
Duplicated reads (flagged) 105,247 / 14.93%
Clipped reads 54,872 / 7.78%
ACGT Content
Number/percentage of A's 15,920,580 / 25.17%
Number/percentage of C's 14,334,483 / 22.66%
Number/percentage of T's 17,456,214 / 27.59%
Number/percentage of G's 15,252,958 / 24.11%
Number/percentage of N's 6,235,280 / 9.86%
GC Percentage 46.77%
Coverage
Mean 472.5622
Standard Deviation 489.1574
Mapping Quality
Mean Mapping Quality 0
Mismatches and indels
Insertions 3,104
Mapped reads with at least one insertion 0.5%
Deletions 11,941
Mapped reads with at least one deletion 1.93%
Homopolymer indels 36.44%
Chromosome stats
Name Length Mapped bases Mean coverage Standard deviation
A*02:01:01:01 3875 2449874 632.2255 410.0818
A*02:614 3875 2264630 584.4206 399.4563
A*68:01:02:01 3875 2441390 630.0361 423.2038
A*68:164 3875 2457665 634.2361 418.559
B*35:01:01:05 4257 2005682 471.1492 209.1831
B*52:01:02:01 4257 2069348 486.1048 211.7716
C*03:03:01:01 4541 2186977 481.6069 228.3237
C*04:01:01:11 4541 2421835 533.3264 256.1043
DPA1*01:03:01:05 9834 3517406 357.6781 403.073
DPA1*01:03:01:05X 9834 3483382 354.2182 399.2769
DPB1*04:02:01:02 11615 7757044 667.8471 741.2827
DQA1*03:01:01:01 6767 4281604 632.7182 357.1965
DQA1*04:01:01:07 6767 3960401 585.2521 323.2718
DQB1*03:02:01:01 7781 5680592 730.0594 556.4607
DQB1*04:02:01:04 7781 5511543 708.3335 542.3835
DRB1*04:11:01 18509 3296625 178.1093 264.6255
DRB1*08:02:01:01 18509 5854153 316.2868 450.6595
DRB4*01:03:01:01 16631 7885090 474.12 639.4515

Konstantin Okonechnikov

unread,
May 11, 2021, 6:18:10 AM5/11/21
to qual...@googlegroups.com
Hi!

Since the insert size is not in statistics, it means that it's either not present in read alignment records ( but this is not suitable since 1st and 2nd reads were found, not sure why it could be like this), or all of insert size them appears to be less than zero e.g reads within the pair are overlapping.  

Would be curious to check the BAM file or a subsample from it e.g. either from one chromosome only or small random ( http://okko73313.blogspot.com/2013/03/random-subsample-from-bam-file.html  ) 

Best regards,
   Konstantin 

Mike

unread,
May 11, 2021, 12:11:25 PM5/11/21
to qual...@googlegroups.com
Grabbed 10 pairs.  The bottom pair has an overlap of 4 and the second to last has a gap of 56.

42-25subsample.txt

Konstantin Okonechnikov

unread,
May 12, 2021, 4:44:18 AM5/12/21
to qual...@googlegroups.com
By some reason the insert size is zero in all of the read alignments. Simply it's the column 9 in SAM format, just in case here's example overview: https://accio.github.io/bioinformatics/2020/03/10/filter-bam-by-insert-size.html 
Did you compute your values (-4,56) manually? Which tool was applied for alignment?  The insert sizes are expected to be provided already in output BAM/SAM file.

Best regards,
   Konstantin

Mike

unread,
May 12, 2021, 3:51:46 PM5/12/21
to qual...@googlegroups.com
I'm working on HLA typing and the kit we use comes with software for analysis of fastq data.  The software releases the BAM file and also has a tool to view the alignments.  It is in this tool I can see the gap/overlap.  It is quite slow to open each sample to review the read lengths, though, so maybe I can have the developers include an informative TLEN value.  Thanks for your help.

Mike

You received this message because you are subscribed to a topic in the Google Groups "QualiMap" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qualimap/hHFkyUFxPzE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qualimap+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/qualimap/CAMe83j%3DgmP2D2Y%2Bpfzn6peya9PXp8ifsv2b%3Doe75%3Dveq%3Dh6c1Q%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages