bamCoverage extendRead option

416 views
Skip to first unread message

mmmmc...@gmail.com

unread,
Dec 6, 2017, 10:39:28 AM12/6/17
to deepTools
Hi all-
I have paired-end sequencing data, and I would like to generate bigwig files showing coverage (i.e. exact number of reads) at a given genomic location. I first generated these files using default settings:
bamCoverage -b file.bam -o out_file -of bigwig -bs 1
However, when I then looked at the resulting bigwig and compared them to the BAM files they were generated from in IGV, I noticed that the coverage was ONLY considering reads, and DID NOT consider the insert between mate pairs. I would like to include the insert between mate pairs as part of my coverage.
So, I made new bigwig coverage files using the extendRead option:
bamCoverage -b file.bam -o out_file -of bigwig -bs 1 -e
Now, when I look at the BAM file and the new bigwig file in IGV, the coverage numbers are different than the previous bigwigs produced using default settings. However, these numbers now appear to be inflated somehow. In some cases, they are double what the actual value should be.
My assumption was that the coverage file would now simply count inserts as an additional read. As an example, if I had a single base pair which was covered by two different paired ends and one insert, I would have expected a coverage value of 2 in the bigwig generated using default settings, and a coverage value of 3 in the new -extendRead bigwig. Am I misinterpreting the option? I can easily send screenshots, as well as the reports which were generated during the making of the files.

Fidel Ramirez

unread,
Dec 6, 2017, 4:27:10 PM12/6/17
to mmmmc...@gmail.com, deepTools
It is very common to have singletons together with paired end reads. Thus, to avoid any bias deepTools extends each of the mates to match the insert size and for singletons it uses the average fragment length computed from the data. 

If your bam files only contain properly paired reads, then you can extend the reads and scale by 0.5

-fidel

--
You received this message because you are subscribed to the Google Groups "deepTools" group.
To unsubscribe from this group and stop receiving emails from it, send an email to deeptools+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mmmmc...@gmail.com

unread,
Dec 6, 2017, 9:30:07 PM12/6/17
to deepTools
To be a bit more clear, let's say that I have 3 overlapping mate pairs, as well as a single insert between mate pairs, all of which overlap at a single base pair. I would have expected that with default settings, the coverage value would be 3. I would then expect that the coverage value with extendRead would be 4. However, the extendRead option seems to be at least doubling the number of reads at a given genomic location (which seems to give me roughly a value of 8 in the above scenario). Is there an option that will generate coverage in a 1:1 manner for ONLY paired-end reads and inserts? So that in the above scenario, I would receive a value of 4?

Steffen Heyne

unread,
Dec 7, 2017, 12:28:28 AM12/7/17
to mmmmc...@gmail.com, deepTools
bamCoverage --extendReads –samFlagInclude 66 ... would use only the first read in pair of properly paired reads. This should give you a coverage of 4 at your location

Steffen
--
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
Reply all
Reply to author
Forward
0 new messages