Re: Read extension and normalisation in bamCoverage for ChIP-seq

444 views
Skip to first unread message

Devon Ryan

unread,
Feb 17, 2018, 3:07:21 AM2/17/18
to andrew.p...@gmail.com, deepTools
Hi Andrew,

If your fragments were originally 200-300 bases, the extending to 250
bases would seem reasonable. You can also decrease the bin size (-bs) to
10 (or lower) to further reduce the blockiness a bit. There's a trade
off with the bin size where smaller values take (A) longer to compute
and (B) more space on the disk, which is why the default is 50.

The scale should not be the same for every sample, since they were all
sequenced to different depths. One thing to check is that the blacklist
file is doing its job properly. A convenient way to do this is just to
load the bigwig files in IGV and look at a chromosome at a time. If you
see a huge peak anywhere then you know there's a site that's missing (or
needs to be expanded) in your blacklist file. In my experience, this is
the most common cause of samples looking a bit too different in IGV.
Presuming you have input samples, you may find SES normalization in
bamCompare useful.

BTW, you'd actually be creating a bias if you enforced that each sample
have the same scaling factor, since then samples with higher sequencing
depth would always have higher signal.

Devon


On 02/16/2018 02:04 AM, andrew.p...@gmail.com wrote:
> Dear Devon,
>
> First off, thank you for creating this great suite of tools, and sorry if this is a silly question. I am new to ChIP-seq tools and analysis.
>
> I have a ChIP-seq timecourse and genotype based experiment with 3x histone modification marks in mouse (n=5 per genotype per time-point). I have aligned the reads to mm10, and would like to use bamCoverage to create bigwigs for visualisation prior to peak-calling.
>
> Details of the experiment are as follows:
> 50bp single end reads, initial fragment size was between 200-300bp long
>
> Samples mapped to mm10 with bowtie2 with default parameters.
>
> I am not sure on what I should use for a few of your options:
>
> My current plan is to use the following:
>
> ```
> #deepTools create bigwig - needs to be indexed first!!!!
> for file in *.sorted.bam ; do
> (bamCoverage --bam ${file} -o <path/to/output/${file}.bw> --binSize 10 --normalizeUsing RPGC --effectiveGenomeSize 2652783500 --numberOfProcessors 60 --extendReads <?> --blackListFileName <path/to/blacklist>
> done
> ```
>
> I noticed when using IGV after creating the bigwigs that the output is quite blocky, and I would like to smooth it a bit if I can - I think I can do this with extendReads, but I am not sure what an appropriate value would be to place in this?
>
> Also, when comparing between samples of the same histone modification I noticed the scale is different for each, and I assume this is because I am normalising to RPGC for 1x coverage. If I want to compare between samples, should the scale be the same for all of the samples - is it okay to rescale in IGV to keep all samples the same? Am I creating a bias by having each sample scaled differently?
>
> Sorry again and thanks for any help/suggestions.
>
> Kind regards,
> Andrew
>

--
Devon Ryan, PhD
Bioinformatician / Data manager
Bioinformatics Core Facility
Max Planck Institute for Immunobiology and Epigenetics
Email: dpry...@gmail.com

Reply all
Reply to author
Forward
0 new messages