deeptools

554 views
Skip to first unread message

王欣欣

unread,
May 28, 2015, 2:49:57 AM5/28/15
to deep...@googlegroups.com

Dear editor:

      Sorry to interupt you ! Now I am using deeptools to deal with chip-seq data ,I have one marks inculdes :h3k4me3 in two different stages , The purpose is to compare this mark in two stage, after bowtie mapping , I don"t know how to normalize my data , In your paper , you said that using bamcoverage-RPKM , but I was wondering that the format of output file is bedgraph , Is it can directly call peak with Macs software ?

                                                                                                      xin

                                                                                                    2015.05.28

Fidel Ramirez

unread,
May 28, 2015, 3:12:59 AM5/28/15
to 王欣欣, deep...@googlegroups.com
Dear Xin,

Peak calling, for example using MACS, has to be done directly on the BAM files. With deepTools you can produce normalized counts in either bedgraph or bigwig format that subsequently can be visualised using a genome browser (e.g. IGV) or using the heatmapper tool of deepTools. This will allow you to compare the signal on your two conditions.

Best,

Fidel 

--
You received this message because you are subscribed to the Google Groups "deepTools" group.
To unsubscribe from this group and stop receiving emails from it, send an email to deeptools+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Fidel Ramirez

王欣欣

unread,
May 28, 2015, 3:25:40 AM5/28/15
to Fidel Ramirez, deep...@googlegroups.com
sorry , I am a little confused , Now I have a h3k4me3 data with two stages , My teacher let me to normalize my date in order to compare this two stages . but I have no idea how to normalize?  thanks for taking time to solve my problem!

在2015-05-28 15:12:59,王欣欣wan...@big.ac.cn写道:

Fidel Ramirez

unread,
May 28, 2015, 5:01:07 AM5/28/15
to 王欣欣, deep...@googlegroups.com
Hi,

As far as I understand your goal is to compare the H3K4me3 mark on different conditions. There are several approaches for that. One that I could suggest involves using deepTools as follows:

If the ChIP-seq data was properly done you should have a so called input file for each H3K4me3 sample, ideally you should have replicates also.

*Quality control*

First step is to compare your samples to check that everything is in order. For this you will use a deepTools tool called bamCorrelate with all the bam files that you have. In the resulting heatmap you should see that replicates are more similar to each other and that the input files cluster together. If you fail to see this you may have a problem with your samples (sample swaps for example). You can find more information at https://github.com/fidelram/deepTools/wiki/QC#bamCorrelate

*Computation of normalized bigwig files*

Next you need to compute log2ratios of your H3K4me3 mark over input for the different conditions. Alternatively you can simply use bamCoverage normalized files, but the ideal case is to use the log2ratios of ChIP vs. Input as this will help to remove the bias present in the data (GC bias, open chromatin bias, CNV etc). More information here: https://github.com/fidelram/deepTools/wiki/Normalizations#bamCompare

*visualisation of differences*

Using the bigwig files that you created in the previous step you can now visualize the bigwig signal over regions of interest. In your case, those regions are most likely to be promoter regions where H3K4me3 is usually found. 

For this you will need to use the heatmapper tool. Apart from the bigwig files you need a list of genes from the species that you are using. With the USCS table browser you can download such list in BED format.
Next you use the command computeMatrix that will extract the relevant regions from the bigwig files and that subsequently can be visualized using the heatmapper command (see https://github.com/fidelram/deepTools/wiki/Visualizations#heatmapper). I recommend you to use for this step the deepTools release 1.6 as this will allow you to plot multiple bigwig files at once. Using the kmeans clustering function it may be possible to identify differentially bound regions. You can output the resulting clusters of regions for manual inspection or further analysis.  


You should also try to identify peaks that are in on treatment and not in other.  For this,  I recommend you to search in biostars for an answer to your question. For example look at this answer: https://www.biostars.org/p/84249/

Best,

Fidel

tess.k...@gmail.com

unread,
May 18, 2017, 7:48:54 AM5/18/17
to deepTools, wan...@big.ac.cn
Hi Fidel,

Thanks for your clear and extensive answer to Xin's question. I have one follow-up question though.

I have both IP and input samples in replicates. I already checked that the replicates are similar so now I would like to create BigWig files of the IP/input values of both replicates combined.

Is this possible by using bamCompare? Or would you recommend first merging the BAM files somehow and feed this to bamCompare?

Thanks a lot for any help!

Tess

Fidel Ramirez

unread,
May 18, 2017, 8:07:48 AM5/18/17
to tess.k...@gmail.com, deepTools, 王欣欣
Hi Tess,

The cleaner way is to merge the BAM files and then call bamCompare. 

You can also add the replicates for normalized input files (using bigwigCompare) and for the control files, and use the resulting bigWigs to run bigwigCompare again to compute the log2ratios. The number of IP and input replicates has to be the same for this to work. This second method is dependent on accurate normalizations and may introduce some biases, thus if you can, start from the merged bam files.

-fidel



--
You received this message because you are subscribed to the Google Groups "deepTools" group.
To unsubscribe from this group and stop receiving emails from it, send an email to deeptools+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--

Fidel Ramirez

Reply all
Reply to author
Forward
0 new messages