ChIP-Seq pipeline in "tf" mode

99 views

Skip to first unread message

Asma Riyaz

unread,

Oct 14, 2020, 9:36:14 AM10/14/20

to idr-discuss

Hello,

The goal of running ChIP-Seq pipeline for me was to figure out the binding regions of BZW1, HA-BZW1 and HA-BZW2 to the nearest gene regions. For this purpose I have 3 biological replicates per sample and 3 Input samples (1 each). I am keen on finding a ranking list of gene regions per sample which can later be validated and hoped to use IDR to generate reproducible peaks for the same as I have replicates. I ran the ENCODE-ChIP Seq pipeline in "tf" mode (no changes to defaults).

I have a couple of questions regrading the output and its interpretation:

1) The peak files obtained are suffixed with "regionPeak" and not "narrowPeak". Could you help me understand why this would be the case? The output of call-reproducibility_idr has 10 columns and not 9 (as is mentioned in the format description for broad/region peak). Here is an excerpt from the output.

2) The IDR plots of all 3 samples show very little black dots as compared to the tutorials I have found online. Why would this be the case? Is it because the regions found from this protocol are broad instead of narrow? I have attached a plot here for reference.

3) While I understand the conservative peaks refer to peaks only from the replicates and optimal peaks come from both replicates and pooled samples, I was hoping to understand which set to consider for downstream analysis as there are very few black dots in my case in the IDR plot?

I have attached the qc report for one of the samples BZW1 with this thread in case you may find it useful to help me.

Thank you,

Asma

idr_plot_example.png

BZW1_qc.html

Reply all

Reply to author

Forward

0 new messages