Using new IDR pipeline in Chip-seq processing pipeline

414 views
Skip to first unread message

Collin White

unread,
Apr 27, 2015, 2:53:49 PM4/27/15
to idr-d...@googlegroups.com
Hello all,


I am currently implementing the ENCODE3 Chipseq pipeline, and have some questions.

I attached the documentation for implementing said pipeline, and my questions are centered around implementing the new IDR pipeline into this Chip-Seq pipeline.

In the attached documentation, section 4 is for running IDR on all pairs of reps, psuedoreps, etc. I am curious as to how much of the BASH code in this section is replaced by the new IDR pipeline i.e. creating a pooled set of peaks, recalibrating peaks, etc. From what I can tell, the output formats are different between the two versions of IDR, and while the new IDR has the option for outputting the old format, I haven't had any luck getting this to run.

Should I just replace the Rscript call to batch-consistency-analysis?

Thank you for your help.

Collin
ENCODE 3 - ChIPSeq Pipeline - Google Docs.pdf

Nathan Boley

unread,
Apr 28, 2015, 1:24:22 AM4/28/15
to idr-d...@googlegroups.com
Hi Collin,

> In the attached documentation, section 4 is for running IDR on all pairs of
> reps, psuedoreps, etc. I am curious as to how much of the BASH code in this
> section is replaced by the new IDR pipeline i.e. creating a pooled set of
> peaks, recalibrating peaks, etc.

The new idr code should replace most of section 4a. You still need to
run the peak caller on the replicates, mered set, pseudo replicates,
etc., but now rather than using bedtools to recalibrate peaks you can
just run:

idr --samples ${REP1_PEAK_FILE} ${REP2_PEAK_FILE} --peak-list
${POOLED_COMMON_PEAKS_IDR} --plot

which will output a narrowPeak file, plus some additional fields. I've
tried to be clearly document the new output format in the README
(https://github.com/nboley/idr) - please let me know if anything is
unclear.

> From what I can tell, the output formats
> are different between the two versions of IDR, and while the new IDR has the
> option for outputting the old format, I haven't had any luck getting this to
> run.

Thanks for pointing out the problem - I'll push a fix tomorrow.

Best, Nathan

Collin White

unread,
Apr 28, 2015, 11:35:21 AM4/28/15
to idr-d...@googlegroups.com
Nathan,

Thanks for the reply. I'll look into all of that. It also looks like the documentation was just updated a little bit yesterday, so I'll read the new stuff.

With regards to the IDR command you gave me, if the new version of IDR does in fact replace most of 4a, then how would I have the --peak-list argument ${POOLED_COMMON_PEAKS_IDR}, since that file is output in the second to last step of section 4a after being created from several bedtools commands prior? 

Thanks again for the help!

Anshul Kundaje

unread,
Apr 28, 2015, 11:48:11 AM4/28/15
to idr-d...@googlegroups.com

We'll update the documentation on the pipeline doc by this weekend. That should make it clear.

Anshul.

--
You received this message because you are subscribed to the Google Groups "idr-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to idr-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Message has been deleted

Nathan Boley

unread,
Apr 28, 2015, 12:06:48 PM4/28/15
to idr-d...@googlegroups.com
> With regards to the IDR command you gave me, if the new version of IDR does
> in fact replace most of 4a, then how would I have the --peak-list argument
> ${POOLED_COMMON_PEAKS_IDR}, since that file is output in the second to last
> step of section 4a after being created from several bedtools commands prior?

Sorry - that was a typo. I should have written:

idr --samples ${REP1_PEAK_FILE} ${REP2_PEAK_FILE} --peak-list
${POOLED_PEAK_FILE} --plot

Best, Nathan

Collin White

unread,
Apr 28, 2015, 12:09:17 PM4/28/15
to idr-d...@googlegroups.com
Thanks. I realized that shortly after I sent the question.

Let me know when you guys get the updated pipeline doc up, I would like to take a look at it.

Thanks for all the help

Collin White

unread,
May 5, 2015, 12:25:37 PM5/5/15
to idr-d...@googlegroups.com
Any update on the Chip-Seq pipeline documentation?

rspreafico

unread,
Jun 13, 2015, 8:57:55 PM6/13/15
to idr-d...@googlegroups.com
Hi Nathan,

thanks for updating the IDR code, I look forward to using it. For now, I am trying to understand how to run the provisional version in Github. I'd like to use it for current ChIP-seq projects given that the previous IDR code will become obsolete soon, so making the switch now makes sense. For that, I would need to ask you just a little guidance on how the pipeline changes.

I should call peaks from each individual replicate, the single pooled set, and individual pseudoreplicates. Then run IDR in peak-list mode on individual replicates plus the single pooled set as oracle list, using a higher threshold such as 0.05 to retain peaks. But that about pseudoreplicates?

Also, how different would be thresholding on global vs local IDR?

Thank you for your help,

Roberto
Reply all
Reply to author
Forward
0 new messages