Hi,
I'd like to try using the peak files I've generated for my recent experiment with the ENCODE IDR pipeline to generate a set of high confidence peaks. To be transparent, I'll point out that I haven't used IDR before.
This requires peaks in a narrowpeak (BED 6+4) or broadpeak (BED 6+3) format.
e.g. for broadpeak columns:
7. signalValue - Measurement of overall (usually, average) enrichment for the
region.
8. pValue - Measurement of statistical significance (-log10). Use -1 if no pValue is
assigned.
9. qValue - Measurement of statistical significance using false discovery rate
(-log10). Use -1 if no qValue is assigned.
One line of a CTK peak.sig.bed contains the following:
chr1 865532 865548 Peak_1[gene=148398][PH=10][PH0=0.07][P=1.00e-100] 10 +
What is the most appropriate way to convert to broadpeak?
Perhaps signal value ought to be either:
1. CTK PH / PH0
2. Some normalization with the size-matched input could also be performed.
3.
CTK PH raw
My inclination would be to try PH/PH0, but I am only so experienced with this, and therefore not so confident.
Also, would the CTK peak p-value be appropriate here to use in col. 8?
From there, I would assume it is a matter of informed preference for what input (signal value or p-value) you use for IDR ranking.
I welcome any insight and thanks for your time.
Kind regards,
Steve