Re: [macs-announscement] q-values in MACS2

4,497 views
Skip to first unread message

Tao Liu

unread,
Jul 6, 2012, 5:59:11 PM7/6/12
to macs-ann...@googlegroups.com
Hi Friederike,

On Jul 3, 2012, at 10:16 AM, Friederike wrote:

> I have a quick question regarding the p- and q-values in MACSv2.
> In MACSv1.4 the p-value was output by the programme as -10log(p-value) and the FDR in %. Do you still multiply the negative log(p-value) by ten in MACS2?

No. In MACS2, I use -log10(pvalue) or -log10(qvalue) instead of multiplying by ten.

> I have major difficulties in reproducing a similar cut-off in MACSv1.4 and MACSv2 based on the FDR. I used to take FDR 5% as a cut off for peaks I considered for the downstream analysis, but with the newer MACS version I get many more peaks (in the exact same data set) eventhough the default cut-off is q-value = 0.05. I should add that the vast majority of the peaks that I were not identified by MACSv1.4 but by MACSv2 have very small fold enrichment values compared to those peaks that were identified by both MACSv1.4 and MACSv2.
> Did you change the way of calculating the q-value? Why were those "tiny" peaks not picked up by MACSv1.4 but suddenly appear quite statistically significant in MACSv2?

Yes. They are different.

In MACSv1.4, FDR is calculated by swapping treatment and control. MACSv14 assumes all peaks called under a cutoff in this way are false positives. So it can calculate empirical FDR for each pvalue cutoff. However this method is hugely influenced by unbalanced sequencing depth. For example, if control sample is much larger than treatment, FDR would be overestimated so you would have less 'good' peaks above cutoff.

In MACSv2, I use another approach. First, pvalue are calculated at every basepair in the genome, then I adopt Benjamini-Hochberg to correct multiple comparisons, convert pvalue into qvalue or minimum FDR that the peak is significant. This method is more robust in my experience.

Best,

Tao Liu

Research Fellow
Dept of Biostats and Comp Bio, DFCI / HSPH
450 Brookline Ave., Boston, MA 02215
(O) 617-582-7769




Tao Liu

unread,
Jul 13, 2012, 4:20:39 AM7/13/12
to macs-ann...@googlegroups.com
Hi Friederike,

On Jul 12, 2012, at 6:59 AM, Friederike wrote:

> thanks for the helpful insights!

You are welcome!

> I had always assumed that MACSv1 corrected for imbalanced sequencing depths before determining the (negative) peak regions. So that's been a false notion of mine?

You are right. MACS always does correction. But still, unbalanced treatment and control will hugely affect MACSv1 FDR.

> Have you changed anything in the way you correct for different read numbers in MACS2?

No. It's the same -- linear scaling. THe difference is the way to calculate FDR in MACS2. Now the method is based on correction of pvalue from multi comparisons as I explained in previous email.

Lana Schaffer

unread,
Jul 16, 2012, 2:57:44 PM7/16/12
to macs-ann...@googlegroups.com
Tao,

Using macs14 I found 440 peaks with FDR all ZERO , (setting: --pvalue=1e-5)
But using macs2 I found only 22 peaks with FDR= 3->128 (setting: -q 0.01)

So I am using big differences in cutoff criteria.
You are recommending -q 0.01 for macs2 ?
And not -p 1e-5 (typo error in README file?) ?

Lana
--
You received this message because you are subscribed to the Google Groups "MACS announcement" group.
To post to this group, send email to macs-ann...@googlegroups.com.
To unsubscribe from this group, send email to macs-announcem...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/macs-announcement?hl=en.

Tao Liu

unread,
Jul 18, 2012, 3:24:10 PM7/18/12
to macs-ann...@googlegroups.com
Lana,

FDR in macs14 may be influenced by sequencing depth. In MACS2 result, the column for FDR is -log10 qvalue, so 3 means FDR 0.001 which is already good.

Recommendation is q=0.01 as cutoff in MACS2. I am updating README file. -p 1e-5 is in old document.

Best,
Tao

Lana Schaffer

unread,
Jul 19, 2012, 2:11:03 PM7/19/12
to macs-ann...@googlegroups.com
Tao,
In MACS2 q=.05 works, too, and gives a few more peaks.

guisong wang

unread,
Jul 20, 2012, 4:27:39 PM7/20/12
to macs-ann...@googlegroups.com
Hello Tao,

I would like to make sure my understanding of FDR (%) in the result is
correct. I have 100% in FDR, Does it mean the q-value=0.01?

Thanks!

Guisong

On Thu, Jul 12, 2012 at 6:59 AM, Friederike <muehl...@immunbio.mpg.de> wrote:
> Hello Tao,
>
> thanks for the helpful insights!
>
> I had always assumed that MACSv1 corrected for imbalanced sequencing depths
> before determining the (negative) peak regions. So that's been a false
> notion of mine?
>
> Have you changed anything in the way you correct for different read numbers
> in MACS2?
>
> Thanks again!
>
> Best,
>
> Friederike
> --
> You received this message because you are subscribed to the Google Groups
> "MACS announcement" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/macs-announcement/-/N6J8a1nkex4J.

Tao Liu

unread,
Jul 24, 2012, 11:24:41 AM7/24/12
to macs-ann...@googlegroups.com
Hi Guisong,

q-value is analogue to FDR. http://en.wikipedia.org/wiki/False_discovery_rate

Please also pay attention to the column name in MACS1 or 2 output. There are -logqvalue column or FDR(percentage) column.

Also. MACS1 and 2 use different methods to calculate FDR that I have explained in previous emails.

Best,
Tao
Reply all
Reply to author
Forward
0 new messages