Macs2 behavior for enriched input regions

460 views
Skip to first unread message

Fidel Ramirez

unread,
Apr 11, 2013, 7:30:47 AM4/11/13
to macs-ann...@googlegroups.com
Hi everyone

we are doing some tests comparing the behavior of macs in the case of sub-sampling. We are dealing with broad peaks from epigenetic marcs using Roadmap data.

We will like to hear your opinion in the following case in which the input appears slightly enriched in a satellite region and no peaks are called even though the ChIP sample appears largely enriched. See blue tracks of  attached image.

Even when considering a local lambda for those regions we think that they should be called as peaks. We wonder if MACS2 is just disregarding such regions of high input coverage and excluding them for further analysis? Any explanation for this result?

Interestingly, in the case of subsampling to 10% the original number of mapped reads macs now reports some peaks in the input enriched regions. See gray tracks in attached image. All tracks are normalized to RPKMs.

We are mapping multi-reads randomly, that is why there is an enrichment of the input in the satellites.

Thanks,
--

Fidel Ramirez

chr10_majsat.png

Tao Liu

unread,
Apr 11, 2013, 10:59:53 AM4/11/13
to macs-ann...@googlegroups.com
Hi Fidel,

MACS2 peaks look very strange to me.  How did you generate the H3K9me3 and control tracks for display? Are they from MACS2 -B option? I have a feeling that MACS2 read your alignment in an unexpected way.

MACS2 calls peaks based on q-value cutoff by default. So, if you want to understand its behavior in your case, you can use 'macs2 bdgcmp -m qpois'  on the bed graph files generated from 'macs2 call peak -B' to generate a q-value track (you can refer to https://github.com/taoliu/MACS/wiki/Build-Signal-Track ). Then load them all together.  Looking forward to your feedback!

Best,
Tao Liu

--
Assistant Professor
Department of Biochemistry
University at Buffalo
NY State Center of Excellence in Bioinformatics & Life Sciences

B2-163 COEBLS
(O) 716-829-2749
tl...@buffalo.edu

Mailing address:
University at Buffalo-COEBLS
701 Ellicott St, B2-163
Buffalo, NY 14203-1221

--
You received this message because you are subscribed to the Google Groups "MACS announcement" group.
To unsubscribe from this group and stop receiving emails from it, send an email to macs-announcem...@googlegroups.com.
To post to this group, send email to macs-ann...@googlegroups.com.
Visit this group at http://groups.google.com/group/macs-announcement?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
<chr10_majsat.png>

Tao Liu

unread,
Apr 11, 2013, 11:03:28 AM4/11/13
to macs-ann...@googlegroups.com
BTW, I don't think there would be a RPKM for H3K9me3 unless you only want to consider K9me3 marks on gene bodies. Normally, I would normalize pileup to the sequencing depth in million for genome browser view -- with the --SPMR option in MACS2.

-Tao

Fidel Ramirez

unread,
Apr 11, 2013, 11:33:08 AM4/11/13
to macs-ann...@googlegroups.com
Thanks Tao for your answer,

The track files in the attached image were generated by us. The RPKM scaling is used for bins of size 50. But we think this is irrelevant because un-scaled wiggle tracks look just the same (off course with values that are proportionally different).

I will check your recommendation but meanwhile can you confirm that MACS2 does not detect enriched regions in the input which are afterwards excluded?


--

Fidel Ramirez

Tao Liu

unread,
Apr 11, 2013, 12:23:04 PM4/11/13
to macs-ann...@googlegroups.com
Hi Fidel,

There is no step to exclude certain regions during peak calling. MACS2 generates ChIP signal and control bias track first ( which you can get through macs2 callpeak -B), then compare them to get score track ( which you can get through macs2 bdgcmp -m qpois ). After that, only regions with score higher than cutoff will be called. So it would be helpful to see what MACS 'saw' from your data by generating those bedGraph files I mentioned. 

Best,
Tao Liu

--
Assistant Professor
Department of Biochemistry
University at Buffalo
NY State Center of Excellence in Bioinformatics & Life Sciences

B2-163 COEBLS
(O) 716-829-2749
tl...@buffalo.edu

Mailing address:
University at Buffalo-COEBLS
701 Ellicott St, B2-163
Buffalo, NY 14203-1221

Fidel Ramirez

unread,
Apr 22, 2013, 3:23:54 AM4/22/13
to macs-ann...@googlegroups.com
Hi Tao,

Finally we figured out the problem: removal of duplicated repeats.

Recalling the image I sent previously, the largely enriched region to the right contains lots of duplicated reads while the region to the right contains just few duplications. When MACS2 is run, because of the removal of such duplications mostly affecting the region to the right, there is no longer an enrichment and no peaks are called.

If MACS2 is run without removing duplicates, then we get the expected behavior: peaks are called in both the left and right regions.

I will like to acknowledge that thanks to your suggestions we could identify the problem.

Thanks a lot,

fidel
Reply all
Reply to author
Forward
0 new messages