5hmc question

54 views
Skip to first unread message

hashima...@gmail.com

unread,
Oct 3, 2017, 10:03:36 AM10/3/17
to MethPipe and MethBase Users
Hi,

First of all thanks for a great tools. I have some queries.

I have BS and oxBS data. I am trying to estimate 5hmc and 5mc level. 

I understood from previous posts that For BS-Seq, methcounts gives P = (Pm + Phm).  For oxBS-Seq, methcounts computes Pm.  So the estimation of 5hmC is (P-Pm).

Here I am showing first 10 lines of my files. 

Sample_1-C57BL6-oxbs_Replicate1_dremove_CPG_symmetric.meth = oxBS sample
Sample_3-C57BL6-bs_Replicate1_dremove_CPG_symmetric.meth   = BS sample
C57BL6_Rep1_mlml_with_nan.meth = Output of mlml 


head Sample_1-C57BL6-oxbs_Replicate1_dremove_CPG_symmetric.meth Sample_3-C57BL6-bs_Replicate1_dremove_CPG_symmetric.meth
==> Sample_1-C57BL6-oxbs_Replicate1_dremove_CPG_symmetric.meth <==
chr1 3000573 + CpG 0 0
chr1 3000725 + CpG 1 1
chr1 3000900 + CpG 0 0
chr1 3001345 + CpG 0.75 4
chr1 3001393 + CpG 1 3
chr1 3001630 + CpG 0 0
chr1 3002176 + CpG 1 1
chr1 3002337 + CpG 1 2
chr1 3002385 + CpG 1 1
chr1 3002598 + CpG 1 1

==> Sample_3-C57BL6-bs_Replicate1_dremove_CPG_symmetric.meth <==
chr1 3000573 + CpG 0 0
chr1 3000725 + CpG 0.666667 3
chr1 3000900 + CpG 1 2
chr1 3001345 + CpG 1 2
chr1 3001393 + CpG 1 2
chr1 3001630 + CpG 1 2
chr1 3002176 + CpG 0 0
chr1 3002337 + CpG 0 0
chr1 3002385 + CpG 0 0
chr1 3002598 + CpG 0.666667 3

$ head C57BL6_Rep1_mlml_with_nan.meth 
chr1 3000573 3000574 nan nan nan nan
chr1 3000725 3000726 0.75 0 0.25 0
chr1 3000900 3000901 nan nan nan nan
chr1 3001345 3001346 0.75 0.25 0 0
chr1 3001393 3001394 1 0 0 0
chr1 3001630 3001631 nan nan nan nan
chr1 3002176 3002177 nan nan nan nan
chr1 3002337 3002338 nan nan nan nan
chr1 3002385 3002386 nan nan nan nan
chr1 3002598 3002599 0.75 0 0.25 0


Could you please tell me 

1) what does nan means? 
2) 5hmc is (P-Pm), how can I get 0.75 (Pm), 0 (Ph) and 0.25 (Pu) in second line of C57BL6_Rep1_mlml_with_nan.meth.   

Thanks in advance, looking for a positive response. 

Regards
Adnan. 

Jianghan Qu

unread,
Oct 3, 2017, 2:45:21 PM10/3/17
to meth...@googlegroups.com
Hi Adnan, 

1) nan means the proportion of C, 5hmC and 5mC cannot be estimated from the input data, this happens when you have read coverage for a CpG site in only one experiment. 
2) These are the maximum-likelihood estimates for the levels given the observed data. The ML method guarantees consistent estimates (pm, ph, pu >=0 and pm+ph+pu=1), while estimating 5hmC level by subtracting the oxBS level from the BS level gives a negative value -0.33 in this case. 

For more details about how mlml works, please checkout our paper  https://www.ncbi.nlm.nih.gov/pubmed/23969133 

Best,

--
You received this message because you are subscribed to the Google Groups "MethPipe and MethBase Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methpipe+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Jenny

Adnan Hashim

unread,
Oct 3, 2017, 3:47:22 PM10/3/17
to meth...@googlegroups.com
Hi,

Thanks for quick reply. Could you please answer few more queries as well :) 
1) I should remove all nan from further analysis?
2) Sites with overshoot are excluded from output of mlml?
3) Sites conflicting to at least two input should be excluded from further analysis?
3) Is mlml applying any statistical test to confirm the existence of 5hmc? If not, I am planning to use Fisher's exact test (Do you have any suggestion about this ?) 

Thanks. 

Regards,

Adnan.


--
You received this message because you are subscribed to a topic in the Google Groups "MethPipe and MethBase Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/methpipe/dtjtfchNY5I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to methpipe+unsubscribe@googlegroups.com.

Jianghan Qu

unread,
Oct 4, 2017, 1:13:25 PM10/4/17
to meth...@googlegroups.com
Hi Adnan, 

Please see comments below. 

Best,

On Tue, Oct 3, 2017 at 12:47 PM, Adnan Hashim <hashima...@gmail.com> wrote:
Hi,

Thanks for quick reply. Could you please answer few more queries as well :) 
1) I should remove all nan from further analysis?

Yes.

2) Sites with overshoot are excluded from output of mlml?

No. But we flag sites with significant inconsistency (binomial test) between experiments. With oxBS and BS, all overshoot sites (including flagged sites) will have 5hmC level estimated as 0 by the maximum-likelihood method. 

3) Sites conflicting to at least two input should be excluded from further analysis?

 Sounds reasonable.
 
3) Is mlml applying any statistical test to confirm the existence of 5hmc? If not, I am planning to use Fisher's exact test (Do you have any suggestion about this ?) 

No. You can certainly apply Fisher's exact test, but I'm guessing you plan to increase the sequencing depths. 



--
Jenny

hashima...@gmail.com

unread,
Oct 5, 2017, 2:58:18 AM10/5/17
to MethPipe and MethBase Users
Thank you very much for quick replies and clarifications.

Regards,

Adnan. 
Reply all
Reply to author
Forward
0 new messages