Question about unite function with relaxed parameter

135 views
Skip to first unread message

Núñez, Diana

unread,
Sep 1, 2021, 8:38:05 PM9/1/21
to methylkit_...@googlegroups.com
HI
Hoping you be fine 

I would like to request your help about this program

I'm currently running 38 samples,  25 cases and 13 controls.

In methRead step. I add the treatment information defining cases with number 1 and controls with number 0. 

The next steps that I used were:

## Adjust the BS-seq file with the OXBS to estimate hydroxymethylation

HydroxyRawList <- adjustMethylC(BS.RawList,MethRawList)

## Filter CpG sites with potential high PCR bias

filtered.HydroxyRawList=filterByCoverage(HydroxyRawList,lo.count=10,lo.perc=NULL, hi.count=NULL,hi.perc=99.9)

## Normalizing coverage across subjects for each CpG

filt.and.normed <- normalizeCoverage(obj,method="median")


## Unite individual samples to get data in one object ready for differential expression

united.meth=unite(filtered.HydroxyRawList, destrand=FALSE)


When I try that and calculate differential methylation using q value <0.01 and differential methylation greater than 25. ---  I DONT HAVE NONE DML.


BUT, when I used a relaxed filter

united.meth=unite(filtered.HydroxyRawList, destrand=FALSE, min.per.group=1L)

I have more than 3000 DML

---------

So, my question is

There is a recommendation to use the relaxed filter according to sample size or number of treatment groups. It means, if I have 38 samples divided in two treatment groups (13 versus 25),  it's better to use a relaxed filter in United function  with 2L, 3L, 4L..... 

Or should i figure out a previous step to perform unite functions without relaxed parameters and get DML



Thanks for your kindness helping us





Sent from my T-Mobile 4G LTE Device
Get Outlook for Android

Altuna Akalin

unread,
Sep 3, 2021, 8:14:21 AM9/3/21
to methylkit_...@googlegroups.com
How many CpGs do you have as an output of both unite() approaches ?

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/methylkit_discussion/BYAPR08MB3909FFD022BF15D48D1F37F496CE9%40BYAPR08MB3909.namprd08.prod.outlook.com.
--
Sent from mobile, excuse the brevity

Diana Leandra Nuñez Rios

unread,
Sep 4, 2021, 10:54:29 AM9/4/21
to methylkit_discussion
Hi
Thanks
I ran with filtered file (According to workflow), and I also tried with normalized file
so,

Using filtered file

       United no flexible                   CpG                                               DMP
                                             5mC    --> 1659917                                  0
                                             5hmC  --> 1849092                                  0

     United flexible 1L
                                        5mC    --> 4821767                                       158
                                        5hmC  --> 5052557                                        38


Using normalized file

       United no flexible                   CpG                                               DMP
                                             5mC    --> 1659917                                  0
                                             5hmC  --> 1849092                                  0

     United flexible 1L
                                        5mC    --> 4821767                                       2586 
                                        5hmC  --> 5052557                                        738




The code to Diff.Meth was

United no flexible

5mC <- calculateDiffMeth(united.meth,
                               overdispersion="MN", 
                               test="Chisq",  
                               covariates=NULL, 
                               adjust=c("SLIM"))

United flexible 1L

5mCFlex <- calculateDiffMeth(united.meth.flex,
                               overdispersion="MN", 
                               test="Chisq",  
                               covariates=NULL, 
                               adjust=c("SLIM"))

Thanks again

Altuna Akalin

unread,
Sep 5, 2021, 10:52:37 AM9/5/21
to methylkit_...@googlegroups.com
Hi,
My gut feeling is to use min.per.group at least 3. One can make simulation experiments to find an optimal number but, you would like to include same variation between samples, if you provide min.per.group=1L, there might be CpGs that appear only once in control and test samples. 

For normalization as well, I don't have a concrete opinion, I think it would help if some samples systematically have low read counts. again, one can use the simulation functions in methylKit to concretely examine the effects of systematic differences in read depth between samples. 

Best,
Altuna

Diana Leandra Nuñez Rios

unread,
Sep 6, 2021, 2:02:24 PM9/6/21
to methylkit_discussion
Thank you
I'm going to follow your gut feeling :)
Thanks

Reply all
Reply to author
Forward
0 new messages