How to use methylKit directly for region-based differential methylation analysis

481 views
Skip to first unread message

Eric Letouzé

unread,
Apr 4, 2015, 10:10:27 AM4/4/15
to methylkit_...@googlegroups.com
Dear Dr Akalin,

I am currently analyzing RRBS data and I would like to ask you two questions regarding your methylKit package, that looks great!
I already summarized my methylation profiles per regions of interest (promoters, CGIs etc.) using scripts from the lab, but I would like to use your methylKit package for differential methylation analysis.
Thus my questions are:
1) Can I use methylKit to analyze methylation differences at the level of regions directly (my coverage is a bit too low to analyze CpGs individually)?
2) How can I coerce my data frame to a methylBase object to go straight to differential methylation testing without performing the initial steps of your pipeline?
I attached an example of data frame if needed.

Thank you very much in advance for your help.
Best regards,

Eric Letouzé
INSERM U1162
Functional genomics of solid tumors
27 rue Juliette Dodu, 75010 Paris
+33 (0)1 53 72 51 98

Sample_data_EL.RData

Altuna Akalin

unread,
Apr 6, 2015, 4:36:24 AM4/6/15
to methylkit_...@googlegroups.com, eric.l...@gmail.com
Dear Eric,

1) Can I use methylKit to analyze methylation differences at the level of regions directly (my coverage is a bit too low to analyze CpGs individually)?
Yes. 
2) How can I coerce my data frame to a methylBase object to go straight to differential methylation testing without performing the initial steps of your pipeline?

Here is how you can create a methylBase object from your data.frame (df). You need to fill out sample.ids vector, assembly string, treatment vector, these are self explanatory. They are the same/similar arguments you would use when reading a file.

coverage.index,numCs.index and numTs.index are the the vectors for columns numbers (indeces) for numTs, coverage and numCs values in your data frame. destranded is a TRUE|FALSE value showing if you have stranded information or not. The rest I filled out for you, check if they are correct.

obj=new("methylBase",(df),sample.ids=c(),
         assembly="",context="CpG",
         treatment=c(),coverage.index=c(),
         numCs.index=c(),numTs.index=c(),
         destranded=TRUE,resolution="region" )

You have to make sure everything is correct and data frame has the same structure as in a regular methylBase object

Best,
Altuna

tagu...@gmail.com

unread,
Jul 18, 2015, 7:27:18 PM7/18/15
to methylkit_...@googlegroups.com, eric.l...@gmail.com
Hi Altuna and Eric,
I'm also trying to coerce a dataframe to a methylBase object. I've tried to adapt the code in Altuna's response (my adaptation copied below), but get the error message: Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘filterByCoverage’ for signature ‘"methylBase"’. Any ideas on what I've done wrong?
Many thanks,
Trudi 

My dataframe:
> head(all.sat.sort)
       chromosome position strand count_meth1 count_unmeth1 C-context1 tri-nuc1 total_reads1 count_meth2 count_unmeth2 C-context2 tri-nuc2 total_reads2
6  Lsat_1_v6_lg_1     1001      +           0             0        CHH      CCT            0           0             0        CHH      CCT            0
11 Lsat_1_v6_lg_1     1002      +           0             0        CHH      CTT            0           1             2        CHH      CTT            3
22 Lsat_1_v6_lg_1     1005      +           0             0        CHH      CCC            0           0             1        CHH      CCC            1
28 Lsat_1_v6_lg_1     1006      +           0             0        CHH      CCC            0           1             5        CHH      CCC            6
33 Lsat_1_v6_lg_1     1007      +           0             0        CHH      CCT            0           0            13        CHH      CCT           13
37 Lsat_1_v6_lg_1     1008      +           0             0        CHH      CTC            0           0            31        CHH      CTC           31

> obj=new("methylBase",(all.sat.sort),sample.ids=c("wgbs.sat1", "wgbs.sat2"), assembly="Lsat.1.v6.lg.fasta",context="CpG",treatment=c(1,0),coverage.index=c(8, 13), numCs.index=c(4,9),numTs.index=c(5,10), destranded=TRUE,resolution="base" )

> str(obj)
'data.frame': 10000 obs. of  13 variables:
Formal class 'methylBase' [package "methylKit"] with 13 slots
  ..@ .Data         :List of 13
  .. ..$ : Factor w/ 1 level "Lsat_1_v6_lg_1": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ : int  1001 1002 1005 1006 1007 1008 1010 1011 1012 1015 ...
  .. ..$ : Factor w/ 2 levels "-","+": 2 2 2 2 2 2 2 1 2 2 ...
  .. ..$ : int  0 0 0 0 0 0 0 0 0 0 ...
  .. ..$ : int  0 0 0 0 0 0 0 0 0 0 ...
  .. ..$ : Factor w/ 3 levels "CG","CHG","CHH": 3 3 3 3 3 3 1 1 3 2 ...
  .. ..$ : Factor w/ 20 levels "CAA","CAC","CAG",..: 10 20 7 7 10 18 12 11 17 19 ...
  .. ..$ : int  0 0 0 0 0 0 0 0 0 0 ...
  .. ..$ : int  0 1 0 1 0 0 0 0 2 2 ...
  .. ..$ : int  0 2 1 5 13 31 2 1 0 1 ...
  .. ..$ : Factor w/ 3 levels "CG","CHG","CHH": 3 3 3 3 3 3 1 1 3 2 ...
  .. ..$ : Factor w/ 20 levels "CAA","CAC","CAG",..: 10 20 7 7 10 18 12 11 17 19 ...
  .. ..$ : int  0 3 1 6 13 31 2 1 2 3 ...
  ..@ sample.ids    : chr  "wgbs.sat1" "wgbs.sat2"
  ..@ assembly      : chr "Lsat.1.v6.lg.fasta"
  ..@ context       : chr "CpG"
  ..@ treatment     : num  1 0
  ..@ coverage.index: num  8 13
  ..@ numCs.index   : num  4 9
  ..@ numTs.index   : num  5 10
  ..@ destranded    : logi TRUE
  ..@ resolution    : chr "base"
  ..@ names         : chr  "chromosome" "position" "strand" "count_meth1" ...
  ..@ row.names     : int  6 11 22 28 33 37 46 50 57 70 ...
  ..@ .S3Class      : chr "data.frame"

> #filter samples by read depth
> filterByCoverage(obj,lo.count=10,lo.perc=NULL,hi.count=NULL,hi.perc=99.9)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘filterByCoverage’ for signature ‘"methylBase"’

Altuna Akalin

unread,
Jul 20, 2015, 10:28:12 AM7/20/15
to methylkit_...@googlegroups.com
Hi,
This data frame doesn't look like a data frame from methylKit. please see data(methylKit);methylBase.obj to see the format

I do not recommend setting up objects this way, it causes the problems you are encountering. 

Best,
Altuna

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discus...@googlegroups.com.
To post to this group, send email to methylkit_...@googlegroups.com.
Visit this group at http://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages