Is it possible to obtain the delta beta with minfi?

973 views
Skip to first unread message

Pablo marin-garcia

unread,
Apr 22, 2013, 9:13:58 AM4/22/13
to epigenom...@googlegroups.com
Hello
 in my methylation experiment when obtaining the differentially methylated positions with minfi dmFinder I obtain a dataframe like

> head(dmp_neo)
           intercept        f         pval        qval
cg25587481  3.049019 763.7692 3.170129e-09 0.001255365
cg15776446 -5.953288 156.9387 1.543591e-06 0.305629537
cg26501671  1.895482 127.1332 3.441293e-06 0.343209713
cg00728655  5.516670 119.0103 4.416806e-06 0.343209713
cg02262727  4.213064 117.1868 4.681643e-06 0.343209713
cg13000285  4.221990 109.8388 5.973284e-06 0.343209713

Is it possible to obtain a delta beta for each position? I can see the difference with plotCpg, but I would like to filter/sort the sites also by deltabeta.

I have used with RnBeads the column  mean.diff and with Genomeestudio the deltabeta and diff score, and in order to compare with minfi results I would like to sort by delta beta.


Thanks

Gustavo Fernandez Bayon

unread,
Apr 23, 2013, 4:27:43 AM4/23/13
to epigenom...@googlegroups.com
Hi Pablo.

I have run into these issues before. If you are using moderated p-values (i.e., shrinkVar=TRUE) in dmpFinder, and due to the contrasts used, the 'intercept' column seems to reflect the mean difference (although the order is dependant on the factor levels, I think). If shrinkVar is set to FALSE, then this is no longer valid.

I have written Kasper Daniel Hansen about this in the past, and he answered me that they were still thinking about ways of improving the dmpFinder function when comparing two samples, which is probably the most common case we find everyday.

If you need full control, then you should probably do it by hand, using limma for example, or t-tests, and then correct for multiple testing.

I once adapted dmpFinder just by modifying following code:

tab <- data.frame(intercept = fit$coefficients[,
                1], f = F, pval = F.p.value)

into this one:

 tab <- data.frame(intercept=fit$coefficients[, 1],
                    slope=fit$coefficients[, 2], f=F, pval=F.p.value)

and it served me well for my purposes.

Hope this can help you.

Regards,
Gustavo

Rob

unread,
Apr 23, 2013, 4:37:31 AM4/23/13
to epigenom...@googlegroups.com
Hi,

You can get the Raw Beta values for each probes via the following:

MSet.raw <- preprocessRaw(RGset)

beta=getBeta(MSet.raw, type = "Illumina")

Then to get the average beta difference values you can do something like this. (you will need to define group1 and group2 to your initial call in dmpFinder.)

betadif=rowMeans(beta[row.names(dmp_neo),group1])-rowMeans(beta[row.names(dmp_neo),group2])


Finally you can get a list of p-value, q-value and differences by:

out=cbind(dmp_neo$pval, dmp_neo$qval, betadif)

head(out)


Hopefully you can easily see how to change it for normalised data and your particular dataset.

Thanks,
Rob





Pablo marin-garcia

unread,
Jun 2, 2013, 7:51:25 PM6/2/13
to epigenom...@googlegroups.com
Thanks Rob, reading your code I realize how simple was to obtain the data. Sometimes things are simpler than one thinks.

Pablo marin-garcia

unread,
Jun 2, 2013, 11:30:14 PM6/2/13
to epigenom...@googlegroups.com


Thanks Gustavo. Rest of the email inline...

On Tuesday, 23 April 2013 10:27:43 UTC+2, Gustavo Fernandez Bayon wrote:
Hi Pablo.

I have run into these issues before. If you are using moderated p-values (i.e., shrinkVar=TRUE) in dmpFinder, and due to the contrasts used, the 'intercept' column seems to reflect the mean difference (although the order is dependant on the factor levels, I think). If shrinkVar is set to FALSE, then this is no longer valid.

One question about the shrinkVar. Hoe do you run it? When I tried shrinkVar it failed because 

        In contrasts.fit(fit, contrasts(pheno)) :
          row names of contrasts don't match col names of coefficients

this hapens when doing the contrast.fit in minfi:

[minfi dmpFinder code]
   if (type == "categorical") {
        pheno <- factor(as.character(pheno))
        design <- model.matrix(~pheno)
        fit <- lmFit(M, design)
        if (shrinkVar) {
            fit <- contrasts.fit(fit, contrasts(pheno))
            fit <- eBayes(fit)
            tab <- data.frame(intercept=fit$coefficients[, 1], f=fit$F, pval=fit$F.p.value)

My doubt is that in the limma manual (http://bioconductor.org/packages/2.12/bioc/manuals/limma/man/limma.pdf) for single chanel experiment with two groups (pg 40), when the design contains all 1s in the first column and second column with the AvsB,  then from lmFit the next step is eBayes. On the contrary when the design is a matrix with separated coefficients for each condition, after "lmFit" they do "makeContrasts" first. 

The minfi code internally do:

> design <- model.matrix(~pheno)

> design
   (Intercept) phenoneo_SGA
1            1            0
2            1            1
3            1            0
4            1            1
5            1            0
6            1            1
7            1            1
8            1            0
9            1            0
10           1            1
attr(,"assign")
[1] 0 1
attr(,"contrasts")
attr(,"contrasts")$pheno
[1] "contr.treatment" 


> pheno
 [1] neo_AGA neo_SGA neo_AGA neo_SGA neo_AGA neo_SGA neo_SGA neo_AGA neo_AGA
[10] neo_SGA
Levels: neo_AGA neo_SGA

> fit <- contrasts.fit(fit, contrasts(pheno))
[error]
        In contrasts.fit(fit, contrasts(pheno)) :
          row names of contrasts don't match col names of coefficients

> contrasts(pheno)
        neo_SGA
neo_AGA       0
neo_SGA       1

How can I pass to dmpFinder the right contrasts names?

pablo

Gustavo Fernández Bayón

unread,
Jun 3, 2013, 5:10:33 AM6/3/13
to epigenom...@googlegroups.com

Hi Pablo.

Sorry for the late reply. With respect to the shrinkVar parameter, we are currently setting it to FALSE, because we are first applying a non-specific filtering to our data in order to increase statistical power (we have only a few samples in this project), and according to this paper, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2906865/?tool=pmcentrez&rendertype=abstract, the empirical bayes adjustment should not be done in this case.

Problem is, in my case, the message about the contrast names is just a warning. Thus, I am still getting the result. As far as your are aware of this, I think you could safely use the results for it's just a warning about some names not matching.

Once I e-mailed Kasper Daniel Hansen with some questions about the dmpFinder function, and he told me that there were still things that they were not sure how to implement. For example, when doing a two-group test with shrinkVar=FALSE, the 'intercept' column does not represent the effect size, so you have to calculate again what's already been done inside the function.

IMHO, if you are aware of what's happening with the contrast names, I think you can use the results (if it really is a warning in your case, if it's an error I guess you should have to look for another way). I am using dmpFinder because I really like minfi and it serves my purposes greatly, but you can use limma or rowttests or whatever you prefer once you have the m-values.

Another suggestion could be to post the minfi-related questions on the Bioconductor mailing list http://www.bioconductor.org/help/mailing-list/  Kasper usually answers there very fast.

Regards,
Gus




El 03/06/13 05:30, Pablo marin-garcia escribió:
--
You received this message because you are subscribed to a topic in the Google Groups "Epigenomics forum" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/epigenomicsforum/uTYCSOgzsHs/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to epigenomicsfor...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Smeeta Shrestha

unread,
Mar 6, 2014, 7:14:20 AM3/6/14
to epigenom...@googlegroups.com
Hi All !!

I am a beginner using dmpFinder in the minfi package. I am trying to use this function independently ie not starting with idat files. Instead I am just using the beta matrix for input and phenotype file.

However i do not get any dmps. when i know from other analysis that i get (IMA).

Below I have shown the error i get . Any assistance would be very much appreciated.

beta <- "C:/Documents and Settings/Administrator/Desktop/F_B12_Test/Beta.txt"
dat <- read.table(beta,sep="\t", head=TRUE)
dat1 <- as.matrix(dat[,-1])
Phenotype <-"C:/Documents and Settings/Administrator/Desktop/F_B12_Test/Pheno.txt"
pheno <- read.table(Phenotype,sep="\t", head=TRUE)
pheno1 <- as.character(pheno$Sample_group)
dmp <- dmpFinder(dat1,pheno1, type = "categorical",qCutoff = NULL, shrinkVar = TRUE)

Warning message:

In contrasts.fit(fit, contrasts(pheno)) :
  row names of contrasts don't match col names of coefficients



Thank you so much in advance !!!

I really am stuck badly for some time

smeeta

Pablo marin-garcia

unread,
Mar 17, 2014, 8:05:29 PM3/17/14
to epigenom...@googlegroups.com
Hello Smeeta,

The message that you have is a warning not an error so the code should work. I never managed to use categorical with shrinkVar. I use it always with F.

Only one more thing, dmpFinder expects a matrix of Ms not Betas.

Just in case make str(your_matrix) and str(your_pheno) to see that they are as you expect.

P.

Christian Trolle

unread,
Jun 26, 2014, 6:08:37 AM6/26/14
to epigenom...@googlegroups.com
Hi Pablo Marin-Garcia

I know this post is a bit old, but how did you acquire the group information needed for the code. You probably obtain it from the Mset.raw file which you used to generate the beta-values from, but how exactly did the code look. Hope the question makes sense,

betadif=rowMeans(beta[row.names(dmp_neo),group1])-rowMeans(beta[row.names(dmp_neo),group2])

Christian Trolle

unread,
Jun 26, 2014, 7:27:35 AM6/26/14
to epigenom...@googlegroups.com
I figured it out.

M <- getM(genMeth.noXY, type = "beta", betaThreshold = 0.001)
Mdif=rowMeans(M[row.names(dmp),group1])-rowMeans(M[row.names(dmp),group2])
volcanoDMP=cbind(dmp$pval, dmp$qval, Mdif)
colnames(volcanoDMP)[1] <- "pval"
colnames(volcanoDMP)[2] <- "qval"
head(volcanoDMP)
Reply all
Reply to author
Forward
0 new messages