Preprocessing, normalization and integration of EPIC arrays with minfi.

1,369 views
Skip to first unread message

Jean-Philippe Fortin

unread,
Jul 24, 2016, 4:15:16 PM7/24/16
to Epigenomics forum
Hi everyone,

Kasper, Tim and I have just finished updating the minfi package for the full processing, normalization and integration of the EPIC array with the 450k array. The latest version of minfi (1.19.10) including the changes can be found on GitHub at https://github.com/kasperdanielhansen/minfi. The devel version should be updated on Bioconductor within the next couple of days. Here are a few things that we have changed:

0) EPIC and 450k arrays can now be loaded into R using the function read.metharray() and read.metharray.exp() replacing the deprecated functions read.meth() and read.meth.exp().

1) We have updated all normalization methods for EPIC array: preprocessRaw(), preprocessIllumina(), preprocessSWAN(), preprocessQuantile(), preprocessNoob() and preprocessFunnorm(). We have improved the noob correction (preprocessNoob()) to be a single sample normalization, that is new samples can be normalized separately from old batches, and the normalized samples will still be optimal. A comparison of the different normalization methods can be found in our preprint: http://biorxiv.org/content/early/2016/07/23/065490

2) Estimation of the cell-type proportions with the function estimateCellCounts() can now be performed on EPIC arrays as well using the three reference datasets FlowSorted.Blood.450k, FlowSortedCordBlood450k and FlowSorted.DLPFC.450k. We show in the preprint that using probes in common between the 450k and EPIC arrays do not change the results of the estimated counts. 

3) We have designed the function combineArrays() to combine 450k data with EPIC array data, for all minfi classes (RGChannelSet, (Genomic)MethylSet, (Genomic)RatioSet)). See documentation in minfi. 

Note that for the analysis of the EPIC arrays with minfi, you will need to install the latest manifest and annotation packages:
IlluminaHumanMethylationEPICmanifest
IlluminaHumanMethylationEPICanno.ilm10b2.hg19

Cheers,

Jean-Philippe

ahend...@tgen.org

unread,
Jul 26, 2016, 7:05:39 PM7/26/16
to Epigenomics forum
Hi,

I'm curious if the newest version of the User's Guide is still under construction. The version available on Bioconductor seems to be unfinished (http://bioconductor.org/packages/release/bioc/vignettes/minfi/inst/doc/minfi.pdf) ... Is it available elsewhere? The 2014 Tutorial has been very helpful, so I'd like to take a look at the new user's guide as well. I'll be sure to read through the preprint in the meantime.

Thanks!

Jean-Philippe Fortin

unread,
Jul 27, 2016, 8:31:07 AM7/27/16
to Epigenomics forum
We are currently rewriting the User's Guide -- note that for the analysis of EPIC arrays, most of the functions behave similarly as for an analysis of 450k arrays, so the 2014 Tutorial should be helpful. For now, for combining 450k data with EPIC data, information can be found in the preprint and in the reference manual for the functions combineArrays().

JP

Adrienne Smith

unread,
Aug 1, 2016, 8:35:24 PM8/1/16
to Epigenomics forum
Thanks for your reply.

I could use some clarification on a different point. In the 2014 tutorial it says,

"In our experience, running SVA after normalizing the 450K data with preprocessFunnorm or preprocessQuantile increases the statistical power of the downstream analysis."


In the paper, Functional normalization of 450k methylation array data improves replication in large cancer studies, it says,

"Although a normalization method, functional normalization is robust in the presence of a batch effect, and performs better than the batch removal tool SVA on our assessment datasets."


I realize it may just be a matter of case-by-case assessment, but I'm considering using the ChAMP package to apply the Combat method to my (Funnorm) normalized data from minfi and don't want to proceed if it's overkill. I wanted to get some advice before trying this because I'm new to analysis and I'll need to figure out how to handle taking the GenomicRatioSet class into the ChAMP Combat function that expects a list class instead. As a newbie it's not so simple :). Any help is appreciated.


Thanks,

Adrienne




Tim Triche, Jr.

unread,
Aug 2, 2016, 7:01:29 AM8/2/16
to Epigenomics forum
You could always just use ComBat from the sva package and run that on the m-values (getM(grSet)) if that's your goal.

No particular need to use ChAMP (which as far as I can tell isn't maintained as vigorously as it once was anyhow).

Anthony Griswold

unread,
Sep 15, 2016, 3:58:59 PM9/15/16
to Epigenomics forum
I'm running into an error when trying to create a MehtylSet in minfi 1.19.13

Below is a bit of ode when trying to create the set along with the error.
Any suggestions what might be happening here:

> RGset
RGChannelSet (storageMode: lockedEnvironment)
assayData: 1052641 features, 4 samples 
  element names: Green, Red 
An object of class 'AnnotatedDataFrame'
  sampleNames: 200652310053_R06C01 200652310053_R07C01 200652310053_R08C01 200650230032_R01C01
  varLabels: Sample_ID Sample_Well ... filenames (8 total)
  varMetadata: labelDescription
Annotation
  array: IlluminaHumanMethylationEPIC
  annotation: ilm10b2.hg19

> Mset <- preprocessRaw(RGset)
Error in `assayDataElement<-`(`*tmp*`, "Meth", validate = FALSE, value = c(233,  : 
  unused argument (validate = FALSE)

Jean-Philippe Fortin

unread,
Sep 15, 2016, 4:02:39 PM9/15/16
to Epigenomics forum
You need to install the latest devel version of Biobase (2.33.3)
Next time could you show us your sessionInfo() as well?

Kasper Daniel Hansen

unread,
Sep 15, 2016, 4:03:02 PM9/15/16
to epigenom...@googlegroups.com
You're not using Biobase 2.33.2 or greater.  I am guess that you're installing minfi devel (1.19.13) into a stable Bioconductor.  That won't work (it sometimes works, but currently, and it is not supported).



--
You received this message because you are subscribed to the Google Groups "Epigenomics forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to epigenomicsforum+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jean-Philippe Fortin

unread,
Sep 15, 2016, 4:04:10 PM9/15/16
to Epigenomics forum
Also note that we have updated minfi yesterday (version 1.19.4) so that the estimateCellCounts() function works properly with EPIC data. 
To unsubscribe from this group and stop receiving emails from it, send an email to epigenomicsfor...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages