Can't guess EPIC array type in minfi

579 views
Skip to first unread message

XY Zhang

unread,
Apr 6, 2017, 2:17:42 PM4/6/17
to Epigenomics forum
I have some illumina EPIC array data. When using read.metharray.exp function in minfi package, it returned unknown array type, so the following steps can not applied such as detectionP. 

> RGSet <- read.metharray.exp(base = baseDir, targets = targets, recursive = T)
> RGSet
RGChannelSet (storageMode: lockedEnvironment)
assayData: 1051943 features, 8 samples 
  element names: Green, Red 
An object of class 'AnnotatedDataFrame'
  sampleNames: 201194000210_R01C01 201194000210_R02C01 ...
    201194000210_R08C01 (8 total)
  varLabels: Array kexuID ... filenames (21 total)
  varMetadata: labelDescription
Annotation
  array: Unknown
  annotation: Unknown

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] limma_3.30.13                                     
 [2] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0
 [3] minfi_1.20.2                                      
 [4] bumphunter_1.14.0                                 
 [5] locfit_1.5-9.1                                    
 [6] iterators_1.0.8                                   
 [7] foreach_1.4.3                                     
 [8] Biostrings_2.42.1                                 
 [9] XVector_0.14.1                                    
[10] SummarizedExperiment_1.4.0                        
[11] GenomicRanges_1.26.4                              
[12] GenomeInfoDb_1.10.3                               
[13] IRanges_2.8.2                                     
[14] S4Vectors_0.12.2                                  
[15] Biobase_2.34.0                                    
[16] BiocGenerics_0.20.0                               

loaded via a namespace (and not attached):
 [1] genefilter_1.56.0        splines_3.3.3            lattice_0.20-34         
 [4] beanplot_1.2             rtracklayer_1.34.2       GenomicFeatures_1.26.3  
 [7] XML_3.98-1.5             survival_2.41-2          DBI_0.6                 
[10] BiocParallel_1.8.1       RColorBrewer_1.1-2       registry_0.3            
[13] rngtools_1.2.4           doRNG_1.6                matrixStats_0.51.0      
[16] plyr_1.8.4               stringr_1.2.0            pkgmaker_0.22           
[19] zlibbioc_1.20.0          codetools_0.2-15         memoise_1.0.0           
[22] biomaRt_2.30.0           AnnotationDbi_1.36.2     illuminaio_0.16.0       
[25] preprocessCore_1.36.0    Rcpp_0.12.10             xtable_1.8-2            
[28] openssl_0.9.6            base64_2.0               annotate_1.52.1         
[31] Rsamtools_1.26.1         digest_0.6.12            stringi_1.1.3           
[34] nor1mix_1.2-2            grid_3.3.3               GEOquery_2.40.0         
[37] quadprog_1.5-5           tools_3.3.3              bitops_1.0-6            
[40] magrittr_1.5             siggenes_1.48.0          RCurl_1.95-4.8          
[43] RSQLite_1.1-2            MASS_7.3-45              Matrix_1.2-8            
[46] data.table_1.10.4        httr_1.2.1               reshape_0.8.6           
[49] R6_2.2.0                 mclust_5.2.3             nlme_3.1-131            
[52] GenomicAlignments_1.10.1 multtest_2.30.0  

XY Zhang

unread,
Apr 6, 2017, 2:29:50 PM4/6/17
to Epigenomics forum
I have installed the EPIC manifest and annotation package, and restarted Studio. But same results:

> RGSet
RGChannelSet (storageMode: lockedEnvironment)
assayData: 1051943 features, 8 samples 
  element names: Green, Red 
An object of class 'AnnotatedDataFrame'
  sampleNames: 201194000210_R01C01 201194000210_R02C01 ...
    201194000210_R08C01 (8 total)
  varLabels: Array kexuID ... filenames (21 total)
  varMetadata: labelDescription
Annotation
  array: Unknown
  annotation: Unknown
> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] IlluminaHumanMethylationEPICanno.ilm10b2.hg19_0.6.0
 [2] IlluminaHumanMethylationEPICmanifest_0.3.0         
 [3] limma_3.30.13                                      
 [4] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0 
 [5] minfi_1.20.2                                       
 [6] bumphunter_1.14.0                                  
 [7] locfit_1.5-9.1                                     
 [8] iterators_1.0.8                                    
 [9] foreach_1.4.3                                      
[10] Biostrings_2.42.1                                  
[11] XVector_0.14.1                                     
[12] SummarizedExperiment_1.4.0                         
[13] GenomicRanges_1.26.4                               
[14] GenomeInfoDb_1.10.3                                
[15] IRanges_2.8.2                                      
[16] S4Vectors_0.12.2                                   
[17] Biobase_2.34.0                                     
[18] BiocGenerics_0.20.0                                

loaded via a namespace (and not attached):
 [1] mclust_5.2.3             base64_2.0               Rcpp_0.12.10            
 [4] lattice_0.20-34          Rsamtools_1.26.1         digest_0.6.12           
 [7] R6_2.2.0                 plyr_1.8.4               RSQLite_1.1-2           
[10] httr_1.2.1               zlibbioc_1.20.0          GenomicFeatures_1.26.3  
[13] data.table_1.10.4        annotate_1.52.1          Matrix_1.2-8            
[16] preprocessCore_1.36.0    splines_3.3.3            BiocParallel_1.8.1      
[19] stringr_1.2.0            RCurl_1.95-4.8           biomaRt_2.30.0          
[22] rtracklayer_1.34.2       multtest_2.30.0          pkgmaker_0.22           
[25] openssl_0.9.6            GEOquery_2.40.0          quadprog_1.5-5          
[28] codetools_0.2-15         matrixStats_0.51.0       XML_3.98-1.5            
[31] reshape_0.8.6            GenomicAlignments_1.10.1 MASS_7.3-45             
[34] bitops_1.0-6             grid_3.3.3               nlme_3.1-131            
[37] xtable_1.8-2             registry_0.3             DBI_0.6                 
[40] magrittr_1.5             stringi_1.1.3            genefilter_1.56.0       
[43] doRNG_1.6                nor1mix_1.2-2            RColorBrewer_1.1-2      
[46] siggenes_1.48.0          tools_3.3.3              illuminaio_0.16.0       
[49] rngtools_1.2.4           survival_2.41-2          AnnotationDbi_1.36.2    
[52] beanplot_1.2             memoise_1.0.0   

Tim Triche, Jr.

unread,
Apr 7, 2017, 1:37:56 PM4/7/17
to Epigenomics forum
can you make one of the IDATs available?

Kasper Daniel Hansen

unread,
Apr 7, 2017, 2:07:35 PM4/7/17
to epigenom...@googlegroups.com
This is fixed in minfi devel on Github which will be pushed to Bioconductor in a couple of days.

You can fix the problem by manually setting annotation(RGset); look for the data in minfiDataEPIC for an example of the value.

Best,
Kasper

--
You received this message because you are subscribed to the Google Groups "Epigenomics forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to epigenomicsforum+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

XY Zhang

unread,
Apr 10, 2017, 10:53:34 AM4/10/17
to Epigenomics forum

XY Zhang

unread,
Apr 10, 2017, 11:17:38 AM4/10/17
to Epigenomics forum
Hi Kasper,

I tried annotation(RGSet) = c("IlluminaHumanMethylationEPICanno", "ilm10b2.hg19"), or annotation(RGSet) = c("IlluminaHumanMethylationEPIC", "ilm10b2.hg19"), but both failed:


> getAnnotation(RGSet)

Error in .getAnnotationString(object@annotation) : 

  unable to get the annotation string for this object


It seems the annotation parameters what I set were false one for EPIC. What are the correct ones? 


Thanks


Best,

Xinyu Zhang



To unsubscribe from this group and stop receiving emails from it, send an email to epigenomicsfor...@googlegroups.com.

Kasper Daniel Hansen

unread,
Apr 10, 2017, 11:21:04 AM4/10/17
to epigenom...@googlegroups.com
It needs to be a named vector.  You need to add names
  c(array = "IlluminaHumanMethylationEPICanno", annotation = "ilm10b2.hg19")

Best,
Kasper

To unsubscribe from this group and stop receiving emails from it, send an email to epigenomicsforum+unsubscribe@googlegroups.com.

XY Zhang

unread,
Apr 10, 2017, 11:30:25 AM4/10/17
to Epigenomics forum
Hi Kasper,

Now the function works, but there are still troubles in the further steps:

I used "annotation(RGSet) = c(array = "IlluminaHumanMethylationEPIC", annotation = "ilm10b2.hg19")" to make sure it find the proper manifest package. 

> RGSet

RGChannelSet (storageMode: lockedEnvironment)

assayData: 1051943 features, 8 samples 

  element names: Green, Red 

An object of class 'AnnotatedDataFrame'

  sampleNames: 201194000210_R01C01 201194000210_R02C01 ...

    201194000210_R08C01 (8 total)

  varLabels: Array kexuID ... filenames (21 total)

  varMetadata: labelDescription

Annotation

  array: IlluminaHumanMethylationEPIC

  annotation: ilm10b2.hg19


> detP <- detectionP(RGSet)

Error in r[TypeI.Red$AddressA, i] : subscript out of bounds


I think it may be because my array version (probe numbers) is a little different with the current manifest and annotation version in bioconductor? So that the function can not deduct the array type, and also report errors in detectionP step. I am not sure about it.


Best,


Xinyu

Kasper Daniel Hansen

unread,
Apr 10, 2017, 11:41:30 AM4/10/17
to epigenom...@googlegroups.com
I cannot reproduce this error with the file(s) you sent.

Best,
Kasper

To unsubscribe from this group and stop receiving emails from it, send an email to epigenomicsforum+unsubscribe@googlegroups.com.

XY Zhang

unread,
Apr 10, 2017, 11:47:14 AM4/10/17
to Epigenomics forum
So it should be R and minfi version problem. I will update them first and try it again. Thank you!

Best,

Xinyu

XY Zhang

unread,
Apr 10, 2017, 10:13:24 PM4/10/17
to Epigenomics forum
Finally it is not a problem with R and minfi version.

I got the following error when processing large scale of samples:

...

201172520052_R01C01 "Unknown"                      "1051943"

201172520053_R01C01 "Unknown"                      "1051943"

201172520054_R01C01 "Unknown"                      "1051943"

201172540058_R01C01 "Unknown"                      "1051943"

201134840069_R01C01 "IlluminaHumanMethylationEPIC" "1052641"

201134840076_R01C01 "IlluminaHumanMethylationEPIC" "1052641"

201134840080_R01C01 "IlluminaHumanMethylationEPIC" "1052641"

201134840084_R01C01 "IlluminaHumanMethylationEPIC" "1052641"

201172220032_R01C01 "Unknown"                      "1051943"

201172220046_R01C01 "Unknown"                      "1051943"

201172220053_R01C01 "Unknown"                      "1051943"

201172550049_R02C01 "Unknown"                      "1051943"

201172520052_R02C01 "Unknown"                      "1051943"

201172520053_R02C01 "Unknown"                      "1051943"

201172520054_R02C01 "Unknown"                      "1051943"

201172540058_R02C01 "Unknown"                      "1051943"

201134840069_R02C01 "IlluminaHumanMethylationEPIC" "1052641"

201134840076_R02C01 "IlluminaHumanMethylationEPIC" "1052641"

201134840080_R02C01 "IlluminaHumanMethylationEPIC" "1052641"

201134840084_R02C01 "IlluminaHumanMethylationEPIC" "1052641"

...


[read.metharray] Creating data matrices ...

Error in read.metharray(files, extended = extended, verbose = verbose,  :

  [read.metharray] Trying to parse different IDAT files, of different size and type.

Calls: read.metharray.exp -> read.metharray

Execution halted


Some datasets were correctly annotated and some not with a different probe number.

Similar problem was described at https://support.bioconductor.org/p/94226/


It seems that is not related with array data quality, yet the problem still unknown. 

Best,

Xinyu 

Kasper Daniel Hansen

unread,
Apr 10, 2017, 10:55:15 PM4/10/17
to epigenom...@googlegroups.com
This problem is solved in minfi devel, certainly minfi >= 1.21.6

To unsubscribe from this group and stop receiving emails from it, send an email to epigenomicsforum+unsubscribe@googlegroups.com.

XY Zhang

unread,
Apr 13, 2017, 6:24:49 PM4/13/17
to Epigenomics forum
Thank you, Kasper. 

I found the version in minfi development version (https://bioconductor.org/packages/devel/bioc/html/minfi.html), and it works well with force = T parameter. 

Marie Loh

unread,
Oct 19, 2017, 4:38:41 AM10/19/17
to Epigenomics forum
Hi all

I get the same error too

> detP <- detectionP(RGSet)

Error in r[TypeI.Red$AddressA, i] : subscript out of bounds


My code was working previously and I just reinstalled the minfi library via biocLite so I am not sure what's the problem...?

Cheers,
Marie

Kasper Daniel Hansen

unread,
Oct 19, 2017, 9:15:08 AM10/19/17
to Epigenomics forum
If you have the newest minfi library (and I might add it is not clear if this is true, since you don't tell us what version you have), I would also recreate the RGset by reading it again.  

Best,
Kasper

--
You received this message because you are subscribed to the Google Groups "Epigenomics forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to epigenomicsforum+unsubscribe@googlegroups.com.

Marie Loh

unread,
Oct 22, 2017, 11:17:47 AM10/22/17
to Epigenomics forum
Works now, thanks! 
To unsubscribe from this group and stop receiving emails from it, send an email to epigenomicsfor...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages