Hi Mark,
The CDF broke the script - ideas? :)
> cdf <- AffymetrixCdfFile$fromChipType(chipType, tags="U-Ensembl49,G-Affy") #was core,U-...
> print(cdf)
AffymetrixCdfFile:
Path: annotationData/chipTypes/HuEx-1_0-st-v2
Filename: HuEx-1_0-st-v2,U-Ensembl49,G-Affy.cdf
Filesize: 44.04MB
File format: v4 (binary; XDA)
Chip type: HuEx-1_0-st-v2,U-Ensembl49,G-Affy
Dimension: 2560x2560
Number of cells: 6553600
Number of units: 22035
Cells per unit: 297.42
Number of QC units: 1
RAM: 0.00MB
>
> cs_osc_control <- AffymetrixCelSet$fromName("OSCControlClassification",cdf=cdf)
> print(cs_osc_control)
AffymetrixCelSet:
Name: OSCControlClassification
Tags:
Path: rawData/OSCControlClassification/HuEx-1_0-st-v2
Chip type: HuEx-1_0-st-v2,U-Ensembl49,G-Affy
Number of arrays: 67
Names: 10002002.1, 10003001.1, ..., huex_wta_thyroid_C
Time period: 2005-03-03 12:59:11 -- 2007-12-13 16:49:42
Total file size: 4231.77MB
RAM: 0.08MB
>
> bc_osc_control <- RmaBackgroundCorrection(cs_osc_control)
> csBC_osc_control <- process(bc_osc_control,verbose=verbose)
20080402 17:57:34|Background correcting data set...
20080402 17:57:34| Already background corrected
20080402 17:57:34|Background correcting data set...done
> qn_osc_control <- QuantileNormalization(csBC_osc_control, typesToUpdate="pm")
> print(qn_osc_control)
QuantileNormalization:
Data set: OSCControlClassification
Input tags: RBC
Output tags: RBC,QN
Number of arrays: 67 (4231.77MB)
Chip type: HuEx-1_0-st-v2,U-Ensembl49,G-Affy
Algorithm parameters: (subsetToUpdate: NULL, typesToUpdate: chr "pm", subsetToAvg: NULL, typesToAvg: chr "pm", .targetDistribution: NULL)
Output path: probeData/OSCControlClassification,RBC,QN/HuEx-1_0-st-v2
Is done: TRUE
RAM: 0.00MB
> # OSC control
> csN_osc_control <- process(qn_osc_control, verbose=verbose)
20080402 17:57:46|Quantile normalizing data set...
20080402 17:57:46| Already normalized
20080402 17:57:46|Quantile normalizing data set...done
> fit(plm_osc_control, verbose=verbose)
20080402 17:58:02|Fitting model of class ExonRmaPlm:...
ExonRmaPlm:
Data set: OSCControlClassification
Chip type: HuEx-1_0-st-v2,U-Ensembl49,G-Affy
Input tags: RBC,QN
Output tags: RBC,QN,RMA,merged
Parameters: (probeModel: chr "pm"; shift: num 0; flavor: chr "affyPLM"; treatNAsAs: chr "weights").
Path: plmData/OSCControlClassification,RBC,QN,RMA,merged/HuEx-1_0-st-v2
RAM: 0.00MB
20080402 17:58:02| Identifying non-estimated units...
20080402 17:58:02| Getting chip-effect set from data set...
20080402 17:58:02| Retrieving monocell CDF...
20080402 17:58:02| Monocell chip type: HuEx-1_0-st-v2,U-Ensembl49,G-Affy,monocell
20080402 17:58:02| Locating monocell CDF...
20080402 17:58:02| Pathname: annotationData/chipTypes/HuEx-1_0-st-v2/HuEx-1_0-st-v2,U-Ensembl49,G-Affy,monocell.CDF
20080402 17:58:02| Locating monocell CDF...done
20080402 17:58:02| Retrieving monocell CDF...done
20080402 17:58:02| Retrieving chip-effects from data set...
20080402 17:58:02| Data set: OSCControlClassification
20080402 17:58:02| Retrieving chip-effect #1 of 67 (10002002.1)...
20080402 17:58:02| Defining chip-effect file...
20080402 17:58:02| Pathname: plmData/OSCControlClassification,RBC,QN,RMA,merged/HuEx-1_0-st-v2/10002002.1,chipEffects.CEL
Error in list("fit(plm_osc_control, verbose = verbose)" = <environment>, :
[2008-04-02 17:58:02] Exception: The specified CDF structure is not compatible with the CEL file. The number of cells do not match: 327184 != 332929
at throw(Exception(...))
at throw.default("The specified CDF structure is not compatible with the CEL f
at throw("The specified CDF structure is not compatible with the CEL file. The
at setCdf.AffymetrixCelFile(res, cdf)
at setCdf(res, cdf)
at method(static, ...)
at clazz$fromDataFile(df, path = path, name = name, cdf = cdf, ..., verbose =
at method(static, ...)
at clazz$fromDataSet(dataSet = ds, path = getPath(this), cdf = cdfMono, verbos
at getChipEffectSet.ProbeLevelModel(this, verbose = verbose)
at NextMethod("getChipEffectSet", this, ...)
at getChipEffectSet.ExonRmaPlm(this, verbose = verbose)
at getChipEffectSet(this, verbose = verbose)
at findUnitsTodo.ProbeLevelModel(this, verbose = less(verbose))
at findUnitsTodo(this, verbose = less(verbose))
at fit.ProbeLevelModel(plm_osc_control, verb
[s2611722@ettin R_scripts]$ ls -al ../annotationData/chipTypes/HuEx-1_0-st-v2/HuEx-1_0-st-v2,U-Ensembl49,G-Affy*
-rw-r--r-- 1 s2611722 s2611722 46182259 Apr 2 14:13 ../annotationData/chipTypes/HuEx-1_0-st-v2/HuEx-1_0-st-v2,U-Ensembl49,G-Affy.cdf
-rw-rw-r-- 1 s2611722 s2611722 33329489 Apr 2 17:44 ../annotationData/chipTypes/HuEx-1_0-st-v2/HuEx-1_0-st-v2,U-Ensembl49,G-Affy,monocell.CDF
[-------------------------------------
Alistair Chalk, Ph.D.
Research Fellow
Systems Biology Program
The Eskitis Institute for Cell and Molecular Therapies
Griffith University
Brisbane Innovation Park
Don Young Road
Brisbane, QLD 4111, Australia
http://informatics-eskitis.griffith.edu.au
http://www.eskitis.org.au/research/systems/systems.html
Office: +61 (7) 373 54411
Fax: +61 (7) 373 54255
One clarification ...
> Two things to do ....
>
> 1. use force=TRUE in your 'fit' and 'process' calls ... that will
> start from scratch and overwrite the CEL files for the normalized
> data, chip effects, etc.
>
> 2. use tags ... with tags, you can process the same dataset with a
> bunch of different CDFs ... for example, the code below does BG
> adjustment, normalization on the 'main' set of probesets, but then
> does fits on the Ensembl 47 or 49 and stores the results in
> separate locations ...
I guess I should say you should do 1 OR 2, you don't need to do
both. There is no need to do 1 unless you really want to overwrite
your previous analysis. #2 is probably the preferred since it gives
you more flexibility (i.e. of running separate analysis and not
overwriting anything).
Mark