Just to follow up on these comments here.
'fitPLM' with default parameters in the affyPLM package should give
practically identical results to the 'standard' pipeline (RMA bg corr
+ quantile + fit) within aroma.affymetrix, assuming the underlying
annotation is the same. This was an easy comparison back in the day
of 3' IVT arrays. Now, its a little more difficult.
If anyone is willing, I'd be keen to know if these two actually do
give the same results on the newer chips i.e. is the underlying
annotation the same? I seem to recall that because these newer chips
occasionally have probes that are shared amongst different probesets,
that the older style affy package would not be able to handle it. For
example, Jim MacDonald's post:
http://article.gmane.org/gmane.science.biology.informatics.conductor/19184
Jim says there that you "don't want to use affy" for these chips (not
100% sure why). He suggests the whole pdInfoBuilder/oligo thing which
at one time had some bugs, but is probably more stable now. I haven't
dug deeper as to whether the annotation that 'fitPLM' uses by default
('hugene10st.db' presumably?) matches the annotation that would be
used by aroma.affymetrix (the converted-to-binary 'unsupported' CDF
file). I know Mark Cowley did find some inconsistencies:
http://article.gmane.org/gmane.science.biology.informatics.conductor/19738
This makes me think that we may want to alternatively construct the
XXGene 1.0 CDFs (XX=Hu,Mo,Ra) directly from the PGF/CLF files instead
of from the unsupported CDF. I suspect that there will only be minor
changes, so I haven't looked too deeply into it.
Anyone want to check?
In addition to what Henrik says about flavour="affyPLM" ... for a lot
of my work, there is definitely additional value in using the
auxiliary information from the PLMs (i.e. weights, residuals) ... you
don't get this directly with oligo/median polish.
Few more specific notes ...
>> library(affy)
>> TestData <- ReadAffy()
>> TestEset <- rma(TestData)
>>
>> If you plot(AromaEset[,1],TestEset[,1]) you can visualize how
>> different the data is.
I assume you ensured the probesets and samples are in the same order?
(Or, is this somehow covered by the plot method ...) I can't tell
from this sequence of commands. I don't know what this plot looks
like, so I don't whether to be alarmed or not.
>> However, I noticed that my GeneST normalized data is quite different
>> from the data that I produce using the Affy package. When looking at
>> the controls on the array I see that the Aroma normalized data is
>> between 5-10% lower than that produced by the Bioconductor Affy
>> packages. However, for some probes this difference is can be quite
>> large (values are averaged across 16 samples):
>>
>> Probe Bionconductor Aroma
>> 10341096 (neg_control) 6079 852
>> 10341735 (neg_control) 25 87
>> 10340969 (pos_control) 3953 1758
>> 10338477 (pos_control) 293 611
>>
>> Ultimately, when my downstream analysis looks for differential
>> expression the differences between the two analysis approaches become
>> minimal, but I did notice that the Aroma package seems to call more
>> control probes and probes with no known gene as being differentially
>> expressed (i.e. it looks noisier). These probes were all pretty close
>> to my 2 fold cutoff, and at 3 fold or greater the data looked the
>> same.
Tough to really tell. You'd probably want to average on the log2
scale, not the linear scale. This could be a difference in median-
polish versus robust PLM, or could be a differnce in annotation.
Requires some digging.
Also, you may want to remove control probesets before doing DE
analysis --- for one thing, you pay a slightly lesser penalty for
multiple testing.
Cheers,
Mark
------------------------------
Mark Robinson
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.rob...@garvan.org.au
e: mrob...@wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
------------------------------
>
> >> SUMMARY:
>
> >> [aroma.affymetrix flavor='oligo'] = rma in affy, rma in oligo
>
> >> [aroma.affymetrix flavor='affyPLM'] = fitPLM in affyPLM
>
> >> ... where '=' means 'for all intents and purposes equivalent', not
> >> strictly equal.
>
> >> Cheers,
> >> Mark
>
> >> On 04/12/2008, at 7:43 AM, Henrik Bengtsson wrote:
>
> >>> Hi,
>
> >>> thanks for sharing all this.
>
> >>> On Tue, Dec 2, 2008 at 11:54 AM, pwhiteusa <pwhite...@gmail.com>
>
> >>>> setwd("P:\\ANNOTATION\\AffyAnnotation\\Mouse\\MoGene-1_0-st-v1")
> >>>> library(pdInfoBuilder)
> >>>> pgfFile <- "MoGene-1_0-st-v1.r3.pgf"
> >>>> clfFile <- "MoGene-1_0-st-v1.r3.clf"
> >>>> transFile <- "MoGene-1_0-st-v1.na26.mm9.transcript.csv"
> >>>> probeFile <- "MoGene-1_0-st-v1.probe.tab"
> >>>> pkg <- new("AffyGenePDInfoPkgSeed", author="Peter White",
> >>>> email="peter.wh...@nationwidechildrens.org", version="0.1.3",
> ...
>
> read more »