Re: Problems with the new Calvin CEL file format (Was: Re: [aroma.affymetrix] Re: zero-length argument error)

821 views
Skip to first unread message

Henrik Bengtsson

unread,
May 10, 2007, 8:37:20 PM5/10/07
to aroma-af...@googlegroups.com
Hi.

Ok. First, good that the patch for timestamp works. I will soon
release a new version of aroma.affymetrix etc. solving recent problems
and problems with R v2.5.0.

Second, the CEL files seems to be in the new/upcoming binary file
format called "Calvin". This is probably why the timestamp is lost -
I have to look into the details if there is a timestamp in this new
file format.

Third, I have not done any analysis with aroma.affymetrix on this new
file format. However, it should support *read* data from Calvin
files, but it does *not* support writing/updating Calvin files, and
that is where the problem is. Your error message occurs when
aroma.affymetrix first copies a (Calvin) CEL files and then tries to
update it (which is only supported for standard binary (XDR) CEL
files).

I do not see that the package will support writing Calvin CEL files
anytime soon. A workaround for you is to convert all Calvin CEL files
in regular binary CEL files using convertCel() available in
affxparser. You only have to do this ones for you raw data. I
haven't tried converting Calvin CEL files, but only ASCII CEL files.
However, I don't think you should have any problems.

Could you please point me to an Affymetrix URL where I can download
these Calvin CEL files? The first solution for aroma.affymetrix is
probably to detect Calvin file formats, and instead of copying them
(as tried below), using convertCel(). That should be fairly
transparent to the user, except that it takes extra time.

Cheers

Henrik

On 5/10/07, Mark Robinson <mark.rob...@gmail.com> wrote:
> yep, here you go. see below. i dunno whether ascii/binary ... these are
> downloaded from affy website.
>
> another problem with the 'process' call for this ... see below. I haven't
> upgraded to R 2.5 yet ...
>
> > getTimestamp(cf)
> [1] NA
> > print(cf)
> AffymetrixCelFile:
> Name: TisMap_Brain_01_v1_WTGene1
> Tags:
> Pathname:
> rawData/tissues/HuGene-1_0-st-v1/TisMap_Brain_01_v1_WTGene1.CEL
> File size: 10.56MB
> RAM: 0.00MB
> Chip type: HuGene-1_0-st-v1
> Timestamp: NA
> >
> > print(readChar(getPathname(cf), n=1500))
> [1]
> ";\001\0\0\0\001\0\0\031\030\0\0\0\033affymetrix-calvin-intensity\0\0\060000023828-1176327457-0000006334-0000018467-0000000041\0\0\0\0\0\0\0\005\0e\0n\0-\0U\0S\0\0\0#\0\0\0\017\0p\0r\0o\0g\0r\0a\0m\0-\0c\0o\0m\0p\0a\0n\0y\0\0\0
> \0A\0f\0f\0y\0m\0e\0t\0r\0i\0x\0,\0
> \0I\0n\0c\0.\0\0\0\n\0t\0e\0x\0t\0/\0p\0l\0a\0i\0n\0\0\0\f\0p\0r\0o\0g\0r\0a\0m\0-\0n\0a\0m\0e\0\0\0*\0D\0a\0t\0a\0
> \0E\0x\0c\0h\0a\0n\0g\0e\0
> \0C\0o\0n\0s\0o\0l\0e\0\0\0\n\0t\0e\0x\0t\0/\0p\0l\0a\0i\0n\0\0\0\n\0p\0r\0o\0g\0r\0a\0m\0-\0i\0d\0\0\0\022\00\0.\00\0.\00\0.\05\04\01\0\0\0\n\0t\0e\0x\0t\0/\0p\0l\0a\0i\0n\0\0\0\026\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0s\0y\0s\0t\0e\0m\0-\0t\0y\0p\0e\0\0\0\022\0R\0U\0O\0
> \0B\0E\0T\0A\03\0\0\0\n\0t\0e\0x\0t\0/\0p\0l\0a\0i\0n\0\0\0\027\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0f\0i\0l\0e\0-\0c\0r\0e\0a\0t\0o\0r\0\0\0\0\0\0\0\n\0t\0e\0x\0t\0/\0p\0l\0a\0i\0n\0\0\0%\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0a\0l\0g\0o\0r\0i\0t\0h\0m\0-\0p\0a\0r\0a\0m\0-\0P\0e\0r\0c\0e\0n\0t\0i\0l\0e\0\0\0\00275\0\0\0\n\0t\0e\0x\0t\0/\0a\0s\0c\0i\0i\0\0\0%\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0a\0l\0g\0o\0r\0i\0t\0h\0m\0-\0p\0a\0r\0a\0m\0-\0C\0e\0l\0l\0M\0a\0r\0g\0i\0n\0\0\0\0014\0\0\0\n\0t\0e\0x\0t\0/\0a\0s\0c\0i\0i\0\0\0&\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0a\0l\0g\0o\0r\0i\0t\0h\0m\0-\0p\0a\0r\0a\0m\0-\0O\0u\0t\0l\0i\0e\0r\0H\0i\0g\0h\0\0\0\0051.500\0\0\0\n\0t\0e\0x\0t\0/\0a\0s\0c\0i\0i\0\0\0%\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0a\0l\0g\0o\0r\0i\0t\0h\0m\0-\0p\0a\0r\0a\0m\0-\0O\0u\0t\0l\0i\0e\0r\0L\0o\0w\0\0\0\0051.004\0\0\0\n\0t\0e\0x\0t\0/\0a\0s\0c\0i\0i\0\0\0%\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0a\0l\0g\0o\0r\0i\0t\0h\0m\0-\0p\0a\0r\0a\0m\0-\0A\0l\0g\0V\0e\0r\0s\0i\0o\0n\0\0\0\0036.0\0\0\0\n\0t\0e\0x\0t\0/\0a\0s\0c\0i\0i\0\0\0(\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0a\0l\0g\0o\0r\0i\0t\0h\0m\0-\0p\0a\0r\0a\0m\0-\0F\0i\0x\0e\0d\0C\0e\0l\0l\0S\0i\0z\0e\0\0\0\004TRUE\0\0\0\n\0t\0e\0x\0t\0/\0a\0s\0c\0i\0i\0\0\0+\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0a\0l\0g\0o\0r\0i\0t\0h\0m\0-\0p\0a\0r\0a\0m\0-\0F\0u\0l\0l\0F\0e\0a\0t\0u\0r\0e\0W\0i\0d\0t\0h\0\0\0\0017\0\0\0\n\0t\0e\0x\0t\0/\0a\0s\0c\0i\0i\0\0\0,\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0a\0l\0g\0o\0r\0i\0t\0h\0m\0-\0p\0a\0r\0a\0m\0-\0F\0u\0l\0l\0F\0e\0a\0t\0u\0r\0e\0H\0e\0i\0g\0h\0t\0\0\0\0017\0\0\0\n\0t\0e\0x\0t\0/\0a\0s\0c\0i\0i\0\0\04\0a\0f\0f\0y\0m\0e\0t\0r\0i\0x\0-\0a\0l\0g\0o\0r\0i\0t\0"
> >
>
>
> > csN <- process(qn, verbose=verbose) #time required
> 20070511 10:16:17|Quantile normalizing data set...
> 20070511 10:16:17| Retrieving target distribution...
> 20070511 10:16:17| Getting target distribution...
> 20070511 10:16:17| Was specified or cached in-memory.
> num [1:788012] 24.7 25.4 25.6 25.8 26.0 ...
> 20070511 10:16:17| Getting target distribution...done
> 20070511 10:16:17| Retrieving target distribution...done
> 20070511 10:16:18| Normalizing data towards target distribution...
> 20070511 10:16:18| Identifying the probes to be updated...
> 20070511 10:16:18| Identifying the probes to be updated...done
> 20070511 10:16:18| Normalizing 788012 probes
> 20070511 10:16:18| Normalizing 33 arrays...
> 20070511 10:16:18| Array #1...
> AffymetrixCelFile:
> Name: TisMap_Brain_01_v1_WTGene1
> Tags:
> Pathname:
> rawData/tissues/HuGene-1_0-st-v1/TisMap_Brain_01_v1_WTGene1.CEL
> File size: 10.56MB
> RAM: 0.01MB
> Chip type: HuGene-1_0-st-v1
> Timestamp: NA
> 20070511 10:16:18| Reading probe intensities...
> 20070511 10:16:20| Reading probe intensities...done
> 20070511 10:16:20| Normalizing to empirical target distribution...
> 20070511 10:16:21| Normalizing to empirical target distribution...done
> 20070511 10:16:21| Writing normalized probe signals...
> 20070511 10:16:21| Creating CEL file for results, if missing...
> 20070511 10:16:21| Creating CEL file...
> 20070511 10:16:21| Chip type: HuGene-1_0-st-v1
> 20070511 10:16:21| Pathname:
> probeData/tissues,QN/HuGene-1_0-st-v1/TisMap_Brain_01_v1_WTGene1.CEL
> 20070511 10:16:21| Method: copy
> Error in list("process(qn, verbose = verbose)" = <environment>,
> "process.QuantileNormalization(qn, verbose = verbose)" = <environment>, :
>
> [2007-05-11 10:16:33] Exception: Cannot create CEL file of version
> 4(probeData/tissues,QN/HuGene-1_0-st-v1/TisMap_Brain_01_v1_WTGene1.CEL).
> Template CEL file has version 1:
> rawData/tissues/HuGene-1_0-st-v1/TisMap_Brain_01_v1_WTGene1.CEL
> at throw(Exception(...))
> at throw.default("Cannot create CEL file of version ", version, "(",
> pathname,
> at throw("Cannot create CEL file of version ", version, "(", pathname,
> "). Tem
> at createFrom.AffymetrixCelFile(this, filename = pathname, path = NULL,
> verbos
> at createFrom(this, filename = pathname, path = NULL, verbose =
> less(verbose))
> at normalizeQuantile.AffymetrixCelFile(df, path = path,
> subsetToUpdate = subse
> at normalizeQuantile(df, path = path, subsetToUpdate = subsetToUpdate,
> typesTo
> at normalizeQuantile.AffymetrixCelSet(NA, path =
> "probeData/tissues,QN/HuGene-
> at normalizeQuantile(NA,
> 20070511 10:16:45| Array #1...done
> 20070511 10:16:45| Normalizing 33 arrays...done
> 20070511 10:16:45| Normalizing data towards target distribution...done
> 20070511 10:16:45|Quantile normalizing data set...done
>
>
> > sessionInfo()
> R version 2.4.0 (2006-10-03)
> x86_64-unknown-linux-gnu
>
> locale:
> LC_CTYPE=en_AU.UTF-8;LC_NUMERIC=C;LC_TIME=en_AU.UTF-8;LC_COLLATE=en_AU.UTF-8;LC_MONETARY=en_AU.UTF-8;LC_MESSAGES=en_AU.UTF-8;LC_PAPER=en_AU.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_AU.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] "splines" "tools" "methods" "stats" "graphics" "grDevices"
> [7] "utils" "datasets" "base"
>
> other attached packages:
> aroma.affymetrix digest aroma.apd R.huge
> "0.4.9" "0.2.3" "0.1.3" "0.1.2"
> aroma.light affyPLM gcrma matchprobes
> "1.1.0" "1.11.13" "2.6.0" "1.6.0"
> affydata affy affyio Biobase
> "1.10.0" "1.12.0" "1.2.0" "1.12.2"
> affxparser R.native R.rsp R.cache
> "1.7.5" " 0.1.2" "0.3.1" "0.1.4"
> R.utils R.oo
> "0.8.8" "1.2.6"
>
>
>
>
>
>
>
> >
>

Mark Robinson

unread,
May 10, 2007, 8:44:21 PM5/10/07
to aroma-af...@googlegroups.com

http://www.affymetrix.com/support/technical/sample_data/hugene_1_0_array_data.affx

OK, good find.  I'll get the convertCels running.

Mark

Mark Robinson

unread,
May 10, 2007, 8:53:19 PM5/10/07
to aroma-af...@googlegroups.com
> f<-dir("rawData/tissues/HuGene-1_0-st-v1/","CEL",full=T)
> for(i in seq(along=f)) {
+ nfn<-sub("WTGene1","WTGene1_XDA",f[i])
+ convertCel(f[i],nfn)
+ }
Error in sprintf(fmt, ...) : zero-length argument
In addition: Warning message:
Coercing LHS to a list

Does the same thing need to  be applied to the convertCel function?

Thanks,
Mark



On 5/11/07, Henrik Bengtsson <h...@stat.berkeley.edu> wrote:

Henrik Bengtsson

unread,
May 10, 2007, 9:17:14 PM5/10/07
to aroma-af...@googlegroups.com
Hi,

yes, it could be related to the fact that the 'header' field of the
Calvin CEL header is empty. Try to use convertCel(..., verbose=TRUE).
However, the solution/patch for that will be different, since this is
unrelated to aroma.affymetrix. I'll look into the problem when I have
time. See if you can find any CEL converters on the Affymetrix
website, e.g. the 'apt-cel-transformer' tool part of the Affymetrix
Power Tools [http://www.affymetrix.com/support/developer/tools/devnettools.affx]
might be able to do a "one-to-one transformation" of a CEL file...
maybe.

/Henrik

Mark Robinson

unread,
May 10, 2007, 10:22:46 PM5/10/07
to aroma-af...@googlegroups.com
Didn't have much luck with the Windoze converter on those files either.

They do have the 'GCOS' CEL files for this dataset, which is the XDA format. I've now downloaded these and used them so far with no problems.

Sorry for the hassle.
Mark



> > >    at createFrom.AffymetrixCelFile (this, filename = pathname, path =

> NULL,
> > > verbos
> > >    at createFrom(this, filename = pathname, path = NULL, verbose =
> > > less(verbose))
> > >    at normalizeQuantile.AffymetrixCelFile (df, path =

> path,
> > > subsetToUpdate = subse
> > >    at normalizeQuantile(df, path = path, subsetToUpdate =
> subsetToUpdate,
> > > typesTo
> > >    at normalizeQuantile.AffymetrixCelSet (NA, path =

Kasper Daniel Hansen

unread,
May 11, 2007, 1:49:39 PM5/11/07
to aroma.affymetrix
According to the description on that page, the zip files should
contain xda type cel files... Where exactly are you finding the calvin
files? (Please write the exact file name)

Kasper

On May 10, 5:44 pm, "Mark Robinson" <mark.robinson...@gmail.com>
wrote:
> http://www.affymetrix.com/support/technical/sample_data/hugene_1_0_ar...

Mark Robinson

unread,
May 12, 2007, 6:31:47 AM5/12/07
to aroma.affymetrix
On May 12, 3:49 am, Kasper Daniel Hansen

<kasperdanielhan...@gmail.com> wrote:
> According to the description on that page, the zip files should
> contain xda type cel files... Where exactly are you finding the calvin
> files? (Please write the exact file name)


Hi Kasper.

There are actually links to the both types of CEL files from that link
(http://www.affymetrix.com/support/technical/sample_data/
hugene_1_0_array_data.affx) ... the AGCC (Calvin) and the GCOS (XDA).

Had I read Elizabeth's post (http://groups.google.com.au/group/aroma-
affymetrix/browse_thread/thread/678c7628abd92f45/?hl=en#) more
thoroughly, I would've known not to use the AGCC type at this stage.

Cheers,
Mark

Reply all
Reply to author
Forward
0 new messages