Processing problem with RmaCnPlm() for 338 samples

33 views
Skip to first unread message

cstratowa

unread,
Jul 7, 2008, 7:39:47 AM7/7/08
to aroma.affymetrix
Dear Henrik

Since last Friday my program is executing the following code for 338
samples (Nsp+Sty):
> plm <- RmaCnPlm(cs, mergeStrands=TRUE, combineAlleles=TRUE, ...);
> fit(plm, verbose=verbose);

Here is an excerpt of the output:
Summarizing probe-level data...
Fitting model of class RmaCnPlm:...
RmaCnPlm:
Data set: CellsGSK
Chip type: Mapping250K_Nsp
Input tags: ACC,-XY,QN
Output tags: ACC,-XY,QN,RMA,A+B
Parameters: (probeModel: chr "pm"; shift: num 0; flavor: chr
"affyPLM"; treatNAsAs: chr "weights"; mergeStrands: logi TRUE;
combineAlleles: logi TRUE).
Path: plmData/CellsGSK,ACC,-XY,QN,RMA,A+B/Mapping250K_Nsp
RAM: 0.00MB
Identifying non-estimated units...
Identifying non-estimated units...done
Getting model fit for 262338 units.
Loading required package: preprocessCore
Number units per chunk: 591
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1617638 43.2 2403845 64.2 2251281 60.2
Vcells 10876232 83.0 53929391 411.5 87628739 668.6
Fitting chunk #1 of 444...
Units:
int [1:591] 1 2 3 4 5 6 7 8 9 10 ...
Fitting probe-level model...
List of 1
$ AFFX-5Q-123:List of 1
..$ AFFX-5Q-123:List of 6
.. ..$ theta : num [1:338] 851 364 455 665 458 ...
.. ..$ sdTheta : num [1:338] 1.03 1.03 1.03 1.03 1.03 ...
.. ..$ thetaOutliers: logi [1:338] FALSE FALSE FALSE FALSE FALSE
FALSE ...
.. ..$ phi : num [1:30] 0.865 0.730 0.718 1.209
1.048 ...
.. ..$ sdPhi : num [1:30] 1.01 1.01 1.01 1.01 1.01 ...
.. ..$ phiOutliers : logi [1:30] FALSE FALSE FALSE FALSE FALSE
FALSE ...
Fitting probe-level model...done
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1624244 43.4 2564037 68.5 2564037 68.5
Vcells 11394413 87.0 34514809 263.4 87628739 668.6
Storing probe-affinity estimates...
Storing probe-affinity estimates...done
Storing chip-effect estimates...
Array #1 of 338 ('1A2')...done
Array #2 of 338 ('22Rv1')...
Extracting estimates...
List of 1
$ AFFX-5Q-123:List of 1
..$ AFFX-5Q-123:List of 3
.. ..$ theta : num 364
.. ..$ sdTheta : num 1.03
.. ..$ thetaOutliers: logi FALSE
Extracting estimates...done
Updating file...
Updating units...
Encoding units...
Encoding units...done
Updating units...done
Updating file...done
Array #2 of 338 ('22Rv1')...done

etc
etc

Started: 20080704 16:52:59
Estimated time left: 10835.8min
ETA: 20080712 05:53:15
Fitting chunk #1 of 444...done
Fitting chunk #2 of 444...

etc

Started: 20080704 16:52:59
Estimated time left: 12229.2min
ETA: 20080714 09:31:41
Fitting chunk #55 of 444...done
Fitting chunk #56 of 444...

etc

Started: 20080704 16:52:59
Estimated time left: 10467.5min
ETA: 20080714 17:27:48
Fitting chunk #122 of 444...done
Fitting chunk #123 of 444...

As you see, currently it is fitting chunk #123 of 444, whereby each
junk takes about 50 min!
Thus it may be finished in about 1-2 weeks.

Looking with "top" at the R process, I realized that it uses only
about 0.3-5.0 %CPU although it uses only about 24 %MEM.
Do you know why it does not use 99 %CPU?

Looking at the source code, the time-consuming step seems to be:
# Fit model using affyPLM code
if (!is.null(w)) {
fit <- .Call("R_wrlm_rma_default_model", y, psiCode, psiK, w,
PACKAGE=rlmPkg);
} else {
fit <- .Call("R_rlm_rma_default_model", y, psiCode, psiK,
PACKAGE=rlmPkg);
}

Furthermore, I found the following mail:
https://stat.ethz.ch/pipermail/bioconductor/2007-September/019286.html
Could this be the reason?

What can I do to increase the processing time?

The sessionInfo looks as follows:
> sessionInfo()
R version 2.6.1 (2007-11-26)
i686-pc-linux-gnu

locale:
C

attached base packages:
[1] stats graphics grDevices utils datasets methods
base

other attached packages:
[1] biasnp_0.2.14 aroma.affymetrix_0.9.3
aroma.apd_0.1.3
[4] R.huge_0.1.5 affxparser_1.10.1
aroma.core_0.9.3
[7] sfit_0.1.5 aroma.light_1.8.1
digest_0.3.1
[10] matrixStats_0.1.2 R.rsp_0.3.4
R.cache_0.1.7
[13] R.utils_1.0.2 R.oo_1.4.3
R.methodsS3_1.0.1

loaded via a namespace (and not attached):
[1] rcompgen_0.1-17

Best regards
Christian

Henrik Bengtsson

unread,
Jul 7, 2008, 5:10:14 PM7/7/08
to aroma-af...@googlegroups.com
Hi.

This doesn't look normal to me.

>
> Looking with "top" at the R process, I realized that it uses only
> about 0.3-5.0 %CPU although it uses only about 24 %MEM.
> Do you know why it does not use 99 %CPU?

Are you working toward a shared file system or a local disk? Working
with files over a shared files system can be a killer.

If you look at the verbose output, you can see after each chunk how
much time fit() spend in reading, fitting, and writing, respectively.
What values to you see?

>
> Looking at the source code, the time-consuming step seems to be:
> # Fit model using affyPLM code
> if (!is.null(w)) {
> fit <- .Call("R_wrlm_rma_default_model", y, psiCode, psiK, w,
> PACKAGE=rlmPkg);
> } else {
> fit <- .Call("R_rlm_rma_default_model", y, psiCode, psiK,
> PACKAGE=rlmPkg);
> }

I'm not sure what you mean by this: you mean that you'd except that
this should be where you would spend most of the time, but you don't?

>
> Furthermore, I found the following mail:https://stat.ethz.ch/pipermail/
> bioconductor/2007-September/019286.html
> Could this be the reason?

It could be, but I doubt it. I don't think that kicks in that hard
already at 338 arrays. You could also try to compare with using
AvgCnPlm instead of RlmCnPlm and see if there is a big difference.

>
> What can I do to increase the processing time?

Run more things in the background ;) Seriously, as I mentioned in
earlier threads, handling the single-probe CN units separately by
immediately transfer their intensities to chip effects would speed up
things a lot. If done correctly, that will come down to copying
values from the input CEL files to the output chip-effect CEL files;
right now those are processed through the complete PLM fitting
mechanism.

Cheers

Henrik

cstratowa

unread,
Jul 8, 2008, 7:57:58 AM7/8/08
to aroma.affymetrix

Dear Henrik

Thank you for your reply, see my comments below
Neither for me!

>
>
>
> > Looking with "top" at the R process, I realized that it uses only
> > about 0.3-5.0 %CPU although it uses only about 24 %MEM.
> > Do you know why it does not use 99 %CPU?
>
> Are you working toward a shared file system or a local disk?  Working
> with files over a shared files system can be a killer.

Yes, I am using a shared file system.
I have finally managed to run my program on a cluster, with
normalization running on one node first, and then using one node per
array for GLAD analyis, see my question:
http://groups.google.com/group/aroma-affymetrix/browse_thread/thread/1ce7b4621731e457#

However, the node used for normalization has only 1GB RAM, so this
could be one problem when using 338 samples.

>
> If you look at the verbose output, you can see after each chunk how
> much time fit() spend in reading, fitting, and writing, respectively.
> What values to you see?
>

I do not see any values, since this output appears only when all junks
are finished, and this would take 2 weeks, so I canceled the job.
However, I did run another job with only 22 samples and for these the
output is:

Fitting chunk #27 of 27...done
Total time for all units across all 22 arrays: 5656.53s == 94.28min
Total time per unit across all 22 arrays: 0.02s/unit
Total time per unit and array: 1.08ms/unit & array
Total time for one array (238378 units): 4.29min = 0.07h
Total time for complete data set: 94.28min = 1.57h
Fraction of time spent on different tasks: Fitting: 7.3%, Reading:
62.7%, Writing: 29.6% (of which 71.24% is for encoding/writing chip-
effects), Explicit garbage collection: 0.4%
Fitting model of class RmaCnPlm:...done

As you see, most time is spent with reading/writing, nevertheless the
total time for one array was only 4.29min and not 50min as in the case
of 338 samples.

>
>
> > Looking at the source code, the time-consuming step seems to be:
> >       # Fit model using affyPLM code
> >       if (!is.null(w)) {
> >         fit <- .Call("R_wrlm_rma_default_model", y, psiCode, psiK, w,
> > PACKAGE=rlmPkg);
> >       } else {
> >         fit <- .Call("R_rlm_rma_default_model", y, psiCode, psiK,
> > PACKAGE=rlmPkg);
> >       }
>
> I'm not sure what you mean by this: you mean that you'd except that
> this should be where you would spend most of the time, but you don't?
>

I was checking your source code if I could find where the time-
consuming step may be, but this seems not to be the place.

Meanwhile, I have found the following code:

> ProbeLevelModel.fit(..., ram=moreUnits, moreUnits=1)

where bytesPerChunk is hardcoded to be 100MB:

> bytesPerChunk <- 100e6;
> unitsPerChunk <- ram * bytesPerChunk / bytesPerUnit;

Would setting e.g. "moreUnits=10" increase the processing speed?
Does "moreUnits=10" need only 1GB RAM or more?

>
>
> > Furthermore, I found the following mail:https://stat.ethz.ch/pipermail/
> > bioconductor/2007-September/019286.html
> > Could this be the reason?
>
> It could be, but I doubt it.  I don't think that kicks in that hard
> already at 338 arrays.  You could also try to compare with using
> AvgCnPlm instead of RlmCnPlm and see if there is a big difference.
>
>
>
> > What can I do to increase the processing time?
>
> Run more things in the background ;)  Seriously, as I mentioned in
> earlier threads, handling the single-probe CN units separately by
> immediately transfer their intensities to chip effects would speed up
> things a lot.  If done correctly, that will come down to copying
> values from the input CEL files to the output chip-effect CEL files;
> right now those are processed through the complete PLM fitting
> mechanism.

This sounds interesting.
Could you be more specific, since I am afraid that I do not quite
understand what you mean?
Is there also a possibility for me to do things in parallel?

>
> Cheers
>
> Henrik
>

Best regards
Christian

>
>
>
>
> > The sessionInfo looks as follows:> sessionInfo()
>
> > R version 2.6.1 (2007-11-26)
> > i686-pc-linux-gnu
>
> > locale:
> > C
>
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods
> > base
>
> > other attached packages:
> >  [1] biasnp_0.2.14          aroma.affymetrix_0.9.3
> > aroma.apd_0.1.3
> >  [4] R.huge_0.1.5           affxparser_1.10.1
> > aroma.core_0.9.3
> >  [7] sfit_0.1.5             aroma.light_1.8.1
> > digest_0.3.1
> > [10] matrixStats_0.1.2      R.rsp_0.3.4
> > R.cache_0.1.7
> > [13] R.utils_1.0.2          R.oo_1.4.3
> > R.methodsS3_1.0.1
>
> > loaded via a namespace (and not attached):
> > [1] rcompgen_0.1-17
>
> > Best regards
> > Christian- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -

Henrik Bengtsson

unread,
Jul 8, 2008, 3:22:34 PM7/8/08
to aroma-af...@googlegroups.com
Hi.

Oh, I thought that was provided after each chunk, but I guess that is
only the information of ETA.

> However, I did run another job with only 22 samples and for these the
> output is:
>
> Fitting chunk #27 of 27...done
> Total time for all units across all 22 arrays: 5656.53s == 94.28min
> Total time per unit across all 22 arrays: 0.02s/unit
> Total time per unit and array: 1.08ms/unit & array
> Total time for one array (238378 units): 4.29min = 0.07h
> Total time for complete data set: 94.28min = 1.57h
> Fraction of time spent on different tasks: Fitting: 7.3%, Reading:
> 62.7%, Writing: 29.6% (of which 71.24% is for encoding/writing chip-
> effects), Explicit garbage collection: 0.4%
> Fitting model of class RmaCnPlm:...done
>
> As you see, most time is spent with reading/writing, nevertheless the
> total time for one array was only 4.29min and not 50min as in the case
> of 338 samples.

It is hard to compare this figures with the 338 array data set,
because of the differ chunk sizes etc. However, from even from the
above 22 array data set, it looks to me that you spend a fair bit of
time (62.7%) in reading the data. If you compare this to what you get
if you work toward a local drive, you'll probably see that it will be
much faster. Note that the "writing" fraction is mostly spend in the
unwrapping/wrapping (encoding/writing) part, and not per se for the
actual writing to file. The writing to file is very roughly
(100-71.24)% of 29.6%.

Another reason for the processing time of 338 arrays seems
"exponentially" slower than for 22 arrays, could be that there are
more file-cache misses when more files (and more disk space) have to
be accessed in each chunk. With 22 arrays, the file cache might get
more read/write cache hits.

See the thread 'Parallelizing the Total Copy Number Analysis (GWS6)
Options' and my reply on April 8, 2008.

In order to do this, it is important to understand how the PLM
model(s) work. When you do that you will also understand that there
are no probe-affinity parameters to fit for the single-probe units, so
the whole modeling vanishes. This will lead you to a simple "model"
where the chip-effect estimate of a single-probe CN units is identical
to the intensity of the CN probe. Thus, you can just copy the value
over regardless of PLM.

> Is there also a possibility for me to do things in parallel?

Yes, but that should not be the first goal. The main gain will be the
skipping of the fitting mechanism for all CN probes (which are 50% of
the units on the new arrays).

/Henrik

Henrik Bengtsson

unread,
Jul 8, 2008, 5:15:41 PM7/8/08
to aroma-af...@googlegroups.com
Forgot to reply to one of your questions:

On Tue, Jul 8, 2008 at 4:57 AM, cstratowa
<Christian...@vie.boehringer-ingelheim.com> wrote:
>
>

Use the 'ram' argument instead. The 'moreUnits' argument is obsolete
and there for backward compatibility.

The 'bytesPerChunk' is just a scaling constant. Don't consider it an
estimate of the number of bytes per chunk, at most a number
proportional to the number of bytes per chunk, but it depends a lot on
the fitting algorithm called and that varies with PLMs (and
'flavor':s). It also depends on the number of probes in a unit, which
depends on the specific CDF etc. So you cannot really predict the
number bytes.

Anyway, if you use ram > 1, more units per chunk will be fitted, and
with ram < 1, less units per will be fitted, than what is the default
(ram = 1). So, if you see that you are not using all of your RAM, you
can increase the 'ram' argument. This will use more memory. It will
also decrease the I/O overhead of opening and closing files for
reading, because this is done fewer times when there are less chunks.
In each chunk, a larger portion of the CEL files are also
read/writing, which will improve the I/O speed *per unit* as well (the
reason/details are in affxparser::updateCel()). However, I think
there are properties that also vary with hardware system and file
system. My guts tells me though that a larger value of 'ram' will
speed up things. There is probably an upper limit when the marginal
gain vanish. Basically it is a trial an error to find out. You're
welcome to post some summary data here, if you do a comparison.

Cheers

Henrik

cstratowa

unread,
Jul 9, 2008, 9:58:56 AM7/9/08
to aroma.affymetrix
Dear Henrik

Thank you.

Meanwhile our sysadmin has included a server with 32GB RAM as one node
in the cluster, which I will use for the normalization step.

I will test different settings of the "ram" parameter, however, this
will take some time. I will let you know.

Currently, I am running 85 samples on the server node, but the output
is sometimes strange. When fitting the chunks, some arrays are
sometimes missing, e.g. Array #1 and Array #2. The output looks as
follows:

Quantile normalizing data set...done
Summarizing probe-level data...
Fitting model of class RmaCnPlm:...
RmaCnPlm:
Data set: CellsGSK1
Chip type: Mapping250K_Nsp
Input tags: ACC,-XY,QN
Output tags: ACC,-XY,QN,RMA,A+B
Parameters: (probeModel: chr "pm"; shift: num 0; flavor: chr
"affyPLM"; treatNAsAs: chr "weights"; mergeStrands: logi TRUE;
combineAlleles: logi TRUE).
Path: plmData/CellsGSK1,ACC,-XY,QN,RMA,A+B/Mapping250K_Nsp
RAM: 0.00MB
Identifying non-estimated units...
Identifying non-estimated units...done
Getting model fit for 262338 units.
Loading required package: preprocessCore
Number units per chunk: 2352
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1069207 57.2 1710298 91.4 1590760 85.0
Vcells 7358933 56.2 50290153 383.7 80445863 613.8
Fitting chunk #1 of 112...
Units:
int [1:2352] 1 2 3 4 5 6 7 8 9 10 ...
here are lots of nonprinting characters <del><del><del>
Updating file...done
Array #3 of 85 ('769P')...done
Array #4 of 85 ('A101D')...
Extracting estimates...
List of 1
$ AFFX-5Q-123:List of 1
..$ AFFX-5Q-123:List of 3
.. ..$ theta : num 807
.. ..$ sdTheta : num 1.03
.. ..$ thetaOutliers: logi FALSE
Extracting estimates...done

etc

Updating file...done
Array #64 of 85 ('REC1')...done
Array #65 of 85 ('RKOE6')...
Extracting estimates...
List of 1
$ AFFX-5Q-123:List of 1
..$ AFFX-5Q-123:List of 3
.. ..$ theta : num 1477
.. ..$ sdTheta : num 1.03
.. ..$ thetaOutliers: logi FALSE
Extracting estimates...done
Updating file...
Updating units...
Encoding units...
Encoding units...done
here are lots of nonprinting characters <del><del><del>
.. ..$ theta : num 643
.. ..$ sdTheta : num 1.03
.. ..$ thetaOutliers: logi FALSE
Extracting estimates...done
Updating file...
Updating units...
Encoding units...
Encoding units...done
Updating units...done
Updating file...done
Array #71 of 85 ('SJSA1')...done
Array #72 of 85 ('SKMEL1')...
Extracting estimates...

etc

Updating file...done
Array #81 of 85 ('SW684')...done
here are lots of nonprinting characters <del><del><del>
.. ..$ thetaOutliers: logi [1:85] FALSE FALSE FALSE FALSE FALSE
FALSE ...
.. ..$ phi : num [1:6] 1.445 1.838 1.305 0.834 0.429 ...
.. ..$ sdPhi : num [1:6] 1.02 1.02 1.02 1.02 1.02 ...
.. ..$ phiOutliers : logi [1:6] FALSE FALSE FALSE FALSE FALSE
FALSE
Fitting probe-level model...done
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1220265 65.2 1967602 105.1 1967602 105.1
Vcells 8084782 61.7 20598845 157.2 80445863 613.8
Storing probe-affinity estimates...
Storing probe-affinity estimates...done
Storing chip-effect estimates...
Array #1 of 85 ('5637')...done
Array #2 of 85 ('639V')...
Extracting estimates...

etc

Fitting chunk #2 of 112...done
Fitting chunk #3 of 112...
Units:
int [1:2352] 4705 4706 4707 4708 4709 4710 4711 4712 4713 4714 ...
Fitting probe-level model...
here are lots of nonprinting characters <del><del><del>
.. ..$ thetaOutliers: logi FALSE
Extracting estimates...done
Updating file...
Updating units...
Encoding units...
Encoding units...done
Updating units...done
Updating file...done
Array #7 of 85 ('ARH77')...done
Array #8 of 85 ('BC3')...
Extracting estimates...

etc

Started: 20080709 15:14:01
Estimated time left: 229.5min
ETA: 20080709 19:12:07
Fitting chunk #4 of 112...done
Fitting chunk #5 of 112...
Units:
int [1:2352] 9409 9410 9411 9412 9413 9414 9415 9416 9417 9418 ...
here are lots of nonprinting characters <del><del><del>
Updating units...done
Updating file...done
Array #3 of 85 ('769P')...done
Array #4 of 85 ('A101D')...
Extracting estimates...

etc

Started: 20080709 15:14:01
Estimated time left: 87.9min
ETA: 20080709 17:10:01
Fitting chunk #27 of 112...done
Fitting chunk #28 of 112...
Units:
int [1:2352] 63505 63506 63507 63508 63509 63510 63511 63512 63513
63514 ...
Fitting probe-level model...
List of 1
$ SNP_A-2254596:List of 1
..$ :List of 6
.. ..$ theta : num [1:85] 10578 16352 9235 10129
9693 ...
.. ..$ sdTheta : num [1:85] 1.07 1.07 1.07 1.07 1.08 ...
.. ..$ thetaOutliers: logi [1:85] FALSE FALSE FALSE FALSE FALSE
FALSE ...
.. ..$ phi : num [1:6] 0.629 0.655 0.785 1.412 1.344 ...
.. ..$ sdPhi : num [1:6] 1.02 1.02 1.02 1.02 1.02 ...
.. ..$ phiOutliers : logi [1:6] FALSE FALSE FALSE FALSE FALSE
FALSE
Fitting probe-level model...done
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1220921 65.3 1967602 105.1 1967602 105.1
Vcells 8050882 61.5 20598845 157.2 80445863 613.8
Storing probe-affinity estimates...
Storing probe-affinity estimates...done
Storing chip-effect estimates...
here are lots of nonprinting characters <del><del><del>
Array #3 of 85 ('769P')...
Extracting estimates...
List of 1
$ SNP_A-2254596:List of 1
..$ :List of 3
.. ..$ theta : num 9235
.. ..$ sdTheta : num 1.07
.. ..$ thetaOutliers: logi FALSE
Extracting estimates...done

etc

Do you know what might be the reason for this?
(Before fitting plm, I am doing quantile normalization, and this seems
to be ok.)

Best regards
Christian
> >http://groups.google.com/group/aroma-affymetrix/browse_thread/thread/...
> ...
>
> read more »- Hide quoted text -

Henrik Bengtsson

unread,
Jul 9, 2008, 10:45:32 AM7/9/08
to aroma-af...@googlegroups.com
Hi.

On Wed, Jul 9, 2008 at 6:58 AM, cstratowa
<Christian...@vie.boehringer-ingelheim.com> wrote:
>
> Dear Henrik
>
> Thank you.
>
> Meanwhile our sysadmin has included a server with 32GB RAM as one node
> in the cluster, which I will use for the normalization step.
>
> I will test different settings of the "ram" parameter, however, this
> will take some time. I will let you know.
>
> Currently, I am running 85 samples on the server node, but the output
> is sometimes strange. When fitting the chunks, some arrays are
> sometimes missing, e.g. Array #1 and Array #2. The output looks as
> follows:

I never seen this. Is the problem only for the verbose output? Are
the estimates still alright?

/Henrik

cstratowa

unread,
Jul 10, 2008, 4:16:28 AM7/10/08
to aroma.affymetrix
Dear Henrik

Luckily this turned out to be an artifact. Since I am writing the
output to a text-file, it seems that I opened the incomplete text-file
just when the program tried to write, causing these artifacts.

Now, that the computation finished I can give you the results of
fitting PLM for 85 samples, when using a server with 32GB RAM as
cluster node:

Fitting model of class RmaCnPlm:...
...
Fitting chunk #102 of 102...done
Total time for all units across all 85 arrays: 5617.77s == 93.63min
Total time per unit across all 85 arrays: 0.02s/unit
Total time per unit and array: 0.277ms/unit & array
Total time for one array (238378 units): 1.10min = 0.02h
Total time for complete data set: 93.63min = 1.56h
Fraction of time spent on different tasks: Fitting: 12.6%, Reading:
25.5%, Writing: 60.6% (of which 83.67% is for encoding/writing chip-
effects), Explicit garbage collection: 1.2%
Fitting model of class RmaCnPlm:...done

As you see, fitting 85 samples needed only 1.5hrs, which is fine.
Nevertheles, I will try to increase parameter "ram".
In comparison, fitting 22 samples on a typical cluster node needed
already 1.6hrs, while fitting 48 samples on a typical cluster node
needed 5.7hrs, although writing to the same network drive!

Thus the take-home message seems to be:
Do not use a typical cluster node (in our case 2 CPUs with 1GB RAM
shared memory) for PLM-fitting.
(First experience shows that even nodes with 4 CPUs and 8GB shared
memory may not be sufficient)

One more info:
The main reason for using a cluster was to run GLAD in parallel. In
this case the processing time was:
For 85 samples using Nsp-Sty GLAD needed about 140min/node.
For 48 samples using only Sty GLAD needed about 30min/node.

Best regards
Christian


On Jul 9, 4:45 pm, "Henrik Bengtsson" <h...@stat.berkeley.edu> wrote:
> Hi.
>
> On Wed, Jul 9, 2008 at 6:58 AM, cstratowa
>
>
>
>
>

cstratowa

unread,
Jul 15, 2008, 3:48:27 AM7/15/08
to aroma.affymetrix
Dear Henrik

Meanwhile I did some testing with the "ram" parameter when using 85
samples:

For "ram=1.0" fitting is done in 102 chunks and I get the following
output:

Fitting model of class RmaCnPlm:...
etc
Fitting chunk #102 of 102...done
Total time for all units across all 85 arrays: 6202.19s == 103.37min
Total time per unit across all 85 arrays: 0.03s/unit
Total time per unit and array: 0.306ms/unit & array
Total time for one array (238378 units): 1.22min = 0.02h
Total time for complete data set: 103.37min = 1.72h
Fraction of time spent on different tasks: Fitting: 11.2%, Reading:
28.1%, Writing: 59.4% (of which 83.09% is for encoding/writing chip-
effects), Explicit garbage collection: 1.3%
Fitting model of class RmaCnPlm:...done

In contrast, for "ram=10.0" fitting is done in only 11 chunks and I
get the following output:

Fitting model of class RmaCnPlm:...
Fitting chunk #11 of 11...done
Total time for all units across all 85 arrays: 4328.53s == 72.14min
Total time per unit across all 85 arrays: 0.02s/unit
Total time per unit and array: 0.214ms/unit & array
Total time for one array (238378 units): 0.85min = 0.01h
Total time for complete data set: 72.14min = 1.20h
Fraction of time spent on different tasks: Fitting: 16.4%, Reading:
16.3%, Writing: 67.0% (of which 91.74% is for encoding/writing chip-
effects), Explicit garbage collection: 0.3%
Fitting model of class RmaCnPlm:...done

As you see, there is a 30% speed-up and a decrease in %Reading.
According to the gse statistics output, memory usage increased only
from maxvmem=1.1GB to maxvmem=1.4GB.
Thus my feeling is that on a typical PC it would be better to use
"ram=10"

Best regards
Christian

On Jul 10, 10:16 am, cstratowa <Christian.Strat...@vie.boehringer-
Reply all
Reply to author
Forward
0 new messages