Processing/Normalization Speed Optimization

140 views
Skip to first unread message

Justin Lin

unread,
Jun 13, 2022, 10:57:26 PM6/13/22
to Cardinal MSI Help
Hello everyone,

I have been performing RMS normalization using the following functions, but it usually takes about 5 hours for the processing to complete for a dataset with about 7000 pixels.
  • readMSIData(file, attach.only = TRUE, mass.range=c(80,200), resolution=0.01, units”mz”) 
  • normalize(file, method=“rms”)
  • process(file, method=“rms”)
I would love to increase the resolution as well. Any insights into optimization or alternative ways to performing normalization would be greatly appreciated!

Thank you so much!

Justin 

Giulia Ricciardi

unread,
Jun 14, 2022, 4:08:32 AM6/14/22
to Cardinal MSI Help
Hi Justin,
try with the following code:
  • data <- readMSIData(file, attach.only = TRUE, mass.range=c(80,200), resolution=0.01, units”mz”) 
  • data_norm <- normalize(data, method=“rms”)
  • data_proc <- process(data_norm)

You can also try with this:

file_proc <- file %>%
normalize(method="rms")%>%
process()

Usually, normalization is performed after some preprocessing steps like mz alignment, peak picking and peak alignment. If you do some preprocessing before normalizing the file should be easier for your computer to process. 

I am used to working with slow-performing computers, but 5 hours is really A LOT of time and it shouldn't be like that! 
To reduce the 'heaviness' of my files, I have selected only the area of interest and created a subset of the original file. This way I removed all the background of the image that was uninformative and also taking up a lot of space/time to process. Then I did all the preprocessing and normalization in the subset of the original file.

Hope this helps!

Giulia


Andrew Goodenough

unread,
Jun 14, 2022, 11:27:18 AM6/14/22
to Cardinal MSI Help
Hi Justin,

If you haven't yet, before doing any processing try
  • setCardinalBPPARAM(BPPARAM = SnowParam())
or if you're on Mac/Linux
  • setCardinalBPPARAM(BPPARAM = MulticoreParam())
This will allow processing on all cores of your processor (R generally uses just 1). If you want to do other work (e.g. browse the web) you may want to change that to SnowParam(workers = n) where n is your number of threads - 1 or your computer may freeze up (depending on processor).

Andrew

Justin Lin

unread,
Jun 14, 2022, 6:14:04 PM6/14/22
to Cardinal MSI Help
Hello Giulia and Andrew,

Thanks for your insights!

I have attempted to process the data again after incorporating your suggestions, but the slow processing speed appears to persist. Subsetting the original dataset experiences the same issue. I also tried narrowing the mass range and decreasing resolution. 

However, I have noticed that there is a small spike in CPU usage while reading the data, but the CPU used while normalizing or subsetting is below 1%. 

Thank you so much!

Yu Tin

kbemis

unread,
Jul 7, 2022, 2:26:00 PM7/7/22
to Cardinal MSI Help
The low CPU usage suggests it's an I/O issue. Are you using a computer with a traditional (spinning) hard drive? Data access will be quite slow in that case.

If you have enough memory, I'd suggest try using attach.only=FALSE to load the data fully and see if that helps.

Justin Lin

unread,
Jul 11, 2022, 9:35:20 PM7/11/22
to Cardinal MSI Help
Hello Dr. Bemis,

Seems like an I/O issue was indeed the case (I accessed the data through NAS). 

Thanks for your help!

Yu Tin

Reply all
Reply to author
Forward
0 new messages