I independently used both Trapper (4.3.1) and Msconvert (pwiz 1.6.0)
to convert a 1.4Gb Agilent MassHunter ".d" file containing ~16000
spectra to mzXML.
Parameters:
$msconvert --mzXML --verbose large.d
$trapper --mzXML -v large.d large.mzXML
Msconvert took 5hrs 38mins to complete, generating a 57Gb file.
Trapper took 1hr 26mins to complete, generating a 28Gb file.
I can understand the differences in size, given the differences in
structure and precision. However, the time difference still appears
quite high.
One interesting observation is that msconvert does not start writing
to the output file until 1hr 20min has elapsed. At that point, the
file begins filling and the progress messages start appearing in the
output. Trapper, on the other hand, starts filling the output file and
reporting progress immediately.
I have seen this occur now for two runs on two different days, so I
don't think it's related to other activity on the machine.
Perhaps msconvert is engaging in some preprocessing that isn't
strictly necessary, for Agilent ".d" files at least?
Thanks,
bio.x2y
This seems like a *really* long time for both tools even if the file
contains a lot of scans. Are you converting files over a slow network
share or similar?
We regularly convert Agilent 6520 QTOF files to mzXML using trapper and
msconvert. For trapper it takes about 27s to convert a file with 8800
scans to mzXML using centroid mode, or 8m 20s for profile mode.
These timings are on a server with Xeon 5550s and 16GB of RAM, but I
wouldn't expect conversions to take hours, even on my 2 yr old laptop.
msconvert is usually about twice as slow as trapper when converting
using 32-bit precision.
DT
I have a feature request pending with Agilent to get a function which
provides either a list of scanIds or a spectrum without metadata.
Thanks for doing the comparison. I agree with Dave that the conversion
time sounds pretty long in both cases and I suspect a network share.
-Matt
My Windows happens to be living in a Virtualbox on a Mac Pro, so that
explains the overall sluggish pace! I guess that's part of the fun of
needing the Agilent library.
Out of curiosity, I might try running one of the jobs on my own year-
old laptop, just in case this is data related..
Cheers,
bio.x2y
On Feb 16, 9:26 pm, Matthew Chambers <matthew.chamb...@vanderbilt.edu>
wrote:
--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To post to this group, send email to spctools...@googlegroups.com.
To unsubscribe from this group, send email to spctools-discu...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.
Msconvert and Trapper use the same API. But Trapper is a dedicated
converter so it doesn't worry about random access. Thus it doesn't need
to do the initial enumeration which accounts for the large difference in
run time. Pwiz is not just about conversion though. Pwiz's SeeMS tool
can directly view raw spectra like MassHunter does (except for that
blasted initialization time on profile data!).
-Matt
Joe Slagel wrote:
> Matt,
>
> Thanks for the explanation. Does this mean that trapper isn't using
> the Agilent API? (Asking the naive trapper user question)
>
> -Joe
>
>
> On Tue, Feb 16, 2010 at 1:26 PM, Matthew Chambers
> <matthew....@vanderbilt.edu
we are using an Agilent MSD/TOF for metabolomics. For us, it is
important to convert to CENTROID mzXML for further data processing,
otherwise the files become extremely large.
Greetings, Robert