AB-Sciex TripleTOF wiff files to MGFs

601 views
Skip to first unread message

ben

unread,
Feb 21, 2011, 2:03:42 PM2/21/11
to spctools-discuss
Just wanted to get an idea about what other people get for processing
time when converting wiff files to other formats. Presently I am using
the mascot.dll script within AnalystTF as well as within Mascot Daemon
(the latter taking roughly 4 hours for a single wiff). I am wondering
if using msconvert is the route I should go. Matt seems to recommend
this based on some threads in this group I saw from January, and it
would probably be best to go wiff->mzML->mgf. We could then use either
format to load onto our mascot server. Our goal is to get our raw data
out and be able to analyze it however we want, beyond protein pilot.

Bottomline:
- It is super slow using a 32-bit machine running XP to convert wiffs
to mgfs with the mascot.dll script. What's my best choice/tool?
- Is the only way to download msconvert with the TPP package? Any
direct download links?
- I can get a version of ProteinPilot3 and install without a lic, or
is Skyline a better alternative?


Sorry for the remedial questions, but reading through other posts I
figured it might be best to ask before I start. Thanks for any help I
can get.

- Ben

Eric Deutsch

unread,
Feb 21, 2011, 9:16:06 PM2/21/11
to spctools...@googlegroups.com, Eric Deutsch
Hi Ben, we have had some trouble with msconvert and TripleTOF. It seems
that the ProteinPilot dll calls that msconvert uses does not get the
precursor m/zs right. At least this happened at one time, I'm not certain
this is still a problem. However, AB is beta testing a converter tool that
will convert a TripleTOF file to an MGF. I don't know about it speed. But
you might ask them if you can be a beta tester.

Regards,
Eric

> --
> You received this message because you are subscribed to the Google
> Groups "spctools-discuss" group.
> To post to this group, send email to spctools...@googlegroups.com.
> To unsubscribe from this group, send email to spctools-
> discuss+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en.

Matt Chambers

unread,
Feb 22, 2011, 10:17:25 AM2/22/11
to spctools...@googlegroups.com, sup...@proteowizard.org

ProteinPilot 3 doesn't support the 5200 or 5600 afaik so neither can msconvert. We're helping to test a new api that will add that support but it won't be available until the testing is over (probably another month or so). In the meantime, try to get a copy of the AB's new beta standalone converter that Eric mentioned.

-Matt

lgillet

unread,
Feb 22, 2011, 1:55:07 PM2/22/11
to spctools-discuss
Hi Ben,
I do not know how you look at your runs, but the PeakView software (I
use Version 1.1) that open the raw data wiff files can actually export
mgf format.
You need to open the IDA runs with the IDA explorer, select filtering
criteria you want (rt, spectra "quality") and then right click on the
map and export to mgf. You can even get "consensus" spectra during the
mgf export (the software pre-compiles MSMS spectra of close m/z and rt
from parameters you can set yourself).
Hope that helps.
Ludovic

On Feb 22, 4:17 pm, Matt Chambers <matt.chamber...@gmail.com> wrote:
> ProteinPilot 3 doesn't support the 5200 or 5600 afaik so neither can
> msconvert. We're helping to test a new api that will add that support but it
> won't be available until the testing is over (probably another month or so).
> In the meantime, try to get a copy of the AB's new beta standalone converter
> that Eric mentioned.
>
> -Matt
>

ben

unread,
Feb 22, 2011, 5:59:11 PM2/22/11
to spctools-discuss
I didn't realize that about PeakView. So I am generating mgfs from a
single wiff 4 ways right now: dll script within Analyst, dll script
within Daemon, through PeakView (which took about two seconds), and
using Distiller (going on 3 hours right now). Then I will compare
them.

Thanks for all the ideas, and for the updates about msconvert. Am
bugging our AB-Sciex guy about the purported standalone converter.

- Ben

ben

unread,
Mar 3, 2011, 1:37:12 AM3/3/11
to spctools-discuss
I wanted to provide an update. First, it appears that the export mgf
function from within peak view isn't very good and talking to AB folks
it seems more like a left over beta function. Second, the AB folks
have unanimously confirmed that the mascot.dlll script shouldn't be
used in this case (Analyst TF 1.5 with TripleTOF data), and based on
the resulting mgfs and mascot results, I agree. Lastly, I was able to
join the beta and the converter has the ability to easily generate an
mgf which is exactly what proteinpilot then sends to the Mascot
server, albeit without any profile data. It also has an option to make
an mzML which works great, only takes about 30 minutes, and results in
a ~2 GB file.

I am still struggling with how to get this mzML file to mzXML since
certain programs (Progenesis LC-MS, VIPER, MAVEN, etc.) take mzXML not
mzML. When I convert the mzML file with msconvert, so far it simply
stalls out though the file appears to be about 6GB large. If anyone
has any clue about this or would like an example mzML to play with,
please let me know.

Thanks again for all the helpful feedback, especially about the beta.

Brian Pratt

unread,
Mar 3, 2011, 10:30:03 AM3/3/11
to spctools...@googlegroups.com
A well constructed mzXML file would normally be somewhat smaller than
its mzML equivalent - make sure you've got peaklist compression turned
on (IIRC the msconvert default is no compression, which seems like a
Bad Thing to me but there you have it).

> --
> You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
> To post to this group, send email to spctools...@googlegroups.com.

> To unsubscribe from this group, send email to spctools-discu...@googlegroups.com.

ben

unread,
Mar 3, 2011, 11:19:04 AM3/3/11
to spctools-discuss
I haven't played around with compression, but that would be simply
adding a -z [ --zlib ] correct? I am using this page more so than
the help on sourceforge for command arguments.
http://groups.google.com/group/xcms/msg/53e2a2b462a5448c

Thanks for the help, and I will let you know if something better comes
from this. As an aside, my goal for this specific issue (as opposed to
my initial issue of converting wiffs to load MS/MS data elsewhere) is
I would like some LC-MS data to visualize for peptide extractions.
Everything else works fine for protein ID of tryptic samples with the
mgfs.

- Ben

Matthew Chambers

unread,
Mar 3, 2011, 11:33:54 AM3/3/11
to spctools...@googlegroups.com
If you want to post some samples to the list, use msconvert with an index filter. Try:
msconvert original.mzML -o subset --filter "index 0-100" -z
msconvert original.mzML -o subset --filter "index 0-100" -z --mzXML
And attach both the subset mzML and mzXML for comparison.

Thanks,
-Matt

ben

unread,
Mar 3, 2011, 11:57:03 AM3/3/11
to spctools-discuss
Thanks for the reply. Can't seem to upload to the group's list, so
here is a link
http://dl.dropbox.com/u/11659021/subset.7z
Has two files generated below. If you want, I can include any files
upstream of this conversion too if that would help.

- Ben

On Mar 3, 11:33 am, Matthew Chambers <matt.chamber...@gmail.com>
wrote:

Matthew Chambers

unread,
Mar 3, 2011, 12:25:26 PM3/3/11
to spctools...@googlegroups.com
Yes, I can see why mzXML with interlaced m/z and intensity would get much worse compression here.
There is pretty steady baseline on the data and since it's profile there are tens of thousands of
data points per spectrum. It's much better to compress the intensity and m/z arrays separately in
this case (actually, it's always better, but in most cases it doesn't make such a big difference).
Of course, it's really better if you could apply some kind of baselining first. Unfortunately we
don't have a baselining filter in pwiz yet. Also, ABI profile data suffers from missing flanking
zero data samples.

You can try an intensity threshold filter, which would normally mess up profile data by eliminating
the flanking zero samples (because it doesn't know anything about peaks), but in this case since the
flanking zeroes are already missing it'll just clean up your data a lot. :P For example, in those
first 100 spectra, there are about 4000 samples with an ion count of 8, and only 800 samples above
that. I'm sure the spectra from the middle of the file are better though so you ought to calibrate
the thresholding on those.

-Matt

Reply all
Reply to author
Forward
0 new messages