Pwiz msconvert and RAW files

tgreco

unread,

Jun 5, 2008, 1:00:13 AM6/5/08

to spctools-discuss

I know the implementation of mzML is still a work in progress, but I
did make one observation when using the Pwiz msconvert.exe (timestamp
5-22-08) to generate an mzML file in profile with zlib compression. I
am able to produce the following error when running XPress:
"bad lexical cast: source type value could not be interpreted as
target"
This results in no peak integration being performed for any of the
peptide IDs.

I do not get this error when the mzML is centroided only, profile
only, or centroided/compressed, only when it is left in profile/
compressed.
-Todd

Matthew Chambers

unread,

Jun 5, 2008, 1:20:10 AM6/5/08

to spctools...@googlegroups.com

Hi Todd,

This issue sounds like it's probably on the reader's end, in this case
the RAMPAdapter to Pwiz. There are extensive unit tests for RAMPAdapter
though so I'm not sure what's going on. Do other RAMP-dependent
utilities in TPP handle your profile/compressed file?

-Matt

Brian Pratt

unread,

Jun 5, 2008, 10:55:28 AM6/5/08

to spctools...@googlegroups.com

It also seems likely that the writer and the reader don't quite agree on
what mzML is - there were updates to the ProteoWizard code that happened
after the last TPP release, so Xpress's reader may be slightly stale?
Perhaps mzML is stable now, I'll see about an update to TPP's mzML reader
today.

Brian

Matthew Chambers

unread,

Jun 5, 2008, 11:10:14 AM6/5/08

to spctools...@googlegroups.com

If it's a staleness issue it's probably the opposite: the 5-22-08 build
of msconvert does not include the changes I made that week and won't
actually be 1.0 (should be .99.12). Your TPP RAMP code should include
those changes though since you've been checking out straight from the
repository. I will try to get together with Darren today to update the
website binaries if he hasn't already done so.

-Matt

Jimmy Eng

unread,

Jun 5, 2008, 1:32:14 PM6/5/08

to spctools...@googlegroups.com

calc_neutral_pep_mass is re-calculated by the TPP tools. It is the mass
of the neutral or uncharged peptide and is calculated as the sum of the
residue masses + N-terminus (~1.0 for H) + C-terminus (~17.0 for OH).

The calc_neutral_pep_mass values are correct. So if you're not getting
the right numbers, compare your set of residue masses with those listed
in resources like:
http://www.i-mass.com/guide/aamass.html
http://www.matrixscience.com/help/aa_help.html

On May 30, 9:17 pm, "Christine Vogel" <vogel....@gmail.com> wrote:
> Hello,
>
> Protein Prophet has in the -prot.xml file an output called:
> calc_neutral_pep_mass. How exactly is this calculated? Is this
> coming from the Sequest output (which is what I am using for the
> search)?
>
> Is it 1. aa_weights minus water (peptide bonds) 2. aa_weights minus
> water (peptide bonds) + proton
>
> Reason for me asking is that I noticed that when I calculate peptide
> masses, my peptides are always around 1 Da heavier than the mass
> quoted in the -prot.xml. I calculate them as the sum of the
> [aa_weights] minus water [18.015] for each peptide bond. So it's
> weird that my peptide masses are *smaller* than those from the
> -prot.xml file.
>
> Thanks a lot,
>
> Christine

Darren Kessner

unread,

Jun 5, 2008, 2:35:09 PM6/5/08

to spctools...@googlegroups.com

Hi all -- I have a slow connection (in the ASMS poster room), so I
can't upload new packages right this minute.

But I was able to check in some new binaries, including msconvert.

We keep the latest binaries in our svn at trunk/pwiz/bin

In particular, the latest msconvert can be obtained via http here:
http://proteowizard.svn.sourceforge.net/viewvc/*checkout*/proteowizard/trunk/pwiz/bin/windows/i386/msconvert.exe

Darren

tgreco

unread,

Jun 5, 2008, 6:28:05 PM6/5/08

to spctools-discuss

Just wanted to post an update on my first post. First, my
observations (see below) are still present even using the latest
msconvert posted above by Darren. I tested the conversion of another
Raw file acquired on the same day/same instrument with the same
profile/compressed setting and this time I did not receive the "bad
lexical cast" error, but instead received a list of zlib errors during
XPress analysis. Strangely though, these zlib errors did not result
in an error during xinteract and the XPress ratio calculations were
performed, though I will have to examine further if there is any
issues with their actual calculation. For the mzML that generates the
"bad lexical cast", I noticed this message was also displayed at the
end of the pepXML/Out2XML generation.

I'm not sure which which of the utilities are specifically RAMP-
dependent...but assuming they involve the use of the mzML, I tested
Pep3D, which again resulted in a "bad lexical cast" error and did not
display the Pep3D image. Using the mzML that gave the zlib errors
during xinteract actually resulted in a hard crash/windows error,
Pep3d_xml.cgi has encountered a problem and needs to close. Let me
know if you would like any additional information..at this point I
can't quite explain why their are two different errors, despite the
fact that these RAW files were generated as part of the same data set
from the same instrument.
-Todd

On Jun 5, 12:35 pm, Darren Kessner <Darren.Kess...@cshs.org> wrote:
> Hi all -- I have a slow connection (in the ASMS poster room), so I
> can't upload new packages right this minute.
>
> But I was able to check in some new binaries, including msconvert.
>
> We keep the latest binaries in our svn at trunk/pwiz/bin
>

> In particular, the latest msconvert can be obtained via http here:http://proteowizard.svn.sourceforge.net/viewvc/*checkout*/proteowizar...

Darren Kessner

unread,

Jun 5, 2008, 6:38:56 PM6/5/08

to spctools...@googlegroups.com

Hi Todd,

Thank you very much for running your tests. Do you have a place where
you could post one or two of the offending RAW files so that I can
investigate what's going on during the mzML conversion (and subsequent
reading)?

Darren

Kessner, Darren E.

unread,

Jun 20, 2008, 2:25:32 PM6/20/08

to spctools...@googlegroups.com

Hi all,

Here's an update on the problems that Todd was having with msconvert. Todd uploaded a couple of RAW files for me to look at.

There were actually two issues:

1) Boost iostreams zlib compressor wasn't flushing some buffer properly in certain cases, when used in the manner described in their online docs. This caused improper zlib encoding, which resulted in the "zlib error" messages that Todd saw during subsequent parsing. I isolated a particular intensity array that illustrates the problem and sent it to the Boost mail list. In the meantime, I implemented a workaround that allows us to force the buffers to flush.

2) We were encoding garbage for a couple of cvParams when the binary data arrays were empty. This resulted in the "bad_lexical_cast" messages during parsing that Todd noticed.

I have checked in a new Windows msconvert binary, with packages on the way in the next day or so:

http://proteowizard.svn.sourceforge.net/viewvc/*checkout*/proteowizard/trunk/pwiz/bin/windows/i386/msconvert.exe

Thank you very much to Todd for reporting the errors and making the data files available!

I'm cross-posting to proteowizard-developer and including the message to boost-users below for future reference.

Darren

Hi all,

I have encountered some strange behavior while using the Boost iostreams zlib compression filter.

The problem is tricky to reproduce -- I have posted a tarball that includes an example program (pasted below) together with a 72k binary file that will be read in and compressed:
http://proteowizard.sourceforge.net/temp/zlib_error.tgz

If you're curious, the 72k data array comes from an array of doubles, which were intensities from a single scan on a mass spectrometer. The problem occurs on platforms darwin, gcc, msvc.

In general, the functions run_filter_1 and run_filter_2 produce identical output. However, when the input is the binary array read in from the 72k file, run_filter_1 appears not to flush some buffer properly. This may be due to my ignorance of how to force the buffers to flush, though it's not clear to me that this should be necessary with the use of boost::iostreams::copy().

Thank you in advance for any help!

Darren

Reply all

Reply to author

Forward