Tide pepxml formatting issues

Michael Riffle

unread,

Jan 10, 2020, 8:21:42 PM1/10/20

to crux-users

Greetings Crux devs,

I'm developing tools to support the output of Tide data generated by Crux and am trying to parse and use the pepxml generated by tide. This is, generally, far preferable to me over parsing tab delimited text files.

However, I'm running into some problems parsing the pepxml generated by Tide--it's producing XML that won't validate against the pepxml schema. Specifically, I'm running into these two issues right now. I'll follow up if I find others:

1) num_tol_term is occassionally set to "" This attribute must have a value that is an integer.

2) more than one "modification_info" element per search hit. The XSD specifies either 0 or 1 instances of this element.

For example, I'm finding the following:

<modification_info modified_peptide="LLAGLLHPGQAVSFWGCFAQM[15.99]YFFVALGITESYLLAAMSYDR">

<mod_aminoacid_mass position="21" mass="147.03"/>

</modification_info>

<modification_info modified_peptide="LLAGLLHPGQAVSFWGCFAQMYFFVALGITESYLLAAMSYDR">

<mod_aminoacid_mass position="17" mass="160.03"/>

</modification_info>

This is causing my XML parsing library a lot of consternation. This should be:

<modification_info modified_peptide="LLAGLLHPGQAVSFWGCFAQM[15.99]YFFVALGITESYLLAAMSYDR">

<mod_aminoacid_mass position="21" mass="147.03"/>

<mod_aminoacid_mass position="17" mass="160.03"/>

</modification_info>

Note that I think it only does this if there are two different mod masses. All mods w/ the same mass are correctly together.

Mike

Michael Riffle

unread,

Jan 13, 2020, 6:43:02 PM1/13/20

to crux-users

Discovered one other problem leading to pep xml that will not validate.

In the element for the root element of the pep xml document (<msms_pipeline_analysis>), the schema specifies an attribute for "date" as xs:datetime. This requires that the date be represented using a specific syntax: YYYY-MM-DDThh:mm:ss

In the XML generated by Tide I get:

<msms_pipeline_analysis date="Tue Jan 7 17:05:18 2020" xmlns="http://regis-web.systemsbiology.net/pepXML" ...

I get validation errors when I try to load this XML document using the schema.

Changing this to "2020-01-07T17:05:18" fixed the problem right up.

Mike

William S Noble

unread,

Jan 15, 2020, 12:13:58 PM1/15/20

to Michael Riffle, crux-users

Thanks for pointing these issues out, Mike. I have created a ticket for this request here:

https://github.com/crux-toolkit/crux-toolkit/issues/461

We will get back to you as soon as we can with a fix.

Bill

--
You received this message because you are subscribed to the Google Groups "crux-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to crux-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/crux-users/a97b0287-3d33-4093-88dc-200b6c549b6e%40googlegroups.com.

Reply all

Reply to author

Forward