Web Images Videos Maps News Shopping Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Issue with pepXML generation
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  10 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Simon Michnowicz  
View profile  
 More options Nov 11, 12:43 am
From: Simon Michnowicz <simon.michnow...@gmail.com>
Date: Wed, 11 Nov 2009 16:43:22 +1100
Local: Wed, Nov 11 2009 12:43 am
Subject: Issue with pepXML generation

Dear Group,

I would like to flag a possible bug in a TPP tool.(Sorry in advance if this
is the wrong forum to report bugs).

One of our users has reported issues with a tpp pepXML tool (he was using
Mascot so I assume he was using Mascot2XML.exe).

Our  FASTA database has protein entries with special characters in then,
i.e.

*IFN-<alpha>2*

*&*

*V<beta>14 *

This generated a pepXML file that was not valid xml, as the tags were not
escaped properly.

*<alternative_protein protein="tr|Q9UMA4|IFN-<alpha>2" num_tol_term="2"
peptide_prev_aa="-" peptide_next_aa="S"/>*

regards

Simon Michnowicz
Duty Programmer
Australian Proteomics Computation Facility
Ludwig Institute For Cancer Research
Royal Melbourne Hospital,
Victoria
Tel: (+61 3) 9341 3155
Fax: (+61 3) 9341 3104


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Brian Pratt  
View profile  
 More options Nov 11, 2:20 pm
From: Brian Pratt <brian.pr...@insilicos.com>
Date: Wed, 11 Nov 2009 11:20:17 -0800
Local: Wed, Nov 11 2009 2:20 pm
Subject: Re: [spctools-discuss] Issue with pepXML generation

Granted, this is a defect - but that's still an unfortunate choice of
characters.  Even with the correction I can imagine this tripping up other
software downstream since the properly escaped XML would no longer match the
FASTA on a literal basis.  I don't suppose your users could be induced to
use { and } or [ and ] or ( and ) instead of < and > ?

Brian

On Tue, Nov 10, 2009 at 9:43 PM, Simon Michnowicz <


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthew Chambers  
View profile  
 More options Nov 11, 2:25 pm
From: Matthew Chambers <matthew.chamb...@vanderbilt.edu>
Date: Wed, 11 Nov 2009 13:25:08 -0600
Local: Wed, Nov 11 2009 2:25 pm
Subject: Re: [spctools-discuss] Re: Issue with pepXML generation
What about the other reserved characters in XML that are valid in FASTA?
"
'
&

Not escaping could also break downstream software - especially with &
which should always begin an escape sequence. :(

-Matt


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Brian Pratt  
View profile  
 More options Nov 11, 2:52 pm
From: Brian Pratt <brian.pr...@insilicos.com>
Date: Wed, 11 Nov 2009 11:52:41 -0800
Local: Wed, Nov 11 2009 2:52 pm
Subject: Re: [spctools-discuss] Re: Issue with pepXML generation

Yes, one would want to escape everything properly - happily there's a
library call for that.  And certainly it's only right to emit valid XML.

But I do think that it might be wisest to sidestep the whole mess - it's
valid FASTA but also unconventional (based on many years of TPP not bumping
into this), and even converted to valid XML I suspect it may cause other
problems downstream since it no longer exactly matches the FASTA.  I suspect
you're damned if you do and damned if you don't.

Brian
On Wed, Nov 11, 2009 at 11:25 AM, Matthew Chambers <


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
dctrud  
View profile  
 More options Nov 11, 3:17 pm
From: dctrud <dct...@ccmp.ox.ac.uk>
Date: Wed, 11 Nov 2009 12:17:12 -0800 (PST)
Local: Wed, Nov 11 2009 3:17 pm
Subject: Re: Issue with pepXML generation
Unfortunately the offending entries are present in commonly used
public DBs. We recently bumped into exactly this problem, as there are
4 entries containing <xxxx> in the IPI human v3.66 fasta file:

IPI00465120 Gene_Symbol=- 3<beta>-HSD <psi>1 protein
IPI00816409 Gene_Symbol=- V<gamma>1 protein (Fragment)
IPI00816761 Gene_Symbol=CREB1 <alpha>CREB-1 protein (Fragment)
IPI00930475 Gene_Symbol=GUSB F<lambda>8 protein (Fragment)

After hundreds of searches, a particular experiment happened to ID one
of these proteins, causing problems with the tools. In the event I
manually removed the problematic IDs as they were irrelevant for the
experiment. We already re-write IPI headers after download of the
FASTA, so will implement a substitution there if it crops up again.
Should a substitution be added to the IPI retrieval utility scripts in
the TPP distribution so that the problem doesn't show it's face if
they are being used?

Interestingly, if you search on the EBI IPI site for these proteins
the < > are substituted with [ ] , but the problematic characters are
in the FASTA.

http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-id+657mP1a3t7q+-e+[IPI:%27IPI00465120.3%27]+-qnum+1+-enum+1

Cheers,

DT

On Nov 11, 7:52 pm, Brian Pratt <brian.pr...@insilicos.com> wrote:


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Brian Pratt  
View profile  
 More options Nov 11, 3:27 pm
From: Brian Pratt <brian.pr...@insilicos.com>
Date: Wed, 11 Nov 2009 12:27:49 -0800
Local: Wed, Nov 11 2009 3:27 pm
Subject: Re: [spctools-discuss] Re: Issue with pepXML generation

Well, I'll go ahead and modify the mascot converter to emit proper XML for
proteins with reserved XML characters, but it does sound like folks would do
well to make that <> / [] substitution upstream from the search engines.
The fact that the EBI IPI site does the substitution confirms my suspicion
that a number of tools might get munged up by this.

Brian


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jimmy Eng  
View profile  
 More options Nov 11, 3:31 pm
From: Jimmy Eng <jke...@gmail.com>
Date: Wed, 11 Nov 2009 12:31:04 -0800
Local: Wed, Nov 11 2009 3:31 pm
Subject: Re: [spctools-discuss] Re: Issue with pepXML generation
I'll add the substitutions to the getdb.* scripts in the TPP src/util directory.


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Simon Michnowicz  
View profile  
 More options Nov 11, 6:21 pm
From: Simon Michnowicz <simon.michnow...@gmail.com>
Date: Wed, 11 Nov 2009 15:21:31 -0800 (PST)
Local: Wed, Nov 11 2009 6:21 pm
Subject: Re: Issue with pepXML generation

Unfortunately we have no control over what goes in the FASTA
databases! Matrix Science's pepXML generation code escapes the XML
if ($thisScript->param($urlParams{'prot_desc'})) {
            $prot_desc = &noXmlTag(&mustGetProteinDescription
($protein_list[0], \%fastaTitles));
          }
Where noXMLTag() detags strings of xml tags..

Thanks
Simon Michnowicz.


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "what is "adjusted ratio mean"?" by Eileen Yue
Eileen Yue  
View profile  
 More options Nov 11, 6:00 pm
From: Eileen Yue <y...@ohsu.edu>
Date: Wed, 11 Nov 2009 15:00:19 -0800
Local: Wed, Nov 11 2009 6:00 pm
Subject: what is "adjusted ratio mean"?

Dear all:
May I check one question? After I run analysis peptide with TPP, I export to excel mode. Then I will see all the different column lists all the information that showed in web but I also noticed that there are some extra information, such as “adjusted ratio mean”. What does this “adjusted ratio mean” represent?
By the way, for the identified protein, the percentage of “shared of spectrum ids” means the spectrum that also shows in other proteins, is this right?
Thanks


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Issue with pepXML generation" by Brian Pratt
Brian Pratt  
View profile  
 More options Nov 11, 7:19 pm
From: Brian Pratt <brian.pr...@insilicos.com>
Date: Wed, 11 Nov 2009 16:19:40 -0800
Local: Wed, Nov 11 2009 7:19 pm
Subject: Re: [spctools-discuss] Re: Issue with pepXML generation

No worries, a corrected Mascot2XML will be in the next TPP release.

Brian

On Wed, Nov 11, 2009 at 3:21 PM, Simon Michnowicz <


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google