Variable mods in decoy matches

52 views
Skip to first unread message

Peppe

unread,
May 25, 2020, 8:54:25 AM5/25/20
to Comet-ms support
Hi,

Lately, I have been trying to couple the Comet + percolator with other protein inference tools such as PIA and Epifany. However I have hit a roadblock due, it seems, on the way comet writes out the variable modification and in particular the "Gln->pyroGlu (Q)" at the peptide N-terminus. The Q(-17.xxxx) seems to be not at the N-terminus of the peptide but in the middle of the sequence:

The pin output file, for example shows something like this:

/home/ubuntu/1a7d100b-b01e-441e-8acf-2a78fd92c151.comet_6_3_5       -1      6       897.565009      897.564753      1.609438        0.000000        0.013894        4.543652        0.652256        32.644783       0.1190  898.572285      8       0       0       1       0       0       0       0       1       1       2       5.902633        0.000000        0.000000        K.KLSKIAQ[-17.0265]K.Y  decoy_sp|Q9NP80|PLPL8_HUMAN     decoy_sp|Q9NP80-2|PLPL8_HUMAN

where the sequence K.RRAGQEQ[-17.0265]R has pyroGly mod in the middle of the peptide, and the pep.xml output reports it like this:

<search_hit hit_rank="5" peptide="KLSKIAQK" peptide_prev_aa="K" peptide_next_aa="Y" protein="decoy_sp|Q9NP80|PLPL8_HUMAN" num_tot_proteins="5" num_matched_ions="5" tot_num_ions="42" calc_neutral_pep_mass="897.564753" massdiff="0.000256" num_tol_term="2" num_missed_cleavages="2" num_matched_peptides="366">
...
<modification_info modified_peptide="KLSKIAQ[111]K">
<mod_aminoacid_mass position="7" mass="111.032029" variable="-17.026549" source="param"/>


Percolator has no issue with the modification, but from this point on I need to convert the pep.xml into idXML, and the OpenMS conversion has trouble with it. I assume this is correct, but, just to understand, why does this happen? Is comet shuffling the peptide sequence that has an n-terminal modification, leaving the mod "attached" to the original amino acid?


Just for reference, I'm using  Comet version "2019.01 rev. 5", the parameters file is attached.

thanks,

Peppe


comet.params

Jimmy Eng

unread,
May 25, 2020, 12:10:16 PM5/25/20
to Comet-ms support
Peppe,

What you are observing is due to the fact that this is a Comet internal decoy peptide.  When Comet reverses the residue positions for the decoy peptide, any modified residue will remain modified in the decoy form.  Distance constraints on modifications of target peptides do not apply to the decoys.  The target peptide that generated this decoy match was Q[111]AIKSLKK.  To generate the corresponding decoy, the first seven amino acids are reversed (the C-term lysine is left in place), leading to the KLSKIAQ[111]K decoy peptide in question.  As decoy peptides are known wrong matches, simply use their scores for downstream analysis and ignore what their known wrong sequences are.  Hope you buy this explanation and rationale.


Jimmy 

Peppe

unread,
May 25, 2020, 6:52:33 PM5/25/20
to Comet-ms support
Hi Jimmy,

Thank you for your prompt reply. It makes perfect sense, thanks, and it was what I thought. 

But at this point, I'm not sure where the issue is. I have raised an issue in the OpenMS GitHub here, and the opinion is that there might be some issue in the definition of the modification in the pepXML file :

The resulting pepXML does not correctly describe it as a terminal modification:
<aminoacid_modification aminoacid="Q" massdiff="-17.026549" mass="111.032029" variable="Y" symbol="@"/>

To be honest I'm not sure if the issue is in Comet's writing the pepXML or in OpenMS reading it. You can get the pepXML output from Comet here if needed. It would be great if you could weigh in on the conversation.

Thanks again,

Peppe

Jimmy Eng

unread,
May 25, 2020, 8:57:24 PM5/25/20
to Comet-ms support
Peppe,

I just posted to the OpenMS thread.  The two solutions are 1) OpenMS does not enforce modification rules on internal decoy peptides or 2) Comet's internal decoy searches are not supported and you must search a target-decoy database generated externally.  I tried giving an explanation/justification why 1) would make sense to support but we'll see where the thread leads.

Jimmy
Reply all
Reply to author
Forward
0 new messages