all Kojak identified unlinked PSMs assigned a probability of 0 TPP/Kojak

24 views
Skip to first unread message

Alex Zelter

unread,
Feb 3, 2020, 1:24:23 PM2/3/20
to spctools-discuss
Hi. 
I'm running into an issue with the TPP/Kojak pipeline. I am running a crosslinking analysis using Kojak. As you know, Kojak also reports unlinked PSMs. I'm finding that PeptideProphet is filtering out ALL the unlinked PSMs in my analysis.
If I process the dataset with Kojak/Percolator I get 8509 unlinked PSMs and 585 crosslinked PSMs with a q value <= 0.05 
Using Kojak/TPP v5.2.1-dev Flammagenitus, Build 202001280920-8004 I get 0 unlinked PSMs and 564 crosslinked PSMs with a PeptideProphet error rate of >= 0.95
Using Kojak/TPP v5.2.0 Flammagenitus, Build 202001311412-exported I also get 0 unlinked PSMs and 564 crosslinked PSMs with a PeptideProphet error rate of >= 0.95

If I look at the raw data for a specific unlinked PSM, I find things like this in the Kojak pep.xml file

spectrum_query spectrum="UWPRLumos_2020_0124_AZ_012_AZ866_xlink05.24437.24437.3" start_scan="24437" end_scan="24437" precursor_neutral_mass="1782.949616" assumed_charge="3" index="25666" retention_time_sec="91.8">
<search_result>
<search_hit hit_rank="1" peptide="KPIDYTILDDIGHGVK" peptide_prev_aa="R" peptide_next_aa="V" protein="hAbi2_1-158" protein_link_pos_a="139" num_tot_proteins="1" calc_neutral_pep_mass="1782.951515" massdiff="0.001899" xlink_type="na" num_tol_term="2" num_missed_cleavages="0">
<search_score name="kojak_score" value="5.9750"/>
<search_score name="delta_score" value="5.0900"/>
<search_score name="ppm_error" value="1.0650"/>
<search_score name="e-value" value="3.950e-13"/>
<search_score name="ion_match" value="42"/>
<search_score name="consecutive_ion_match" value="13"/>
<analysis_result analysis="peptideprophet">
<peptideprophet_result probability="0" all_ntt_prob="(0,0,0)" analysis="incomplete">
<search_score_summary>
<parameter name="fval" value="7.7950"/>
<parameter name="ntt" value="2"/>
<parameter name="nmc" value="0"/>
<parameter name="massd" value="1.065"/>
<parameter name="isomassd" value="0"/>
</search_score_summary>
</peptideprophet_result>

so scan 24437 identifying KPIDYTILDDIGHGVK gets a e value of 3.950e-13 from kojak, which is good, but gets a probability of 0 from peptideprophet (see below).

A comet/percolator analysis shows a comet e value of 1.67E-14 and a percolator q-value of 0.00009243 for the same PSM (both very good)

A kojak/percolator analysis shows a Kojak e value of  3.950e-13 and a percolator q value of  0.000001555 for the same PSM. Again, both very good.

Looking at interact.pep.xml I see:

<spectrum_query spectrum="UWPRLumos_2020_0124_AZ_012_AZ866_xlink05.24437.24437.3" start_scan="24437" end_scan="24437" precursor_neutral_mass="1782.949616" assumed_charge="3" index="25666" retention_time_sec="91.8">
<search_result>
<search_hit hit_rank="1" peptide="KPIDYTILDDIGHGVK" peptide_prev_aa="R" peptide_next_aa="V" protein="hAbi2_1-158" protein_link_pos_a="139" num_tot_proteins="1" calc_neutral_pep_mass="1782.951515" massdiff="0.001899" xlink_type="na" num_tol_term="2" num_missed_cleavages="0">
<search_score name="kojak_score" value="5.9750"/>
<search_score name="delta_score" value="5.0900"/>
<search_score name="ppm_error" value="1.0650"/>
<search_score name="e-value" value="3.950e-13"/>
<search_score name="ion_match" value="42"/>
<search_score name="consecutive_ion_match" value="13"/>
<analysis_result analysis="peptideprophet">
<peptideprophet_result probability="0" all_ntt_prob="(0,0,0)" analysis="incomplete">
<search_score_summary>
<parameter name="fval" value="7.7950"/>
<parameter name="ntt" value="2"/>
<parameter name="nmc" value="0"/>
<parameter name="massd" value="1.065"/>
<parameter name="isomassd" value="0"/>
</search_score_summary>
</peptideprophet_result>
</analysis_result>
</search_hit>
</search_result>
</spectrum_query>

So I am confused why PeptideProphet is assigning all unlinked peptides in these data with a probability of 0. I tried the latest stable version of TPP as well as the latest daily build. I see the same results in each. Any ideas about this would be very much appreciated. Thanks very much in advance!
Alex

David Shteynberg

unread,
Feb 3, 2020, 4:49:23 PM2/3/20
to spctools-discuss
Hi Alex,

The automated model invalidator seems to be getting triggered by this dataset.  I will adjust the settings for Kojak analysis as this should not be happening on this data.  In the meanwhile, fou can override this problem by using the PeptideProphetParser option FORCEDISTR (possibly combined with IGNORECHG= for ones you want to exclude.)  These options are different for xinteract -OF and -I

Let me know if you have any questions.

Cheers,
-David

Alex Zelter

unread,
Feb 5, 2020, 11:32:02 AM2/5/20
to spctools-discuss

The automated model invalidator seems to be getting triggered by this dataset.  I will adjust the settings for Kojak analysis as this should not be happening on this data.  In the meanwhile, fou can override this problem by using the PeptideProphetParser option FORCEDISTR (possibly combined with IGNORECHG= for ones you want to exclude.)  These options are different for xinteract -OF and -I


Hi David,
Using -OF and -I options with xinteract solved the problem for me. Thanks very much for your help!
Alex 
Reply all
Reply to author
Forward
0 new messages