I'm running into an issue with the TPP/Kojak pipeline. I am running a crosslinking analysis using Kojak. As you know, Kojak also reports unlinked PSMs. I'm finding that PeptideProphet is filtering out ALL the unlinked PSMs in my analysis.
If I process the dataset with Kojak/Percolator I get 8509 unlinked PSMs and 585 crosslinked PSMs with a q value <= 0.05
Using Kojak/TPP v5.2.1-dev Flammagenitus, Build 202001280920-8004 I get 0 unlinked PSMs and 564 crosslinked PSMs with a PeptideProphet error rate of >= 0.95
Using Kojak/TPP v5.2.0 Flammagenitus, Build 202001311412-exported I also get 0 unlinked PSMs and 564 crosslinked PSMs with a PeptideProphet error rate of >= 0.95
If I look at the raw data for a specific unlinked PSM, I find things like this in the Kojak pep.xml file
spectrum_query spectrum="UWPRLumos_2020_0124_AZ_012_AZ866_xlink05.24437.24437.3" start_scan="24437" end_scan="24437" precursor_neutral_mass="1782.949616" assumed_charge="3" index="25666" retention_time_sec="91.8">
<search_result>
<search_hit hit_rank="1" peptide="KPIDYTILDDIGHGVK" peptide_prev_aa="R" peptide_next_aa="V" protein="hAbi2_1-158" protein_link_pos_a="139" num_tot_proteins="1" calc_neutral_pep_mass="1782.951515" massdiff="0.001899" xlink_type="na" num_tol_term="2" num_missed_cleavages="0">
<search_score name="kojak_score" value="5.9750"/>
<search_score name="delta_score" value="5.0900"/>
<search_score name="ppm_error" value="1.0650"/>
<search_score name="e-value" value="3.950e-13"/>
<search_score name="ion_match" value="42"/>
<search_score name="consecutive_ion_match" value="13"/>
<analysis_result analysis="peptideprophet">
<peptideprophet_result probability="0" all_ntt_prob="(0,0,0)" analysis="incomplete">
<search_score_summary>
<parameter name="fval" value="7.7950"/>
<parameter name="ntt" value="2"/>
<parameter name="nmc" value="0"/>
<parameter name="massd" value="1.065"/>
<parameter name="isomassd" value="0"/>
</search_score_summary>
</peptideprophet_result>
so scan 24437 identifying KPIDYTILDDIGHGVK gets a e value of 3.950e-13 from kojak, which is good, but gets a probability of 0 from peptideprophet (see below).
A comet/percolator analysis shows a comet e value of 1.67E-14 and a percolator q-value of 0.00009243 for the same PSM (both very good)
A kojak/percolator analysis shows a Kojak e value of
3.950e-13 and a percolator q value of
0.000001555 for the same PSM. Again, both very good.
Looking at interact.pep.xml I see:
<spectrum_query spectrum="UWPRLumos_2020_0124_AZ_012_AZ866_xlink05.24437.24437.3" start_scan="24437" end_scan="24437" precursor_neutral_mass="1782.949616" assumed_charge="3" index="25666" retention_time_sec="91.8">
<search_result>
<search_hit hit_rank="1" peptide="KPIDYTILDDIGHGVK" peptide_prev_aa="R" peptide_next_aa="V" protein="hAbi2_1-158" protein_link_pos_a="139" num_tot_proteins="1" calc_neutral_pep_mass="1782.951515" massdiff="0.001899" xlink_type="na" num_tol_term="2" num_missed_cleavages="0">
<search_score name="kojak_score" value="5.9750"/>
<search_score name="delta_score" value="5.0900"/>
<search_score name="ppm_error" value="1.0650"/>
<search_score name="e-value" value="3.950e-13"/>
<search_score name="ion_match" value="42"/>
<search_score name="consecutive_ion_match" value="13"/>
<analysis_result analysis="peptideprophet">
<peptideprophet_result probability="0" all_ntt_prob="(0,0,0)" analysis="incomplete">
<search_score_summary>
<parameter name="fval" value="7.7950"/>
<parameter name="ntt" value="2"/>
<parameter name="nmc" value="0"/>
<parameter name="massd" value="1.065"/>
<parameter name="isomassd" value="0"/>
</search_score_summary>
</peptideprophet_result>
</analysis_result>
</search_hit>
</search_result>
</spectrum_query>
So I am confused why PeptideProphet is assigning all unlinked peptides in these data with a probability of 0. I tried the latest stable version of TPP as well as the latest daily build. I see the same results in each. Any ideas about this would be very much appreciated. Thanks very much in advance!
Alex