Seeking help regarding peptide prophet and decoy mode

37 views
Skip to first unread message

Debojyoti Pal

unread,
May 12, 2024, 5:38:01 AMMay 12
to spctools-discuss
Hello everyone

Proteomics newbie here. I am trying to use TPP and peptide prophet but really unable to understand the options and outputs. Seeking help of the experienced members.peptideprophet.PNG

What are the options that I need to activate to estimate FDR via the target-decoy mode? I am currently generation decoy through comet and actuvation "use decoy hits to pin down" option and "known protein names begin with" option and "use non parametric model option". What do the option "report decoy hits with computed probability do? And the other options too? 

results.PNG

How is this decoy based FDR calculated?? I am not asking the principle behind it but I can't see a table for the same? And what is the FDR after discard?

table.PNG

The data in this table is not for decoy based FDR, right? This is the peptide prophet stattistical model based FDR, correct?

I would be highly obliged if anyone could help me out. 

Thanks
Debojyoti 
PhD Student

David Shteynberg

unread,
May 12, 2024, 1:24:18 PMMay 12
to spctools...@googlegroups.com
Dear Debojyoti,

Welcome to the world of TPP!

If you have a database of targets you can use the following tool in TPP to generate random (repeat preserving deBruijn) decoys for your targets:

image.png

With the default options this tool will create two independent sets decoys for each of your targets, prefixed by DECOY0 and DECOY1.

After you search the data you can analyze it with PeptideProphet in many different ways.  I would suggest you try with the following options to start:

image.png

This will enable PeptideProphet to use DECOY0 hits as model-decoys and DECOY1 hits as validation-decoys.

With these setting the table on the models page will contain model-based error estimations based on the model trained with DECOY0 ("known" decoys).

As part of the run with these settings DECOY1 will be used to validate the PeptideProphet model using the "unknown" decoys.  This will be displayed on the models page "Models Charts" tab near the bottom, for example:
image.png

The chart on the right shows both the "DECOY" (DECOY1 "unknown") ROC curve and the "PREDICTED" (DECOY0 "known" model-based) ROC curve.  The error estimates comparing the model-based error to the unknown/validation decoy-based error are on the chart on the left.  If you want evaluate a model using a different decoys settings you can run the ProphetModels.pl decoy validation tool on the following page:
image.png

On this page set the decoy proteins to the PeptideProphet "model unknown" and the excluded decoys to the PeptideProphet "model known' ones (if any) as follows:
image.png

Hopefully this helps you process your dataset.  Let us know if you have additional questions.

Cheers!
-David



--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/f3a8af0d-d40d-4ce9-9a99-5a250c5345c9n%40googlegroups.com.

Debojyoti Pal

unread,
May 14, 2024, 5:08:00 AMMay 14
to spctools-discuss
Thank you Dr David! That is really great help.

I have a couple  of follow-up queries:

1) If I use iProphet on these results, does that now use Decoy based FDR estimates or the PeptideProphet statisitcal model only estimates.

2) If I combine multiple PeptideProphet outputs (from fractions of same digest) in iProphet , how does that affect individual PSMs? For example, if a PSM if found in multiple fractions, does it get converted to single PSM in the iProphet output (I just want to make the output compatible to MSStatsTMT - see here https://groups.google.com/g/msstats/c/aINhWMKt2Co) While MSStatsTMT has converters for other SW like Maxquant and PD, I kind of like TPP, so trying to establish a proper workflow for my fractionated TMT data.

Thanks again,
Debojyoti

Debojyoti Pal

unread,
May 14, 2024, 5:09:50 AMMay 14
to spctools-discuss
One more question. I am using COMET prior to the peptide prophet steps. COMET is already generating decoy database (concatenated search), do I need to turn the decoy search off in COMET?

David Shteynberg

unread,
May 14, 2024, 8:51:03 PMMay 14
to spctools...@googlegroups.com

Here is the best reference describing iProphet in detail: https://pubmed.ncbi.nlm.nih.gov/21876204/

iProphet starts with PeptideProphet probabilities at the "PSM-level" (always model-based) and computes the probabilities at the "peptide-level."  It always produces one result per spectrum (whether or not each spectrum is searched by different engines or not.)  The probability of each peptide sequence identified by iProphet should be taken as the maximum over all PSMs matching that peptide sequence.  I do not use MSStatsTMT so cannot help you there.  BTW, the TPP tool for TMT and other isobaric quantification approaches is called Libra. 

To answer your Comet decoy question:  if you are using TPP's decoy generator then you likely don't need additional decoys provided by comet, so I would recommend to turn those OFF.

Best of luck!

-David

Reply all
Reply to author
Forward
0 new messages