Seeking help regarding peptide prophet and decoy mode

Debojyoti Pal

unread,

May 12, 2024, 5:38:01 AM5/12/24

to spctools-discuss

Hello everyone

Proteomics newbie here. I am trying to use TPP and peptide prophet but really unable to understand the options and outputs. Seeking help of the experienced members.

What are the options that I need to activate to estimate FDR via the target-decoy mode? I am currently generation decoy through comet and actuvation "use decoy hits to pin down" option and "known protein names begin with" option and "use non parametric model option". What do the option "report decoy hits with computed probability do? And the other options too?

How is this decoy based FDR calculated?? I am not asking the principle behind it but I can't see a table for the same? And what is the FDR after discard?

The data in this table is not for decoy based FDR, right? This is the peptide prophet stattistical model based FDR, correct?

I would be highly obliged if anyone could help me out.

Thanks

Debojyoti

PhD Student

David Shteynberg

unread,

May 12, 2024, 1:24:18 PM5/12/24

to spctools...@googlegroups.com

Dear Debojyoti,

Welcome to the world of TPP!

If you have a database of targets you can use the following tool in TPP to generate random (repeat preserving deBruijn) decoys for your targets:

With the default options this tool will create two independent sets decoys for each of your targets, prefixed by DECOY0 and DECOY1.

After you search the data you can analyze it with PeptideProphet in many different ways. I would suggest you try with the following options to start:

This will enable PeptideProphet to use DECOY0 hits as model-decoys and DECOY1 hits as validation-decoys.

With these setting the table on the models page will contain model-based error estimations based on the model trained with DECOY0 ("known" decoys).

As part of the run with these settings DECOY1 will be used to validate the PeptideProphet model using the "unknown" decoys. This will be displayed on the models page "Models Charts" tab near the bottom, for example:

The chart on the right shows both the "DECOY" (DECOY1 "unknown") ROC curve and the "PREDICTED" (DECOY0 "known" model-based) ROC curve. The error estimates comparing the model-based error to the unknown/validation decoy-based error are on the chart on the left. If you want evaluate a model using a different decoys settings you can run the ProphetModels.pl decoy validation tool on the following page:

On this page set the decoy proteins to the PeptideProphet "model unknown" and the excluded decoys to the PeptideProphet "model known' ones (if any) as follows:

Hopefully this helps you process your dataset. Let us know if you have additional questions.

Cheers!

-David

--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/f3a8af0d-d40d-4ce9-9a99-5a250c5345c9n%40googlegroups.com.

Debojyoti Pal

unread,

May 14, 2024, 5:08:00 AM5/14/24

to spctools-discuss

Thank you Dr David! That is really great help.

I have a couple of follow-up queries:

1) If I use iProphet on these results, does that now use Decoy based FDR estimates or the PeptideProphet statisitcal model only estimates.

2) If I combine multiple PeptideProphet outputs (from fractions of same digest) in iProphet , how does that affect individual PSMs? For example, if a PSM if found in multiple fractions, does it get converted to single PSM in the iProphet output (I just want to make the output compatible to MSStatsTMT - see here https://groups.google.com/g/msstats/c/aINhWMKt2Co) While MSStatsTMT has converters for other SW like Maxquant and PD, I kind of like TPP, so trying to establish a proper workflow for my fractionated TMT data.

Thanks again,

Debojyoti

Debojyoti Pal

unread,

May 14, 2024, 5:09:50 AM5/14/24

to spctools-discuss

One more question. I am using COMET prior to the peptide prophet steps. COMET is already generating decoy database (concatenated search), do I need to turn the decoy search off in COMET?

David Shteynberg

unread,

May 14, 2024, 8:51:03 PM5/14/24

to spctools...@googlegroups.com

Here is the best reference describing iProphet in detail: https://pubmed.ncbi.nlm.nih.gov/21876204/

iProphet starts with PeptideProphet probabilities at the "PSM-level" (always model-based) and computes the probabilities at the "peptide-level." It always produces one result per spectrum (whether or not each spectrum is searched by different engines or not.) The probability of each peptide sequence identified by iProphet should be taken as the maximum over all PSMs matching that peptide sequence. I do not use MSStatsTMT so cannot help you there. BTW, the TPP tool for TMT and other isobaric quantification approaches is called Libra.

To answer your Comet decoy question: if you are using TPP's decoy generator then you likely don't need additional decoys provided by comet, so I would recommend to turn those OFF.

Best of luck!

-David

To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/de8c55e3-2e0a-4af0-809d-13dc88df1a66n%40googlegroups.com.

Reply all

Reply to author

Forward