Predict peptide detection on MS

15 views
Skip to first unread message

Alex_PCB

unread,
Feb 19, 2010, 8:15:32 AM2/19/10
to bioinformatica-proteomica
Hi all,
Someone sent me this and I am posting here so as we can boost our
bioinfo community and open up the discussion. My answer is below.

Tengo que intentar identificar dos isoformas en mezclas de proteinas
complejas. .... Puedes darme la probabilidad de encontrar los peptidos
por
espectrometria de masas.


As far as I know there are four programs where you can get the
"probability" (in fact, these programs generate a score function) to
find a peptide.
1) PeptideSieve is quite simple to use. You input the protein
sequence(s) you want to predict from, choose the minimal and max
peptide lengths (and mass), miss cleavage, and finally you choose from
training data sets (PAGE-ESI, MUDPIT-ESI, PAGE-MALDI) which type of
experiment is more similar to yours. In other words, the later is the
data set used by the machine learning algorithm to predict the
"observability" of a peptide in a given experimental condition (e.g.,
PAGE-ESI).
For PeptideSieve, there is a very easy-to-use graphical interface for
Windows.
http://tools.proteomecenter.org/wiki/index.php?title=Software:PeptideSieve

2) STEEP is java-based with graphical interface and really easy-to-
use. It applies a Support Vector Machine to compute the "probability"
to detect a peptide. The algorithm was trained with PNNL data set
(from LC-FTICR instruments) and uses 35 different properties (amino
acid content, hydrophilicity, polarity, etc) of peptides to perform
its prediction.
http://omics.pnl.gov/software/STEPP.php

3) ESPPredictor is provided in the GenePattern server from the Broad
Institute at the MIT (although you can get it as a stand-alone as
well). Again, this algorithm was trained with their data set (from
Orbis, QTRAPs, and FT) which means that it may not predict peptides
with the same efficiency for your instrumentation.
http://genepattern.broadinstitute.org/gp/pages/index.jsf

4) APEX is program used for Label-free quantification (spectral
counting). The interesting thing is the following. To calculate the
'amount' of a given protein based on the number of spectra found for
that protein, the program corrects (take into consideration) the
"observability" of all peptides for that protein. Therefore, the
program calculates over 45 (as I recall) different properties for each
peptide and predict, based on all these properties, the "probability"
to observe a peptide. The nice thing is that you can train the
algorithm 'on the fly' for your instrument. All you need to do is to
input a collection of previous experiments in which you identified
several peptides for several proteins.
Here is the reference for APEX (you can ask Edward Marcotte the perl
scripts)
http://www.nature.com/nbt/journal/v25/n1/abs/nbt1270.html

hope this can help

cheers
alex

Reply all
Reply to author
Forward
0 new messages