Thanks Jimmy for your insights. I will definitely check with TPP folks. We have a homegrown discriminant score formula for Sequest result PTM modeling, so calculating discriminant score for Comet-ms would make more sense. It just that when I read your paper, it seems to me that Xcorr scores are calculated for backward compatibility (maybe I read it wrong), expect value should be a more straightforward measure?
Some other questions for comet:1. what's the best way to set up for phspho searches?
2. can we specify in the parameter file to not include terminal residue modification of a given amino acid?
3. how can I set up custom amino acid to represent, say, an acetylated aspartate?
David


Thanks, Jimmy!DavidIf you look at a Comet generated pep.xml file, there is a "massdiff" attribute for the "search_hit" element. The value of the "massdiff" attribute is (precursor_neutral_mass - calc_neutral_pep_mass) so experimental mass minus calculated peptide mass.On Wed, Aug 27, 2014 at 9:54 AM, David Zhao <weizh...@gmail.com> wrote:
Is it the difference between the theoretical and experimental mass?ThanksDavidOn Tue, Aug 26, 2014 at 2:29 PM, David Zhao <weizh...@gmail.com> wrote:
Hi Jimmy,I'm looking at CometDiscrimFunction.cxx to port the function to Java and perl, and one question: what is massdiff field in CometSearchResult? Where is it in the comet result? BTW, is there a documentation on comet result somewhere?Thanks,David
Hi Jimmy,I've done some comparison of sequest and comet results lately, and have some observation I'd like to share with you and see what are your takes on these:1. I ported the discriminant score formula to perl and java and used it to calculate discriminant score for PTM, below is the plot of our XcorrNorm score (normalized Xcorr) vs Discriminant score by charges, it seems like that discriminant score penalize higher charge state hits? I can see that from the code as well.but the expect value correlates better with our XcorrNorm score:
2. If I use the recommended parameter setting from comet web site for our Velos samples, mainly using 20 ppm, with mono/mono parent and fragment ion mass, I get much fewer hits. As you know, peptides (proteins) in our samples are pulled down by our activity based probes, and the way we run the samples on our instrument, I found that using average parent mass, and monoisotopic fragment mass with 2 Da tolerance give me better results. Do you think this makes sense? and should I set "isotope_error" setting to 1 in this case, I currently setting it to 0.
Does using C ion in the search make any different? we use in sequest, but it seems it's not recommended to use in comet.
3. And the million dollar question, if I need to get as many hits as possible, what will be the best settings? or which settings will make the biggest impact?Thanks a lot!
see my replies inline below.
On Thursday, September 11, 2014 10:51:47 AM UTC-7, David Zhao wrote:Hi Jimmy,I've done some comparison of sequest and comet results lately, and have some observation I'd like to share with you and see what are your takes on these:1. I ported the discriminant score formula to perl and java and used it to calculate discriminant score for PTM, below is the plot of our XcorrNorm score (normalized Xcorr) vs Discriminant score by charges, it seems like that discriminant score penalize higher charge state hits? I can see that from the code as well.but the expect value correlates better with our XcorrNorm score:I don't know if I can add any useful comment here; I suggest you use what works for you. The discriminant scores in PeptideProphet don't make use of XcorrNorm and that tool analyzes each charge state separately so it's not obvious to me if it's a good or bad outcome demonstrated in your discriminant analysis vs. XcorrNorm plot above. I also don't know the behavior of XcorrNorm myself so I don't know how to interpret something correlating well with it.
2. If I use the recommended parameter setting from comet web site for our Velos samples, mainly using 20 ppm, with mono/mono parent and fragment ion mass, I get much fewer hits. As you know, peptides (proteins) in our samples are pulled down by our activity based probes, and the way we run the samples on our instrument, I found that using average parent mass, and monoisotopic fragment mass with 2 Da tolerance give me better results. Do you think this makes sense? and should I set "isotope_error" setting to 1 in this case, I currently setting it to 0.What recommended parameter setting are you using with your Velos sample? Just a pure Velos instrument should use the comet.params.low-low because there is no high-res data in either the MS or MS/MS scans. If it's really an Orbi-Velos instrument then I would be surprised that a 2 Da, avg mass setting works better than the 20 ppm mono mass setting with "isotope_error = 1".
With a 2 Da tolerance, set "isotope_error = 0".
Does using C ion in the search make any different? we use in sequest, but it seems it's not recommended to use in comet.I don't use it but that doesn't mean that it couldn't make a small positive difference. Search a couple of datasets using a target-decoy search with and without specifying C-ions and see what gives you more IDs at a given FDR. It would be helpful if you report your findings back here.
3. And the million dollar question, if I need to get as many hits as possible, what will be the best settings? or which settings will make the biggest impact?Thanks a lot!That is a million dollar question that I can't answer and I'm sure there's no one right answer for all use cases. I already have suggestions for best parameter settings for various combinations of low res and high res spectra. But how you process your data post-search has an impact as well.If I had the time, I'd vary:- precursor tolerances
- fragment tolerances
- full vs. semi-digest
- likely always use mono masses
- compare performance of all combinations above using plain target-decoy FDR analysis based on the E-value