--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/e74da627-d489-453c-b480-96453fbdc6a5n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/4E5196E2-E7D9-4B46-91CB-462A19AEFAAC%40systemsbiology.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHTC2%2BQ5XagQASK5OL5zHRhaykxHYY83%2B0L%3DbHE9FkN%3DRg%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/FE20D08F-8D4D-4844-8AA9-877534A03586%40systemsbiology.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CAGJJY%3D9Ypc%2Bx1yNj2wCdzDP-YnOzQV1rtNex60HYJSQ5hxJ1qQ%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/4E5196E2-E7D9-4B46-91CB-462A19AEFAAC%40systemsbiology.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHRLCSdsG5uEPbsrtUGzV5_6iephhY8GofZa3ZwVfoyohA%40mail.gmail.com.
On Jul 25, 2024, at 7:22 AM, David Shteynberg <david.sh...@isbscience.org> wrote:Please create a new comet params file for the comet search in the new release which has updated comet. You can do this using the Files tab in petunia and create new file.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHRyLTuMBZWFiLiUQ_-9XwtFQ3gdLdp_%2B8GOgQSOLc30_g%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/D9271C67-C313-4438-AAF7-4382DC347586%40systemsbiology.org.
1. I see that in the first image "interact neggamma. pepxml" all the hits are assigned to correct proteins. I know it as I was expecting these proteins from the sample.
2. I wonder why the model for +2 charge is fitting the observed ions to negative model
3. Your observation that "As you can see, even with the “optimized” analysis, at most we can identify about 100 PSMs and about 80 peptides, in this mzML file containing 4117 spectra, which are all 2+ charge" may be correct given the stringency of wide tools of statistics applied. But it will be interesting to see- are these qualified 80 peptides belong to the same proteins as indicated in the raw comet or tandem search? My statement- comet search without prophet control - shows very correct hits.
On Jul 25, 2024, at 11:23 PM, sudarshan kumar <kumarsu...@gmail.com> wrote:
Hi David,
Thank you for doing the analysis at your end.1. I see that in the first image "interact neggamma. pepxml" all the hits are assigned to correct proteins. I know it as I was expecting these proteins from the sample.2. I wonder why the model for +2 charge is fitting the observed ions to negative model3. Your observation that "As you can see, even with the “optimized” analysis, at most we can identify about 100 PSMs and about 80 peptides, in this mzML file containing 4117 spectra, which are all 2+ charge" may be correct given the stringency of wide tools of statistics applied. But it will be interesting to see- are these qualified 80 peptides belong to the same proteins as indicated in the raw comet or tandem search? My statement- comet search without prophet control - shows very correct hits.4. I am not able to find a 2024 version of comet.param even when I tried to create a new one in TPP 7.1.0. If I copy an old version of comet.param (it is of 2023). and the seearch aborts.5. I am not able to do it through command prompt either. Kindly send a copy (notepad) of comet.param version 2024 file to my mail which I can edit as per my need.Best regards,Sud
On Thu, Jul 25, 2024 at 11:08 PM David Shteynberg <dshte...@systemsbiology.org> wrote:
Hello Sud,Thank you for sharing the problematic dataset! I was able to download the data, search it with both comet and tandem, and generate some non-zero probability results. Since there are so few correct results in this data, about 2% of the PSMs, it makes it difficult for PeptideProphet to select out the correct results, especially with out the aid of some decoy true negatives. Yet, I was still able to get a few PSMs with your basic analysis database and selecting the NEGGAMMA distribution for negatives analysis, and a few more by lowering c-level to 0.5 (or 0).You should always be careful selecting minimum probability thresholds or the false positives can pile up. Here, you can see I added a few decoy to your database, which is highly recommended to get another estimate on the remaining errors and biases in the data. When adding decoys you might also consider adding common contaminant proteins to the database as well (I have not done that in this case.) Adding decoys also allow you to take advantage of the semi-parametric modeling in PeptideProphet which can be more sensitive and extract more correct results from the data set. Here are the parametric NEGGAMMA model results:
<image.png>As you see NEGGAMMA models struggle to pull together over 30 correct PSMs:
<image.png>And when I run iProphet on this result these map to only about 26 peptide sequences:
<PastedGraphic-3.png>In the following optimized analysis I added two sets of deBruijn randomized decoys to your database. I then did an X!Tandem search (refinement enabled to boost speed and sensitivity.). Then I used the TPP PeptideProphet with DECOY0 as “known” decoy, DECOY1 as “unknown” decoy, semi-parametric model, bandwidth of 3, clevel of 2. PeptideProphet models and PSM-level results:
<PastedGraphic-2.png>Also I enabled the use of iProphet to obtain these Peptide-level models and results on the data you sent:
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHTe_PRNdGFMdDoVitZBtWfzF%3DchsdvNVxnLVkQYz5TuhQ%40mail.gmail.com.