Please share notepad version of comet.param version 2024.
--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/c944987f-aa59-45fb-82a9-95760ea49213n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CACyS9bqpV8VHGrGYWqMZTgSYJHdY5hKftAHg57rJOS-v_k3BZQ%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/50158fb6-9d50-427f-a8e4-1126333f0cc8n%40googlegroups.com.
Hello again Sud,
--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/E48C9F54-97ED-4825-AA14-CF84A8956729%40systemsbiology.org.
First, you can find my comet.params file attached. It is modified to a set of parameters that I selected after having played a bit more with your dataset to try to discover some other reason why you might be getting low number of correct IDs. One thing I am noticing (after having performed a semi-tryptic search with comet) is that the majority of correct peptide IDs are semi-tryptic. This is expected among incorrect results, but among correct results this indicates a potential issue with tryptic digestion of the sample. The model for NTT is learned automatically by PeptideProphet and is pasted here:
--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/E48C9F54-97ED-4825-AA14-CF84A8956729%40systemsbiology.org.
I recommend this data is searched without strict tryptic-end requirements on the peptides.Cheers!-David
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHTtZKrLEdKuAvsuvBF-GWn%2Bmd3T0g0dD4DX-EujVSh7Lg%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/E48C9F54-97ED-4825-AA14-CF84A8956729%40systemsbiology.org.
--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/E48C9F54-97ED-4825-AA14-CF84A8956729%40systemsbiology.org.
Hello again Sud,--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/E48C9F54-97ED-4825-AA14-CF84A8956729%40systemsbiology.org.
First, you can find my comet.params file attached. It is modified to a set of parameters that I selected after having played a bit more with your dataset to try to discover some other reason why you might be getting low number of correct IDs. One thing I am noticing (after having performed a semi-tryptic search with comet) is that the majority of correct peptide IDs are semi-tryptic. This is expected among incorrect results, but among correct results this indicates a potential issue with tryptic digestion of the sample. The model for NTT is learned automatically by PeptideProphet and is pasted here:
--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/E48C9F54-97ED-4825-AA14-CF84A8956729%40systemsbiology.org.
I recommend this data is searched without strict tryptic-end requirements on the peptides.Cheers!-David
On Jul 26, 2024, at 10:18 AM, sudarshan kumar <kumarsu...@gmail.com> wrote:
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHTtZKrLEdKuAvsuvBF-GWn%2Bmd3T0g0dD4DX-EujVSh7Lg%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/E48C9F54-97ED-4825-AA14-CF84A8956729%40systemsbiology.org.
--
You received this message because you are subscribed to the Google Groups "spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/E48C9F54-97ED-4825-AA14-CF84A8956729%40systemsbiology.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHT5WUFyzYQWdLiDk_tMq2%2BHpwKKn7OJyM7iwrZgxcSt-w%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/C9F27E69-2022-49BC-BEE6-282369C6E694%40systemsbiology.org.
On Jul 30, 2024, at 5:52 AM, sudarshan kumar <kumarsu...@gmail.com> wrote:
Hi David,
Thank you for clearing my doubts.I have few more queries -you said "The reason you are seeing many more protein numbers in the PepXML Viewer (Summary Tab) as opposed to after running ProteinProphet is likely because you haven’t applied any threshold filtering to the probability (or other scores). You are seeing all the hits here as opposed to the “likely correct” hits."I tried to anlayze other run files. It is a blood sample run on orbitrap fusion. The total number of scans are around 88000 (I consider it a high number).till comet search there are many peptides hits (upto 50000). But as soon as I put stats/models of validation (peptide prophet or iprophet) the number of unique peptides falls down to 300. This drastic reduction in the number of accurate peptides and hence proteins as well force me to think that I am not using correct statistiical models.I assume that from such a large number of PSM getting only 300 proteins that too iin blood, is unbelievable.
<image.png>original without puttin error filter
<image.png>
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHS_9bhP3jtUO%2BaUKmRMj-J%3DMte0EQ-%3DXcnwdeg%2BD%3D52dw%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/223C4760-0503-464D-B7C9-89B0DC35E03E%40systemsbiology.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/223C4760-0503-464D-B7C9-89B0DC35E03E%40systemsbiology.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHQncd6%3DtGUOne8Khi3_W-%3Dj8g840yvP3kxnHOh2n_NzWw%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHT3nPTdmd_aftzyoCdnkvV%2BBOW75T5ZkBmkkdPqudrHjw%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CAGJJY%3D_2JpPnKbPqN4W9rvb7ry%3D8GNyEbE_AqyZFeghohf6raQ%40mail.gmail.com.
On Aug 1, 2024, at 3:30 AM, sudarshan kumar <kumarsu...@gmail.com> wrote:
Hi Davidthank you so much. I learnt a lot from your discussion.Can you please go through the ppt? I have questions regarding how to include 0 probability assigned isoform of a protein in our list of identified protein. It is not possible if i use the error rate cutoff criteria to sort the list.Case- I am studying tissue proteome. It has more than 30 isoforms of a protein called pregnancy associated glycoprotein. I miss many of them in my list of protiens if i filter the protiens list as per the error rate of .05sensitivity: error table gives me the last error rate cutoff of .2086. How advisable is it to move down still lower like .1 or .01.When I lower the min_prob cutoff beyond .2086, it increases the number of identification by including more proteins which belong to the already reported "group of protein" some of which are with high number of PSM. I can see those isoforms also.Please give your expert opinion.
On Wed, Jul 31, 2024 at 9:27 AM 'David Shteynberg' via spctools-discuss <spctools...@googlegroups.com> wrote:
Absolutely! Another thing to understand here is that the statistical analysis happens on several data reduction layers where PeptideProphet works on the level of PSMs, iProphet works of the level of peptide sequences and ProteinProphet works on the level of proteins. Since these layers stack on top of each other, small errors from incorrect statistics at an earlier layer propagate into much larger errors at the lower levels. Keeping entrapment decoys in the database allows one to have another evaluation of the error in addition to the TPP models' estimation and provides an FDR estimate that is independent from the TPP. Unlike "True Positives", independent entrapment decoys are "True Negative" random matches that are not biased by the researcher's prior expectations.Also, there maybe a misunderstanding: as you lower the minimum probability threshold the error increases, and the sensitivity also increases (hopefully much faster than the error!)Cheers!-David
On Wed, Jul 31, 2024, 1:02 AM sudarshan kumar <kumarsu...@gmail.com> wrote:
Hi David,I am reading your paragraph again and again to understand it fully and word by wordYou said - I recommend you focus on the sensitivity of your analysis rather than the absolute total of proteins identified, without consideration for errorI usually take the 0.05 error to use the minimum probability cutoff to sort my data. You mean to suggest me that I can go for higher sensitivity. If I see this table - at higher sensitivity also (.8965) I am getting similar (5268) number of correct hits which correspondence to the error rate of .02. Even if i am increasing the sensitivity threshold on my data- the correct hits keeps going down (as per the statistics) which will further reduce the absolute number of correct proteins identified.As a researcher I want only the least number of the spectra should be discarded by the prophet. What intrigues me is that - out of 88000 scans/spectra only 5000 are assigned to peptides at an error rate of .05. Do you think this is normal?
<image.png>
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHRKeRNpX4fkN-yc56Fxe%3Do3buD1Cmsv%3DbTL7JxVBrSAtQ%40mail.gmail.com.
<query.pptx>
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/169EE197-AB22-4F14-95D9-4C7AEF766AD8%40systemsbiology.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CACyS9bqvTJGFnvgkQg3g-vA%2BvJJB4ht0mGpn%3DjBuLO4c%3DK5rmQ%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CALZrgHT9%2BCymTppRckev9--%2BMV4VSG%2BL%2Bkm2Xn6ZYUduFk1voA%40mail.gmail.com.