Re: Regarding peptide mass type and isotope error

46 views
Skip to first unread message
Message has been deleted
Message has been deleted
Message has been deleted

Jimmy Eng

unread,
May 7, 2024, 7:02:29 PMMay 7
to Comet ms/ms db search support
As another option to consider since you're new to the field and are curious about these issues, you might want to think about using wide precursor tolerances for your search.  Read Phil Wilmart's blog here and this old MacCoss lab publication on the topic.  Those might convince you to use a wide precursor tolerance which would make the isotope_offset parameter unnecessary (because the wide precursor window would cover the one or two or three isotope offsets).

On Tuesday, May 7, 2024 at 1:27:40 AM UTC-7 debojy...@gmail.com wrote:
Thank you, Dr Tabb for the reply. Indeed its very comforting to see that one can get great advice no matter the distance!

My data is from a using a Q Exactive Plus, so I will stick to monoisotic masses. 

I came across this paper (https://pubs.acs.org/doi/10.1021/acsomega.8b01649) which kind of suggests that a small but significant fraction of masses are actually in the +1, +2 form, although, from the distribution it appears that only longer peptides would have +1 or +2 forms as the highest abundance peak (Figure 1). Just thought I would point it out. Are we missing out longer peptide PSMs beacuse of this, I was wondering.

Thanks again for the advice!

Regards
Debojyoti

On Tuesday, May 7, 2024 at 12:50:27 PM UTC+5:30 David Tabb wrote:
Hi, Debojyoti.

Unless you are using an ion trap for your MS scans, you will want to use monoisotopic mass for parent ions.  Generally speaking, an ion trap can distinguish between a monoisotopic peak and the 13C peak next to it in MS/MS, so I would almost always use monoisotopic for fragment ions.

I think "best practices" for isotope error are less than clear at the moment.  I often just accept the search engine default, but if I am doing the search my way, I would probably allow for just 0 (no isotope difference between expected and observed) and 1 (one neutron difference between expected and observed).  Yes, it's possible that a bigger error has been made in the instrument's monoisotope-picker, but they're a small fraction of precursors, and I don't want to blow up my search space just to cover these rare occurrences.

Welcome to this research field!  I hope it's comforting that your message reached MS bioinformaticists on multiple continents!

Dave in Groningen



On Monday, May 6, 2024 at 2:40:10 PM UTC+2 Debojyoti Pal wrote:
Hello everyone

Proteomics newbie here (relatively).  I do not have access to paid software and proteomics experts because I am trying to initiate this work in my lab. Just wanted to seek a little advice from the experienced heads doing standard bottom up proteomics:

Should I use monoisotopic or average mass for parent and fragment mass?

What kind of isotope error is ideal for standard bottom up proteomics (no SILAC): +1,  +2, +3??? 

I am using standard Trypsinization and other steps.

Please advice!

Thanks in advance.

-Debojyoti
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted

Debojyoti Pal

unread,
Jun 8, 2024, 1:39:43 AMJun 8
to Comet ms/ms db search support
What happened to this thread? I received a Google Groups report saying content has been flagged and deleted. I don't understand! It was only scientific discussion.

Jimmy Eng

unread,
Jun 8, 2024, 11:42:32 AMJun 8
to Debojyoti Pal, Comet ms/ms db search support
There was a bunch of spam posts. I thought I was deleting just those posts but the thread got deleted. 

--
You received this message because you are subscribed to a topic in the Google Groups "Comet ms/ms db search support" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/comet-ms/e-IhnSyijMY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to comet-ms+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/comet-ms/5cc5758c-e658-4a72-b331-c0b35cab9f4an%40googlegroups.com.

Debojyoti Pal

unread,
Jun 8, 2024, 1:57:39 PMJun 8
to Comet ms/ms db search support
Okay. Got it! I have the replies saved in my Gmail thankfully.

Jimmy Eng

unread,
Jun 8, 2024, 2:35:21 PMJun 8
to Comet ms/ms db search support
Fortunately the bulk of the thread posts are quoted in our replies today so most of the content is available here.  I think Phil's reply was the only one missing so I'll add it below for posterity sake.
 
On Wed, May 8, 2024 at 9:01 AM pwil...@gmail.com  wrote:
Hi Debojyoti,
As you are discovering, setting data acquisition parameters and configuring search engines can be complicated. The QE instruments have the simplest acquisition software of any of Thermo's current instruments. An often overlooked but critical part of analyzing proteomics data is choosing the FASTA file of protein sequences. That is also a surprisingly complicated subject. The README for this repository: https://github.com/pwilmart/fasta_utilities has some useful background information. 

Many of my Python scripts in that repository are out of date because NCBI and UniProt have improved their download options. Extracting organism specific sequences from large full sequence sets (NCBI nr, UniProt TrEMBL, etc.) is no longer necessary (it was the norm 10-20 years ago). Sequence collections for specific taxonomy numbers (organisms) are easier to find and download directly now. The two GUI download tools are also out of date. UniProt changed the organization of their FTP site for reference proteomes and that broke my script. I have not used the Ensembl download tool in a couple of years, so I wold be surprised if the script runs without issues. The basic FASTA utility scripts (counting, reversing, etc.) should all work. I can't seem to find the time to maintain software these days. I have too many other tasks that take priority.

This blog entry: https://pwilmart.github.io/blog/2020/09/19/shotgun-quantification-part2 looks at how many different FASTA files you can download from UniProt for mouse (as an example organism), how they differ, and what effects they might have on data analysis steps. Even though the blog is a few years old, I think the concepts are still valid. UniProt has good documentation, so you might want to research things like: what are reference proteomes, how to find proteomes, and how to download proteomes. Most things at UniProt can be done with web browsers. They also have a well-documented FTP site that is an alternative way to get FASTA files.

Jimmy was kind to point you to my rant about narrow precursor tolerance searching. Any search results require some post processing steps for PSM filtering and protein inference. Those downstream steps need to be aware of how to handle wide tolerance search results. Narrow tolerance searches have been the default setting for so many years now, that most downstream tools have probably never been tested with wide tolerance search results. Using PPM for precursor tolerance with isotopic errors is perhaps the safest choice. However, 10 PPM precursor tolerance is too narrow. 20 or 30 PPM are usually better choices for Orbitraps, depending on how frequently you calibrate your instrument (most labs would rather run samples rather than calibration mixes, so calibrations might not be as frequent as they should be).
Cheers,
Phil

Debojyoti Pal

unread,
Jun 8, 2024, 11:28:17 PMJun 8
to Comet ms/ms db search support
Yes, this thread had very valuable information (atleast for a novice). Thank you for posting it back.
Reply all
Reply to author
Forward
0 new messages