Search of fractionated mzML, merge or separately?

84 views
Skip to first unread message

Yiyang Liu

unread,
Apr 25, 2022, 5:33:07 PM4/25/22
to Comet ms/ms db search support
Hi there,

This is a slightly silly question but we are looking to search fractionated mzMLs using Comet as part of the OpenMS framework. We have 2 choices with Comet:
1. search each fraction mzML in Comet (parallelizing over the fractions) and then merge the results later (the results are idXML files and can be merged using OpenMS's IDMerger)
2. merge the fraction mzMLs first using OpenMS's FileMerger, and search using Comet (once per sample and non-parallel)

When we tested the two options we found that the scores outputted by Comet are different
142774482-420bf0e4-2ea1-404d-9cb6-49952178ee38.png

We were wondering if this is expected and what you would advice on doing: parallel or non-parallel?

Thank you very much!


Jimmy Eng

unread,
Apr 25, 2022, 6:11:04 PM4/25/22
to Comet ms/ms db search support
I'm unfortunately not familiar with OpenMS's IDMerger or FileMerger.  Can you explain what score you are plotting?  Is it a Comet score or some score generated by an OpenMS tool?  I suspect that the answer is the latter.  And if that's the case, you are better off asking this question about the post search tool that generated these scores in OpenMS support.

I can (semi!) confidently state that a single spectrum searched through Comet will generate the same Comet scores for the best peptide match whether that single spectrum is searched alone, searched as part of one mzML file, or searched in some other larger file.  This means that every spectrum search is independent of the others and will generate the same peptide match and Comet scores whether the spectrum is searched within one file or within some merged larger file.  So from a Comet specific standpoint, it doesn't matter how the searches are performed.

Suresh Poudel

unread,
Apr 25, 2022, 6:37:24 PM4/25/22
to Comet ms/ms db search support
I second with Jimmy here. I have tested comet with multiple fractions parallel or single fraction iteratively and every-time I get same score for same spectra. Make sure to use same database on every searches if you are comparing such scores. Also, it will be great to get some outliers and put the output from pep.xml file for that particular scan. 

Suresh (frequent comet user)

Chenghao Zhu

unread,
Apr 25, 2022, 6:59:14 PM4/25/22
to Comet ms/ms db search support
I'm a colleague of Lydia. That figure is actually using the raw search score of Comet directly. We are now thinking that it might be that FileMerger updates the spectrum reference ID but not the ID of the precursors, that might gave it some trouble. Glad to hear that the same spectrum should have the same score independent of others!

Trevor

Jimmy Eng

unread,
May 2, 2022, 1:11:41 PM5/2/22
to Comet ms/ms db search support
Thanks Trevor.  Sorry for the late reply; for some reason I'm not getting email notifications for every post and just saw your reply today.  I did want to follow-up as you mention that figure plots a raw Comet search score.  The two primary Comet search scores are xcorr (cross-correlation score) and E-value and neither of these two scores have values that span that 0 to 1000 magnitude range that's plotted.  There is a secondary score, termed Sp (for preliminary score), that could span that range.  If you don't mind, can you let me know what specific score is being plotted just to satisfy my curiosity?

Chenghao Zhu

unread,
May 12, 2022, 2:34:51 PM5/12/22
to Comet ms/ms db search support
Hi Jimmy, also sorry for the late response! We are using the CometAdapter to call Comet, and the score being used up there is the one set by OpenMS. So looking at the idXML file, the score being set by OpenMS CometAdapter is the "expect" score, which seems to be the value of "MS:1002257". There is also MS:1002252, MS:1002253, MS:1002254, MS:1002255, and MS:1002256. But we tried the merge-search and search-merge again, and seems like they do produce the same score. We are thinking that we might did something wield before, maybe using different searching parameters when getting the plot up there.

Jimmy Eng

unread,
May 29, 2022, 12:49:16 AM5/29/22
to Comet ms/ms db search support
Good to know that you are seeing the same score produced. 

Just as an FYI:  the "expect" score is the expectation score or E-value and is similar to a p-value in that smaller numbers are better.  So if you did want to make a score comparison plot using E-values in the future, it makes more sense to use -log(expect) instead of plotting the raw "expect" scores against each other.

Reply all
Reply to author
Forward
0 new messages