Search speed for variable modifications -- internal optimization potential?

55 views
Skip to first unread message

Chris Bielow

unread,
Jul 1, 2022, 3:44:10 AM7/1/22
to Comet ms/ms db search support
Hi Jimmy,


we've recently had an issue reported on the OpenMS issue tracker, which suggested that our CometAdapter (which just wraps Comet by simply writing a param file and calling Comet), is slower than calling comet itself (the corresponding issue is here: https://github.com/OpenMS/OpenMS/issues/6200)

The two searches basically have identical parameters, but differ slightly in how varMods are specified. The 'fast' config combines varMods (which allows it to be about 3-fold faster when searching for this particular configuration), the 'slow' config lists varMods separately for each amino acid, even if its the same underlying modification. Our CometAdapter happens to write the slow config... and hence comet takes some more time to run.

The 'slow' config:

variable_mod01 = 43.0058 K 0 5 -1 0 0 0.0
variable_mod02 = 43.0058 R 0 5 -1 0 0 0.0
variable_mod03 = 27.9949 K 0 5 -1 0 0 0.0
variable_mod04 = 27.9949 S 0 5 -1 0 0 0.0

vs. fast config:

variable_mod01 = 43.0058 KR 0 5 -1 0 0 0.0
variable_mod02 = 27.9949 KS 0 5 -1 0 0 0.0

Technically, this is not the same, since we just combined 5R plus 5K mods into 5(K or R) mods, but since max_variable_mods_in_peptide = 5, it does not make a difference.

So, my naive guess is that, the internal search scheme for Comet could potentially be optimized, such that, no matter if the user provides varMods combined or separate (i.e., fast vs. slow config), the search space is always as small as possible. At least for these trivial cases where the maximum number of allowed mods equals ' max_variable_mods_in_peptide'.

In OpenMS we will implement this 'fix' for the CometAdapter, unless I'm missing something totally obvious? We would simply combine varmods, where the AA's are different, but all other values are identical AND where the maximum number of sites is identical to max_variable_mods_in_peptide.
In the long run, it would however make more sense if Comet could fix this directly, since all downstream tools would potentially benefit from it.

So, in summary:
1) is this fix feasible? (or maybe there is even something better?)
2) can this be implemented in Comet? ( I would also volunteer, but I have not looked at that part of the Comet implementation...)

best,
Chris

---

Bioinformatics Solution Center (BSC)

www.bsc.fu-berlin.de

Freie Universität Berlin, Institut für Informatik, Takustr. 9, room K21, 14195 Berlin




Jimmy Eng

unread,
Jul 1, 2022, 3:07:54 PM7/1/22
to Comet ms/ms db search support
I thought I replied about ~30 minutes ago before I went to implement this in Comet but I don't see my reply at all so maybe I sent it only to Chris or I cancelled instead of posted my reply.  Anyways, yes Comet appears to sadly be progressively slower when multiple variable mods are added.  The difference between the "slow" and "fast" configs are presumably due to the extra loops that are iterated over for the additional variable mods in the slow config compared to the fast config.

I just implemented the fix in Comet code to merge the mods from the slow config to the fast config.   Chris, would you want a new Comet release with just this added functionality for OpenMS when it's ready?

I did state in my mysteriously missing reply that we are working on a better long term solution to search speeds using database indexing.  And minimally I could go add threading to regular (aka offline) Comet in searching the indexed databases used for the real-time search code.  That should also be faster than the current Comet but I'm hopeful that the other solution we're working on is the better option.

Jimmy

Chris Bielow

unread,
Jul 1, 2022, 3:22:54 PM7/1/22
to Comet ms/ms db search support
Hi Jimmy,

awesome! Thanks for picking this up so quickly and solving it!
If you are planning a new release anyway, we'd be happy to include it in our next release of course. We will however also implement the fix, since some of our users employ their own Comet versions, which when older, would also benefit if we can fix it on our end.

Thanks again!

cheers
Chris

Roger Olivella

unread,
Jul 25, 2022, 5:47:23 AM7/25/22
to Comet ms/ms db search support
Hi! Is this fix already published? If not, when you plan to do it? Thanks! 

Roger

El dia divendres, 1 de juliol de 2022 a les 21:22:54 UTC+2, Chris Bielow va escriure:

Jimmy Eng

unread,
Jul 25, 2022, 11:19:33 AM7/25/22
to Comet ms/ms db search support
Hi Roger.  This little optimization will be included in the next minor release likely today (version 2022.01 rev. 1).  The release has already been tagged and I just need to run my tests and update the documentation, both of which will happen today, before I make the release official.

Jimmy

Reply all
Reply to author
Forward
0 new messages