Hi Jimmy,
we've recently had an issue reported on the OpenMS issue tracker, which suggested that our CometAdapter (which just wraps Comet by simply writing a param file and calling Comet), is slower than calling comet itself (the corresponding issue is here:
https://github.com/OpenMS/OpenMS/issues/6200)
The two searches basically have identical parameters, but differ slightly in how varMods are specified. The 'fast' config combines varMods (which allows it to be about 3-fold faster when searching for this particular configuration), the 'slow' config lists varMods separately for each amino acid, even if its the same underlying modification. Our CometAdapter happens to write the slow config... and hence comet takes some more time to run.
The 'slow' config:
variable_mod01 = 43.0058 K 0 5 -1 0 0 0.0
variable_mod02 = 43.0058 R 0 5 -1 0 0 0.0
variable_mod03 = 27.9949 K 0 5 -1 0 0 0.0
variable_mod04 = 27.9949 S 0 5 -1 0 0 0.0
vs. fast config:
variable_mod01 = 43.0058 KR 0 5 -1 0 0 0.0variable_mod02 = 27.9949 KS 0 5 -1 0 0 0.0
Technically, this is not the same, since we just combined 5R plus 5K mods into 5(K or R) mods, but since max_variable_mods_in_peptide = 5, it does not make a difference.
So, my naive guess is that, the internal search scheme for Comet could potentially be optimized, such that, no matter if the user provides varMods combined or separate (i.e., fast vs. slow config), the search space is always as small as possible. At least for these trivial cases where the maximum number of allowed mods equals '
max_variable_mods_in_peptide'.
In OpenMS we will implement this 'fix' for the CometAdapter, unless I'm missing something totally obvious? We would simply combine varmods, where the AA's are different, but all other values are identical AND where the maximum number of sites is identical to max_variable_mods_in_peptide.
In the long run, it would however make more sense if Comet could fix this directly, since all downstream tools would potentially benefit from it.
So, in summary:
1) is this fix feasible? (or maybe there is even something better?)
2) can this be implemented in Comet? ( I would also volunteer, but I have not looked at that part of the Comet implementation...)
best,
Chris
---
Bioinformatics
Solution Center (BSC)
www.bsc.fu-berlin.de
Freie Universität Berlin, Institut
für Informatik, Takustr. 9, room K21, 14195 Berlin