Running Comet on one spectrum

12 views
Skip to first unread message

Erik Johnson

unread,
Oct 31, 2024, 5:56:08 PMOct 31
to Comet ms/ms db search support
Hello Comet maintainers!

I have an mzML file with many spectra. For testing and exploratory data analysis purposes, I want to run Comet on just one of the spectra (or preferably as many or few as I'd like). I've tried to create an mzML file with just one spectrum but Comet doesn't seem to like the mzML I created. I'm attaching the one-spectrum mzML file that I created and here's Comet's output on that file is below. I created the file by basically copying the contents of the larger mzML and removing all but the first spectrum. 

Can someone help me understand what I'm doing wrong? Thank you!

$ ./comet.macos.exe ../data/spectra/test_spectra.mzML  

 Comet version "2024.02 rev. 0 (c40a2ed)"

 Search start:  10/31/2024, 03:53:12 PM
 - Input file: ../data/spectra/test_spectra.mzML
   - Load spectra: Warning - no spectra searched.
 Search end:    10/31/2024, 03:53:12 PM, 0m:0s

Best,
Erik 
test_spectra.mzML

Jimmy Eng

unread,
Oct 31, 2024, 7:37:31 PMOct 31
to Comet ms/ms db search support
Erik,

Sadly, manually editing mzML (and mzXML files) almost never works out well.  Some tools will be able to parse your mzML fine (those that don't use the wrapper index) but Comet expects an mzML with a valid scan index.  And your manually edited file has a couple of issues with that scan index.  If you want to pursue the single scan mzML route, read below for what I observe and the fix.  If I were in your shoes, I would either (1) use the "scan_range" parameter in Comet to search a single scan in your unedited, valid mzML file, e.g. "scan_range = 1 1" to search only spectrum 1.  Or convert your data to mgf or ms2 and edit either of those formats to avoid the complexities of trying to make a valid mzML.

Here's why Comet doesn't like your mzML file:

Towards the end of your mzML, it has this line entry:
      <offset idRef="scan=1">4375</offset>
That index offset value 4375 points to line 73 of test_spectra.mzML which is incorrect:
      <spectrumList count="2" defaultDataProcessingRef="pwiz_Reader_conversion">

I'm no mzML expert but I believe the index offset value should point to line 74 which is the spectrum element that contains that "scan=1" id.  The file offset for  the beginning of this line is 4437.  Here's line 74:
        <spectrum index="0" id="scan=1" defaultArrayLength="82">

If I manually edit your test_spectra.mzML with the changes below in red, Comet searches it fine.

  <indexList count="1">
    <index name="spectrum">
      <offset idRef="scan=1">
4437</offset>
    </index>
    <index name="chromatogram">
    </index>
  </indexList>
  <indexListOffset>9047</indexListOffset>

</indexedmzML>

Hope this helps.  Thanks for using Comet and let me know if you have any follow-up questions!

Jimmy

Suresh Poudel

unread,
Nov 1, 2024, 1:43:53 PMNov 1
to Erik Johnson, Comet ms/ms db search support
You can always make .ms2 file and comet works great.

Best,


Suresh Poudel
Lead Bioinformatics Research Scientist
Laboratory of Douglas R. Green
Department of Immunology
St. Jude Childrens Research Hospital


--
You received this message because you are subscribed to the Google Groups "Comet ms/ms db search support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to comet-ms+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/comet-ms/06e3c4dc-30a0-45e2-8462-23fe8f3a0521n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages