Hi Erik.
Regarding #1, the processed "spectrum" that Comet uses internally is a binned array of floating point numbers where the array index is the binned mass and the array value is the processed intensity for the fast cross correlation scoring. The binned mass is based on the
fragment_bin_tol and
fragment_bin_offset parameters. This pseudo "spectrum", especially after the processing for the fast cross correlation scoring, is semi useless unless you use it directly to generate the cross correlation scores. If this is what you want, intervene at line 929 of CometPreprocess.cpp and export
pScoring->pfFastXcorrData[] in whatever format you'd want to see that array. Or if you're interested in the array before the fast cross correlation processing, dump
pdTmpRawData[] at line 979 of the same file. If this isn't want you're looking for, definitely follow-up again; I'll try and assist you in whatever you're trying to accomplish. I just don't think Comet's internal array representation of spectra is really going to be useful unless you want to learn about how Comet does things. If you want either of these arrays and aren't comfortable adding in the code to export them, let me know and I'll get you a binary that does this for you. You'll have to define what format you want these exported in.
The "ions_total" value is the number of theoretical fragment ions in a peptide. For a peptide of length N, if you're only considering b- and y-ions, this number would be 2*(N-1). What you might be missing is that this number is also scaled by the fragment ion charge states that are analyzed. So if 1+ and 2+ fragment ions are analyzed, this number would be 2*2*(N-1). If 1+, 2+, and 3+ fragment ions are analyzed, this number would be 3*2*(N-1). The maximum fragment ion charge state considered in controlled by the
max_fragment_charge parameter. Comet considers all fragment charge states up to 1 less than the precursor charge state or the charge defined in
max_fragment_charge , whatever is less.