Hi David,
Thanks for pointing out the wrong mass - there were a couple of others as well. So I've got quite a lot further with the analysis, but I'm not convinced that it's working as it should. Maybe it's easiest if I detail the steps I've gone through and ask some questions as I go. It would be great if you could answer these / annotate with any other thoughts you have:
Step 1: Search using standard 14N labelled masses. The 15N search seems to be fairly useless (see later). No variable modifications in search, because otherwise ASAPRatio complains and fails to run in 'static' mode.
Step 2: InteractParser $intout $in
Step 3: PeptideProphetParser $intout DECOY=XXX_ NONPARAM
Note I get a lot of model failures for this data. I get less model failures when I use fully parametric modelling, but then the decoy search hits are not properly taken into account meaning that in the final prot.xml file I have single -protein protein groups consisting only of a decoy hit. So I continued with the semi-parametric options.
Step 3: ASAPRatioPeptideParser $in -lACDEFGHIKLMNPQRSTVWY -r0.5 -S -mA72.04581R160.1359N116.0603D116.0356C161.0393E130.0513Q130.076G58.03016H140.085I114.0928L114.0928K130.1124M132.0492F148.0771P98.06146S88.04073T102.0564W188.0967Y164.072V100.0771
So the -m string specifies the modified masses (ie 15N masses) for comparison with the 14N search results.
This doesn't work the other way round (taking the 15N search results and using the 14N masses in the -m string. This is because of cysteine carbamylation, meaning that ASAPRatio sees the C mass as being a 'heavy' static modification, and the other residues as being 'light' modifications. This results in an error message that there are a mixture of heavy and light modifications).
Step 4: ProteinProphet $light $out ASAP_PROPHET
I get ratios for reasonable numbers of proteins - but unfortunately I'm not convinced these are reliable. To take one example:
> L/H ratio given as 2.98
> Based on peptide: CTTSAAATSTSSGR
> Looking at the details in the viewer I see "light +2 m/z 679.3" and "heavy +2 m/z 679.8"
> There are 17 N atoms in the peptide, 16 with one neutral loss. m/z for the mass difference between light and heavy in the latter case will then be 8. So ASAPRatio should be looking for a mass of about 687.
> When I look in the 3D data viewer there are candidate peaks in this region.
I also tried xpress and got quantification that were quite different (L/H is 0.66 in this example). The trouble with the xpress output is that there doesn't seem to be a way to click through to the underlying data for inspection. In fact using Petunia I see no way of finding out what peptide or heavy peak identification the ratio is based on. Presumably this information is buried somewhere in the prot.xml / pep.xml files?
So my questions are:
1. Should -r be bigger to allow ASAPRatio to find the correct heavy peak?
2. Why does ASAPRatio accept a heavy peak that is so obviously the wrong mass given the static modifications? It seems to be completely ignoring them!
3. Ideally I would use the 15N search results to validate the identity of the heavy peak where possible. Is there a way to do this other than looking it up manually?
Many thanks for the help!
Alastair