MS1 peak area vs peak height for protein quantification

adriaan...@gmail.com

unread,

Feb 22, 2018, 8:47:38 AM2/22/18

to ProteomicsQA

Hi all,

I was wondering what the prefered feature is to measure for protein quantification: the MS1 peak height or the peak area?
And what would be the main differences between the two aproaches?

When looking at data from proteins that are differentially expressed in two conditions, i have found that log fold changes from peak intensity are lower then log fold changes from peak areas (eg as calculated in maxquant)

Is there a 'rule' on how peak intensities and areas are related?

I was googling for literature that discusses the two approaches but did't find anything. Is this allready discussed somewhere in the literature?

Best regards

David Bouyssié

unread,

Feb 23, 2018, 3:41:43 AM2/23/18

to proteo...@googlegroups.com

Dear Adrian,

I don't think there is a golden rule and would say it depends on your experimental setup.
For SRM data I think that the peak area is mostly used.
For shotgun proteomics it depends on the used software.
I think that MaxQuant use peak areas, but other tools report the two values together.

From a mathematical point of view if chromatographic were perfect Gaussian peaks then using the area or the apex intensity would be strictly identical in terms of relative quantification.
A first problem is that XICs are often asymmetrical and they are often fitted using two kind of distributions (Gaussian and Lorentzian for instance) for the left and right sides of the XIC.
While it should not change the relative equivalence of area and apex intensity, it makes harder the exact determination of the XIC boundaries (start/end in RT dimension) and thus the integration of the area.

Another drawback of using the area is the impact of interferences on the computed value. If two peptides partially overlaps in RT dimension, and if the signal detection algorithm failed at separating the two signals then you may overestimate the peptide ion quantity while using the area. The apex may be less impacted by the interference in function of the degree of overlapping.
From my experience using the apex intensity helps a lot to avoid this kind of issues.

However a drawback of the apex intensity is that it relies on a single observation, so if your signal is noisy (not smooth in across the RT dimension) you may also overestimate your quantity. But keep in mind that integrating the area of noisy peaks degrades your signal-to-noise ratio because the SNR is not constant across the RT dimension and is much higher at the apex.

The best I would say for you would be to benchmark the ratios accuracy of spiked peptides at known concentrations and to compare the linearity of observed values versus theoretical values.
Three years ago we published a benchmarking study of different tools for label-free quantification (https://www.sciencedirect.com/science/article/pii/S187439191530186X?via%3Dihub).
In this study with spiked UPS1 proteins under different concentrations in yeast background at constant concentration.
You will find the corresponding raw files in ProteomeXchange using the identifier PXD00181.
You can use these files if you want to make a comparison using SkyLine and MaxQuant for instance.

Best,

David

adriaan...@gmail.com

unread,

Feb 25, 2018, 1:34:53 AM2/25/18

to ProteomicsQA

HI David,

Thanks for the answer.
The observation I mentioned in my original post was based on data from the CPTAC UPS study.
When I looked at log foldchanges from area intensities it came close to the expected fold changed as expected by the spiked in concentration, while the apex based intensites were consistently underestimated.
It also look that the relationship is linear, so by multiplying by a constant, you could correct for it.
I guess that if you are only interested in detecting difference, they both work fine but if you want to now by how much, you're better of with area based intensities.

Best
Adriaan

Reply all

Reply to author

Forward