Error in NSAF calculation?

124 views
Skip to first unread message

kschweig

unread,
Mar 1, 2012, 4:12:14 PM3/1/12
to crux-users
Hi All,
I was looking closely at the NSAF values of some data and was noticing
that the NSAF values are all very tiny (most abundant protein has
NSAF : P69905 = 0.00021868. However, the unnormalized score (spectral
count I assume) is P69905 = 1137.0. Well, P69905 is 142 aa long, so
the numerator of the NSAF (as its described on
http://noble.gs.washington.edu/proj/crux/spectral-counts.html) should
be about 8.0. The denominator should be the sum of the length-
normalized counts over all the proteins. What I think is happening is
that the denominator is not being length-normalized and this would be
a bug in the code.

here is the code snippet for protein normalization from
SpectralCounts.cpp with my annotation:

/**
* Changes the scores in protien_scores_ to either be divided by the
* sum of all scores times the peptide length (SIN, NSAF) or to be the
* final emPAI score.
*/
void SpectralCounts::normalizeProteinScores(){
if( measure_ == MEASURE_EMPAI ){
computeEmpai();
} else {

carp(CARP_INFO, "Normalizing protein scores");
FLOAT_T total = 0.0;

// calculate sum of all scores
for (ProteinToScore::iterator it = protein_scores_.begin();
it != protein_scores_.end(); ++it){
FLOAT_T score = it->second;
total += score; //----> This is not length
normalized!!
}

// normalize by sum of all scores and length
for (ProteinToScore::iterator it = protein_scores_.begin();
it != protein_scores_.end(); ++it){
FLOAT_T score = it->second;
Protein* protein = it->first;
it->second = score / total / protein->getLength(); //----> Here
is the NSAF for each protein.
}
}
}


Am I interpreting this wrongly? Honestly, I am a C-programmer not C++,
so maybe I am losing something here.

Best,
Karl

kschweig

unread,
Mar 1, 2012, 4:53:55 PM3/1/12
to crux-users
I made a change:

// calculate sum of all scores
for (ProteinToScore::iterator it = protein_scores_.begin();
it != protein_scores_.end(); ++it){
FLOAT_T score = it->second;
Protein* protein = it->first;
total += score/protein->getLength();
}

And now the value for P0CG48 0.02853354

And also the sum of all the NSAF values is one.

On Mar 1, 1:12 pm, kschweig <kschw...@gmail.com> wrote:
> Hi All,
> I was looking closely at the NSAF values of some data and was noticing
> that the NSAF values are all very tiny (most abundant protein has
> NSAF : P69905 = 0.00021868. However, the unnormalized score  (spectral
> count I assume) is P69905 = 1137.0. Well, P69905 is 142 aa long, so
> the numerator of the NSAF (as its described onhttp://noble.gs.washington.edu/proj/crux/spectral-counts.html) should

Sean McIlwain

unread,
Mar 1, 2012, 5:30:12 PM3/1/12
to kschweig, crux-users
Karl,

Thanks for catching that! I will post a patch to correct this bug
within the latest release of crux.

Thanks,
Sean

> --
> You received this message because you are subscribed to the Google Groups "crux-users" group.
> To post to this group, send email to crux-...@googlegroups.com.
> To unsubscribe from this group, send email to crux-users+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/crux-users?hl=en.
>

--
Sean J. McIlwain, Ph. D.
Post-doctoral Fellow, Noble Lab,
Dept. of Genome Sciences,
University of Washington
http://noble.gs.washington.edu/~mcilwain

kschweig

unread,
Mar 1, 2012, 7:26:17 PM3/1/12
to crux-users
ok sure.
thanks much,
karl
> > For more options, visit this group athttp://groups.google.com/group/crux-users?hl=en.

Sean McIlwain

unread,
Mar 2, 2012, 3:43:34 PM3/2/12
to kschweig, crux-users
Here is the patch that will fix the protein length issue for NSAF in
the latest release of crux. These changes will also be within the
next release of crux.

To apply the patch:
1) cd to base directory of source code from crux_1.37.tar.gz.
2) patch -p0 < nsaf_length.diff
3) make distclean
4) ./configure
5) make

Let me know if there are any issues.

Thanks,
Sean

> For more options, visit this group at http://groups.google.com/group/crux-users?hl=en.

nsaf_length.diff

Charles Grant

unread,
Mar 2, 2012, 4:06:29 PM3/2/12
to kschweig, crux-users
Hi Kark,

On Mar 2, 2012, at 12:43 PM, Sean McIlwain wrote:

> Here is the patch that will fix the protein length issue for NSAF in
> the latest release of crux. These changes will also be within the
> next release of crux.
>
> To apply the patch:
> 1) cd to base directory of source code from crux_1.37.tar.gz.
> 2) patch -p0 < nsaf_length.diff
> 3) make distclean

A slight correction, that should be 'make clean', not 'make distclean'

Charles

kschweig

unread,
Mar 5, 2012, 3:49:27 PM3/5/12
to crux-users
Thanks for posting the fix!

One other question. Would it be possible in the next release, to have
raw spectral counts as an option in spectral-counts? Its nice to be
able to compare the unnormalized data to the various types of
normalized output, or to apply a different normalization scheme to the
data. I hacked my own version to allow for this by making an
additional boolean flat to spectral-counts (--rawcounts T). This co-
opts the NSAF calculation and omits the normalization but does not
affect anything else. If that flag is false, it does the usual NSAF.
Thats kind of wonky, but it works.

Incidentaly, its how I discovered the bug in spectral-counts.

Best,
-karl

Charles Grant

unread,
Mar 5, 2012, 8:10:49 PM3/5/12
to kschweig, crux-users
Hi Karl,

On Mar 5, 2012, at 12:49 PM, kschweig wrote:

> One other question. Would it be possible in the next release, to have
> raw spectral counts as an option in spectral-counts? Its nice to be
> able to compare the unnormalized data to the various types of
> normalized output, or to apply a different normalization scheme to the
> data. I hacked my own version to allow for this by making an
> additional boolean flat to spectral-counts (--rawcounts T). This co-
> opts the NSAF calculation and omits the normalization but does not
> affect anything else. If that flag is false, it does the usual NSAF.
> Thats kind of wonky, but it works.
>
> Incidentaly, its how I discovered the bug in spectral-counts.


This is a great suggestion. Thanks. We've added it to the work list for the next release of Crux.

Charles

Reply all
Reply to author
Forward
0 new messages