FDR Uncertainty Measures

dctrud

unread,

Jun 9, 2009, 9:43:21 AM6/9/09

to spctools-discuss

Hi,

In the latest TPP versions ProphetModels.pl now calculates uncertainty
values for the FDRs (both prophet and decoy calculated), with the
decoy FDR uncertainties displayed as error bars in the plots. I note
that the uncertainty values are calculated with the formula:

$PP_decoy_uncert = sqrt($fdr_pp_decoy*(1-$fdr_pp_decoy)/$num_pos_pp);

I wondered where this formula comes from? I came across a paper
recently in which FDR uncertainty is modelled, leading to asymmetric
confidence intervals for the FDR dependent on dataset size and number
of decoy hits:

Edward L. Huttlin, Adrian D. Hegeman, Amy C. Harms, and Michael R.
Sussman
Prediction of Error Associated with False-Positive Rate Determination
for Peptide Identification in Large-Scale Proteomics Experiments Using
a Combined Reverse and Forward Peptide Sequence Database Strategy
J. Proteome Res., 2007, 6 (1), 392-398

Given the example in the paper, where at 1% FDR there were 35 decoy
hits out of 3418

We have a pipeline incorporating some of the TPP software where I need
to give users an assessment of the uncertainty in FDRs, and am trying
to decide whether to pass through the ProphetModels values or use the
Huttlin et. al. model.

Many Thanks,

Dave Trudgian

David Shteynberg

unread,

Jun 9, 2009, 4:15:17 PM6/9/09

to spctools...@googlegroups.com

Hi Dave,

This is the basic standard error calculation that can be applied to a
simple random sample.

-David

Dave Trudgian

unread,

Jun 10, 2009, 4:58:02 AM6/10/09

to spctools...@googlegroups.com

David,

Thanks. This makes sense now, I was reading $num_pos_pp incorrectly as
something other than n.

DT

--
Dr. David Trudgian
Bioinformatician in Proteomics
University of Oxford

Tel: (+44) (01865 2)87807 (CCMP - Mon-Thu)
Tel: (+44) (01865 2)75557 (Dunn Sch - Fri)

Reply all

Reply to author

Forward