The source of my calculations is
http://cnx.rice.edu/content/m10805/latest/
which seems pretty straightforward.
Does anyone know what algorithm Excel uses for Percentile?
Or does anyone have any suggestions on how I can track down this
discrepancy?
Thanks!
---Selden McCabe
What source figures did you use? What results did you expect? What did you
get instead? What results did Excel give?
--
Kind Regards,
Niek Otten
Microsoft MVP - Excel
"Selden McCabe" <sel...@msn.com> wrote in message
news:eC4thJh1...@TK2MSFTNGP10.phx.gbl...
> Does anyone know what algorithm Excel uses for Percentile? <
Online Help terminology: PERCENTILE(array,k)
Zero percentile (k = 0) is the minimum value (rank = 1) of the data set
array.
100th percentile (k = 1) is the maximum value (rank = n) of the array.
If k is a multiple of 1/(n-1), PERCENTILE returns the appropriate ranked
value (rank = 2 through n-1) of the array.
If k is not a multiple of 1/(n-1), PERCENTILE interpolates (linear between
two adjacent ranked values) to determine the value at the k-th percentile.
- Mike Middleton, www.usfca.edu/~middleton
For the 25th percentile, Excel computed 2.81
using the interpolation algorithm
(which you may find at
http://cnx.rice.edu/content/m10805/latest/)
I compute it to be 2.765
Any more thoughts?
Thanks,
---Selden McCabe
I sort the data, (there are 21 items) then compute the rank of the 25th
percentile,
wich is .25 * (21 +1) = 5.5
so the result should be .5 between the 5th and 6th numbers, whcih are 2.72
and 2.81, which would be 2.765.
However the PERCENTILE(a1:a21,.25) function in Excel yields 2.81.
Do my calculations look right? Am I missing something obvious?
Thanks!
Instead the 6th ranked value (1:0%, 2:5%, 3:10%...6:25%) is returned
(2.81).
Note that XL's interpolation method is not the same as that found at
the site you reference, so you shouldn't expect the same results.
In article <OdE626F...@TK2MSFTNGP11.phx.gbl>,
XL's algorithm is based on the lowest value being the zeroth
percentile, so the 25th percentile should be the 6th ranked item,
not interpolated.
In article <e04P$BG2DH...@TK2MSFTNGP11.phx.gbl>,
Not starting the numbering from zero was my main error, along with
multiplying the desired percentile * (N + 1) instead of (N - 1). Now I'm
getting the same results as Excel!
Thanks again!
---Selden
In your link, Lane writes that "there is no universally accepted
definition of a percentile."
Hyndman and Fan, 1996, "Sample Quantiles in Statistical Packages", The
American Statistician 50(4):361-365 discuss 9 different definitions and
reference some others. Excel uses Hyndman and Fan's 7th definition,
which considers the min and max to be the 0th and 100th percentiles.
Lane recommends Hyndman and Fan's 6th definition, which considers the
min and max to be the 1/(n+1) and n/(n+1) percentiles.
Both definitions evenly space the percentiles of the remaining
observations between the min and max, and use linear interpolation to
estimate unobserved percentiles. Most of Hyndman and Fan's other
definitions work the same way, but take the percentiles of the min and
max to be somewhere in between the values chosen by Excel and by Lane.
Jerry