Get the middle part of a result set

11 views
Skip to first unread message

Jens Riedel

unread,
Dec 18, 2007, 3:17:49 AM12/18/07
to
Hi,

I have to following problem:

I get X rows from a statement, these are sorted by a certain column,
let's say an numerical value.
Now I want to calculate the average of this numerical value, but the 10%
with the lowest and the 10% with the highest value shall not be included
in this calculation.
So for example, if I get 20 rows, I need the average of the value in
rows 3 to 18.

Currently I solved this with a very complicated statement, but I don't
know the built-in Oracle mathematical functions so I hope that there
could be a way to do this with a better performance.

Thanks for any hints,

Jens

Charles Hooper

unread,
Dec 18, 2007, 7:08:44 AM12/18/07
to

Let's set up a short experiment:
CREATE TABLE T1 (C1 NUMBER(4));

INSERT INTO T1
SELECT
ROWNUM
FROM
DUAL
CONNECT BY
LEVEL<=20;

We now have a table with 20 rows with numbers between 1 and 20.

Assuming that you are running a version of Oracle that supports
analytical functions, the following returns the twenty rows with the
relative ranking of each row, if the rows are sorted by C1 in
descending order:
SELECT
C1,
DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
COUNT(C1) OVER (PARTITION BY 1) R
FROM
T1;

C1 DR R
---------- ---------- ----------
20 1 20
19 2 20
18 3 20
17 4 20
16 5 20
15 6 20
14 7 20
13 8 20
12 9 20
11 10 20
10 11 20
9 12 20
8 13 20
7 14 20
6 15 20
5 16 20
4 17 20
3 18 20
2 19 20
1 20 20

A slight modification of the above, dividing the value of DENSE_RANK
by the value of COUNT, and also including a PERCENT_RANK for
comparison:
SELECT
C1,
DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
(DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION BY
1)) DRP,
PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
FROM
T1;

C1 DR DRP PR
---------- ---------- ---------- ----------
20 1 .05 0
19 2 .1 .052631579
18 3 .15 .105263158
17 4 .2 .157894737
16 5 .25 .210526316
15 6 .3 .263157895
14 7 .35 .315789474
13 8 .4 .368421053
12 9 .45 .421052632
11 10 .5 .473684211
10 11 .55 .526315789
9 12 .6 .578947368
8 13 .65 .631578947
7 14 .7 .684210526
6 15 .75 .736842105
5 16 .8 .789473684
4 17 .85 .842105263
3 18 .9 .894736842
2 19 .95 .947368421
1 20 1 1

The final cleanup is performed when the above is slid into an inline
view, by using a WHERE clause:
SELECT
SUM(T.C1) S
FROM
(SELECT
C1,
DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
(DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION
BY 1)) DRP,
PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
FROM
T1) T
WHERE
T.DRP>0.1
AND T.DRP<=0.9;

S
----------
168

A version that uses the PERCENT_RANK value:
SELECT
SUM(T.C1) S
FROM
(SELECT
C1,
DENSE_RANK() OVER (ORDER BY C1 DESC) DR,
(DENSE_RANK() OVER (ORDER BY C1 DESC))/(COUNT(C1) OVER (PARTITION
BY 1)) DRP,
PERCENT_RANK() OVER (ORDER BY C1 DESC) PR
FROM
T1) T
WHERE
T.PR BETWEEN 0.1 AND 0.9;

S
----------
168

You will obviously need to test the above approach with your data.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

Jens Riedel

unread,
Dec 18, 2007, 10:16:45 AM12/18/07
to
Hi Charles,

thank you very much for this example, it goes in the same direction like
my approach but looks a bit more generic - I'll test it and try if it
increases the performance of my current solution.

KR,
Jens

Reply all
Reply to author
Forward
0 new messages