I'm trying to rank each product (most, second most, etc.) by weight.
Data worksheet
A B C
1 Product Type Pounds
2 Orange Fruit 600
3 Tomato Vegetable 500
4 Apple Fruit 700
5 Potato Vegetable 1,000
Using the LARGE function, the goal is to get the following result on
another worksheet:
Order worksheet
A B C
1 Product 1 2
2 Fruit Apple Orange
3 Vegetable Potato Tomato
On the Order worksheet, the following formula is the one I'm using
with an array:
{=IFERROR(INDEX(Data!$A$1:$C$5,MATCH(LARGE(IF(Data!$B$1:$B$5=
$A2,Data!$C$1:$C$5,""),B$1),Data!$C$1:$C$5,0),1),"")}
Does anyone know how to get the same result without using an array -
even if it uses another function ?
=IFERROR(INDEX(Data!$A$1:$A$5,MATCH(LARGE(INDEX(
(Data!$B$2:$B$5=$A2)*Data!$C$2:$C$5,),B$1),Data!$C$1:$C$5,0)),"")
BTW, this (and your original) would fail if you had two products of the same
weight and type.
HTH
Steve D.
"exceluser" <ifmcqy7...@yahoo.com> wrote in message
news:4ed0be3e-3a9b-44eb...@u7g2000vbq.googlegroups.com...
=IFERROR(INDEX(Data!$A$1:$A$10,MATCH(LARGE(INDEX(
(Data!$B$2:$B$10=$A2)*Data!$C$2:$C$10+
ROW(Data!$C$2:$C$10),),B$1),INDEX(Data!$C$2:$C$10+
ROW(Data!$C$2:$C$10),),0)+1),"")
"Steve Dunn" <st...@sky.com> wrote in message
news:9F7E0B2D-7169-4572...@microsoft.com...
Hello,
I suggest to use array functions to get your result right:
In sheet Data, enter into D1 "Type Rank" and into D2
=SUMPRODUCT(--($B2=$B$2:$B$999),--($C2<$C$2:$C$999))+SUMPRODUCT(--($B2=
$B$2:$B2),--($C2=$C$2:$C2))
and copy down.
Then you can array-enter in sheet Order into B2:
=INDEX(Data!$A$2:$A$999,MATCH($A2&"|"&B$1,Data!$B$2:$B$999&"|"&Data!$D
$2:$D$999,0))
and copy down and across.
If you like to compare with other approaches, test with Pounds data
like 1,1,1,1 or 0,0,0,0, for example.
Further examples you can find at
http://sulprobil.com/html/sorting.html
Regards,
Bernd
Bernd,
Thank you very much for that fast and very detailed response.
The reason that I'm trying to avoid using an array is that the
sheer number of cells that would contain this formula (15,000+) would
cause my computer to lockup for over an hour and a half while it
recalculates.
I recently replaced an array with a formula using the SUMIF
function and that rewrite alone reduced the calculation time from 2.5
hours to 1.5 hours.
Now I'm just trying to eliminate that last 1.5 hours which is the
reason for my original post.
It looks like I'll be able to avoid the array with the formula that
Steve suggested above.
I also checked out your website and it was very helpful.
Again, I appreciate the time you spent creating the solution above.
Exceluser
Steve,
You are so THE MAN !
It makes me almost forget that whole BP thing ... for a few hours
anyway.
The reason why I wanted to avoid using an array was because the
original formula I was using with an array (in over 15,000 cells) was
using %100 of the CPU for approximately
90 minutes on an Intel 3.4 GHz processor with 2 GB of RAM.
To make matters worse, new data is imported daily and making any
change effecting those cells would cause a 90 minute recalculation.
This new formula has reduced the calculation from 90 minutes to
under 30 seconds.
Could you explain why ... ?
1) The second INDEX function skips the first row for 'Data!$B$2:$B$5
and 'Data!$C$2:$C$5 rather than using the whole column
2) The second INDEX function multiplies those two ranges
Thanks again for that super fast formula.
Exceluser
Steve,
Is there a way to use the second formula when sorting with a mix of
positive and negative numbers ?
Exceluser
To Max, Minyeh, Steve, Steve (UK) and Bernd P,
TA DAH !
Here it is in all its working, fattened glory.
=IFERROR(INDEX(Data!$A$2:$A$5,MATCH(INDEX(LARGE(((Data!$B$2:$B$5=
$A2)*(Data!$C$2:$C$5)+(ROW(Data!$C$2:$C$5)/10^10))+((Data!$B$2:$B$5<>
$A2)*(SMALL(Data!$C$2:$C$5,1)-1)),B$1),),INDEX(Data!$C$2:$C$5+
(ROW(Data!$C$2:$C$5)/10^10),),0)),"")
The new formula:
1) Supports sorting positive and negative numbers
2) Creates a tie breaker for duplicate values
3) Is orders faster (and fatter) than a similarly constructed array
entered formula
4) Dispenses with 0's created from FALSE values in an INDEX array
5) Does not require any additional helper columns, rows or worksheets
to store workaround data
With your very kind permission, I'd like to thank the innocent and
unsuspecting parties above who inadvertently subjected themselves to a
ridiculously, tortuous exercise.
This formula has replaced an array entered formula and crushed the
calculation time from 90 minutes to 95 seconds.
It's like replacing a $1,000,000 Rolls Royce with a dump truck that
can do 200 mph - it's not pretty, but it gets the job done.
Thanks to all of you, I've learned more about writing moderately
complex Excel formulas in the last week than I have in the last few
years.
And I shouldn't leave out the fact that I've spent an inordinate
amount of time this past week with a dumber than usual look on my face
as I tried to figure out how each of your solutions worked.
You can now resume your regularly scheduled lives ... until next
time.
MOO HOO HOO HA HA (evil laugh).
Exceluser
Hello,
Glad that you had some fun.
I think it's you who returns to a "regularly scheduled life" :-)
Just in case you lose some faith in your formula in future:
Function FoodTop2(r As Range) As Variant
'Select range of 3 x 3 cells and array-enter this function.
'First column of input range is Type, second Product, third Pounds.
'Reverse("moc.liborplus.www") PB V0.1 05-Jun-2010
Dim i As Long, j As Long
Dim s1 As String, s2 As String
Dim st(1 To 2, 1 To 2) As String
Dim d(1 To 2, 1 To 2) As Double
Dim vR(1 To 3, 1 To 3) As Variant
s1 = "Fruits"
s2 = "Vegetables"
For i = 1 To r.Rows.Count
If IsEmpty(r.Cells(i, 2)) Then Exit For
j = 1: If s2 = r.Cells(i, 2).Text Then j = 2
If st(j, 1) = "" Then
'Initialize first of this product
st(j, 1) = r.Cells(i, 1).Text
d(j, 1) = r.Cells(i, 3).Value
Else
If st(j, 2) = "" Then
'Initialize second of this product
st(j, 2) = r.Cells(i, 1).Text
d(j, 2) = r.Cells(i, 3).Value
Else
If d(j, 2) < r.Cells(i, 3).Value Then
If d(j, 1) < r.Cells(i, 3).Value Then
'Replace first and second
st(j, 2) = st(j, 1)
d(j, 2) = d(j, 1)
st(j, 1) = r.Cells(i, 1).Text
d(j, 1) = r.Cells(i, 3).Value
Else
'Replace second only
st(j, 2) = r.Cells(i, 1).Text
d(j, 2) = r.Cells(i, 3).Value
End If
End If
End If
End If
Next i
vR(1, 1) = "": vR(1, 2) = 1: vR(1, 3) = 2
vR(2, 1) = s1: vR(2, 2) = st(1, 1): vR(2, 3) = st(1, 2)
vR(3, 1) = s2: vR(3, 2) = st(2, 1): vR(3, 3) = st(2, 2)
FoodTop2 = vR
End Function
A sample file is here:
http://dl.dropbox.com/u/6077606/20100610_PB_01_Stats_on_Subtypes_15000_rows.xlsm
This macro is about 10x slower than your formula - which means that
the algorithm could be about 10x more efficient than your formula
(assuming that VBA is in general 100x slower than worksheet
functions). Of course this will only be important for you if you
detect an error in your formula or if you need to apply changes later
which you might not be able to solve with your formula.
Regards,
Bernd
Bernd,
You must wear ear plugs to keep all those brains from spilling out
of your head.
Thank you for putting the time into that last post.
At some point, I'll try to make my own user defined functions - at
which time I'll understand what you just coded.
Can you recommend some reading material for beginners that wish to
learn to create UDFs ?
Exceluser
Hello,
I suggest to follow up on examples presented here or in other forums
on Excel.
Chip Pearson has a good website. I have collected some examples as
well:
www.sulprobil.com
Maybe you would like to start with Multicat. If you understand the
difference between J.E.McGimpsey's version and mine, you will
certainly have mastered a first good step.
Regards,
Bernd
The only thing this formula doesn't do is indicate which values are
involved in a tie breaker.
For example, you're a judge that's responsible for selecting 10
athletes out of 30 entrants to go to the Olympics by scoring them on
various criteria.
The top 9 athletes score unique values.
However, athletes #10 and #11 score equally.
Depending on how the data is sorted, #11 could be eliminated from
consideration just because of where their name falls in the list to
sort on.
In this situation, the formula would have to be modified to account
for this.
Perhaps by eliminating the tie breaker and concatenating the names
with equal values.
Any ideas ?
Exceluser
Hello,
Roll a die for each tie and freeze the value.
You will be in good company - even in the current world cup they would
do this in special cases (ties), I think.
Regards,
Bernd
You could "see" easily which names are associated with tied scores
(and there could be multiple ties, it doesn't matter) if you use the
suggested simple set-up to return the scores next to the auto-sorted
names. And beyond the obvious visuals, in an adjacent col, you could
simply use COUNTIF on the scores col, any count > 0 returned will
indicate ties.
should read:
> .. any count > 1 returned will indicate ties.