Rank Correlation

94 views
Skip to first unread message

Luke Setzer

unread,
Jan 9, 2009, 7:50:07 AM1/9/09
to tinspire
I am studying rank correlation in a statistics book.

This involves converting two lists of data to ranks and then
determining if they correlate.

For instance:

x:={1000, 5, 20, 19, 6}
y:={20, 7, 15, 200, 1}

I just made those from thin air but bear with me.

I would convert them to ranks, i.e. their order from lowest to
highest:

xr:={5, 1, 4, 3, 2}
yr:={4, 2, 3, 5, 1}

After this conversion, additional manipulations (such as Spearman's
rank correlation coefficient) tell me whether these lists have a
correlation within a stated confidence level.

I need to know what existing functions in the TI-Nspire CAS calculator
can speed this process and reduce human error.

Nelson Sousa

unread,
Jan 9, 2009, 8:19:21 AM1/9/09
to tins...@googlegroups.com

But, of course there is! ;)

Try this:

Data needed: a list and an index.
Goal: determine how many elements are bigger than the one at the specified index; return that number plus 1.

Here are the functions:

Define LibPub rank(list,index)=
Func
:Local i,n
:If getType(list)≠"LIST"
:  Return "ERROR: First argument must be a list"
:dim(list)→n
:If getType(index)≠"NUM"
:  Return "ERROR: Second argument must be a number"
:If fPart(index)≠0 or index<1 or index>n
:  Return "ERROR: Second argument must be a valid integer"
:Return sum(seq(when(list[i]>list[index],1,0),i,1,n))+1
:EndFunc


Define LibPub ranklist(list)=
Func
:Local i,n
:If getType(list)≠"LIST"
:  Return "ERROR: First argument must be a list"
:dim(list)→n
:Return seq(rank(list,i),i,1,n)
:EndFunc



See attached file. Functions created and defined as public and an example.

Note: function checks whether the list is a valid list, not whether all of it's elements are numbers that can be sorted.


Regards,
Nelson
rank_of_elements.tns

Luke Setzer

unread,
Jan 9, 2009, 10:11:14 AM1/9/09
to tinspire
This does not list the rank from smallest to highest.

I need it to assign 1 to the smallest element and n (the dimension of
the list) to the largest element with the intervening numbers ranked
accordingly.

Nelson Sousa

unread,
Jan 9, 2009, 10:31:40 AM1/9/09
to tins...@googlegroups.com

Oh, my bad, I misread! This function sorts from highest to lowest.

To to the oposite just open the program editor to edit the rank function and change the > sign to < on the line
Return sum(seq(when(list[i]>list[index],1,0),i,1,n))+1

 Save and close and that's it.

Nelson

Luke Setzer

unread,
Jan 9, 2009, 1:27:19 PM1/9/09
to tinspire
Thanks!

Luke Setzer

unread,
Jan 9, 2009, 1:51:16 PM1/9/09
to tinspire
Is there a way to change the program to break ties?

In other words, if two elements have the same value and yield a
sequential rank of, say, 8 and 9, can we change the code to average
them to 8.5?

The statistics book says to do this before using a given formula.

Nelson Sousa

unread,
Jan 11, 2009, 12:30:18 PM1/11/09
to tins...@googlegroups.com

Errrr... are you sure you want that?
Currently elements that are the same are credited with the same rank, and the next element jumps one rank.

As an example, the list {1,5,5,3,3,9,2} has as ranks {7,2,2,4,4,1,6}.
Note that equal elements are returned with the same rank and then the ranks "skip one" for the next element. For that reason there's no element ranked in 3rd place (as there are two 2nd)  and no 5th, as there are 2 elements in 4th place.
If you average them out you get something weirder. It would be {7,2.5,2.5,4.5,4.5,1,6}.

This definition is quite neat and gives coherent results.

What exactly are you looking for?


Nelson

Luke Setzer

unread,
Jan 12, 2009, 7:54:19 AM1/12/09
to tinspire
Please see uploaded file RankCorrelation.pdf for an example.

Nelson Sousa

unread,
Jan 13, 2009, 8:49:12 AM1/13/09
to tins...@googlegroups.com

So, along with counting 1 for each element that is greater, we should also add something depending on how many elements are equal, is that it?

Ok, lets think for a bit.

The rank function returns 
sum(seq(when(list[i]<list[index],1,0),i,1,n))+1

The when statement will return 1 if the element on position i is greater than the element on position index, then sum the results for all i from 1 to n and adding 1 to that result as a number with zero elements bigger will have rank 1.

If you have n elements that are equal, currently tied on position m they currently have a rank of m, and the next element will have a rank of m+n+1.

That is to say that those elements occupy positions m, m+1, ... , m+n-1.

What you want is to, instead of having rank m, to have rank
(m + m + n - 1 ) /2 = m + (n-1)/2.

So, the rank you're looking for is

Elements_bigger_than + 1 + Elements_equal_to / 2 - 1/2 =
Elements_bigger_than + Elements_equal_to / 2 +1/2

Just replace that last line on the rank() function by
sum(seq(when(list[i]<list[index],1,0),i,1,n)) + sum(seq(when(list[i]=list[index],1,0),i,1,n))/2 + 0.5

and you'll have the results you want.

For list {1,2,2,3} you'll get {1,2.5,2.5,4} as the ranks;
For list {1,2,2,2,3} you'll get {1,3,3,3,5} as the ranks;
For list {1,2,2,2,2,3} you'll get {1,3.5,3.5,3.5, 3.5, 5} as the ranks.

If that what you're looking for?


Nelson


PS: operations on lists aren't THAT complicated. Of course it takes time to analyze the problem, set the goals and come up with the ideas. But there's no rocket science here, only the criterious use of the available list operations on TI-Nspire and some pen and paper to make a draft of the procedures. You should, however, try to get it done by yourself and try to, looking at a working example, improve it to meet your demands. If you just write a message to the user group and wait for someone to write you the answer you're not really learning anything. Take, for example, your first question, regarding the order that was reversed. It would take just a few minutes to analyze the code and realise that the operations were done using the wrong comparison symbol. So the change you needed was replacing < to >. Instead, you came to me again to ask for a ready-to-go solution. I don't really mind building code for others to use, as long as I have the time for it, but if you keep asking people to give you the solution instead of trying to get to it yourself, eventually you'll ask for something that either (a) nobody knows how to do; (b) who knows how to doesn't have the time to do it for you; or (c) although someone knows how to do it and has the time for it, isn't in the mood to do your homework assignments. This usergroup is a way to get help, not to get the work done.

Luke Setzer

unread,
Jan 13, 2009, 11:32:54 AM1/13/09
to tinspire
Thank you for your help in getting the work done.
Reply all
Reply to author
Forward
0 new messages