nonparametric test for trend?

294 views
Skip to first unread message

Alex Li

unread,
Oct 17, 2013, 5:40:32 PM10/17/13
to pystat...@googlegroups.com
Hi, I'm new to the board but glad I found it!

I'd like to look at a single measured trait in three groups, and then test whether there is any sort of trend (ie, increasing or decreasing means value across each group). One of the groups has very few observations so I would like to use nonparametric methods, though I am open to other approaches.  I was wondering if the means to do this is within the pytatsmodels

Scipy.Ranksums will perform the pairwise comparison, but it can not handle three groups.  The Kruskal Wallace test can detect a difference among three groups but it does not indicate direction.  How can I get the best of both worlds?

Thanks

>>> scipy.stats.mstats.kruskalwallis(list1, list2, list3)
(19.776377952755897, 5.0770809921334118e-05)
>>> scipy.stats.stats.ranksums(list1, list2)
(-3.5707142142714248, 0.00035600916321554716)
>>> scipy.stats.stats.ranksums(list1, list3)
(-3.0451153135353142, 0.0023259111307740075)


josef...@gmail.com

unread,
Oct 17, 2013, 7:24:47 PM10/17/13
to pystatsmodels
On Thu, Oct 17, 2013 at 5:40 PM, Alex Li <ale...@gmail.com> wrote:
> Hi, I'm new to the board but glad I found it!
>
> I'd like to look at a single measured trait in three groups, and then test
> whether there is any sort of trend (ie, increasing or decreasing means value
> across each group). One of the groups has very few observations so I would
> like to use nonparametric methods, though I am open to other approaches. I
> was wondering if the means to do this is within the pytatsmodels
>
> Scipy.Ranksums will perform the pairwise comparison, but it can not handle
> three groups. The Kruskal Wallace test can detect a difference among three
> groups but it does not indicate direction. How can I get the best of both
> worlds?

To see if I understand correctly

The Null hypothesis is that all means are the same.
The alternative hypothesis is that means are increasing in the given
group order.

There is nothing available in python, as far as I know.
I downloaded some articles on this a while ago but never read them.

http://www.mathworks.com/matlabcentral/fileexchange/22059-cuzicks-test
http://www.mathworks.com/matlabcentral/fileexchange/22159-jonckheere-terpstra-test-on-trend
and there are many other "non-parametric trend test", but I have no
overview, and don't know which ones are good and which ones are easy
to implement.

I also don't have an overview over standard (parametric) tests against
ordered alternatives.

It would be helpbul if you have more specific information about these
kinds of tests.

Josef

Alex Li

unread,
Oct 17, 2013, 7:50:24 PM10/17/13
to pystat...@googlegroups.com
Thanks for your response, and your null hypothesis is correct. 

More details: I am looking at a large cohort of ~6000 randomly selected individuals who are all measured for a continuous trait, for example blood pressure.  Within this sample,each individual falls into one of three non-overlapping groups: 0, 1, and 2.  Now I would like to see if blood pressure increases across these groups. Group 2 is always very small, n < 10.  FYI 0, 1, and 2 represent genotypes as I am a geneticist.

I came across Cuzick's test and would like to try and implement it but I could not access the original article, not am I proficient in matlab.  In addition to python I know a some R if that helps? It also looks like STATA has a function (http://www.ats.ucla.edu/stat/stata/faq/test_trend.htm) but I do not have a license for this software. 

Your post seems to confirm my suspicion that I may have to write my own function, unless anyone else has some insight

Alex

josef...@gmail.com

unread,
Oct 17, 2013, 10:51:10 PM10/17/13
to pystatsmodels
On Thu, Oct 17, 2013 at 7:50 PM, Alex Li <ale...@gmail.com> wrote:
> Thanks for your response, and your null hypothesis is correct.
>
> More details: I am looking at a large cohort of ~6000 randomly selected
> individuals who are all measured for a continuous trait, for example blood
> pressure. Within this sample,each individual falls into one of three
> non-overlapping groups: 0, 1, and 2. Now I would like to see if blood
> pressure increases across these groups. Group 2 is always very small, n <
> 10. FYI 0, 1, and 2 represent genotypes as I am a geneticist.
>
> I came across Cuzick's test and would like to try and implement it but I
> could not access the original article, not am I proficient in matlab. In
> addition to python I know a some R if that helps? It also looks like STATA
> has a function (http://www.ats.ucla.edu/stat/stata/faq/test_trend.htm) but I
> do not have a license for this software.

Stata has currently the manual online without restriction
http://www.stata.com/manuals13/r.pdf

The formulas don't look very detailed, but it should be possible to
reuse the scipy functions for rank and ties.
The alternative in Cuzick is either increasing or decreasing, I just saw.

how many individuals do you have in group 1?
I guess group 2 will have only a very small effect on the result,
given that the sample is very unbalanced.

....
A bit later (as a quick distraction)
https://gist.github.com/josef-pkt/7035711
verified with 2 Stata manual examples.

Josef

josef...@gmail.com

unread,
Oct 18, 2013, 7:27:13 AM10/18/13
to pystatsmodels
A recent overview for the 3 group case:

http://scholar.google.ca/scholar?cluster=5596027455909030537&hl=en&as_sdt=2005&sciodt=0,5

Alonzo, Todd A., Christos T. Nakas, Constantin T. Yiannoutsos, and
Sherri Bucher. "A comparison of tests for restricted orderings in the
three‐class case."Statistics in medicine 28, no. 7 (2009): 1144-1158.

a thesis with what looks like a good overview, but I don't have access
to journals right now
http://udini.proquest.com/view/tests-for-trend-in-the-analysis-of-goid:304466142/
http://books.google.ca/books?id=2lFzDMG3FgoC&lpg=PA41&ots=IgoRF6pXLI&dq=cuzick%20trend%20test%20increasing%20alternative&pg=PA28#v=onepage&q=cuzick%20trend%20test%20increasing%20alternative&f=false


in SAS I only saw Cochran-Armitage test for trend for contingency
tables (discrete variables)

Josef
Reply all
Reply to author
Forward
0 new messages