[R] Kruskal Walllis test

181 views
Skip to first unread message

imrib

unread,
Aug 3, 2010, 7:47:58 AM8/3/10
to r-h...@r-project.org

Hi all
My data table (g) contains a continues data column (plant.height) and other
columns (columns 8 to 57),

each with number of levels of different factors. ANOVA test was done and
the p-values were extracted

as follos:

a <- function(x) anova(lm(plant.height ~ x))$"Pr(>F)"[1]

r<- apply(g[,8:57],2,a)

If I try to do a Kruskal-Wallis test :

kw <- function(x) kruskal.test(plant.height ~ x)$"p.value"

r.kw <- apply(g[,8:57],2,kw)

I get the following error message:

Error in kruskal.test.default(c(0.16, 0, 0.007, 0.078, 0, 0.08, 0.19, :

all group levels must be finite

Why do I get this error ? (the values in c() are the plant.height values)

Thanks


--
View this message in context: http://r.789695.n4.nabble.com/Kruskal-Walllis-test-tp2311712p2311712.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Tal Galili

unread,
Aug 3, 2010, 10:32:04 AM8/3/10
to imrib, r-h...@r-project.org
I would suggest to you the following:
1) Run the same thing, but with a loop instead of apply
2) add the to loop a printing that shows you on what cycle of the loop the
function breaks
3) see if that vector has any Inf or NA values (although in general I think
you are using a numeric instead of a factor on the function, which might be
the source of the problem)
4) if you can't figure what is wrong there - use "dput", and put that vector
on text, and e-mail it back so we could see if we can reproduce the error
and find a reason for it.
5) Report if you found a solution on yourself - so others would benefit from
your experience.

Cheers :)
Tal

----------------Contact
Details:-------------------------------------------------------
Contact me: Tal.G...@gmail.com | 972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------

[[alternative HTML version deleted]]

Wu Gong

unread,
Aug 3, 2010, 11:26:23 AM8/3/10
to r-h...@r-project.org

The apply function coerces the factor results to a character array
apply(g,2,class) # gives character

The kruskal.test function doesn't take character vector as the group
argument.
kruskal.test(as.character(plant.height) ~ as.character(g[,8])) #doesn't work
kruskal.test(plant.height ~ as.character(g[,8])) #doesn't work
kruskal.test(as.character(plant.height) ~ g[,8]) #works

You'd better change the kw function.
kw <- function(x) kruskal.test(plant.height ~ as.factor(x))$"p.value"

-----
A R learner.
--
View this message in context: http://r.789695.n4.nabble.com/Kruskal-Walllis-test-tp2311712p2312063.html

peter dalgaard

unread,
Aug 3, 2010, 11:35:36 AM8/3/10
to imrib, r-h...@r-project.org

On Aug 3, 2010, at 1:47 PM, imrib wrote:

>
> Hi all
> My data table (g) contains a continues data column (plant.height) and other
> columns (columns 8 to 57),
>
> each with number of levels of different factors. ANOVA test was done and
> the p-values were extracted
>
> as follos:
>
> a <- function(x) anova(lm(plant.height ~ x))$"Pr(>F)"[1]
>
> r<- apply(g[,8:57],2,a)

This looks like an invitation to disaster. apply() will coerce g[,8:57] to a matrix, losing all factor definitions. As it happens, lm() will survive this, because

> lm(0:1~c("a","b"))

Call:
lm(formula = 0:1 ~ c("a", "b"))

Coefficients:
(Intercept) c("a", "b")b
0 1

Warning message:
In model.matrix.default(mt, mf, contrasts) :
variable 'c("a", "b")' converted to a factor

But kruskal.test dies on the similar construction, essentially because

> is.finite("a")
[1] FALSE
................

Try it with lapply(g[,8:57], kw) instead.

-pd


>
> If I try to do a Kruskal-Wallis test :
>
> kw <- function(x) kruskal.test(plant.height ~ x)$"p.value"
>
> r.kw <- apply(g[,8:57],2,kw)
>
> I get the following error message:
>
> Error in kruskal.test.default(c(0.16, 0, 0.007, 0.078, 0, 0.08, 0.19, :
>
> all group levels must be finite
>
> Why do I get this error ? (the values in c() are the plant.height values)
>
> Thanks
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Kruskal-Walllis-test-tp2311712p2311712.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-h...@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk Priv: PDa...@gmail.com

Reply all
Reply to author
Forward
0 new messages