[R] glm.nb error

953 views
Skip to first unread message

Daofeng Li

unread,
Jun 7, 2013, 10:44:18 AM6/7/13
to r-h...@r-project.org
Dear R Community,

I have encountered a problem while using the R function glm.nb.
The code that produce the error was following two lines:

group=c(1,1,1,1,0,0,0,0)
fit=glm.nb(y~group)

While the y contains 8 sets of number like:
gene275 0 1 0 0 1 5 1 0

Error message:

Error in while ((it <- it + 1) < limit && abs(del) > eps) { :
missing value where TRUE/FALSE needed
Calls: glm.nb -> as.vector -> theta.ml
In addition: There were 50 or more warnings (use warnings() to see the
first 50)
Execution halted


Information of my system:
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

Does anyone happen to have some hit on how to solve this?
Appreciate for any response.

Thanks in advance,

Daofeng

[[alternative HTML version deleted]]

______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Sarah Goslee

unread,
Jun 7, 2013, 11:13:40 AM6/7/13
to Daofeng Li, r-h...@r-project.org
Hi,

On Fri, Jun 7, 2013 at 10:44 AM, Daofeng Li <lid...@gmail.com> wrote:
> Dear R Community,
>
> I have encountered a problem while using the R function glm.nb.
> The code that produce the error was following two lines:
>
> group=c(1,1,1,1,0,0,0,0)
> fit=glm.nb(y~group)
>
> While the y contains 8 sets of number like:
> gene275 0 1 0 0 1 5 1 0
>
> Error message:
>
> Error in while ((it <- it + 1) < limit && abs(del) > eps) { :
> missing value where TRUE/FALSE needed
> Calls: glm.nb -> as.vector -> theta.ml
> In addition: There were 50 or more warnings (use warnings() to see the
> first 50)
> Execution halted

I'd assume there is a missing value somewhere.

But we really need a reproducible example to be certain. You could use
dput(head(yourdata, 20)) to give us something to work with (as long as
that much of your data throws the error).

At the very least, we probably also need
str(y)
str(group)

Are you certain there are no missing values in your data? The problem
may not be that obvious, but that's an easy place to start.

Sarah

>
> Information of my system:
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> Does anyone happen to have some hit on how to solve this?
> Appreciate for any response.
>
> Thanks in advance,
>
> Daofeng
>

--
Sarah Goslee
http://www.functionaldiversity.org

Marc Schwartz

unread,
Jun 7, 2013, 11:15:43 AM6/7/13
to Daofeng Li, r-h...@r-project.org
There is something wrong with your actual 'y' or 'group' that is not evident from the above info:


group <- c(1, 1, 1, 1, 0, 0, 0, 0)
y <- c(0, 1, 0, 0, 1, 5, 1, 0)

> require(MASS)
Loading required package: MASS

> glm.nb(y ~ group)

Call: glm.nb(formula = y ~ group, init.theta = 1.711564307, link = log)

Coefficients:
(Intercept) group
0.5596 -1.9459

Degrees of Freedom: 7 Total (i.e. Null); 6 Residual
Null Deviance: 10.23
Residual Deviance: 6.848 AIC: 25.25


Check str(y) and str(group)

You should also be sure to note in your posts when you are using a function from a non-base package, in this case MASS, which is not indicated in your sessionInfo() above, so something is amiss there as well.

Regards,

Marc Schwartz

Sarah Goslee

unread,
Jun 7, 2013, 11:47:08 AM6/7/13
to Daofeng Li, r-h...@r-project.org
Hi,

On Fri, Jun 7, 2013 at 11:36 AM, Daofeng Li <lid...@gmail.com> wrote:
> Thank you Sarah and Marc for your fast and nice response.
> Apology for didn't include all information.
>
> I have a input file like following:
>
> gene1 18 15 13 13 16 9 20 24
> gene2 15 8 8 7 0 12 18 4
> gene3 10 9 8 12 9 11 12 12
> gene4 4 0 4 3 0 5 0 0
> gene5 0 1 0 0 1 5 1 0
> gene6 3 3 3 3 4 4 4 4
> gene7 0 4 0 2 2 2 0 0
> gene8 4 4 7 3 0 6 6 12
> gene9 11 6 13 10 13 7 12 9
> gene10 6 3 6 3 4 7 6 3

> dat =
> read.table("test",col.names=c("gene","b1","b2","b3","b4","c1","c2","c3","c4"))

Is this what's in the "test" file that your code reads in? We don't
have that file, so can't run your code.

If it is, then the output of

dput(dat)

would be enormously more useful than copy and pasting your data file
into your email.

And yes, the full code is very little like the pair of lines you
originally pasted in.

Sarah

> I am using following R code to compare the difference between the 1st 4
> numbers against 2nd 4 numbers:
>
> library(MASS)
> library(coin)
> suppressPackageStartupMessages(suppressWarnings(library(tcltk)))
> library(qvalue)
> library(plyr)
> dat =
> read.table("test",col.names=c("gene","b1","b2","b3","b4","c1","c2","c3","c4"))
> index=(apply(dat[,-1],1,sum)>0)
> data = dat[index,]
> group=c(1,1,1,1,0,0,0,0)
> n=nrow(data)
> result=NULL
> for (i in 1:n){
> print(i)
> y=as.numeric(data[i,-1])
> if (all((y-mean(y))==0))
> result=rbind(result,c(0,0,0,1))
> else {
> fit=glm.nb(y~group)
> result=rbind(result,summary(fit)$coef[2,])
> }
> }
> qval = qvalue(result[,4])
> fdr=0.1
> index=(qval$qvalues<fdr)
> dat.result = data[index,]
> write.table(dat.result,file="test_result",row.names=F,quote=F)
>
> If you use this input file and code, would reproduce the same error:
>
> Loading required package: methods
> Loading required package: survival
> Loading required package: splines
> Loading required package: mvtnorm
> Loading required package: modeltools
> Loading required package: stats4
>
> Attaching package: ‘plyr’
>
> The following object is masked from ‘package:modeltools’:
>
> empty
>
> [1] 1
> [1] 2
> [1] 3
> [1] 4
> [1] 5
> [1] 6
> Error in while ((it <- it + 1) < limit && abs(del) > eps) { :
> missing value where TRUE/FALSE needed
> Calls: glm.nb -> as.vector -> theta.ml
> In addition: Warning messages:
> 1: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace =
> control$trace > :
> iteration limit reached
> 2: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace =
> control$trace > :
> iteration limit reached
> Execution halted
>
> So might be the error was in 6th line, not the line I pasted before (5th
> line)? Sorry about that.
>
> Thanks.
>
> Daofeng

Marc Schwartz

unread,
Jun 7, 2013, 11:48:54 AM6/7/13
to Daofeng Li, r-h...@r-project.org

On Jun 7, 2013, at 10:36 AM, Daofeng Li <lid...@gmail.com> wrote:

> Thank you Sarah and Marc for your fast and nice response.
> Apology for didn't include all information.
>
> I have a input file like following:
>
> gene1 18 15 13 13 16 9 20 24
> gene2 15 8 8 7 0 12 18 4
> gene3 10 9 8 12 9 11 12 12
> gene4 4 0 4 3 0 5 0 0
> gene5 0 1 0 0 1 5 1 0
> gene6 3 3 3 3 4 4 4 4
> gene7 0 4 0 2 2 2 0 0
> gene8 4 4 7 3 0 6 6 12
> gene9 11 6 13 10 13 7 12 9
> gene10 6 3 6 3 4 7 6 3
>
> Error in while ((it <- it + 1) < limit && abs(del) > eps) { :
> missing value where TRUE/FALSE needed
> Calls: glm.nb -> as.vector -> theta.ml
> In addition: Warning messages:
> 1: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > :
> iteration limit reached
> 2: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > :
> iteration limit reached
> Execution halted
>
> So might be the error was in 6th line, not the line I pasted before (5th line)? Sorry about that.
>
> Thanks.
>
> Daofeng


Your 'y' at that point in the loop is:

> y
[1] 3 3 3 3 4 4 4 4

and 'group' is:

> group
[1] 1 1 1 1 0 0 0 0


> glm.nb(y ~ group)
Error in while ((it <- it + 1) < limit && abs(del) > eps) { :
missing value where TRUE/FALSE needed


Think about it... :-)

Sarah Goslee

unread,
Jun 7, 2013, 12:25:51 PM6/7/13
to Daofeng Li, r-h...@r-project.org
As Marc already pointed out, take a close look at this part of your loop:

R> i <- 6
R>
R> y <- as.numeric(data[i,-1])
R> y
[1] 3 3 3 3 4 4 4 4
R> group
[1] 1 1 1 1 0 0 0 0
R> fit <- glm.nb(y~group)
Error in while ((it <- it + 1) < limit && abs(del) > eps) { :
missing value where TRUE/FALSE needed

What do you expect to happen there?

In general, it's a better practice to pre-specify the size of result
(eg matrix(NA, nrow=n, ncol=4) ) and fill it as you go, rather than
using rbind() within a loop. Much more memory-efficient.

Sarah

On Fri, Jun 7, 2013 at 11:58 AM, Daofeng Li <lid...@gmail.com> wrote:
> Sorry Sarah.
>
>> dput(dat)
> structure(list(gene = structure(c(1L, 3L, 4L, 5L, 6L, 7L, 8L,
> 9L, 10L, 2L), .Label = c("gene1", "gene10", "gene2", "gene3",
> "gene4", "gene5", "gene6", "gene7", "gene8", "gene9"), class = "factor"),
> b1 = c(18L, 15L, 10L, 4L, 0L, 3L, 0L, 4L, 11L, 6L), b2 = c(15L,
> 8L, 9L, 0L, 1L, 3L, 4L, 4L, 6L, 3L), b3 = c(13L, 8L, 8L,
> 4L, 0L, 3L, 0L, 7L, 13L, 6L), b4 = c(13L, 7L, 12L, 3L, 0L,
> 3L, 2L, 3L, 10L, 3L), c1 = c(16L, 0L, 9L, 0L, 1L, 4L, 2L,
> 0L, 13L, 4L), c2 = c(9L, 12L, 11L, 5L, 5L, 4L, 2L, 6L, 7L,
> 7L), c3 = c(20L, 18L, 12L, 0L, 1L, 4L, 0L, 6L, 12L, 6L),
> c4 = c(24L, 4L, 12L, 0L, 0L, 4L, 0L, 12L, 9L, 3L)), .Names = c("gene",
> "b1", "b2", "b3", "b4", "c1", "c2", "c3", "c4"), class = "data.frame",
> row.names = c(NA,
> -10L))
>
> above was the dput(dat). Thanks.

Daofeng Li

unread,
Jun 7, 2013, 11:58:16 AM6/7/13
to Sarah Goslee, r-h...@r-project.org
Sorry Sarah.

> dput(dat)
structure(list(gene = structure(c(1L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 2L), .Label = c("gene1", "gene10", "gene2", "gene3",
"gene4", "gene5", "gene6", "gene7", "gene8", "gene9"), class = "factor"),
b1 = c(18L, 15L, 10L, 4L, 0L, 3L, 0L, 4L, 11L, 6L), b2 = c(15L,
8L, 9L, 0L, 1L, 3L, 4L, 4L, 6L, 3L), b3 = c(13L, 8L, 8L,
4L, 0L, 3L, 0L, 7L, 13L, 6L), b4 = c(13L, 7L, 12L, 3L, 0L,
3L, 2L, 3L, 10L, 3L), c1 = c(16L, 0L, 9L, 0L, 1L, 4L, 2L,
0L, 13L, 4L), c2 = c(9L, 12L, 11L, 5L, 5L, 4L, 2L, 6L, 7L,
7L), c3 = c(20L, 18L, 12L, 0L, 1L, 4L, 0L, 6L, 12L, 6L),
c4 = c(24L, 4L, 12L, 0L, 0L, 4L, 0L, 12L, 9L, 3L)), .Names = c("gene",
"b1", "b2", "b3", "b4", "c1", "c2", "c3", "c4"), class = "data.frame",
row.names = c(NA,
-10L))

above was the dput(dat). Thanks.

Daofeng


On Fri, Jun 7, 2013 at 10:47 AM, Sarah Goslee <sarah....@gmail.com>wrote:

[[alternative HTML version deleted]]

Daofeng Li

unread,
Jun 7, 2013, 11:36:53 AM6/7/13
to Marc Schwartz, sarah....@gmail.com, r-h...@r-project.org
Thank you Sarah and Marc for your fast and nice response.
Apology for didn't include all information.

I have a input file like following:

gene1 18 15 13 13 16 9 20 24
gene2 15 8 8 7 0 12 18 4
gene3 10 9 8 12 9 11 12 12
gene4 4 0 4 3 0 5 0 0
gene5 0 1 0 0 1 5 1 0
gene6 3 3 3 3 4 4 4 4
gene7 0 4 0 2 2 2 0 0
gene8 4 4 7 3 0 6 6 12
gene9 11 6 13 10 13 7 12 9
gene10 6 3 6 3 4 7 6 3

Error in while ((it <- it + 1) < limit && abs(del) > eps) { :
missing value where TRUE/FALSE needed
Calls: glm.nb -> as.vector -> theta.ml
In addition: Warning messages:
1: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace =
control$trace > :
iteration limit reached
2: In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace =
control$trace > :
iteration limit reached
Execution halted

So might be the error was in 6th line, not the line I pasted before (5th
line)? Sorry about that.

Thanks.

Daofeng


On Fri, Jun 7, 2013 at 10:15 AM, Marc Schwartz <marc_s...@me.com> wrote:

>
[[alternative HTML version deleted]]

Daofeng Li

unread,
Jun 7, 2013, 12:02:16 PM6/7/13
to Marc Schwartz, r-h...@r-project.org
Thanks Marc...still didn't get it...
Tried this...

> str(group)
num [1:8] 1 1 1 1 0 0 0 0
> fit=glm.nb(y~group)
Error in while ((it <- it + 1) < limit && abs(del) > eps) { :
missing value where TRUE/FALSE needed
> group <- c('1','1','1','1','0','0','0','0')
> str(group)
chr [1:8] "1" "1" "1" "1" "0" "0" "0" "0"
> fit=glm.nb(y~group)
Error in while ((it <- it + 1) < limit && abs(del) > eps) { :
missing value where TRUE/FALSE needed


Daofeng
[[alternative HTML version deleted]]

Daofeng Li

unread,
Jun 7, 2013, 1:05:53 PM6/7/13
to Sarah Goslee, r-h...@r-project.org
Thanks for all your help. I will try to fix this.
Very helpful.

Best,

Daofeng
[[alternative HTML version deleted]]

Reply all
Reply to author
Forward
0 new messages