glmer error with gamma distribution

emyoung90

unread,

May 10, 2019, 7:17:58 PM5/10/19

to STRI R users

Hi list! Hope all is well.

I'm trying to wrap my brain around an error I'm getting in glmer, but my stats background doesn't seem quite up to the task of figuring it out. Any advice would be much appreciated.

Goal: I'm trying to determine if rainfall differs between 4 weather stations on the panama canal (Gamboa, BCI, Miraflores, and Empire Hill). At first it seemed like a job for an ANOVA, but given the influence of seasonality and year-to-year variation, I thought that a GLMM would be the way to go. I have 20 years of monthly measurements for each site.

Data frame is attached to this post (I hope that's ok - I wasn't sure the best way to make the data available, since its a bit much to embed). It's 5 columns, long form: site, year, month, rainfall, and rainfallplus. All self explanatory except rainfallplus - that's the rainfall measurement plus 1, to correct for issues with zero values when fitting certain distributions.

Issue: When I run my glmer (see below for code) I get the following error:

Error in pwrssUpdate(pp, resp, tol = tolPwrss, GQmat = GHrule(0L), compDev = compDev, :

(maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate

In addition: Warning message:

In commonArgs(par, fn, control, environment()) :

maxfun < 10 * length(par)^2 is not recommended.

I have no idea what this means. Although I see on stackoverflow etc that others have experienced it, I can't pick apart what is happening and why. I assume it has to do with the distribution of my data, so I'm worried that I'm doing something incorrectly... any help would be extremely appreciated!

Cheers,

Emma

###CODE###

library(lme4)

library(fitdistrplus)

panamarain<-read.csv

#to figure out the best error family to use for the GLMM, I looked at a lot of q-q plots. In the end, I used the following to determine best distribution:

shapiro.test(panamarain$rainfallplus)

#significant, therefore reject H0 of normal distribution

fit1<-fitdistr(panamarain$rainfallplus, "normal")

fit2<-fitdistr(panamarain$rainfallplus, "lognormal")

fit3 <-fitdistr(panamarain$rainfallplus, "gamma")

#Here, gamma gives the warning "1: In densfun(x, parm[1], parm[2], ...) : NaNs produced"

#I read up a bit on it, and adjusted for that by setting the lower boundaries manually. This clears up the issue.

fit3 <-fitdistr(panamarain$rainfallplus, "gamma", lower=c(0,0))

AIC(fit1,fit2,fit3)

#based on this, gamma is best. Not great, but best.

#you can look at it's q-q here, using the package fitdistrplus

descdist(panamarain$rainfallplus, discrete = FALSE)

fit_test <-fitdist(panamarain$rainfallplus, "gamma", method="mme")

plot(fit_test)

#so, with that in mind, the GLMM is fairly straightforward: site predicting rainfall, and month/year as nested random effects.

sitetest <- glmer(rainfallplus ~ site + (month|year), family = Gamma, data = panamarain)

#that gives me the following error, preventing the model from coalescing.

#Error in pwrssUpdate(pp, resp, tol = tolPwrss, GQmat = GHrule(0L), compDev = compDev, :

#(maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate

#In addition: Warning message:

#In commonArgs(par, fn, control, environment()) :

#maxfun < 10 * length(par)^2 is not recommended.

#I tested a null model by removing the effect of site, same error though.

null <- glmer(rainfallplus ~ (month|year), family = Gamma, data = panamarain)

###END CODE

Panamarain.csv

Justin Touchon

unread,

May 10, 2019, 11:12:17 PM5/10/19

to stri-R...@googlegroups.com

Howdy Emma,

First off, thanks for using the STRI R group! It’s always awesome when we get a message on this.

Secondly, as I see it, I think there are two problems. 1) Your random effects structure is wrong and 2) you don’t want to use a Gamma model.

Problem 1: You have your random effects as (month|year), which I assume is because you see this as a repeated measures design. The problem is now you are coding a separate slope for each Jan, Feb, etc. and a different intercept for each year, and you only have 1 value for each site in each month X year combo. There are just no data to estimate those combinations with. So the model just fails. I would instead see months as nested within years, which would be coded as (1|year/month). Interestingly, coding the model this way results in a random effect size for year of 0, but clearly year is still important because you get rather different results if you just code the random effect as (1|month). But, that’s besides the point…

Problem 2: Gamma models are notoriously hard to fit. I’m not enough of a real statistician to say why, but I know they just are. Poisson or Negative Binomial models are generally more robust. Thanks for including your data by the way, that made this so much easier! I ran your model as a Poisson (which you can do in R, even though a Poisson is supposed to have just whole integers), as a Poisson where I rounded the rainfall values and as a negative binomial where I rounded the values.

sitetest1 <- glmer(rainfallplus ~ site + (1|year/month), family="poisson", data = panamarain)

sitetest2 <- glmer(round(rainfallplus) ~ site + (1|year/month), family="poisson", data = panamarain)

sitetest3 <- glmer.nb(round(rainfallplus) ~ site + (1|year/month), data = panamarain)

Looking at the qqplots of the 3 models, it’s clear that the poisson models fit a bit better than the negative binomial.

PastedGraphic-1.pdf

PastedGraphic-2.pdf

PastedGraphic-3.pdf

emyoung90

unread,

May 11, 2019, 5:10:59 PM5/11/19

to STRI R users

Thanks so much! That all makes sense - I'd thought that the best distribution would be poisson, and I even thought about the rounding technique, but I was thrown off since I didn't think of my data as appropriate for a poisson for some reason (I guess since it isn't traditional count data). I ran the rounded poisson model and everything worked (thank you!!), but I have a couple followup questions about the selection process if you don't mind, just so I fully understand.

###CODE###

###First, I can't seem to get the models with the non-rounded poisson or the negative binomial to work for some reason, even though I used your code exactly.

sitetest1 <- glmer(rainfallplus ~ site + (1|year/month), family="poisson", data = panamarain)

sitetest3 <- glmer.nb(round(rainfallplus) ~ site + (1|year/month), data = panamarain)

#For poisson, I get this error for each rainfall value:

#"1: In (function (fr, X, reTrms, family, nAGQ = 1L, verbose = 0L, ... :

#non-integer x = 21.320000"

#I'm assuming that's because the values aren't rounded? I'm curious as to how you overrode those warnings. Or can I just ignore them?

#And for negative binomial, I get this error:

#"1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :

#Model failed to converge with max|grad| = 0.025627 (tol = 0.001, component 1)"

#I think this usually happens when there isn't a large enough sample size or similar, is that correct?

###Second, I put the rounded poisson into the AIC comparison, and the AIC value for poisson is huge - is this just not the right technique to use for this?

fit1<-fitdistr(round(panamarain$rainfallplus), "normal")

fit2<-fitdistr(round(panamarain$rainfallplus), "lognormal")

fit3 <-fitdistr(round(panamarain$rainfallplus), "gamma", lower=c(0,0))

fit4 <-fitdistr(round(panamarain$rainfallplus), "poisson")

AIC(fit1,fit2,fit3,fit4)

###Third, would you mind sharing the code you used to get the q-q plots? I think I'm using a different function than you, and I'd love to be able to reproduce those plots.

###END CODE

Thanks so much, Justin!

Justin Touchon

unread,

May 14, 2019, 4:10:46 PM5/14/19

to stri-R...@googlegroups.com

Hi Emma,

I’ve responded in line below.

On May 11, 2019, at 5:10 PM, emyoung90 <emyo...@gmail.com> wrote:

Thanks so much! That all makes sense - I'd thought that the best distribution would be poisson, and I even thought about the rounding technique, but I was thrown off since I didn't think of my data as appropriate for a poisson for some reason (I guess since it isn't traditional count data). I ran the rounded poisson model and everything worked (thank you!!), but I have a couple followup questions about the selection process if you don't mind, just so I fully understand.

###CODE###

###First, I can't seem to get the models with the non-rounded poisson or the negative binomial to work for some reason, even though I used your code exactly.

sitetest1 <- glmer(rainfallplus ~ site + (1|year/month), family="poisson", data = panamarain)
sitetest3 <- glmer.nb(round(rainfallplus) ~ site + (1|year/month), data = panamarain)

#For poisson, I get this error for each rainfall value:
#"1: In (function (fr, X, reTrms, family, nAGQ = 1L, verbose = 0L, ... :
#non-integer x = 21.320000"

#I'm assuming that's because the values aren't rounded? I'm curious as to how you overrode those warnings. Or can I just ignore them?

Yes, this is because the model is expecting integer values. Note that you get warnings here, not errors. Warnings just mean R wants you to know something, but errors mean something is wrong.

#And for negative binomial, I get this error:
#"1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
#Model failed to converge with max|grad| = 0.025627 (tol = 0.001, component 1)"

#I think this usually happens when there isn't a large enough sample size or similar, is that correct?

I guess I had cut and pasted code from different places. You can change the default optimizer by adding the following code to your model, which makes it run smoothly in this case.

sitetest3 <- glmer.nb(round(rainfallplus) ~ site + (1|year/month), data = panamarain, glmerControl(optimizer="bobyqa", optCtrl = list(maxfun = 100000)))

sitetest3.1 <- glmer.nb(round(rainfallplus) ~ 1 + (1|year/month), data = panamarain, glmerControl(optimizer="bobyqa", optCtrl = list(maxfun = 100000)))

anova(sitetest3,sitetest3.1)

You could also try running the model in glmmadmb, which often works when glmer has issues.

###Second, I put the rounded poisson into the AIC comparison, and the AIC value for poisson is huge - is this just not the right technique to use for this?

fit1<-fitdistr(round(panamarain$rainfallplus), "normal")
fit2<-fitdistr(round(panamarain$rainfallplus), "lognormal")
fit3 <-fitdistr(round(panamarain$rainfallplus), "gamma", lower=c(0,0))
fit4 <-fitdistr(round(panamarain$rainfallplus), "poisson")
AIC(fit1,fit2,fit3,fit4)

You are correct that the AIC for the fit of the Poisson comes out crazy big, but that is why you are using the negative binomial and not the poisson distribution.

###Third, would you mind sharing the code you used to get the q-q plots? I think I'm using a different function than you, and I'd love to be able to reproduce those plots.

There are two ways to do this. You can code it yourself with the following code:

qqnorm(resid(sitetest3));qqline(resid(sitetest3))

Or you can use the qqmath() function in the lattice package.

Cheers,

Justin

—
Justin Touchon, Ph.D.
Assistant Professor on the Mary Clark Rockefeller Chair of Environmental Studies
Biology Department
Vassar College
Office: Olmsted 163
Phone: 845-437-7419
pages.vassar.edu/jutouchon

Emma Young

unread,

May 14, 2019, 4:21:51 PM5/14/19

to stri-R...@googlegroups.com

Thanks so much, Justin! re: poisson vs negative binomial, I think I'm confused because the poisson AIC is bigger, but the q-q plots show a better fit for poisson than negative binomial. Should I go with the AIC over the plots?

--
You received this message because you are subscribed to the Google Groups "STRI R users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stri-R-group...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/stri-R-group/CF5A12C5-1356-4A28-8B70-52B20B392A2E%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

--

Emma Young | PhD Candidate
Department of Biology
University of Missouri - St. Louis
-
Producer & Co-host, The Story Collider Podcast

Reply all

Reply to author

Forward