Model fitting in ggplot with confounding variables

99 views
Skip to first unread message

angelovarlotta

unread,
Apr 11, 2016, 8:41:51 AM4/11/16
to ggplot2
Hi,

In ggplot2 using the ToothGrowth data from R, I've been trying to fit the following models where the factor w has been introduced:

lm(y ~ w)     :   yhat = B0 + B2 * w
lm(y ~ x + w) :   yhat = B0 + B1 * x + B2 * w
lm(y ~ x * w) :   yhat = B0 + B1 * x + B2 * w + B3 * x * w

I've tried using the formula option in geom_smooth() but my ggplot2 code doesn't recognize the factor supp.

ggplot(data = ToothGrowth, aes(x = dose, y = len, color = supp)) + geom_point(pch = 19, size = 2, alpha = 0.4) + xlab('Dose') + ylab('Length') + theme_bw() + scale_color_manual(name = 'Supplements', breaks = c('OJ', 'VC'), values = c('red', 'blue')) + geom_smooth(method = 'lm', formula = (y ~ x * supp)) + ggtitle('y = B0 + B1 * x + B2 * w + B3 * x * w')

As a matter of fact, I get the error message:

Warning message: Computation failed in `stat_smooth()`: object 'supp' not found.

I was thinking of using this workaround using geom_abline with first lm, but it doesn't give me the error bands.

fit <- lm(data = ToothGrowth, len ~ dose * supp)

ggplot(data = ToothGrowth, aes(x = dose, y = len, color = supp)) + geom_point(pch = 19, size = 2, alpha = 0.4) + xlab('Dose') + ylab('Length') + theme_bw() + scale_color_manual(name = 'Supplements', breaks = c('OJ', 'VC'), labels = c('Orange Juice', 'Vitamin C'), values = c('red', 'blue')) + ggtitle('y = B0 + B1 * x + B2 * w + B3 * x * w') + geom_abline(intercept = coef(fit)[1], slope = coef(fit)[2], col = 'red', lwd = 1) + geom_abline(intercept = coef(fit)[1] + coef(fit)[3], slope = coef(fit)[2] + coef(fit)[4], col = 'blue', lwd = 1)

Any suggestions?

Angelo

Wouter van der Bijl

unread,
Apr 11, 2016, 9:02:23 AM4/11/16
to ggplot2
Something like this?

fit <- lm(data = ToothGrowth, len ~ dose * supp)

library(effects)
f <- as.data.frame(Effect(c('dose', 'supp'), fit, xlevels = 1000))

ggplot(mapping = aes(x = dose)) + 
  geom_point(data = ToothGrowth, aes(y = len, color = supp), pch = 19, size = 2, alpha = 0.4) +
  geom_line(data = f, aes(y = fit, color = supp)) +
  geom_ribbon(data = f, aes(ymin = lower, ymax = upper, fill = supp), alpha = 0.2) +
  xlab('Dose') + ylab('Length') + theme_bw() + 
  scale_color_manual(name = 'Supplements', breaks = c('OJ', 'VC'), labels = c('Orange Juice', 'Vitamin C'), values = c('red', 'blue')) + 
  scale_fill_manual(name = 'Supplements', breaks = c('OJ', 'VC'), labels = c('Orange Juice', 'Vitamin C'), values = c('red', 'blue')) + 
  ggtitle('y = B0 + B1 * x + B2 * w + B3 * x * w')

ggsave('temp.png', dpi = 500)

Angelo Varlotta

unread,
Apr 12, 2016, 7:07:50 AM4/12/16
to ggplot2
Hi Wouter,
Your code works really well, but in the case I use fit <- lm(data = ToothGrowth, len ~ supp), with just the parameter 'supp', I get an error in RStudio where it complains that the 'dose' predictor isn't in the model and I don't know how to modify the code. Compared to your code above, I've just removed the 'dose' predictor in the line that contains the 'fit' variable.

Angelo

Wouter van der Bijl

unread,
Apr 12, 2016, 7:37:53 AM4/12/16
to ggplot2
Well, you should also remove 'dose' when getting the Effect... And your plot has to become very different, too.


fit <- lm(data = ToothGrowth, len ~ supp)


library
(effects)
f
<- as.data.frame(Effect('supp', fit))


ggplot
(mapping = aes(x= supp)) +
  geom_bar
(data = f, aes(y = fit), stat = "identity", col = 1, fill = NA) +
  geom_errorbar
(data = f, aes(ymin = lower, ymax = upper, fill = supp), width = 0.3) +

  ylab
('Length') + theme_bw()



ggsave
('temp2.png', dpi = 500)

Angelo Varlotta

unread,
Apr 12, 2016, 8:54:36 AM4/12/16
to ggplot2

I was aiming at something more like this but with the error bands:


Wouter van der Bijl

unread,
Apr 12, 2016, 10:01:57 AM4/12/16
to ggplot2
Ok, last one:



fit <- lm(data = ToothGrowth, len ~ supp)

library
(effects)
f
<- as.data.frame(Effect('supp', fit))


ggplot
() +
  geom_point
(data = ToothGrowth, aes(x = dose, y = len, color = supp), pch = 19, size = 2, alpha = 0.4) +
  geom_hline
(data = f, aes(yintercept = fit, color = supp)) +
  geom_rect
(data = f, xmin = -Inf, xmax = Inf,
            aes
(ymin = lower, ymax = upper, fill = supp), alpha = 0.2) +

  xlab
('Dose') + ylab('Length') + theme_bw() +
  scale_color_manual
(name = 'Supplements', breaks = c('OJ', 'VC'), labels = c('Orange Juice', 'Vitamin C'), values = c('red', 'blue')) +

  scale_fill_manual
(name = 'Supplements', breaks = c('OJ', 'VC'), labels = c('Orange Juice', 'Vitamin C'), values = c('red', 'blue'))

ggsave
('temp3.png', dpi = 500)
Reply all
Reply to author
Forward
0 new messages