Use ggplot2 for drawing a non-linear regression curve based on a specific equation

michael....@gmx.ch

unread,

Aug 2, 2015, 5:26:08 PM8/2/15

to ggplot2

Hi there,
I used a specific equation to fit a curve to my data points
The equation is:
y~yo+a*(1-b^x)
y =Gossypol (from my data) x= Damage_cm (from my data)

My data

structure(list(Gossypol = c(1036.331811, 4171.427741, 6039.995102, 5909.068158, 4140.242559, 4854.985845, 6982.035521, 6132.876396, 948.2418407, 3618.448997, 3130.376482, 5113.942098, 1180.171957, 1500.863038, 4576.787021, 5629.979049, 3378.151945, 3589.187889, 2508.417927, 1989.576826, 5972.926124, 2867.610671, 450.7205451, 1120.955, 3470.09352, 3575.043632, 2952.931863, 349.0864019, 1013.807628, 910.8879471, 3743.331903, 3350.203452, 592.3403778, 1517.045807, 1504.491931, 3736.144027, 2818.419785, 723.885643, 1782.864308, 1414.161257, 3723.629772, 3747.076592, 2005.919344, 4198.569251, 2228.522959, 3322.115942, 4274.324792, 720.9785449, 2874.651764, 2287.228752, 5654.858696, 1247.806111, 1247.806111, 2547.326207, 2608.716056, 1079.846532), Treatment = structure(c(2L, 3L, 4L, 5L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 1L), .Label = c("C", "1c_2d", "3c_2d", "9c_2d", "1c_7d"), class = "factor"), Damage_cm = c(0.4955, 1.516, 4.409, 3.2665, 0.491, 2.3035, 3.51, 1.8115, 0, 0.4435, 1.573, 1.8595, 0, 0.142, 2.171, 4.023, 4.9835, 0, 0.6925, 1.989, 5.683, 3.547, 0, 0.756, 2.129, 9.437, 3.211, 0, 0.578, 2.966, 4.7245, 1.8185, 0, 1.0475, 1.62, 5.568, 9.7455, 0, 0.8295, 2.411, 7.272, 4.516, 0, 0.4035, 2.974, 8.043, 4.809, 0, 0.6965, 1.313, 5.681, 3.474, 0, 0.5895, 2.559, 0)), .Names = c("Gossypol", "Treatment", "Damage_cm"), row.names = c(NA, -56L), class = "data.frame")

I used the package nls2 to fit a curve based on my equation to my data points and to draw the curve

#Draw curve based on:y~yo+a*(1-b^x) 
# y =Gossypol () data set) x= Damage_cm (data set)
# yo=Intercept, a= assymptote ans b=slope
library(nls2)#model

plot(Gossypol~Damage_cm, dta)
dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta,
               start=list(y0=0, a=4000, b=.5))

xval <- seq(0, 10, 0.1)
lines(xval, predict(dta.nls, data.frame(Damage_cm=xval)))
profile(dta.nls, alpha= .05)

However, since i am a big fan of ggplot2 figures I was wondering if I could use ggplot2 to draw the curve
I already created an example code in ggplot2 which uses the same data set. However the curve is not based on the upper equation (instead i used y ~ poly(x, 2)). Can anyone show me how to draw the curve in ggplot2 based on the equation

y~yo+a*(1-b^x) ?

ggplot 2 example

#EXAMPLE GGPLOT 
library(ggplot2)
dta$Treatment <- factor(data$Treatment, levels = c("C","1c_2d","3c_2d","9c_2d","1c_7d"))

p<-ggplot(dta,
 aes(x=Damage_cm,y=Gossypol)) + geom_point(size=5,aes(shape=Treatment))+
 scale_shape_manual(values = c(1,16,15,17,0))+theme_bw()+
  labs(x="Damage (cm^2)",y="Gossypol (ng/mg dw)")
print(p)
p+stat_smooth(method = "lm", formula = y ~ poly(x, 2), size = 1,color="black", se=T)

Thanks a lot,
Mike

Ista Zahn

unread,

Aug 2, 2015, 9:16:50 PM8/2/15

to michael....@gmx.ch, ggplot2

Hi Mike,

You can do it in ggplot2 the same way you did it with base graphics, e.g.,

> dta$y <- predict(dta.nls)
> library(ggplot2)
> ggplot(dta, aes(x = Damage_cm, y = Gossypol)) + geom_point() + geom_line(aes(y = y))

Best,
Ista

> --
> --
> You received this message because you are subscribed to the ggplot2 mailing
> list.
> Please provide a reproducible example:
> https://github.com/hadley/devtools/wiki/Reproducibility
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ggplot2" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ggplot2+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

michael....@gmx.ch

unread,

Aug 3, 2015, 11:44:52 AM8/3/15

to ggplot2

Hi Ista Zahn,
thanks for your help. I didnt expect it to be so easy.
I slightly changed the code in ggplot2. it looks now like that:

p<-ggplot(data, aes(x=Damage_cm,y=Gossypol)) + geom_point(size=5,aes(shape=Treatment))+ scale_shape_manual(values = c(1,16,15,17,0))+theme_bw()+


  labs(x="Damage (cm^2)",y="Gossypol (ng/mg dw)")



p+stat_smooth(method = "nls", formula = y ~ y0 + a * (1 - b^x),  start=as.list(coef(dta.nls)), size = 1, se = F, colour = "black")

However, i was wondering how I could add the standard error shading. If I use the command se=T an error message appears saying:

Error in pred$fit : $ operator is invalid for atomic vectors

Can anyone help me in this regard?

...

Ista Zahn

unread,

Aug 3, 2015, 4:17:02 PM8/3/15

to michael....@gmx.ch, ggplot2

Confidence intervals are not available from predict.nls, so this is
tricky. I don't have time to look into it now, maybe someone else will
hsva a suggestion. Note that at this point it's not really a ggplot2
question, so you might have more luck posting on the R-help mailing
list, or on stackoverflow.

Best,
Ista

Brandon Hurr

unread,

Aug 3, 2015, 4:50:55 PM8/3/15

to Ista Zahn, michael....@gmx.ch, ggplot2

This looks helpful.

https://rmazing.wordpress.com/2013/08/14/predictnls-part-1-monte-carlo-simulation-confidence-intervals-for-nls-models/

Dennis Murphy

unread,

Aug 3, 2015, 6:47:36 PM8/3/15

to michael....@gmx.ch, ggplot2

Hi:

Brandon showed you one approach to computing simulation-based
confidence bands for nonlinear regression; another is to use a
bootstrap approach to the problem. A rather naive/Q&D way to do this
is to select a set of x-values at which predictions are desired, use
the bootstrap to get a set of predicted values at each x and then take
the .025 and .975 quantiles as its 95% bootstrap CI at x. This will
produce a set of confidence limits for each x that you can put into a
data frame and pass to ggplot(), using geom_ribbon() to plot the CIs.

I'd look at John Fox's paper:
http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix/Appendix-Bootstrapping.pdf

It discusses two ways to bootstrap a linear model, depending on
whether you want to consider the x-values to be fixed in the bootstrap
process or not. You can take the ideas from that paper and try to
adapt them to the process of bootstrapping your nonlinear model.

The problem with getting confidence limits from nonlinear models is
that not only is the expectation function nonlinear, but so is the
variance function, and it is rarely the case that the variance
function has a known form. This is one reason why predict.nls() does
not return confidence (or prediction) limits. The bootstrap often
works well in this type of problem, absent serious outliers. The
propagation of error approach that Brandon linked is another way to
attack the problem using a specific type of variance approximation.
Some people try to linearize a nonlinear model (when possible), find
the CIs/PIs from the linearized fit and transform back, but that has
not proven to be an effective solution. It is usually better to work
in the original units, with perhaps a more appropriate
parameterization of the model in the nonlinear case.

You may run into (convergence) problems fitting your model with
bootstrapped data, so you should become familiar with the try() and
tryCatch() functions and use them for each iteration of the bootstrap
fitting process. A large number of convergence failures is likely a
sign that your fitted model is not very stable and should not be
trusted.

One final point: as far as prediction intervals go, fuggedaboudit. In
linear models, there is an analytic form for the prediction variance
that is capable of extending confidence limits to prediction limits in
a straightforward, theoretically defensible way. This does not extend
to nonlinear models. Period. If the variance function for the _mean_
response is generally unknown in nonlinear models, it is even more
difficult to find one for an _individual_ response, and AFAIK using
simulation-based techniques for this purpose is a futile exercise. For
example, neither the bootstrap nor the propagation-of-error technique
applies to predicting individual responses from a nonlinear model.

Dennis

Reply all

Reply to author

Forward