[R] Finding starting values for the parameters using nls() or nls2()

Pinglei Gao

unread,

Oct 9, 2016, 11:17:10 AM10/9/16

to r-h...@r-project.org

Hi,

I have some data that i'm trying to fit a double exponential model: data.
Frame (Area=c (521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91, 989.05,
1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2),

Retention=c (95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46,
9.72, 97.92, 71.44, 44.52, 24.44, 15.26) ) and the formula of the double
exponential is: exp (b0*exp (b1*x^th)).

I failed to guess the initial parameter values and then I learned a measure
to find starting values from Nonlinear Regression with R (pp. 25-27):

> cl<-data.frame(Area =c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91,
989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2),

+ Retention =c(95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46,
9.72, 97.92, 71.44, 44.52, 24.44, 15.26) )

> expFct <- function(Area, b0, b1,th) {exp(b0*exp(b1*Area^th))}

> grid.Disperse <- expand.grid(list(b0 = seq(0.01,4, by = 0.01), th =
c(0.02),b1 = seq(0.01, 4, by = 0.01)))

> Disperse.m2a <- nls2(Retention ~expFct(Area, b0, b1,th), data = cl, start
= grid.Disperse, algorithm = "brute-force")

> Disperse.m2a

Nonlinear regression model

model: Retention ~ expFct(Area, b0, th, b1)

data: cl

b0 th b1

3.82 0.02 0.01

residual sum-of-squares: 13596

Number of iterations to convergence: 160000

Achieved convergence tolerance: NA

I got no error then I use the output as starting values to nls2 ():

> nls.m2<- nls2(Retention ~ expFct(Area, b0, b1, th), data = cl, start =
list(b0 = 3.82, b1 = 0.02, th = 0.01))

Error in (function (formula, data = parent.frame(), start, control =
nls.control(), :

Singular gradient

Why? Did I do something wrong or misunderstand something?

Later, I found another measure from Modern Applied Statistics with S (pp.
216-217):

> negexp <- selfStart(model = ~ exp(b0*exp(b1*x^th)),initial =
negexp.SSival, parameters = c("b0", "b1", "th"),

+ template = function(x, b0, b1, th) {})

> Disperse.ss <- nls(Retention ~ negexp(Area, b0, b1, th),data = cl, trace =
T)

b0 b1 th

4.208763 144.205455 1035.324595

Error in qr.default(.swts * attr(rhs, "gradient")) :

NA/NaN/Inf (arg1) can not be called when the external function is called.

Error happened again. How can I fix it? I am desperate.

Best regards,

Pinglei Gao

[[alternative HTML version deleted]]

______________________________________________
R-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Andrew Robinson

unread,

Oct 9, 2016, 6:08:03 PM10/9/16

to Pinglei Gao, R help (r-help@r-project.org)

Here are some things to try. Maybe divide Area by 1000 and retention
by 100. Try plotting the data and superimposing the line that
corresponds to the 'fit' from nls2. See if you can correct it with
some careful guesses.

Getting suitable starting parameters for non-linear modeling is one of
the black arts of statistical fitting. Good luck! And don't forget to
check for sensitivity.

Andrew

--
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics Tel: (+61) 0403 138 955
School of Mathematics and Statistics Fax: +61-3-8344 4599
University of Melbourne, VIC 3010 Australia
Email: a.rob...@ms.unimelb.edu.au
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/

Bert Gunter

unread,

Oct 9, 2016, 6:41:38 PM10/9/16

to Andrew Robinson, Pinglei Gao, R help (r-help@r-project.org)

Well... (inline -- and I hope this isn't homework!)

On Sun, Oct 9, 2016 at 3:05 PM, Andrew Robinson
<A.Rob...@ms.unimelb.edu.au> wrote:
> Here are some things to try. Maybe divide Area by 1000 and retention
> by 100. Try plotting the data and superimposing the line that
> corresponds to the 'fit' from nls2. See if you can correct it with
> some careful guesses.
>
> Getting suitable starting parameters for non-linear modeling is one of

> the black arts of statistical fitting. ...
>
> Andrew

True. But it's usually worthwhile thinking about the math a bit before guessing.

Note that the model can be linearized to:

log(log(Retention)) = b0 + b1*Area^th

So a plot of log(log(Retention)) vs Area may be informative and useful
for finding starting values. e.g., for a grid of th's, do linear
regression fits .

However, when I look at that plot, it seems pretty linear with a
negative slope. This suggests that you may have an overparametrization
problem . i.e. fix th =1 and use the b0 and b1 from the above
regression for starting values.

Do note that this strategy isn't foolproof, as it ignores that the
error term is additive in the above transformed metric, rather than
the original. This can sometimes mislead. But this is just a
heuristic.

Cheers,
Bert

peter dalgaard

unread,

Oct 9, 2016, 7:42:49 PM10/9/16

to Pinglei Gao, R help (r-help@r-project.org)

> On 10 Oct 2016, at 00:40 , Bert Gunter <bgunte...@gmail.com> wrote:
>
> Well... (inline -- and I hope this isn't homework!)
>

Pretty much same as I thought.

Fixing th=0.02 in the grid search looks wrong. Bert's plot is pretty linear, so th=1 is a good guesstimate. There's a slight curvature but to reduce it, you would increase th, not decrease it. Running the regression, as Bert suggests, indicates that b0=5.16 and b1= -0.00024 could work as reasonable starting values. Notice that the grid search had "b1 = seq(0.01, 4, by = 0.01)" which is wrong in both sign and scale.

Andrew's suggestion of dividing Retention by 100 is tempting, since it looks like a percentage, but that would make all Y values less than 1 and the double exponential function as written has values that are always bigger than 1. (It is conceivable that the model itself is wrong, though. E.g. it could be that Retention on a scale from 0 to 1 could be modeled as exp(-something), but we really have no idea of the context here.)

(If this was in fact homework, you should now go and write a proper SelfStart initializer routine for this model. Even if it isn't homework, you do need to study the text again, because you have clearly not understood how self-starting models work.)

-pd

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd....@cbs.dk Priv: PDa...@gmail.com

ProfJCNash

unread,

Oct 9, 2016, 8:16:03 PM10/9/16

to r-h...@r-project.org, gaopi...@163.com

I didn't try very hard, but got a solution from .1, 1, .1 with nlxb() from nlmrt. It took a lot
of iterations and looks to be pretty ill-conditioned. Note nlmrt uses analytic derivatives if it
can, and a Marquardt method. It is designed to be a pit bull -- tenacious, not fast.

I'm working on a replacement for this and nls(), but it will be a while. However, I welcome
short scripts like this as tests. My script below

Best, JN

cl<-data.frame(Area =c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91,
989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 2629.2),

Retention =c(95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 23.46,
9.72, 97.92, 71.44, 44.52, 24.44, 15.26) )

expFct <- function(Area, b0, b1,th) {exp(b0*exp(b1*Area^th))}

expf2 <- "Retention ~ exp(b0*exp(b1*Area^th))"

# grid.Disperse <- expand.grid(list(b0 = seq(0.01,4, by = 0.01), th =
#c(0.02),b1 = seq(0.01, 4, by = 0.01)))

#> Disperse.m2a <- nls2(Retention ~expFct(Area, b0, b1,th), data = cl, start
#= grid.Disperse, algorithm = "brute-force")

# Disperse.m2a

library(nlmrt)
test <- nlxb(expf2, start= c(b0=.1, b1=1, th=.1), trace=TRUE, data=cl)

Gabor Grothendieck

unread,

Oct 9, 2016, 9:26:43 PM10/9/16

to Pinglei Gao, r-h...@r-project.org

If you are not tied to that model the SSasymp() model in R could be
considered and is easy to fit:

# to plot points in order
o <- order(cl$Area)
cl.o <- cl[o, ]

fm <- nls(Retention ~ SSasymp(Area, Asym, R0, lrc), cl.o)
summary(fm)

plot(Retention ~ Area, cl.o)
lines(fitted(fm) ~ Area, cl.o, col = "red")

ProfJCNash

unread,

Oct 9, 2016, 10:39:42 PM10/9/16

to Gabor Grothendieck, Pinglei Gao, r-h...@r-project.org

Despite nlmrt "solving" the OP's problem, I think Gabor's suggestion likely gives a more sensible approach
to the underlying modelling problem.

It is, of course, sometimes important to fit a particular model, in which case nls2 and nlmrt are set up to grind away.
And hopefully the follow-up to nlmrt I'm working on will have enough capability in getting analytic derivatives
to work for a wider class of models. Note that functional approaches in nlmrt and minpack.lm allow users to
provide derivatives. Too many users think numerical approximations are a panacea, but my experience is that
most problems benefit from very accurate derivatives, of which analytic expressions are generally the best.

JN

ProfJCNash

unread,

Oct 10, 2016, 11:28:30 AM10/10/16

to Pinglei Gao, R help

The key lines are

library(nlmrt)
test <- nlxb(expf2, start= c(b0=.1, b1=1, th=.1), trace=TRUE, data=cl)

Thus I started with .1 1 and .1. The "solution" from nlxb, which is using analytic derivatives
and a very aggressive Marquardt code to keep trying even in bad situations, was
as you included below. Note that the singular values of the Jacobian are given (they are
recorded on the same table as the parameters, but do NOT correspond to the parameters.
The placement was simply a tidy place to put these numbers.)

The ratio of these sv's is 1.735e+16/0.004635 or approx 4E+18, so the condition number
of the traditional Gauss Newton approach is about 1E+37. Not a nice problem!

You probably should reformulate.

JN

On 16-10-10 10:41 AM, Pinglei Gao wrote:
> Thanks very much for your kindness help. I run your script then came out
> lots of outputs and I also studied the solution you posted. Forgive my
> ignorance, I still can't find the suitable starting values. Did I
> misunderstand something?
>
> Best,
>
> Pinglei Gao
>
> -----邮件原件-----
> 发件人: ProfJCNash [mailto:profj...@gmail.com]
> 发送时间: 2016年10月10日 10:41
> 收件人: Gabor Grothendieck; Pinglei Gao
> 主题: Re: [R] Finding starting values for the parameters using nls() or
> nls2()
>
> I forgot to post the "solution" found by nlmrt:
>
> nlmrt class object: x
> residual sumsquares = 1086.8 on 15 observations
> after 5001 Jacobian and 6991 function evaluations
> name coeff SE tstat pval gradient
> JSingval
> b0 5.3274e-14 NA NA NA -6.614e+13
> 1.735e+16
> b1 33.5574 NA NA NA -3.466
> 11518
> th -0.00721203 NA NA NA -740.8
> 0.004635
>
>
> Note the singular values -- this is the worst SV(max)/SV(min) ratio I've
> observed!
>
> JN

Pinglei Gao

unread,

Oct 10, 2016, 12:03:02 PM10/10/16

to peter dalgaard, R help

Thanks very much for taking time on this. Your assistances are very much
appreciated. But, I am afraid that I still have a question to bother you.

I am working on a paper about weed seeds dispersal with harvest machine. I
found three general models for seed dispersal and retention after a review
of relevant literature. All models were optimized using nonlinear least
squares via the nls function in the statistical package R. The model that
best described the data will be determined by comparing Akaike Information
Criterion (AIC) values and the model with the lowest AIC score will be
selected.

The first general model incorporated simple exponential and power
exponential functions, its starting value was easily to be found. But, I am
stuck with model 2 (which was mentioned previously) and model 3 with the
form: Retention = (b0*Area^th+1)^b1. The model 3 is totally different to
others. I tried the measures that you were mentioned. But I still can’t
find suitable starting values because of my limited knowledge. I hope you
can do me the favor again. I can send the draft to you when I finished the
paper, if it is necessary. Maybe you can give me some constructive
suggestion about statistic and model construction and I can name you as a
coauthor for your contributions.

Best,

Pinglei Gao

-----邮件原件-----
发件人: peter dalgaard [mailto:pda...@gmail.com]
发送时间: 2016年10月10日 7:41
收件人: Pinglei Gao
抄送: Andrew Robinson; R help (r-h...@r-project.org); Bert Gunter

主题: Re: [R] Finding starting values for the parameters using nls() or
nls2()

-pd

>>> + 33.04, 23.46,

Pinglei Gao

unread,

Oct 10, 2016, 12:04:07 PM10/10/16

to ProfJCNash, R help

Thanks very much for your kindness help. I run your script then came out
lots of outputs and I also studied the solution you posted. Forgive my
ignorance, I still can't find the suitable starting values. Did I
misunderstand something?

Best,

Pinglei Gao

-----邮件原件-----

发件人: ProfJCNash [mailto:profj...@gmail.com]
发送时间: 2016年10月10日 10:41
收件人: Gabor Grothendieck; Pinglei Gao

主题: Re: [R] Finding starting values for the parameters using nls() or
nls2()

I forgot to post the "solution" found by nlmrt:

nlmrt class object: x
residual sumsquares = 1086.8 on 15 observations
after 5001 Jacobian and 6991 function evaluations
name coeff SE tstat pval gradient
JSingval
b0 5.3274e-14 NA NA NA -6.614e+13
1.735e+16
b1 33.5574 NA NA NA -3.466
11518
th -0.00721203 NA NA NA -740.8
0.004635

Note the singular values -- this is the worst SV(max)/SV(min) ratio I've
observed!

JN

______________________________________________

dave fournier

unread,

Oct 19, 2016, 6:55:12 PM10/19/16

to r-h...@r-project.org, gaopi...@163.com

Actually this converges very nicely if you use these starting values
that I obtained with
AD Model Builder

th 9.1180e-01
b0 5.2104e+00
b1 -4.6725e-04

The R result looks like

nls.m2
Nonlinear regression model
model: Retention ~ expFct(Area, b0, b1, th)
data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,

2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52,

1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36,

18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44,

15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class
= "data.frame")
b0 b1 th
5.2104466 -0.0004672 0.9118029
residual sum-of-squares: 686.8

Number of iterations to convergence: 1

Achieved convergence tolerance: 1.75e-06

dave fournier

unread,

Oct 24, 2016, 3:15:20 PM10/24/16

to r-h...@r-project.org

I've spent quite a bit of time trying to convince people on various
lists that the solution to these kinds of
problems lies in the stable parameterization of the model. I write the
solutions in AD Model Builder because it
is easy. But R people are generally stuck in R (or mired) so the
message somehow always gets lost.

I thought I would bite the bullet and figure out how to do this in R.
It turns out that one can fit this model
with only 6 function evaluations using nls2. I apologize in advance
for my execrable R code. But it does do
the job.

The data were fit using 3 calls to nls2. For the first call only the
parameter th was estimated. This converged
in 4 function evaluations. For the second call all three of the
parameters were estimated
(but the other 2 parameters are different from b0, b1) .This converged
to the solution in four function evaluations.

The final call to nls2 uses the converged values to calculate b0,b1 and
theta and starts from there.
As you can see the model is already converged. This final call to nls2
is used to get the standard error
estimates for the parameters.

> nls.m1
Nonlinear regression model
model: Retention ~ expFct1(Area, y1, yn, th)
data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,

2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52,

1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36,

18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44,

15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class
= "data.frame")

th
0.9862
residual sum-of-squares: 730

Number of iterations to convergence: 2
Achieved convergence tolerance: 1.616e-06

> nls.m2
Nonlinear regression model
model: Retention ~ expFct2(Area, y1, yn, c, d, th)
data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,

2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52,

1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36,

18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44,

15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class
= "data.frame")

c d th
0.9716 1.1010 0.9118
residual sum-of-squares: 686.8

Number of iterations to convergence: 4
Achieved convergence tolerance: 2.398e-06

> nls.m
Nonlinear regression model

model: Retention ~ expFct(Area, b0, b1, th)

data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,

2560.46, 524.91, 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52,

1432.5, 2144.74, 2629.2), Retention = c(95.3, 87.18, 44.94, 26.36,

18.12, 84.68, 37.24, 33.04, 23.46, 9.72, 97.92, 71.44, 44.52, 24.44,

15.26)), .Names = c("Area", "Retention"), row.names = c(NA, -15L), class
= "data.frame")
b0 b1 th

5.2104452 -0.0004672 0.9118040
residual sum-of-squares: 686.8

Number of iterations to convergence: 0
Achieved convergence tolerance: 1.384e-06

Parameters:
Estimate Std. Error t value Pr(>|t|)
b0 5.2104452 0.4999594 10.422 2.29e-07 ***
b1 -0.0004672 0.0013527 -0.345 0.7358
th 0.9118040 0.3583575 2.544 0.0257 *

So what is the stable parameterization for this model. To simplify let
x be the independent variable and y be the
dependent variable and write t instead of th So the model is

y = exp(b0*exp(b1*x^t) (1)

Now it would be nice to reorder the x's so that they are monotone
increasing or decreasing, but in this case the
first x is almost the largest and the last x is almost the smallest
(slight reach) so they will do. Call them x1 and xn.
The new parameters of the model are the predicted y's for x1 and xn.
Call them y1 and yn. Note that y1 and yn
are parameters, not observations.

The data were fit using 3 calls to nls2. For the first call only the
parameter th was estimated. this converged
in 4 function evaluestions. for the second call all three of the
parameters were estimated
(but the other 2 parameters are different from b0, b1) .This converged
to the solution in four function evaluations.

The final call to nls2 uses the converged values to calculate b0,b1 and
theta and starts from there.
as you can see the model is already converged. this final call to nls2
is used to get the standard error
estimates for the parameters.

> nls.m1
Nonlinear regression model
model: Retention ~ expFct1(Area, y1, yn, th)
data: structure(list(Area = c(521.5, 689.78, 1284.71, 2018.8,