gdistsamp Error in optim

555 views
Skip to first unread message

MP

unread,
Jun 21, 2014, 1:23:06 PM6/21/14
to unma...@googlegroups.com
Hi everyone,

My dataset is rather large in scope, consisting of point-count survey data for 69 species carried out in 321 point-transects divided into 3 habitats; each transect was repeat-visited 3 times.  I have split my data into 3 groups by habitat and I want to calculate density for each species within each habitat type.  Detections/visit varies greatly between species.  Because they are repeat visits I think gdistsamp is the correct procedure to run - correct me if I'm wrong.

My problem is that when I run gdistsamp for a null model:


yDat <- formatDistData(spp, distCol="BinCenter", transectNameCol="Station", dist.breaks=dbands, occasionCol="Visit")

umf <- unmarkedFrameGDS(y=as.matrix(yDat), numPrimary=1, dist.breaks=dbands, survey="point", unitsIn="m")

(nullmodel <- gdistsamp(~ 1, ~1, ~ 1, umf, keyfun="halfnorm", output = "density", unitsOut="ha", method = "BFGS", se=TRUE))


...I get the following error message:

 
 Error in optim(starts, nll, method = method, hessian = se, ...) :
  initial value in 'vmmin' is not finite

 
I believe this is due to me not specifying a vector for 'starts', containing 'starting values for the model parameters'.  I am a little uncertain how to proceed, as far as what needs to go into the vector.  I think I will need a separate vector for each species, correct?  The Unmarked documentation is unfortunately vague on the topic. 

If someone could provide information on how to avoid that error, and what I need to put into the starts vector, I'd greatly appreciate it.  Please let me know if I can clarify anything.  I'd be happy to provide my full code and my data if needed.

Thanks,
Michael

Jeffrey Royle

unread,
Jun 21, 2014, 3:42:30 PM6/21/14
to unma...@googlegroups.com
hi Michael,
 I agree this is probably a starting values problem -- the default values of 0 are probably extremely bad.
 the vector of starting values should be a vector of length 3 for the model you have illustrated (you can always check this stuff out by looking right at the R code where you will see how the starts are passed to the optim function)
 They are passed to the function call simply by specifying the starts = c(0,0,0) argument (see the help file for gdistsamp)

  So I suggest playing around with starting values until you find good ones. Sorry but I can't provide specific guidance on how to choose starting values
 (if anyone is interested in contributing code to unmarked, a utility function to give some ballpark guesses might be useful)

regards
andy



--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kery Marc

unread,
Jun 22, 2014, 8:22:13 AM6/22/14
to unma...@googlegroups.com
hi Michael,

one question (though I don't think this has to do with your error): why did you specify numPrimary = 1, when you have three rounds of distance sampling ?

Kind regards  --  Marc

From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of Jeffrey Royle [jar...@gmail.com]
Sent: 21 June 2014 21:42
To: unma...@googlegroups.com
Subject: Re: [unmarked] gdistsamp Error in optim

MP

unread,
Jun 25, 2014, 3:50:11 PM6/25/14
to unma...@googlegroups.com
Marc,

In the documentation for unmarkedFrameGDS, it says "numPrimary - Number of primary time periods (seasons in the multiseason model)."  I interpreted that to mean repeated sampling over multiple seasons or years.  My data collection involved three visits in a single avian breeding season, so I thought I should put numPrimary = 1.  Should I have set it to 3 instead?

Thanks,
Michael

Kery Marc

unread,
Jun 26, 2014, 4:23:22 AM6/26/14
to unma...@googlegroups.com
Dear Michael,

I think you should have numPrimary = 3. In a robust design, the primary occasions are those between which you assume an open population and that's the 3 sampling occasions.I just checked in an analysis I ran using gdistsamp, where we had 4 rounds in a single breeding season, and I put numPrimary = 4.

Kind regards  ---  Marc
______________________________________________________________
 
Marc Kéry
Tel. ++41 41 462 97 93
marc...@vogelwarte.ch
www.vogelwarte.ch
 
Swiss Ornithological Institute | Seerose 1 | CH-6204 Sempach | Switzerland
______________________________________________________________
 
*** Introduction to Bayesian statistical modeling: Kéry (2010), Introduction to WinBUGS for Ecologists, Academic Press; see www.mbr-pwrc.usgs.gov/pubanalysis/kerybook 
*** Book on Bayesian statistical modeling: Kéry & Schaub (2012), Bayesian Population Analysis using WinBUGS, Academic Press; see
www.vogelwarte.ch/bpa
*** Upcoming workshops: www.phidot.org/forum/viewforum.php?f=8


From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of MP [pendra...@gmail.com]
Sent: 25 June 2014 21:50

To: unma...@googlegroups.com
Subject: Re: [unmarked] gdistsamp Error in optim
--

MP

unread,
Jun 29, 2014, 7:27:22 PM6/29/14
to unma...@googlegroups.com
Thanks, Marc; that makes sense to me. 

Ok, another question, if I may?  I set numPrimary=3 and it seems to run ok now, at least it's not producing the error in optim.  Now I'm unclear on what unmarked is outputting, since whether I set output="density" or output="abund", both models return with 'Abundance' estimates.  Is the 'Abundance' estimate for nullmodel the density in my 50 m radius circle?

Thanks for your help!

Code and output below:

CODE
    ydat <- formatDistData(birds.sub2, distCol="BinCenter", transectNameCol="Station", dist.breaks=dbands, occasionCol="Visit")
    umf <- unmarkedFrameGDS(y=as.matrix(ydat), numPrimary=3, dist.breaks=dbands, survey="point", unitsIn="m")

    nullmodel <- gdistsamp(~ 1, ~1, ~ 1, umf, keyfun="halfnorm", output = "density", unitsOut="ha", se=TRUE)
    nullmodel2 <- gdistsamp(~ 1, ~1, ~ 1, umf, keyfun="halfnorm", output = "abund", unitsOut="ha", se=TRUE)

OUTPUT
    Call:
gdistsamp(lambdaformula = ~1, phiformula = ~1, pformula = ~1,
    data = umf, keyfun = "halfnorm", output = "density", unitsOut = "ha",
    se = TRUE)

Abundance:
 Estimate    SE    z  P(>|z|)
     2.26 0.674 3.36 0.000781

Availability:
 Estimate    SE    z P(>|z|)
     -1.9 0.788 -2.4  0.0162

Detection:
 Estimate    SE  z  P(>|z|)
     3.77 0.189 20 1.18e-88

AIC: 1201.548


Call:
gdistsamp(lambdaformula = ~1, phiformula = ~1, pformula = ~1,
    data = umf, keyfun = "halfnorm", output = "abund", unitsOut = "ha",
    se = TRUE)

Abundance:
 Estimate    SE    z P(>|z|)
     2.03 0.681 2.98 0.00286

Availability:
 Estimate    SE     z P(>|z|)
     -1.9 0.795 -2.39  0.0166

Detection:
 Estimate    SE  z P(>|z|)
     3.77 0.189 20 1.2e-88

AIC: 1201.548

Kery Marc

unread,
Jun 30, 2014, 3:11:41 AM6/30/14
to unma...@googlegroups.com
Dear Michael,

it seems to me that the heading in the summary of the model is simply the same, but once the parameters that come under that heading refer to abundance and another time for density, as chosen by the output argument.

Kind regards  --  Marc



Sent: 30 June 2014 01:27

To: unma...@googlegroups.com
Subject: Re: [unmarked] gdistsamp Error in optim
--

Dan Linden

unread,
Jun 30, 2014, 11:05:03 AM6/30/14
to unma...@googlegroups.com
Yeah, the estimate for abundance is exp(2.03) = 7.6 individuals per plot, while that for density is exp(2.26) = 9.6 individuals per ha.  The 50m radius plot has an area of 0.785 ha, which corresponds to the difference between values (9.6 * 0.785 = 7.6).

MP

unread,
Jun 30, 2014, 5:10:17 PM6/30/14
to unma...@googlegroups.com
Thanks, Marc and Dan. 

It seems that unmarked is carrying out some rounding in their estimates.  Is it possible to view the full estimates that it's using?

> #density model
> nullmodel
 
Call: gdistsamp(lambdaformula = ~1, phiformula = ~1, pformula = ~1, data = umf, keyfun = "halfnorm", output = "density", unitsOut = "ha", se = TRUE) Abundance: Estimate SE z P(>|z|) 2.26 0.674 3.36 0.000781 Availability: Estimate SE z P(>|z|) -1.9 0.788 -2.4 0.0162 Detection: Estimate SE z P(>|z|) 3.77 0.189 20 1.18e-88 AIC: 1201.548 > backTransform (nullmodel, type="lambda") Backtransformed linear combination(s) of Abundance estimate(s) Estimate SE LinComb (Intercept) 9.63 6.49 2.26 1 Transformation: exp > exp(2.26) [1] 9.583089

>
#abundance model
> nullmodel2
Call:
gdistsamp(lambdaformula = ~1, phiformula = ~1, pformula = ~1, 
    data = umf, keyfun = "halfnorm", output = "abund", unitsOut = "ha", 
    se = TRUE)

Abundance:
 Estimate    SE    z P(>|z|)
     2.03 0.681 2.98 0.00286

Availability:
 Estimate    SE     z P(>|z|)
     -1.9 0.795 -2.39  0.0166

Detection:
 Estimate    SE  z P(>|z|)
     3.77 0.189 20 1.2e-88

AIC: 1201.548 
> backTransform(nullmodel2, type="lambda")
Backtransformed linear combination(s) of Abundance estimate(s)

 Estimate   SE LinComb (Intercept)
     7.62 5.19    2.03           1

Transformation: exp 
> exp(2.03)
[1] 7.614086

Jeffrey Royle

unread,
Jun 30, 2014, 6:27:17 PM6/30/14
to unma...@googlegroups.com
yea the object "nullmodel" should have the parameter estimates in there. Try coef(nullmodel) or print(nullmodel, digits=5) or something like that.



--

MP

unread,
Jun 30, 2014, 6:46:05 PM6/30/14
to unma...@googlegroups.com
Yup, it was coef(nullmodel) - Thanks!
Reply all
Reply to author
Forward
0 new messages