summarize_ds_model Funktion

223 views
Skip to first unread message

Carolin Tröger

unread,
Apr 28, 2017, 5:28:24 AM4/28/17
to distance-sampling

Hi Listfolk,

I am analyzing some distance data in R (R Studio) and i would like to summarize my models with the function "summarize_ds_model ()". 

But somehow this does not work. I get an error message:

>So15_hr_bins<-ds(So15,  formula= ~1, key="hr", convert.units=0.001, cutpoints= dist.bins)
>Wi16_hr_bins<-ds(Wi16,  formula= ~1,key="hr", adjustment= "cos",convert.units=0.001, cutpoints= dist.bins)
>summarize_ds_models(So15_hr_bins,Wi16_hr_bins, sort="AIC", output="latex")

>summarize_ds_models(So15_hr_bins,Wi16_hr_bins, sort="AIC", output="latex")
Fehler in `[.data.frame`(res, , 4:7) : undefined columns selected

Packages Distance, Rdistance, mrds and knitr are installed!

Can somebody help ??

Thanks upfront

Carolin 

Eric Rexstad

unread,
Apr 28, 2017, 5:54:53 AM4/28/17
to Carolin Tröger, distance-sampling

Carolin

I've duplicated (with the minke data provided in the Distance package) your sequence of commands

library(Distance)
data(minke)
minke1 <- ds(data=minke, key="hn")
minke2 <- ds(data=minke, key="hr")
summarize_ds_models(minke1, minke2)

             Model Key function Formula C-vM $p$-value $\\hat{P_a}$ se($\\hat{P_a}$) $\\Delta$AIC
1 \\texttt{minke1}  Half-normal      ~1      0.9292864    0.4519568       0.03470614     0.000000
2 \\texttt{minke2}  Hazard-rate      ~1      0.8477827    0.5010360       0.04753377     1.838793

without problem using the following versions of the packages of interest

other attached packages:
[1] Distance_0.9.6 mrds_2.1.17   

under R 3.4.0

Why don't you see if you can get sensible results from `summarize_ds_models()` using these data; that would isolate the problem to something unique about your models.
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To post to this group, send email to distance...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/e56e4494-68f1-45d1-8cfd-ae15dd5654cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
Eric Rexstad
Research Unit for Wildlife Population Assessment
Centre for Research into Ecological and Environmental Modelling
University of St. Andrews
St. Andrews Scotland KY16 9LZ
+44 (0)1334 461833
The University of St Andrews is a charity registered in Scotland : No SC013532

Carolin Tröger

unread,
Apr 28, 2017, 6:47:52 AM4/28/17
to distance-sampling, caroli...@gmail.com, eric.r...@st-andrews.ac.uk, er...@st-andrews.ac.uk

Hi Eric, 

thanks for the quick response!! I tried it with the minke data, and the function summarize_ds_model worked. So the problem must be within my data. 
Then i checked the summarize_ds_model with my data :

test<-ds(Wi16, key="hn", formula=~1, convert.units=0.001)
test2<-ds(So16, key="hr",formula=~1, convert.units=0.001)
summarize_ds_models(test, test2)

WORKS !!

BUT as soon as i put cut points in the ds formula, the error pops up :

test<-ds(Wi16, key="hn", formula=~1, convert.units=0.001)
test2<-ds(So16, key="hr",formula=~1, convert.units=0.001, cutpoints=dist.bins)
summarize_ds_models(test,test2)

Fehler in data.frame(c("Hazard-rate", "~1", "0.439498409297398", "0.0158987457444908",  : 
  arguments imply differing number of rows: 5, 6



Eric Rexstad

unread,
Apr 28, 2017, 7:22:11 AM4/28/17
to Carolin Tröger, distance-sampling, eric.r...@st-andrews.ac.uk

Thanks for the thorough testing Carolin.

There are some subtle issues associated with summarize_ds_models() because that function is usually used for the purpose of model selection.  To compare models (using AIC), the data used in the two models needs to be identical--otherwise the comparison is invalid.  When data are not binned, the data used for multiple models is identical, therefore the AIC comparison is legitimate.

However, when binning occurs, the summarize_ds_models() function should also test whether the data used in the models being compared is identical.  At the moment the function does not do that.  In the second example you provided, the function should prevent you from comparing models fitted to unbinned and binned data (the result is invalid).  The function should undertake that prevention in a nicer fashion than generating an error message.

To summarise, we should put a warning onto summarise_ds_models() that for the moment, it should only be used with models fitted to exact distance data.  In a future version, the function should be more clever and come up with a comparison that if binned data are used, ensuring the cut points are identical.  Likewise the function ought to check that the truncations distances for data in the models being considered is identical.  Those checks are not yet in place so the user needs to police themselves.

I'll raise an issue about this problem on our software development system.  Thanks for spotting this.

--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To post to this group, send email to distance...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Carolin Tröger

unread,
Apr 28, 2017, 8:04:58 AM4/28/17
to distance-sampling, caroli...@gmail.com, eric.r...@st-andrews.ac.uk, er...@st-andrews.ac.uk
Hi Eric, 

ok that explains, why it does not work with my data. My intension for using this function was actually just to list all the runned models in a nice form underneath each other. 
At the moment I am creating a table like this: 

density_results_all<- rbind(So15_hn_bins$dht$individuals$D, Wi16_hn_bins$dht$individuals$D,......
 
density_results_all[,1]<- as.character(density_results_all[,1])
density_results_all[1,1]<-"So15 Hn 10m steps"
density_results_all[2,1]<-"Wi16 Hn 10m steps"  .....

kable(density_results_all[,1:7], digits=5, format= "pandoc", caption= "Summary of Density estimates Binned")

Table: Density Estimates over all seasons (binned data)

Label                Estimate        se          cv        lcl        ucl          df   AIC_results
------------------  ---------  --------  ----------  ---------  ---------  ----------  ------------
So15 Hn 10m steps    5.431665   0.97927   0.1802894   3.632992   8.120852    9.365881     -578.9588
Wi16 Hn 10m steps    5.649013   1.02055   0.1806603   3.774165   8.455206    9.318664     -735.8592
So16 Hn 10m steps    5.025529   0.79037   0.1572706   3.550227   7.113896   10.164670     -343.9217
Wi17 Hn 10m steps    6.057813   1.06061   0.1750820   4.098366   8.954082    9.363939     -753.3490

but is there a better and easier way ???

One other thing I was wondering about is the ESW. I know that it is not automatically calculated in R with the ds function. I know how to calculate it by hand out of my data, but is there a way of automatisizing it within R ? 

for example :  
minke1 <- ds(data=minke, key="hn")

ESW_minke1<- minke1$dht$ .... ??? 

And how can i let the ESW be drawn into my detection function plot (what kind of R code do i need for that ? ) 
I am sorry to ask all this questions, but i don't find any R code example regarding this problems. 

Thanks upfront ! 

Eric Rexstad

unread,
Apr 28, 2017, 8:28:12 AM4/28/17
to Carolin Tröger, distance-sampling, eric.r...@st-andrews.ac.uk

Caroline

Every analyst will want their own summaries of the analyses they have performed, so there is no generalised function beyond 'summarise_ds_models()' for that purpose.  Your "rbind()" effort is as good as any method for your purposes.

R is lovely for providing the tools to make functions out of repetitive calculations.  If you wish to have a summary of effective strip width, you could finish the function you began like this

ESW.calc <- function(my.ds.object) {
#  calculates effective strip width
#    given object created by function ds() of class "dsmodel"
#    Rexstad, 28Apr17
  if (class(my.ds.object) != "dsmodel") stop("Argument is not of class dsmodel")
  my.summary <- summary(my.ds.object)
  truncation <- my.ds.object$ddf$meta.data$width
  average.p <- my.summary$ds$average.p
  esw <- truncation * average.p
  return(esw)
}

If you want a vertical line placed at ESW on your detection function plot, try

plot(minke1)
ESW.for.hn <- ESW.calc(minke1)
abline(v=ESW.for.hn, lwd=2)


For more options, visit https://groups.google.com/d/optout.

Carolin Tröger

unread,
Apr 28, 2017, 8:38:28 AM4/28/17
to distance-sampling, caroli...@gmail.com, eric.r...@st-andrews.ac.uk, er...@st-andrews.ac.uk
Thanks a lot Eric ! That helps me a lot !! 


David Lawrence Miller

unread,
Apr 28, 2017, 11:38:20 AM4/28/17
to Carolin Tröger, distance-sampling, eric.r...@st-andrews.ac.uk, er...@st-andrews.ac.uk
Hi Carolin, hi listfolk,

I've patched this issue in the version of Distance available on github.
You can install this via devtools:

install.packages("devtools")
library(devtools)
install_github("DistanceDevelopment/Distance")

This update ensures that the truncation and binning are the same between
models and includes the Chi^2 p-value in the table of results for binned
models (Cramer-von Mises is still provided for unbinned distances).

Hope this is helpful!

cheers,
--dave
> --
> You received this message because you are subscribed to the Google
> Groups "distance-sampling" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to distance-sampl...@googlegroups.com
> <mailto:distance-sampl...@googlegroups.com>.
> To post to this group, send email to distance...@googlegroups.com
> <mailto:distance...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/distance-sampling/1e487004-d42f-4098-9175-37e2b275c813%40googlegroups.com
> <https://groups.google.com/d/msgid/distance-sampling/1e487004-d42f-4098-9175-37e2b275c813%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages