Dear group,
I am analyzing data of humpback whale abundance during 2022 in a given area. We have standardize transects and we do replicates during whale season.
I have done many models with covariates that fit the data and they give me very different results regarding abundance (ranging from 600 to 2000 animals). The model with best AIC gives me a very low Pa (0.085) and high abundance. Another model including alpha transects in the analysis improved Pa (0.349) and reduced abundance, but it is not right to compare the AIC of models with different datasets, right?
I don’t feel comfortable choosing a model because here my decision is very important. The covariates can be explained (they are stratifications like region, survey and time – divided by time A and time B). When I put the survey as a covariate (survey is each completed fieldwork in the transects) is when I have better AIC and low Pa. It makes sense because there were surveys with very high numbers of whales and surveys with very low numbers, depending on whale migration. But I wonder if it is superestimating abundance here. What do you suggest to improve my model selection?
All the best,
Jéssica
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/82b7348f-120e-4c88-a620-a7ef395290d8n%40googlegroups.com.
dht2
function to learn about that.Jéssica, a few comments.
Steve Buckland
From: distance...@googlegroups.com <distance...@googlegroups.com>
On Behalf Of Jéssica Melo
Sent: 30 November 2022 11:35
To: Tiago Marques <tiagoand...@gmail.com>
Cc: distance-sampling <distance...@googlegroups.com>
Subject: Re: [distance-sampling] selecting a model with very different results
Hi Tiago,
thank you for your help. I will try another model for the detection function to see if it's something wrong with the hazard rate. I attached the perpendicular distances frequency.
I'm sorry I didn't explain what an alpha transect is. We have "main" transects that we sample in each survey. But sometimes we make a visual effort to get to the starting point, this is what we call the alpha transect. These transects improved the Pa but reduced the total abundance, and I don't know which estimate would be the best.
I think my biggest question here is about replicates. Is there a different way to analyze replicates to get a total abundance of how many whales frequent the area? Because like I said, I put surveys as a covariate but maybe there's a better way to do this.
Jéssica
Em seg., 28 de nov. de 2022 às 18:03, Tiago Marques <tiagoand...@gmail.com> escreveu:
Dear Jéssica,
You should not compare AIC's when using different data. Those comparisons are meaningless.
You do not explain what is an alpha transect, and that seems key to understanding what is going on.
Using survey as a covariate can't be justified on the grounds that there are more or less whales due to migration, the covariates on the detection function impact the detection function, not the density. Therefore, survey would be a sensible covariate if e.g. different vessels or observers or sea state was observed in different surveys and you had not those survey level covariates recorded. Otherwise, using directly vessel, observer or sea state might be a better covariate, since survey would otherwise only represent proxies for the true factors impacting detectability. In a way, survey is a covariate that soaks any other survey level heterogeneity in the detection process that you have not recorded the relevant covariates to explain.
I suspect there is something rather strange going on with a spiked detection function leading to the P=0.085 (maybe a hazard rate detection function gone bad?), which then probably leads to the estimate of around 2000 animals. That might be a model that is not sensible for the sighting process - is it? - and might be the result of some rounding to 0 (common if you detect distances and angles, and many angles are small or rounded to 0). Of course I am only guessing since I have not seen your data, so this is only based on vague priors and no data (this is really a stats joke, bear with me).
Hope this helps, I must be honest this was the last thing I did before turning off the computer and it was a long day, so sorry if something does not add up here :)
Tiago
On Mon, 28 Nov 2022 at 20:15, Jéssica Melo <jessic...@pctsb.org> wrote:
Dear group,
I am analyzing data of humpback whale abundance during 2022 in a given area. We have standardize transects and we do replicates during whale season.
I have done many models with covariates that fit the data and they give me very different results regarding abundance (ranging from 600 to 2000 animals). The model with best AIC gives me a very low Pa (0.085) and high abundance. Another model including alpha transects in the analysis improved Pa (0.349) and reduced abundance, but it is not right to compare the AIC of models with different datasets, right?
I don’t feel comfortable choosing a model because here my decision is very important. The covariates can be explained (they are stratifications like region, survey and time – divided by time A and time B). When I put the survey as a covariate (survey is each completed fieldwork in the transects) is when I have better AIC and low Pa. It makes sense because there were surveys with very high numbers of whales and surveys with very low numbers, depending on whale migration. But I wonder if it is superestimating abundance here. What do you suggest to improve my model selection?
All the best,
Jéssica
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/82b7348f-120e-4c88-a620-a7ef395290d8n%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
distance-sampl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/CAN4VzkGKc6Kxq%3DTbtX6qEQ1Mn%3DU68JQ5xh%2BVJnQyAmLJ0CobTw%40mail.gmail.com.
Jéssica, adding to Eric’s comments, if model fit is poor, that can be because the model is a poor choice, or it can mean that there is a problem with the data. If your distances are subject to a lot of rounding, then the hazard-rate model may appear to provide a better fit than the half-normal, but the half-normal may give lower bias. Have you investigated your data? Are there a number of zeros in the perpendicular distances? If so, you may want to increase the width of your first interval when assessing goodness-of-fit, so that the first interval extends out beyond the distances that might plausibly get rounded to zero. If you only have very few zeros in the data, then you might want to try truncating more. But spend some time exploring the data and understanding them – that will then inform your decisions on how best to model them.
Steve Buckland