ctmm.fit vs ctmm.select for mean UDs

Connie Kot

unread,

Sep 14, 2023, 7:55:31 PM9/14/23

to ctmm R user group

Hello!

I have been using the code given during the spring webinar to create mean UDs (very helpful - thanks!) but noticed that they produce really large areas that seem to be overestimations. This is using the ctmm.select function after running ctmm.guess an averaging the UDs of individuals within a population. When I used ctmm.fit (also provided as an example method, but ran for individual akdes), and averaged the resulting UDs, the results seemed more reasonable for home ranges. I tried to read more about how these differ, but I am not understanding why/when one is preferred. I do think I have relatively small sample sizes, but the ctmm.fit seems to be doing what we need.

The edited R code and same sample data files are in the same dropbox location.

Thank you!

Connie

Christen Fleming

unread,

Sep 17, 2023, 9:41:19 PM9/17/23

to ctmm R user group

Hi Connie,

You are running ctmm.fit() with no guess object, which just returns an IID model fit, which has the maximum DOF equal to the nominal sample size. This produces a tighter distribution, but is incorrect for non-independent data.

On the other hand, when you run ctmm.select(), some of your DOFs are very small because individuals are not very resident. The estimate is appropriately larger than the data, with correspondingly large confidence intervals. This is a data quality issue, as you are effectively estimating something akin to a dispersal kernel for the population.

Your individual DOFs are: 1.4, 1.8, 2.1, 2.6, 6.7, 13.4. So the first 4 will have negatively biased range estimates without ctmm.boot() and are potentially non-resident.

Best,

Chris

Connie Kot

unread,

Sep 21, 2023, 3:59:53 PM9/21/23

to ctmm R user group

Hi Chris,

Thank you for this explanation - there is a lot here to unpack and appreciate your time and expertise!

1) I was mainly surprised that it would run without a guess object, so this clarifies what is happening - thanks! I was also able to run the ctmm.fit with a guess object (ran ctmm.guess first) and results were similar to ctmm.select (with guess object), but not exact (amended my R code in dropbox if you wanted to take a look). Can you please explain this? For estimating aKDEs for populations (n > 1), I understand that the best method would be to run ctmm.select() with the guess object (after running ctmm.guess), such as the example you have presented in the past webinar. But I was confused by the documentation, where it uses ctmm.fit with guess and says "in general, you want to run ctmm.select instead" (https://ctmm-initiative.github.io/ctmm/reference/ctmm.fit.html).

2) I do need to ponder my non-resident movement models a bit more, so thank you for looking at my sample dataset. In most of my datasets (with small sample sizes), I have been able to identify the "dispersal/migration" portion of the track and am interested in summarizing these movements as their "dispersal range." I have been exploring aKDEs for resident and dispersing behaviors to keep the method consistent within a dataset and balance timelines. I realize this might not be ideal, so it would be great to hear your insights on this, given new packages (dispfit? or others?) or if we can present all results using ctmm with some limits/major caveats.

3) Thanks for the tip about ctmm.boot - this was my next step, given time. However, this might not be appropriate for datasets with only n = 1 (or if we run out of time), so I'm wondering how to best handle aKDEs for one individual or small sample sizes without bootstrapping. Would an IID model fit with caveats be ok to consider or better to have overestimated UDs using the ctmm.select function? I am asking about the mathematical point of view as I am aware that it may also be dependent on the scientific question/objectives.

Best,

Connie

Christen Fleming

unread,

Sep 21, 2023, 5:04:15 PM9/21/23

to ctmm R user group

Hi Connie,

ctmm.fit() will fit and keep all of the specified parameters. ctmm.select() will drop the unsupported parameters. In almost all use cases, you want to use the latter.
The documentation is using ctmm.fit() in specific cases where I know that is safe to do (for computational speed), but also informing you not to do that on your data (unless you know better).
If you are able to segment your data into resident segments and dispersal segments, then I would recommend estimating separate ranges for each, and reporting them separately. Also IIRC, I believe you were using mean() when you probably want to be using pkde(), but the latter is more sensitive to data quality issues. The former is the mean of the sample, while the latter is the extrapolated population range.
When you have a small effective sample size (unless its from location error) then the AKDE is underestimated, but you still get appropriately wide confidence intervals. Switching to IID KDE is exacerbating that negative bias.

Best,

Chris

Connie Kot

unread,

Sep 23, 2023, 12:20:38 PM9/23/23

to ctmm R user group

Thanks, Chris, this is great info!

I reran my code for different datasets and the pkde() seems to be working and the UDS look more reasonable, so that is good news! However, I ran into an error message on a couple of datasets using the same code after running pkde(): "Error in rep(0, gridsize - 1) : invalid 'times' argument." My input data are in the same format for ones that look like they run successfully and ones that throw the error.

The error also occurs for the example dataset I shared earlier on dropbox. I amended the code at the end to show the steps, though it is just using PKDE <- pkde(data_tel, UDS) instead of the mean(UDS) at the last line.

I'm interested in learning more about the pkde(), so if there are any materials that is available for others that are unable to attend your upcoming talk at TWS or other related references, please let me know.

Thanks again for your quick response!

Best,

Connie

Connie Kot

unread,

Sep 24, 2023, 8:36:57 PM9/24/23

to ctmm R user group

Hi Chris,

I should note that this error message started showing up when running more than one dataset, but after I restarted R a few times, it seemed to have resolved so maybe it was just a glitch? However, I now get another error message and I am not sure if it is related, so I have included it in this same thread. After running pkde() for two individuals, I got a warning message first: "Warning message: In akde(data, CTMM = UD, kernel = kernel, weights = weights, ref = ref, : Population fit object returned. DOF[area] = 1.79492098138809e-07" and then I get error messages when trying to plot it or trying to export it as a shapefile to explore further. When I plotted the individual aKDEs, I was able to see them as overlapping. I placed the dataset that was giving me the error in dropbox and the new code is at the end. Any insights would be much appreciated!

Best,

Connie

Christen Fleming

unread,

Sep 25, 2023, 7:20:42 PM9/25/23

to ctmm R user group

Hi Connie,

Population fit object returned. DOF[area] = 1.79492098138809e-07

This is a problematic warning message. This effective sample size is almost zero. The CIs will be so wide, you can't really export this or do anything with it.

I don't think its safe to run pkde() on just two individuals. That would be like running akde() on two locations. I can work on coding that to be more robust, but I would never trust that output.

Best,

Chris

Message has been deleted

Connie Kot

unread,

Sep 25, 2023, 9:22:05 PM9/25/23

to ctmm R user group

Hi Chris,
Thanks for looking into this! The main objective for summarizing our data, with a small sample size, is to have a better way to represent area-use, though it is obviously going to be limited statistically. In the past, we have resorted to calculating the mean KDE with the knowledge that this is not ideal. After looking at the results from the pkde() function, it seemed like the resulting UDs were in between UDs using the IID model fit and UDs calculated as the mean UDs (aKDEs). So, do you suggest that for certain sample sizes (some small threshold?), that it would be better not to use pkde() and use the mean UD (or other method)?
Thank you!
Connie

Christen Fleming

unread,

Sep 26, 2023, 3:21:42 PM9/26/23

to ctmm R user group

Hi Connie,

Thanks for the example data. I noticed that some numerical gradient code wasn't doing a good job in that example and upgraded that code to analytic formulas, which I tested to work much better. This update is on Github now, however:

The PKDE estimate does not come out to be good with these two individuals because you get variance collapse in the population model estimates, in that the uncertainty in the individual parameter estimates is large enough to explain all of the variation in the data.

In situations like this, assuming that you have to report something, I would run pkde() and mean() on the UDs and then take the larger of the two estimates. mean() does not extrapolate from the sample to the population like pkde() does, but with really weak data (like only 2 points here) pkde() can suffer more from variance collapse.

Best,

Chris

Connie Kot

unread,

Oct 6, 2023, 11:55:52 AM10/6/23

to ctmm R user group

Thank you, Chris, for your advice and quick upgrade to the code!
I tried again with the new version of ctmm (1.2.1) on a different dataset (n = 67 individuals) and I got this warning message when calculating multiple UDs at once (UDS <- akde(data, FIT1)): "Warning: Fit object returned. DOF[area] = 3.33576466709752e-13." This seems similar to the issue I had with the other dataset with 2 individuals, where you explained that the results may not be useful because of large CIs. Subsequently, I was unable to run either the pkde() or mean(), and (just for fun) overlap(). The error after pkde() and mean() was: "Error in FUN(X[[i]], ...) : no slot of name "CTMM" for this object of class "ctmm"." For overlap(), I got this error: "Error in dr[1, ] : incorrect number of dimensions"

I'm wondering if there were any other options you suggest or if this dataset is just not suitable? I've posted the UDS.RDS file in the dropbox if you wanted to take a look, it is quite large.

Thanks again!

Cheers,

Connie

Christen Fleming

unread,

Oct 6, 2023, 7:41:00 PM10/6/23

to ctmm R user group

Hi Connie,

For the time being, I would drop those individuals from the calculation. I will have this fixed in the future, but the output will likely be the same as dropping them.

Best,

Chris

Connie Kot

unread,

Oct 16, 2023, 4:11:05 PM10/16/23

to ctmm R user group

Thanks, Chris!

Please keep me posted on the updates on the tool and I'll be sure to keep following the threads/announcements.