Error: vector memory exhausted (limit reached?)

Devin Fitzpatrick

unread,

Nov 8, 2021, 6:16:19 AM11/8/21

to ctmm R user group

Hi there,

I am estimating home ranges for 10 urban grey squirrels. For 2 of the squirrels I am getting the above error message when I try to calculate the AKDE (OUF) however the KDE (IID) works.

Do you have any idea why this would be?

Thank you,

Devin

Christen Fleming

unread,

Nov 8, 2021, 1:00:43 PM11/8/21

to ctmm R user group

Hi Devin,

I'm not sure why that would happen when analyzing an individual squirrel. Can you send me a minimal working example (data + script) that can reproduce the error.

Best,

Chris

Connor O'Malley

unread,

Nov 9, 2021, 10:45:20 AM11/9/21

to ctmm R user group

Hey Devin,

Not sure if this is useful at all but I ran into memory errors when I had too many animals with low numbers of location points - like the akde was unable to converge or something? I filtered the data to have a minimum of 30 locations for each animal and the memory errors went away. Good luck,

Connor

Christen Fleming

unread,

Nov 9, 2021, 11:16:40 AM11/9/21

to ctmm R user group

Hi Connor,

If you are trying to calculate everything on the same grid, like akde() of lists of individuals, then having a very uncertain individual with extremely wide CIs can cause OOM because the certain individuals are setting the resolution too high for the uncertain individual (as I haven't yet implemented "progressive" resolutions). But in most other situations, this shouldn't happen and I'd like to fix it if it does.

Best,

Chris

Connor O'Malley

unread,

Nov 9, 2021, 5:02:50 PM11/9/21

to ctmm R user group

I'm still a little unclear on how things like grid and bandwidth work and am trying to educate myself. I'm going off this code I picked up in one of the vignettes to process ~300 mountain lions from all across the USA. As I mentioned, I got memory errors when I had cats with low numbers of locations but after filtering those out it worked again. Is this the recommended way to process lots of animals? Sorry one more question- is bandwidth optimization taken care of in this code or do I need to use the bandwidth() function in here? Thanks so much!

Connor

df <- as.telemetry(cats.ready)

# fit movement models

FITS <- AKDES <- list()

for(i in 1:length(df))

{

# save the default guess value for each home range

GUESS <- ctmm.guess(df[[i]],interactive=FALSE)

# use the guess values to find the best model and save into list

FITS[[i]] <- ctmm.fit(df[[i]],GUESS,trace=2)

}

# calculate AKDES on a consistent grid using the list of fitted models

AKDES <- akde(df,FITS,trace=1)

Connor O'Malley

unread,

Nov 9, 2021, 5:46:31 PM11/9/21

to ctmm R user group

I'll just add that I'm still working through all the previous discussions and questions folks have asked and I'm sure you've covered this stuff a few times so sorry in advance for the repeat questions - thanks!

Christen Fleming

unread,

Nov 10, 2021, 12:32:32 PM11/10/21

to ctmm R user group

Hi Connor,

This is how you want to setup your code if you want to calculate all AKDEs on a consistent grid for comparisons like overlap. That can result in OOM errors if you have individuals with very uncertain home ranges (giant upper CI) combined with individuals with very certain home ranges (fine resolution).

When I get around to coding up "progressive" resolutions, then that should fix this OOM issue, but I don't know when I will get to that unless somebody really needs it. Is it a priority that you need to include the remaining individuals?
If you don't need to calculate overlaps (or other UD-UD comparisons), then you can just calculate each AKDE separately and that should not result in an OOM error. That would be done in the loop, like:

AKDES[[i]] <- akde(df[[i]],FITS[[i]],trace=1)

The bandwidth is optimized within akde(). The bandwidth() function is for people that want to calculate and export bandwidth values to other software. Most users will not need this function. It also does not include the bias correction of AKDEc/KDEc, so by itself, it's fairly inefficient.

Best,

Chris

Connor O'Malley

unread,

Nov 10, 2021, 1:52:22 PM11/10/21

to ctmm R user group

Hi Chris,

1. I don't need to include small sample size individuals but my challenge has been determining which animals are residents and which are not.

I want to compare home range sizes across a range of habitat quality to test if home ranges get smaller in higher quality habitat. The problem (and I would love any advice you can offer) is setting an objective standard for which cats should be included since many of them have home ranges that seem way too large.

I've been playing around with cluster() and then filtering the animals to just keep the ones with a $P == 1. I've also tried retaining the cats with $P close to 1 since a lot are in the 0.999 range. This seems to get at what I need but still keeps a lot of animals with really big home ranges that have wide confidence intervals (maybe it could be argued that those cats should be included, I'm not sure). Maybe filtering using variogram convergence is more appropriate than cluster()? Thanks for any tips you might have!

Connor

Devin Fitzpatrick

unread,

Nov 11, 2021, 4:54:10 AM11/11/21

to ctmm R user group

Hi both,

Thank you for your replies. Strangely the error was resolved by simply updating R studio.

All the best,

Devin

Christen Fleming

unread,

Nov 11, 2021, 2:50:36 PM11/11/21

to ctmm R user group

Hi Connor,

I've been making more and more tools for this task, but there's no magic bullet yet. Low DOF[area] values are an indication that can be misleading if some tracks are genuinely short samples. Large area estimates are an indication (which cluster() separates out), as are large tau[position] estimates (which I might include in cluster() at some point). In Mike Noonan's 2019 and 2020 comparative analyses, he set a threshold overlap between the first half of the data and second half of the data.

Best,

Chris

Connor O'Malley

unread,

Nov 11, 2021, 6:13:38 PM11/11/21

to ctmm R user group

Hi Chris,

That Noonan et al 2019 paper is really eye opening and makes me so grateful that you and your colleagues have provided these tools of us!

Correct me if I'm wrong but would it make sense to filter my animals by relative confidence intervals on their akde's? So that I could keep the animals with really tight CI's and ditch the ones with greater uncertainty around their home range estimate? Does cluster() factor in CI's into the grouping process or is it specifically grouping based on area? Hopefully I am interpreting that correctly.

Thanks again!

Connor

Connor O'Malley

unread,

Nov 12, 2021, 10:49:16 AM11/12/21

to ctmm R user group

Sorry one more thing to add to this. If 3 x tau[position] is the approximate number of days the variogram needs to asymptote (hopefully I'm interpreting this correctly), is it logical to remove animals who's data length is shorter than that? Say I have an animal with 20 collar days and 3 x tau[position] gives me 35, toss out that animal? I'm considering visually inspecting all of the variograms but some of them are difficult to say for sure if they should be included or not. Thank you!

Connor

Christen Fleming

unread,

Nov 12, 2021, 6:08:17 PM11/12/21

to ctmm R user group

Thanks Connor,

Relative CI width is equivalent to DOF[area], which is both behavior and sampling dependent. You can get a tiny DOF[area] because the behavior was dispersive or because the tracking period was very short.

Both meta() and cluster() propagate the uncertainties in the individual estimates. cluster() is using a generalization of the hierarchical model in meta(): https://www.biorxiv.org/content/10.1101/2021.07.05.451204v1.abstract

where in cluster() there is a mixture of two population distributions—resident and dispersive.

Regarding the question about removing individuals with small tau[position] relative to their sampling period (which is pretty much equivalent to removing individuals with small DOF[area]), that depends.

If you have many individuals with similar behaviors and some have low DOF[area] (<4-5 for the default estimator in ctmm) because the tracking periods were very short, then removing those individuals would reduce negative bias in the population estimate.

On the other hand, if you have many resident individuals with similar tracking periods but variable home-range sizes and crossing times, then removing those individuals (even though they are negatively biased) will further negatively bias the population estimates.

For classifying between resident and non-resident individuals, I would tend to stick more to area and tau[position] size.