Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

cluster() Function and Paper

232 views
Skip to first unread message

Tyler Hodges

unread,
Apr 24, 2024, 10:56:03 AM4/24/24
to ctmm R user group
Hello Chris,

I am currently working on an analysis of sub-seasonal functional home ranges of elk in eastern North America using a combination of segclust2d and ctmm. Given that I have 5+ years of data, ~200 individual elk, and 160,000+ relocations, I am trying to automate the process as much as possible. I was initially planning on evaluating stationarity of each segment returned via segclust2d (1500+) by studying each individual track and variogram and manually annotating the spreadsheets to indicate residence or migration. However, I stumbled onto the cluster() function while reading old threads on here, which can help to automate this process. I was wondering if the paper you mentioned here a few times ever got published? I'd like to read more into the function before I decide to use it. It will of course be faster than manually diagnosing variograms, but does it prove to be accurate, especially if comparing tracks that are of variable time (3 days to a few months per segment)? Would things get tricky if you are clustering the HR area of non-resident tracks that are 3 days in length with the HR areas of resident tracks that encompass larger areas and three months of time?

I'd also like to eventually derive timing and duration of the migratory and/or exploratory segments, so I wonder if manual inspection might end up being best anyway to better classify those movements. 

As always, thanks for your help!

Best,
Tyler

Christen Fleming

unread,
Apr 27, 2024, 12:44:58 AM4/27/24
to ctmm R user group
Hi Tyler,

We haven't yet published that work, because we are trying a lot of different approaches and seeing what works best. Currently, I would recommend something like sorting by the range-crossing timescale and then inspecting the most extreme cases, from worst to best (or until you feel confident).

best,
Chris

Tyler Hodges

unread,
Apr 27, 2024, 9:04:58 AM4/27/24
to ctmm R user group
Hello Chris,

Noted, thank you! I will give it a go and see how it turns out!

Best,
Tyler
Message has been deleted
Message has been deleted
Message has been deleted

Tyler Hodges

unread,
Oct 31, 2024, 1:38:03 PM10/31/24
to ctmm R user group

Hello Chris,


Thank you for hosting the ctmm workshop at TWS! I enjoyed the good discussions and presentations.


After finishing my M.S., I am now back to working on this elk project. Rather than using cluster or sorting by tau or DOF, we decided to visually classify variograms as stationary/non-stationary (all ~6000!). We are using a double-observer validation-type approach to minimize misclassification. We just finished the first batch of variograms (~900), and I have a series of questions I was hoping for clarification on. Note that I had all variograms zoomed out to 100% for the first round of classifications- I was hoping that by doing so, we’d catch some instances of delayed asymptotic behavior for the shorter (7 day) segments. Alas, I fear that I likely only created more confusion amongst the technicians (and myself) by using a zoom of 100% given the tendency for the latter portions of the variogram to go haywire. For the next round, I will have a 50% and 100% zoom available for viewing, but will use the 50% as default. I will also use fast=FALSE and Gaussian CIs for the next round, which should clear up some of the plentiful ambiguous cases we are encountering.


  1. In some cases with shorter movement tracks and smaller effective sample sizes, the SVF didn’t asymptote until the latter portions of the variogram (if at all) as it attempted to fit the oscillations and irregularities in the empirical variogram. In such cases, if the SVF’s CIs are in accordance with those of the empirical variogram, is it safe to classify as stationary, or should we only do so if the asymptote is reached in the both the SVF and variogram by ~50% zoom? Here are a few examples of varying ambiguity where the SVF doesn’t asymptote until after 50% zoom, if at all. 

    1. This SVF doesn't quite reach an asymptote by the end of the variogram, and certainly not by 50%, so I initially classified it as non-stationary, despite a track and behavior that seems range resident. DOF = ~3 (quite small for these animals, and usually indicative of non-stationarity).


    2. male20214_w2022_3.jpeg
    3. This is a middling variogram. It seems that an asymptote appears in the empirical variogram by about 50% zoom, but the SVF doesn't level off until the latter half. I classified this as stationary. DOF = 5 (on the lower end, but not terrible).
    4. male1253_su2018_2.jpeg
    5. To me, this is the easiest to classify as stationary. Despite the oscillations, there seems to be an asymptote in both the empirical variogram and the SVF, even if the SVF doesn't plateau fully until just after 50%. DOF = 22 (probably about average).


    6. female20107_su2020_1.jpeg

  2. In these examples, there are prominent humps or oscillations in the variograms. I suspect these might be the result of relatively few home range crossings, large taus, and short movement tracks. I know that, in general, some oscillations are fine if they are in agreement with the SVF and its CIs, but these seem a little more ambiguous. With some of the more finely-sampled individuals there are definite periodicities (I’ve been playing around with periodic models), so that may be causing some weird patterns, as well. 

    1. This was one of the less ambiguous ones, and I classified it as stationary. Despite the initial hump, the SVF and variogram both display a prominent asymptote during the latter half of the variogram, and the track looks stationary, too. At 50% zoom, however, the plateau in the variogram wouldn't be as evident, and it might be easier to dismiss this as non-stationary. DOF = ~8.


    2. female20093_w2021_1.jpeg

    3. This variogram is more confusing. I ultimately classified it as stationary given the broad overlap in CIs, apparent asymptote, and stationary track, but I wanted to confirm that the dip in the middle isn't going to affect anything. DOF = 23 (pretty good).


    4. female20075_su2022_1.jpeg

    5. This is similar to the last one, but with a more prominent hump at the beginning. Otherwise, the track and DOF (34) seem very stationary. I called this one stationary.


    6. female20049_su2020_2.jpeg

    7. These are bimodal variograms, and we have been getting them quite often. I think these are a couple crossings of narrow, linear home ranges (some of the females set up HRs in narrow valley bottoms). Given the complete lack of an asymptote, I labeled these as non-stationary.

    8. female20089_f2020_3.jpeg


    9. female20037_su2020_3.jpeg

  3. We also had a number of shorter movement tracks that were being thrown off by exploratory bouts that would ultimately be averaged over and/or have relatively minor impacts on larger movement tracks. Some of these are quite egregious and force us to label a track as non-stationary, whereas others are borderline and much more confusing.

    1. The tail to the right on the movement track was a brief exploratory bout, but with the smaller track, it is throwing off the variogram. I called this one non-stationary. DOF = 6.


    2. female20059_su2022_2.jpeg

    3. Not a terrible variogram, but exploratory bouts throwing things off, creating white space between SVF and empirical. I called this one stationary despite the white space between CIs indicating less-than-ideal fit. DOF = 18.


    4. female20059_su2020_1.jpeg

    5. I called this one non-stationary. Although there is a home range on the left of the track map, the 1-2 exploratory bouts to the right are throwing things way off in the variogram. DOF = 4.


    6. female20049_sp2020_2.jpeg

    7. In this last one, a home range is definitely present, but the segmentation procedure threw a couple migration points in, making the whole segment and variogram clearly non-stationary and massively biasing space use high. I included this mostly for comparative/demonstrative purposes since it is clearly non-stationary.


    8. female19039_w2021_2.jpeg

  1. Lastly: I am improving the variograms for round 2, and in doing so I want to properly account for varying sampling schedules. Most of these animals had a sampling interval of 13 hours, but a small subset in one season/year had 1 hour sampling intervals, and others had malfunctioning collars that were more erratic, taking fixes every 15 minutes all the way up to >1 day. The 1 hour collars also switched over to 13 hours in the middle of some segments. For the first round, I tried a dt of c(1%#%’hours’,13%#%’hours) and that seemed to work well for most, but I don’t know if that was the best way to account for the variability and that was also before I noticed the 15 minute intervals. Would something like c(15%#%’minutes’, 1%#%’hours’,13%#%’hours) be more appropriate? Since I am running everything in for() and foreach() loops, I want to be able to provide a single solution, and my attempts at coding something more adaptive have thus far failed given the variability. I know the variogram is just a visual tool and doesn’t need to be perfect, but I want to provide the technicians (and myself) with the cleanest variograms I can to make the process easier and faster. 


Again, sorry for the long post and beating a dead horse here when it comes to variogram interpretation, but considering how vital it is to the whole process, I want to make sure we are doing it correctly! If you don't mind, I may send along a few others that present recurring patterns we are seeing, as well. I think that I am going to summarize everything that I have learned about variogram interpretation over the last year in a blog post or something similar in the near future (with ample examples from elk) to hopefully minimize the amount of confusion for others.


As always, I am immensely grateful for your help!


p.s., sorry for the weird lettering, I wrote this in a word processor and the formatting got weird during import.



Best,
Tyler

Christen Fleming

unread,
Nov 3, 2024, 12:05:43 AM11/3/24
to ctmm R user group
Hi Tyler,

I would also look at the distribution of tau[position] home-range crossing timescale estimates, as the longer of those will often indicate non-resident behavior.
In that regard, if the tau[position] estimates are within the normal range, most of these look okay for small DOF estimates, but not the last one - there is a clear mismatch between the asymptote of most of the variogram and of the fitted model, likely from the included migratory points.

Best,
Chris

Tyler Hodges

unread,
Nov 3, 2024, 6:49:32 PM11/3/24
to ctmm R user group
Hello Chris,

Many thanks! In that case, maybe I will sort by tau as you initially suggested and play around with cluster() and meta() some.


Best,
Tyler

Tyler Hodges

unread,
Jan 15, 2025, 9:18:37 AMJan 15
to ctmm R user group
Good morning, Chris,

I hope you're doing well! I'm continuing to chip away at this elk analysis, and I have what I think is a fairly straightforward question for you. When calculating an averaged seasonal UD weighted by the proportion of time an animal spent within each range, where the ranges themselves are stationary segments returned via segclust2d, should the weights be relative to the entire season length, including the removed non-stationary segments, or only to the combined length of the stationary ranges? For example, if I have a three-month long season, and during each of those months an elk occupied a new sub-seasonal range, two of which were stationary and one of which was non-stationary, when averaging the two stationary ranges together, should the weights be 1/3 for each, or 1/2? My thoughts are that 1/2 would make the most sense as the non-stationary segments are not being factored into the range estimation at all, but I wanted to double-check.

As always, thanks for your insight!

Best,
Tyler

Tyler Hodges

unread,
Jan 21, 2025, 10:39:32 AMJan 21
to ctmm R user group
Hello everyone,

It appears that I answered my own question. When non-stationary segments are present, weighting by the proportion of time spent in each range compared to the total season length shrinks the UD isopleths inward, which is definitely not what I am looking for. So, the proportions need to be calculated in respect to total time spent within stationary ranges only.

Thanks!

Best,
Tyler

Tyler Hodges

unread,
Jan 30, 2025, 10:16:50 AMJan 30
to ctmm R user group
Good morning, everyone,

I have a question about establishing availability in 2nd order resource selection analyses. Now that I have the aforementioned segments classified by stationarity and have sfHRs produced for the stationary segments, I am moving into the resource selection part of the analysis. Our goal here is to assess selection of forest attributes in sfHRs compared to larger seasonal and annual available areas. I do not want to exclude non-stationary segments from these available areas as the animals have, at the very least, spent time crossing through them and are familiar with the environmental conditions therein. As such, in a 2nd-order analysis, it makes sense to me to include these non-stationary bouts. I am trying to find a biologically meaningful way to establish availability but have run into issues with every method I've tried. Despite its glaring shortfalls, I first tried 100% MCPs with a 1km buffer (sensu Truly sedentary? The multi-range tactic as a response to resource heterogeneity and unpredictability in a large herbivore | Oecologia) but doing so excluded peripheral portions of especially large sfHRs from the availability estimate, which feels incorrect. I foresee this being an issue with conventional KDEs that draw tight perimeters around the data, too. I also considered using the averaged seasonal AKDEs (see above), but of course, if an animal has any non-stationary segments within a season, these would be excluded. Moreover, the averaged AKDEs still omitted some portions of especially large sfHRs from consideration. I've also looked into some of the availability methods used in works by Hooten and Johnson, but many of those sound like occurrence and not range distributions. I think I have seen it suggested here that AKDEs can be used with non-stationary segments, with the caveat that the estimates and CIs are going to be very large given the low effective sample sizes. Would doing so, and then averaging those estimates with the stationary estimates, be reasonable for establishing 2nd-order availability, or do you have any other suggestions? Considering the sensitivity of RSFs and related models to the availability sample, I want to make sure I am being as rigorous as possible in its estimation, but I am currently at a loss as to what the best current method is. 

As always, thanks for the insight!

Best,
Tyler

Christen Fleming

unread,
Feb 14, 2025, 7:40:16 PMFeb 14
to ctmm R user group
Hi Tyler,

In the ctmm RSFs, there are two sets of parameters: 2nd-order phenomenological parameters (mean and spread) and 3rd-order mechanistic parameters (selection coefficients), and a stationary process is assumed. So for one stationary process from one individual, there isn't much that can be inferred about 2nd order selection. For that you need multiple individuals or multiple home ranges from an individual (or something analogous), where each home range is worth about 1 DOF.

Best,
Chris

Tyler Hodges

unread,
Feb 15, 2025, 8:31:44 PMFeb 15
to ctmm R user group
Good evening, Chris,

Thanks for the response! In this case, the 2nd-order selection that I am referring to is that established by Johnson 1980: how an animal places its home range within the landscape. However, given the sub-seasonal nature of this analysis, rather than comparing covariates within the sfHR to the entire study region (as is often done for 2nd-order analyses), I think it makes sense to instead use the animal's seasonal range as the available area. The issue is that I am unaware of any widely available and statistically sound way to establish the available area for individuals that switched between stationary and non-stationary behavior in the same season. Since ctmm only implements 3rd-order RSFs, my plan is to model selection with a PPM or a GLMM/GAMM once I have settled on a method to estimate the available area. For now, I am just using an MCP buffered by the maximum 13-hour step length for each individual-season combination. 

Thanks again!
Tyler

Reply all
Reply to author
Forward
0 new messages