Kriging occurrence to fill in missing locations

Newbie

unread,

Mar 31, 2019, 6:36:03 PM3/31/19

to ctmm R user group

Hi,

As a newbie to ctmm and overall movement modeling, I apologize in advance for my ignorance with this topic. I read your paper "where and how animals move" with great interest, especially with how the kriging occurrence appears to resolve the missing points to a greater integrity than other methods.

My collars were set to sample in alternating bouts of 5 minute fixes and 30 minute fixes over a 12 hour period daily. Within the good dataset, there are some few missed sampling fixes. Can the ctmm kriging occurrence be used to interpolate and fill in those missing points, or does the ctmm occurrence only work on regularized sampling intervals?

However, within my bad data due to missed sampling fixes and lost collars, a large chunk of my data has irregular sampling periods ranging from fixes every 14 minutes to fixes every 18 hours. Can I use the ctmm occurrence to fill in the missing times in these gaps to give it a regularized trajectory that I can use to work with?

Thanks in advance!

nabar...@gmail.com

unread,

Apr 1, 2019, 1:52:31 PM4/1/19

to ctmm R user group

Also, is it necessary to run the occurrence function on only the 5minute fix rate section first to fill in those missing times, then run it again on the 30minute fix rate section to fill in those missing times, or is it perfectly okay to run it on the entire dataset consisting of both sections at the same time and let the function fill in whatever is necessary?

Christen Fleming

unread,

Apr 1, 2019, 4:33:44 PM4/1/19

to ctmm R user group

Hi,

The three functions you would be interested in are occurrence(), predict(), and simulate().

Kriging in ctmm is done in continuous time, so irregular sampling is not an issue for any of the functions to operate correctly. If you want to predict where the individual was at a specific time, then the method you want is predict(). However, the autocorrelation structure of predictions is more smooth than the autocorrelation structure of real trajectories. If you want to regularize your data for an analysis that assumes even sampling, then you want to use simulate() many times, running the analysis many times to capture the uncertainty of not knowing where the individual was at the times of interest. However, you still do not want to use simulations too far from your data, as that will more reflect the movement model than the data.

occurrence() returns a distribution of where the animal might have been during the sampling period (or some range of times). Occurrence distributions of all kinds are sensitive to the sampling schedule by design. There are some gap skipping arguments in occurrence() that I just updated on GitHub. These are used to keep the occurrence distribution tight to the data, as large gaps will produce large blobs of uncertainty when you didn't know where the individual was... and people are usually not interested in visualizing that. If you want to compare occurrence distributions in an apples-to-apples away, you might consider tweaking the gap skipping arguments to make them more comparable, if possible. occurrence() doesn't return a trajectory, however. It is a distribution that (if I can steal a thought from John Fieberg) is more like the confidence intervals around the full trajectory.

I hope that answers your questions.

Best,

Chris

On Sunday, March 31, 2019 at 6:36:03 PM UTC-4, Newbie wrote:

nabar...@gmail.com

unread,

Apr 2, 2019, 1:06:47 PM4/2/19

to ctmm R user group

Hi Chris

Wow, your answers literally shone a bright light into the muddled grey waters I was trying to peer through!

I see now that stimulate() is more appropriate for what I would like to do with the data. How many is "many times" before the stimulated data can be considered as a confident substitution for the missing points?

Thanks.

Christen Fleming

unread,

Apr 2, 2019, 6:16:44 PM4/2/19

to ctmm R user group

Hi,

In a loop you want to run simulate(), run the discrete-time analysis, store the output in an array/list. You can stop the loop when the standard error of the mean output is much smaller than the variance of the outputs. The relationship should be like 1/sqrt(N). The threshold is up to you, but 1% error will take ~100^2 simulations, etc..

Best,

Chris

joshuap...@gmail.com

unread,

Nov 4, 2021, 6:22:26 PM11/4/21

to ctmm R user group

Hi Chris,

Tagging onto this hope you don't mind, I have been working in ctmm for a while and love it. Using it as the backbone of a currently analysis on black bear space use. I am wanting to use hidden markov models (hmms) to examine how landscape heterogeneity effects frequency of different latent behavioural modes in my seasonal movement data. For this I need consistent step intervals (the GPS was set to record hourly, but there after often gaps e.g. when a bear enters its den, or goes into deep forest). At first I thought my best approach was to use predict() at a series of consistent times (say every hour or two hours) from my ctmm model, but from my reading here it seems simulate will be the way to go.

In your response above you say " run simulate(), run the discrete-time analysis, store the output in an array/list. You can stop the loop when the standard error of the mean output is much smaller than the variance of the outputs".

Does this apply for HMMs also, in my case is the discrete time analysis the HMM itself? So I would running 1 simulation of a seasonal range, run a HMM on the single simulation. My next simulation is again on the original ctmm model and then I average across simulations at end, or applied to the previous sim output?

All the best

Josh

Christen Fleming

unread,

Nov 5, 2021, 10:37:21 AM11/5/21

to ctmm R user group

Hi Josh,

Yes, you would be averaging over the final discrete-time analysis output—so whatever you are extracting from the HMM. This gives you an approximate account of the missing uncertainty from sampling irregularity or (if you modeled it) location error in a discrete-time analysis that can't handle those issues. In Appendix S1 of https://www.biorxiv.org/content/10.1101/2020.06.12.130195v2 I also describe how you can compare those results to results conditional on the smoothed data (from predict), to investigate bias as well. Just using the conditional prediction is better than nothing, but there is missing variance and you don't have a great idea of what the bias is.