Hi Chris!
Thanks for your quick reply. That's really cool to hear you applied for the quantitative ecologist position! I hope you find something this year; I can only imagine how hard it is to find a job during the pandemic.
Thanks for the information on the cores argument. I totally missed the fact that would only work for non-Windows computers. I am playing around with a foreach loop now (great suggestion by Genevieve Finerty) on a small subset of the data, before I do it for all of them and really hog my computer's time. If I want to print progress updates (like "processing file 1 of 1881"), I assume I need to put that within the foreach loop or does that not work with parallel processing?
Briefly about the data: Data were sampled with a GPS logger every 3 or 5 minutes depending on logger and breeding season. I have data from approximately 700 birds, with multiple trips for each bird across five different breeding seasons. Because of this data collection, I was processing each trip separately. The birds are central-place foragers, so a trip starts and ends at the colony. I have removed GPS data from when the birds are on the nest, so that I only input data from their foraging trip. However, they do switch up behaviors during a foraging trip; because some trips last for days, they often rest on the water at night and move during the day. Some trips are incomplete, but most are complete. I remember in earlier emails we had talked about how current ctmm modeling would only work for these complete trips.
My main goals are to get a bird's distance travelled during a foraging trip, and to get an estimated instantaneous speed to be able to calculate a bird's airspeed. Is MSPE = "velocity" the best approach for those goals?
I had tried setting IC = NA and was getting more warning messages than when I set IC = AICc:
1: In ctmm.fit(data, GUESS, trace = trace2, ...) :
pREML failure: indefinite ML Hessian or divergent REML gradient.
2: In ctmm.fit(data, GUESS[[1]], trace = trace2, ...) :
pREML failure: indefinite ML Hessian or divergent REML gradient.
From looking at the ctmm pdf it doesn't seem like this is necessarily a concern, but it probably isn't good if that happens for most of them? Would I expect the model to be the same whether I use IC = NA or IC = AICc? When you say fitting the same movement model to the subsetted data, would that be something like instead of ctmm.select() I use ctmm.fit() and specify the model within the CTMM argument? The most common models for these trips seem to be OUF isotropic but I occasionally see OUf anisotropic.
When you suggest fitting a model to all the data and using that as my guess for the subset fitting, that would be ALL 1881 tracks? Or do it on a bird basis (slightly larger subset than subsetting it by trip)?
Re: prior, it is preferred to have prior = TRUE to be more accurate, but combined with fast = TRUE will make it accurate and faster? What are cases when it would fail at the boundaries? Is that with a smaller sample size or more coarse sampling? prior = TRUE, fast = FALSE is incredibly slow (i.e., 3-4x slower), but I guess that should become less of a problem with the foreach loop.
Thanks for your help,
Jenny