Issues importing data and error=0 in FITS

254 views
Skip to first unread message

Francisco Castellanos

unread,
Aug 28, 2020, 2:29:33 AM8/28/20
to ctmm R user group
Hello,

I've been trying to use the ctmm package for a couple weeks now and I've been encountering some issues that I'm not able to solve. So I believed it was time to get some help

My main problem is that when I try to select a fit model, the error=0 and this causes trouble later on when I'm trying to calculate speeds or distance traveled. There has only been one time when error=0.001755288 and I was able to get speeds but when I tried to use the loop to calculate distances, the model estimated an error=0 again and I was stuck on that once more. I get lots of errors and warnings when I try to use the loop to estimate distances, such as:

Error in emulate.ctmm(CTMM, data = data, fast = fast, ...) :
  fast=TRUE (CLT) not possible when minor = 0
In addition: Warning messages:
1: In cov.loglike(DIFF$hessian, grad) :
  MLE is near a boundary or optimizer failed.
2: In speed.ctmm(CTMM, data = object, level = level, robust = robust,  :
  Sampling distribution does not always resolve velocity. Try robust=TRUE.
3: In cov.loglike(hess, grad) :
  MLE is near a boundary or optimizer failed.
4: In ctmm.fit(data.subset, CTMM = guess) :
  pREML failure: indefinite ML Hessian or divergent REML gradient.
5: In cov.loglike(DIFF$hessian, grad) :
  MLE is near a boundary or optimizer failed.
6: In speed.ctmm(CTMM, data = object, level = level, robust = robust,  :
  Movement model is fractal.
7: In cov.loglike(hess, grad) :
  MLE is near a boundary or optimizer failed.

On the other hand, the summary of the fits object which didn't have an error=0, was:

$name
[1] "OUF anisotropic error"

$DOF
     mean      area     speed
495.70492 916.45006  19.62854

$CI
                               low       est       high
area (square kilometers) 10.039029 10.722062  11.427256
τ[position] (hours)       3.967633  4.487266   5.074953
τ[velocity] (minutes)     7.310652 15.394301  32.416329
speed (kilometers/day)   18.814049 24.135617  29.446499
error (milimeters)        0.000000  1.755288 160.790500

I've been setting the error=TRUE after creating the guess object in order to run ctmm.select(). I'm using ctmm.select() because I already checked in this group that if I get the warning message "In ctmm.fit(data, GUESS, trace = trace2, ...) : pREML failure: indefinite ML Hessian or divergent REML gradient." Then, I have to use ctmm.select(). However, if I use ctmm. select(), I´m also getting the same warning. So, I don´t know what can be possibly be going wrong. I've already run the buffalo and turtle data included in the package and I have no trouble following the vignette. Therefore, I believe something may be wrong with my .csv file.

I downloaded the .csv file from MoveBank, so i don't think it's a format issue.
My dataset contains DOP values only, so when I import my data using as.telemetry(), I get this message:

"HDOP values not found. Using ambiguous DOP. VDOP not found. HDOP used as an approximate VDOP. Minimum sampling interval of 1.96 hours"

Moreover, I have two columns containing timestamps, one of them named "timestamp", and another one called "study.local.timestamp" which contains the date and time of my study area. I'm deleting the study.local.timestamp before reading it with as.telemetry() and I'm just keeping the UTC timestamp which I make to sure to have it as POSIXct, maybe that could be the problem? or the fact that I only have DOP values?

I'll be happy to send you an email and share my dataset and code if you think that could make things easier for you to help me out. I want to thank you in advance for your help.

Regards,

Francisco

Christen Fleming

unread,
Aug 28, 2020, 8:56:01 PM8/28/20
to ctmm R user group
Hi Francisco,

I'll start by linking our pre-print, which is still in review: https://www.biorxiv.org/content/10.1101/2020.06.12.130195v1 . It has a lot of relevant information.

I don't recommend simultaneously estimating the error parameters, except as a last resort. If you have any calibration data or can collect any calibration data, I would recommend that instead.

error=TRUE is more for calibrated data, like Argos Doppler-shift, or data with an error estimate in meters, like e-obs. Your model fit summary has an error parameter, so your data do not appear to be calibrated, and the estimate does not appear to be ~1 meter. In that case, I would suggest error= an initial guess, which would be something more like 10 meters for most devices. What error information does your data have?

The error parameter estimate you have there (assuming HDOP values on a normal scale) is very small and uncertain. What was the problem that you were running into with error turned off? I'd be surprised that 0-16 centimeter error would give any improvement on kilometer scale data that I assume are not very finely sampled.

The R error there "fast=TRUE (CLT) not possible when minor = 0" is just telling you that the assumptions behind fast=TRUE (the central limit theorem approximation) were not met and you would need to switch to fast=FALSE to proceed.
"Sampling distribution does not always resolve velocity. Try robust=TRUE." is telling you that your sampling distribution has infinities and so you have to use robust=TRUE to proceed.
The other warnings there look like what you would expect in the regime you are in with parameters near boundaries, which causes statistical issues.
The pREML failure warning is fine/expected, as you have parameters near boundaries—both error and tau[velocity] are close to zero.
These warnings are indicating that some parameters are not well resolved, like tau[velocity], which is the cause of the two R errors you ran into. Switching to fast=FALSE and robust=TRUE might help you squeeze blood from this stone. Calibration data with a good error model would probably help more.
Looking at the DOF estimates in your model fit, DOF[speed] is an order of magnitude smaller than that of location and area. I'm guessing that your data are fairly coarse compared to the estimated tau[velocity] of 7-32 minutes, which has a large relative uncertainty, because it is barely resolved.

DOP values are ambiguous. They could be HDOP, VDOP, PDOP, etc.. More information on this is in the pre-print. So that warning is telling you what assumptions are being made on import.

UTC timestamps are the default assumption, so that sounds good.

Best,
Chris
Reply all
Reply to author
Forward
0 new messages