I have a very large input dataset that covers the contiguous U.S. After running it through SAHM and getting the merged dataset, I have over 100k positives and 550k background locations collapsed to 1km pixels (actual traps plus some simulated background for areas not reporting traps). When I inspected the model output (MARS application), I noticed in the confusion matrix a reduced number of absence (background) points being reported. However, the correct number shows in the MDS shapefiles. I thought that the model probability surface was fitting excessively high likelihoods throughout the map, and it seems that the majority of my background points were dropped (leaving the model unconstrained). Is there an option to expand the number of data points for model fitting, or do you have other recommendations (like a spatial thinning algorithm)?
FYI, when I added the number of points retained in the model it didn't reach some sensible round number: 106,496
Thanks.
Gericke