First, thanks for releasing COLMAP and continuing to update it! Sorry for asking multiple questions in the same thread and the verbosity.
I'm using a DJI Phantom 4 and have done many experiments using video with different sample rates (15fps, 5fps, 3fps, 2fps, 1fps) and software such as Altizure and DroneDeploy for path planning. The video paths I used Litchi in waypoint mode.
Calibration and bundle adjustment:
I'm really struggling to get a good camera calibration. I've had the best success using the RADIAL camera model but usually hit the wall with Bundle Adjustment when I approach 900-1000 images registered. It will run with 10-14 global iterations and a single local iteration but suddenly jump to 50 global and 200 local iterations (max) after hitting that magic threshold. Using the OPENCV camera model it never really hits many local iterations but always results in a convex or concave warped sparse model. It is easy to tell as the area I'm testing is very flat. I can't get it to initialize the model using FULL_OPENCV.
Radial parameters:
Matching:
For the video based testing I used sequential matching which produces excellent results but I can't manage to get it to perform loop closure. That is most likely due to my flight plan than anything but still frustrating as I can get ORB_SLAM2 to detect the loop closures. I've had to resort to pulling flight flogs from the bird and then applying the latitude, longitude and altitude to each frame using EXIF and using Spatial matching. This works but there is more noise in the sparse model than there was with sequential matching. I have been using default settings for everything (except min_matches which I've pushed up to 80 from the default of 15 based on spot checks using your feature matching GUI)
Video flight plan:
The flight plan is double helix shaped which means in sequential matching the 1st loop closure won't happen until at least 50% of the way through the flight. The start and end points are at the top of the NADIR image and the first loop closure would be at the bottom intersection of the NADIR image. I would expect at least 12 closures in that path. Without the closures the camera position diverge and I get ghosting like effects. By the way the model below was 8,525 images with ~2.27M points. I'm currently running the stereo process but don't expect it to finish for several more days.
NADIR:

Perspective:

Video: https://www.dropbox.com/s/mfendo581rq8p80/out.mp4?dl=0
Dense sample:
This was a small test using sequential matching at 15fps and only half of the first loop. This is the dense point cloud, I never generated a mesh (~77M points)


If you look closely on the 2nd image and the pool in the middle bottom portion of the image, you can see it even triangulated the entry/exit steps under water!