I have also always found the shapes.txt system a little odd and potentially confusing to new producers. My understanding is that it adapted to odometer-based vehicle tracking systems that essentially counted the number of wheel revolutions, before low-cost high-precision combined GPS and inertial navigation systems were common.
Note that the distance units are arbitrary: they may be kilometers, or miles, or bus-wheel-circumferences. They don't even need to be specified, and really only need to match between stop_times and shapes. These distances are then used to slice the full-length shape into segments between stops, without the consumer needing to use heuristics to project the stop point onto the shape.
I just checked the Google GTFS docs as well as gtfs.org
, and I do not think either place specifies what exactly it means for shape_dist_traveled to be optional. I can tell you the interpretation within OpenTripPlanner: if the first stop_time in a trip has a shape_dist_traveled, then all other stop_times in that trip are just assumed to have them, i.e. shape_dist_traveled is optional on stop_times but only at the full-trip scale. On the shapes.txt side, if any point in a shape does not have a shape_dist_traveled, all other shape_dist_traveled are dropped from that shape, i.e. shape_dist_traveled is optional but only at the full-shape scale.
Again I see nothing official to back up this interpretation. If the only function of shape_dist_traveled is slicing the shape at stops, it seems adequate to have shape_dist_traveled only at the points immediately next to stops, and they wouldn't even need to reflect distances. Any monotonically increasing sequence of numbers (even the stop_sequence) would serve that purpose.
The catch is that there are not necessarily points on the shape near each stop. I think the whole system is designed so once person or system can create a shape (perhaps tracing or deriving from street data) and someone else can place stops later without regard for where the points are within the shape. Still, having shape_dist_traveled on only a few points should allow interpolating all the others pretty accurately.
In the case where there is one line segment in the shape unambiguously closest to each stop on a trip using that shape, the shape_dist_traveled seems redundant. The simplest solution in that (common) case is to not use shape_dist_traveled at all, and let the consumer project the stops onto the shape.