Hi Matt,
Great question! :-)
I agree it would be intuitive for "peer-app-limited" situations to be considered the same as "local-app-limited". In fact, when we were developing BBR we initially did that: we treated "peer-app-limited" situations as "application-limited". However, this did not work well in practice. What we found was that a significant proportion of TCP receivers on the public Internet spent nearly the entire connection lifetime in a "peer-app-limited" (aka receive-window-limited) state, probably because of low static receive buffer limits or disabled/non-existent receive buffer autotuning. This caused several problems for these flows:
(a) The flows never exited Startup, because there were never enough non-application-limited bandwidth samples to have confidence that the connection had discovered the available bandwidth. And as a consequence, the flow continued to pace with a high pacing_gain and keep a large cwnd with a BDP or more of data in the bottleneck queue, perpetuating the condition of being receive-window-limited rather than cwnd-limited.
(b) The flows never reduced their estimated available bandwidth, again because there were never non-application-limited bandwidth samples.
So we decided to have BBR treat cases where the delivery rate was constrained by "peer-app-limited" conditions the same as BBR treats cases where the delivery rate is constrained by in-network bottlenecks; basically, to treat receiver bottlenecks the same as network bottlenecks. This seems to have a nice conceptual symmetry and so far has seemed to work well enough in practice.
In the long run I think it would be good to get rid of this tricky "application-limited" tracking, but that will require a different framework. Something for future work. :-)
thanks,
neal