Summary I've been investigating an issue where rapid recovery probing doesn't seem to work as expected after network congestion clears. I suspect there might be a missing condition check, but I could be misunderstanding the intended design. I'd appreciate if someone could clarify whether my analysis is correct or point out what I'm missing.
------------------------
Environment
- WebRTC Version: M130
- Affected File: modules/congestion_controller/goog_cc/goog_cc_network_control.cc
------------------------
Observed Behavior
1. Start a call with normal bandwidth (~1.4 Mbps)
2. Apply network bandwidth constraint (limit to ~300 kbps)
3. Wait a few seconds, then remove the constraint
4. recovered_from_overuse is detected correctly
5. RequestProbe() is called
6. But probing doesn't happen - InitiateProbing() returns early
------------------------
My Analysis (Please Correct Me If Wrong)
What I Observed in OnTransportPacketsFeedback()
// Step 1: delay_based_bwe_ detects recovery
result = delay_based_bwe_->IncomingPacketFeedbackVector(...);
// result.recovered_from_overuse = true
// delay_based_bwe_->last_state() changes: kBwUnderusing → kBwNormal
// Step 2: MaybeTriggerOnNetworkChanged is called when result.updated is true
if (result.updated) {
MaybeTriggerOnNetworkChanged(&update, report.feedback_time);
}
// Step 3: RequestProbe is called when recovered
if (recovered_from_overuse) {
auto probes = probe_controller_->RequestProbe(report.feedback_time);
}
My Concern with MaybeTriggerOnNetworkChanged()
Looking at the condition that triggers SetEstimatedBitrate():
if ((loss_based_target_rate != last_loss_based_target_rate_) ||
(loss_based_state != last_loss_base_state_) ||
(fraction_loss != last_estimated_fraction_loss_) ||
(round_trip_time != last_estimated_round_trip_time_) ||
(pushback_target_rate != last_pushback_target_rate_) ||
(stable_target_rate != last_stable_target_rate_)) {
auto probes = probe_controller_->SetEstimatedBitrate(
loss_based_target_rate,
GetBandwidthLimitedCause(bandwidth_estimation_->loss_based_state(),
bandwidth_estimation_->IsRttAboveLimit(),
delay_based_bwe_->last_state()), // ← Used here
at_time);
}
I notice that:
- GetBandwidthLimitedCause() uses delay_based_bwe_->last_state() as an input parameter
- But the condition check doesn't include delay_based_bwe_->last_state()
What I Think Happens
When network recovers:
1. delay_based_bwe_->last_state() changes (kBwUnderusing → kBwNormal)
2. But bandwidth_estimation_ related values might not change at the same time
3. If none of the checked values changed, SetEstimatedBitrate() is not called
4. bandwidth_limited_cause_ in ProbeController remains stale (kDelayBasedLimitedDelayIncreased)
5. When RequestProbe() → InitiateProbing() is called, it checks bandwidth_limited_cause_ and rejects probing:
switch (bandwidth_limited_cause_) {
case BandwidthLimitedCause::kDelayBasedLimitedDelayIncreased:
RTC_LOG(LS_INFO) << "Not sending probe in bandwidth limited state.";
return {}; // Probing rejected
}
------------------------
What I Expected vs What I See
| | Expected | Actual |
|------------------------------|--------------------|------------------------------------------|
| recovered_from_overuse | Detected ✓ | Detected ✓ |
| RequestProbe() called | Yes ✓ | Yes ✓ |
| SetEstimatedBitrate() called | Yes | No (condition not met) |
| bandwidth_limited_cause_ | kDelayBasedLimited | kDelayBasedLimitedDelayIncreased (stale) |
| Probing initiated | Yes | No |
Simplified Flow Diagram
delay_based_bwe_->last_state(): kBwUnderusing → kBwNormal
│
▼
recovered_from_overuse = true ✓
│
▼
MaybeTriggerOnNetworkChanged()
│
▼
Check: Did any bandwidth_estimation_ values change?
│
▼
If NO → SetEstimatedBitrate() NOT called
→ bandwidth_limited_cause_ NOT updated
│
▼
RequestProbe() → InitiateProbing()
│
▼
bandwidth_limited_cause_ == kDelayBasedLimitedDelayIncreased
│
▼
Probing rejected ✗
---------------------------
My Questions
1. Is my understanding correct? Should delay_based_bwe_->last_state() be included in the condition check since it's used in GetBandwidthLimitedCause()?
2. Or is this intentional by design? Perhaps there's a reason why bandwidth_limited_cause_ should only be updated when bandwidth_estimation_ values change?
Any clarification or correction would be greatly appreciated. Thank you!