Hello,
we observed that QUICHE implementation of BBR2occasionally logs the following message at ERROR level:
quiche/quiche/quic/core/congestion_control/bbr2_misc.cc
Example:
…prior_bytes_in_flight:0 is smaller than the sum of bytes_acked:28 and bytes_lost:0…
I have troubles to estimate the impact of the error – can you help me to determine the severity, please ?
Is that log message at ERROR level appropriate here ?
One scenario that leads to this error:
This can happen during processing of ACKs, when sent packet is at first marked as LOST, but later an ACK for this packet is received. There are two entities that estimate "bytes in flight" and process the ACKs:
When a packet is marked as LOST, QuicUnackedPacketMap is updated and the packet is removed from "bytes in flight".
Later, when a ACK for this packet is received, QuicUnackedPacketMap does NOT modify "bytes in flight" again, as the packet is not in flight anymore. There is the following check:
void QuicUnackedPacketMap::RemoveFromInFlight(QuicTransmissionInfo* info) { if (info->in_flight) { ...
However, such a check is not applied when processing ACKs in Bbr2NetworkModel/BandwidthSampler, and the counter that computes the number of acked bytes is increased despite the fact that the packet is not "in flight" anymore.
This leads to a mismatch between "bytes in flight" from QuicUnackedPacketMap and "bytes in flight" from Bbr2NetworkModel/BandwidthSampler. In that case the ERROR message can be logged.
Thanks,
Stanislav Slusny
--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bbr-dev/CAACc7tpsnAm4rTAz0JhzBJ26FzHo0m2b5PpQdHH5cP8TzudLjw%40mail.gmail.com.
Hi Bin,
The problem is that the bytes_acked, that is printed in the message, is not the counter computed in QuicSentPacketManager (you posted a link pointing to QuicSentManager).
It is computed in BandwidthSampler, and there is no check to see if packet is in flight, see https://github.com/google/quiche/blob/4069625b5260ab0df30f733e17d6ab7d1703cbce/quiche/quic/core/congestion_control/bandwidth_sampler.cc#L398
Unfortunately I have not managed to build a unit test.
Stanislav
Well in fact we do not skip the packet there, because the packet is really ACKed.
The packet is spuriously marked as lost at first (and removed from flight), but an ACK is eventually received and processed.
So the packed is really ACKed, it is marked as ACKed in QuicUnackedPacketMap (https://github.com/google/quiche/blob/4069625b5260ab0df30f733e17d6ab7d1703cbce/quiche/quic/core/quic_sent_packet_manager.cc#L620)
and “bytes in flight” is correctly updated there.
Then, the BandwidthSampler loops through all acked packets again – but BandwidthSampler does not know that the packet has been marked as lost in the past, and it incorrectly increases bytes_acked, without checking if packet is in flight.