BBR sitting idle for a significant amount of time

97 views
Skip to first unread message

Shehab Sarar Ahmed

unread,
Jan 3, 2025, 10:16:59 PMJan 3
to BBR Development
Hello,

I was testing the behavior of BBR on different network environments. I found a suspicious behavior of BBR which is triggered when the network delay significantly decreases after around 1 second of the start of the flow.

What happens is that there are some out of order delivery of packets. The packets that are sent later but faces lower latency is received earlier by the receiver, compared to some other packets which were sent just before the network delay decreased.

I observed that after sending a few packets during the reduced network delay, BBR just halts for around 800-1000ms and does not send any other packets during that time. After that, it resumes its normal operation again.

I tried to use bpftrace to investigate what's going on inside BBR. But it seems it is not providing me enough flexibility to look at every state of BBR that I need. The script I am currently using is as follows:

#include <net/sock.h>
#include <net/tcp.h>
#include <net/inet_connection_sock.h>

// Manually define the bbr structure (adapt this based on your kernel version)
struct bbr {
    int mode;
    // Add other fields from the bbr struct as needed
};


kretprobe:bbr_packets_in_net_at_edt
{
    @time_ms = nsecs / 1000000;
    printf("Timestamp: %llu ms, bbr_packets_in_net_at_edt return value: %u\n", @time_ms, retval);
}

kprobe:bbr_init
{
    @time_ms = nsecs / 1000000;
    printf("Timestamp: %llu ms, bbr_init called\n", @time_ms);
}

kretprobe:bbr_inflight
{
    @time_ms = nsecs / 1000000;
    printf("Timestamp: %llu ms, bbr_inflight return value: %u\n", @time_ms, retval);
}


What I found that after bbr_init, bbr_inflight is first called almost after 2.5 seconds (which include the idle ~800ms), whereas normally after entering probeBW phase, bbr starts calling bbr_inflight after about 1 second.

From the codebase, I am guessing rs->losses is getting True? But does that make BBR halt for that long time?

Any suggestion on the reason for the weird behavior from BBR? Or how to better debug what's going on inside BBR linux kernel implementation?

Neal Cardwell

unread,
Jan 3, 2025, 10:20:42 PMJan 3
to Shehab Sarar Ahmed, BBR Development
Hi,

I'm having difficulty coming up with a theory based on this high-level description of the scenario. :-)

Do you have time to capture and share a tcpdump binary pcap file with a trace showing this kind of scenario?

For some recipes on installing and using tcpdump, see:

best regards,
neal


--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bbr-dev/c1f8668d-cc7e-4588-8df2-7004db3a75c8n%40googlegroups.com.

Shehab Sarar Ahmed

unread,
Jan 6, 2025, 10:19:01 AMJan 6
to BBR Development
Reply all
Reply to author
Forward
0 new messages