enable BBR: Is there a way for Linux to log the reason of sending out a TCP RST packet?

552 views
Skip to first unread message

朱宇

unread,
Aug 12, 2017, 12:49:58 PM8/12/17
to BBR Development

Running this version of kernel 4.11.8-1.el6.elrepo.x86_64 and enable the feature of BBR . We want to know why the TCP stack sends some RST packets, i.e. is there a Linux counterpart of the BSD net.inet.tcp.log_debug=1?

Following is one of the cases where the reason is wanted. A RST is sent immediately after the finally arrived ACK of the handshake. It can be seen that SYN got lost for several times and the last ACK did not arrive in more that 1s. But it is still not clear why the RST is sent. Disabling syn cookie does not help.


15:27:41.166799 IP CLIENT.16537 > SERVER.80: Flags [S], seq 1397492268, win 29200, options [mss 1440,sackOK,TS val 1230199 ecr 0,nop,wscale 6], length 0
15:27:41.166820 IP SERVER.80 > CLIENT.16537: Flags [S.], seq 1773519351, ack 1397492269, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 9], length 0
15:27:42.069572 IP CLIENT.16537 > SERVER.80: Flags [S], seq 1397492268, win 29200, options [mss 1460,sackOK,TS val 1230299 ecr 0,nop,wscale 6], length 0
15:27:42.069590 IP SERVER.80 > CLIENT.16537: Flags [S.], seq 1773519351, ack 1397492269, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 9], length 0
15:27:43.123141 IP SERVER.80 > CLIENT.16537: Flags [S.], seq 1773519351, ack 1397492269, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 9], length 0
15:27:44.067228 IP CLIENT.16537 > SERVER.80: Flags [S], seq 1397492268, win 29200, options [mss 1460,sackOK,TS val 1230499 ecr 0,nop,wscale
 6], length 0
15:27:44.067240 IP SERVER.80 > CLIENT.16537: Flags [S.], seq 1773519351, ack 1397492269, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 9], length 0
15:27:46.547072 IP CLIENT.16537 > SERVER.80: Flags [.], ack 1, win 457, length 0
15:27:46.547094 IP SERVER.80 > CLIENT.16537: Flags [R], seq 1773519352, win 0, length 0
15:27:46.548177 IP CLIENT.16537 > SERVER.80: Flags [.], ack 1, win 457, options [nop,nop,sack 1 {0:1}], length 0
15:27:46.548186 IP SERVER.80 > ClIENT.16537: Flags [R], seq 1773519352, win 0, length 0

TCP configuration are shown as follows:
net.ipv4.tcp_abort_on_overflow = 0
net.ipv4.tcp_app_win = 31
net.ipv4.tcp_autocorking = 1
net.ipv4.tcp_challenge_ack_limit = 1000
net.ipv4.tcp_congestion_control = bbr
net.ipv4.tcp_dsack = 1
net.ipv4.tcp_early_retrans = 3
net.ipv4.tcp_ecn = 0
net.ipv4.tcp_ecn_fallback = 1
net.ipv4.tcp_fastopen = 1
net.ipv4.tcp_fin_timeout = 10
net.ipv4.tcp_frto = 2
net.ipv4.tcp_fwmark_accept = 0
net.ipv4.tcp_invalid_ratelimit = 500
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_low_latency = 0
net.ipv4.tcp_max_reordering = 300
net.ipv4.tcp_max_syn_backlog = 819200
net.ipv4.tcp_max_tw_buckets = 262144
net.ipv4.tcp_min_rtt_wlen = 300
net.ipv4.tcp_min_tso_segs = 2
net.ipv4.tcp_notsent_lowat = 4294967295
net.ipv4.tcp_orphan_retries = 0
net.ipv4.tcp_recovery = 1
net.ipv4.tcp_reordering = 3
net.ipv4.tcp_retrans_collapse = 1
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_retries2 = 5
net.ipv4.tcp_rfc1337 = 0
net.ipv4.tcp_sack = 1
net.ipv4.tcp_slow_start_after_idle = 1
net.ipv4.tcp_stdurg = 0
net.ipv4.tcp_syn_retries = 6
net.ipv4.tcp_synack_retries = 5
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_thin_linear_timeouts = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_window_scaling = 1


Thanks for helping.

Shiyao MA

unread,
Aug 12, 2017, 11:25:48 PM8/12/17
to BBR Development
If you have the privilege, you can use systemtap to hook into any kernel function (provided the debug info or ftrace points).

Neal Cardwell

unread,
Aug 13, 2017, 11:42:37 AM8/13/17
to 朱宇, BBR Development
Hi,

First, I should say that this is not related to BBR. Second, I am unable to reproduce this problem on the recent Linux kernels I tried, so I suspect this problem is particular to the kernel or kernel config you are using.

That said, to try to track this down you could try looking at nstat counters, e.g.:

  nstat > /dev/null
  # run test
  nstat

That will show which counters have increased during the course of the test, which might help if there is a suspicious counter that is incrementing for the test.

As far as dynamically tweaking the Linux. kernel to understand why something is happening, you could try ftrace. In https://lwn.net/Articles/370423/ see "What calls a specific function?" You could try tracing tcp_v4_send_reset().

If you just want to try out BBR, you might try the BBR quick-start guide:

Hope that helps,
neal



--
You received this message because you are subscribed to the Google Groups "BBR Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bbr-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages