Tcpdump to MetalLB IP

670 views
Skip to first unread message

Sachith Muhandiram

unread,
Jul 24, 2020, 12:44:43 PM7/24/20
to metallb-users
My k8 cluster has MetalLB. Now I want to get a `tcpdump` to check inter-communication.

As described [here](https://www.objectif-libre.com/en/blog/2019/06/11/metallb/)
 In this mode a service is owned by one node in the cluster. It is implemented by announcing that the layer 2 address (MAC address) that matches to the external IP is the MAC address of the node. For external devices the node have multiple IP address.

So I have taken `MAC` address of my service running node and tried to put
sudo tcpdump -i eth0 ether host aa:bb:cc:11:22:33

. [refered](https://networkengineering.stackexchange.com/questions/19737/tcpdump-filter-by-mac-address)

My service has
192.168.10.101

IP assigned.

As official [troubleshooting][1] suggests, I used `tcpdump -n -i ens3 arp src host 192.168.10.101`, yet no packet captured.

metalLB config


Name:         config
Namespace:    metallb-system
Labels:       <none>
Annotations:  <none>

Data
====
config
:
----
address
-pools:
- name: default
  protocol
: layer2
  addresses
:
 
- 192.168.10.100-192.168.10.120

Events:  <none>




But I can not match any packet.

I am kind of stuck here. I have tried `kubectl get svc`, get service running port and `tcpdump` to that port, still not matching any packet.
Basically How can I put `tcpdump` to

  • LoadBalancer service
  • ClusterIP running services


  [1]: https://metallb.universe.tf/configuration/troubleshooting/

Rodrigo Campos

unread,
Jul 24, 2020, 2:12:26 PM7/24/20
to metallb-users
Hi!

I'm not sure if everything is working but you can't see the traffic or if things are not working. Can I ask if it is working or not? :)

If it is not working, I'd suggest to try connecting using the nodePort (hostIP:NodePort) instead of the load balancer IP, just to verify that the kubernetes part up to NodePort is working fine.

If nodePort is not working, I'd suggest to go over this very nice guide in detail: https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/. Let us know if that helps.


Best,
Rodrigo

Todor Petkov

unread,
Jul 24, 2020, 2:29:22 PM7/24/20
to metallb-users
On Fri, Jul 24, 2020 at 7:44 PM Sachith Muhandiram
<sachit...@gmail.com> wrote:
>
> My k8 cluster has MetalLB. Now I want to get a `tcpdump` to check inter-communication.
<snip>
These are the steps that I usually follow to debug such issues:

1) From a node outside of the cluster, run "arping -I eth0
LoadBalancerIP" to discover which node is announcing this ip address.
Note - eth0 can be different in your case.
2) Once you discover the node IP address, log in it and run "tcpdump
-ni eth0 host LoadBalancerIP". Note - eth0 can be different in your
case.

Example:
1) arping from external node:
arping -I ens192 10.82.9.70
ARPING 10.82.9.70 from 10.82.9.100 ens192
Unicast reply from 10.82.9.70 [00:50:56:BB:83:63] 1.936ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)
I stopped it after the first response, then I checked Kubernetes nodes
MAC addresses and saw which one is 00:50:56:BB:83:63

2) I logged into this node and I ran tcpdump, then I executed "curl
10.82.9.70" from the external node:

tshark -ni ens192 host 10.82.9.70
Running as user "root" and group "root". This could be dangerous.
Capturing on 'ens192'
1 0.000000000 00:50:56:bb:f7:ef -> ff:ff:ff:ff:ff:ff ARP 60 Who has
10.82.9.70? Tell 10.82.9.100
2 0.000480186 00:50:56:bb:83:63 -> 00:50:56:bb:f7:ef ARP 60
10.82.9.70 is at 00:50:56:bb:83:63
3 0.000639981 10.82.9.100 -> 10.82.9.70 TCP 74 40676 > 80 [SYN]
Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=644916556 TSecr=0
WS=128
4 0.000974405 10.82.9.70 -> 10.82.9.100 TCP 74 80 > 40676 [SYN,
ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=1505492135
TSecr=644916556 WS=128
5 0.001218009 10.82.9.100 -> 10.82.9.70 TCP 66 40676 > 80 [ACK]
Seq=1 Ack=1 Win=29312 Len=0 TSval=644916558 TSecr=1505492135
6 0.001254076 10.82.9.100 -> 10.82.9.70 HTTP 140 GET / HTTP/1.1

If you run "tcpdump -i eth0 src LoadBalancerIP" and there is an error
in the service, so it can't reply back (for example, nginx in the pod
didn't start), you will at least see arp reply

Hope this helps.

Sachith Muhandiram

unread,
Jul 25, 2020, 11:48:14 PM7/25/20
to metallb-users
Hello

Yes, thats what I had done. In my case it was running on node-3, I logged there and put
tcpdump -ni eth0 host LoadBalancerIP

and accessed my service, it was giving me proper response, but packets are not matched to tcpdump . Then I have put tcpdump  from all nodes, yet same result.

Todor Petkov

unread,
Jul 26, 2020, 5:27:32 AM7/26/20
to metallb-users
On Sun, Jul 26, 2020 at 6:48 AM Sachith Muhandiram
<sachit...@gmail.com> wrote:
>
> Hello
>
> Yes, thats what I had done. In my case it was running on node-3, I logged there and put
> tcpdump -ni eth0 host LoadBalancerIP
>
> and accessed my service, it was giving me proper response, but packets are not matched to tcpdump . Then I have put tcpdump from all nodes, yet same result.
>

Can you run tcpdump with another filter, let's say by source IP
address and destination port? For example, if you run curl from a node
10.0.0.10 towards LoadBalancer IP address 10.0.0.20 on destination
port 80: tcpdump -ni eth0 -f 'host 10.0.0.10 and port 80'

If you run arping from an external node, do you see arp reply from a
single MAC or from more than one?

Sachith Muhandiram

unread,
Jul 26, 2020, 6:05:15 AM7/26/20
to metallb-users
OK, I have done following, I have put tcpdump from my gateway server, it worked and captured packets, but none of k8 cluster nodes matched packets..

Is that an expected behavior?

Makrand

unread,
Jul 26, 2020, 12:06:31 PM7/26/20
to Sachith Muhandiram, metallb-users
Just tested this on my K8 custer. I can see regular (communication between master and worker nodes) traffic arriving at eth0 of the worker node. But can't see anything when ran tcpdump against LoadBalancer IP from worker node. Although the nginx pod is getting accessed via load balancer IP. 

15:51:33.181077 IP 10.70.241.145.sun-sr-https > 10.70.241.148.63763: Flags [P.], seq 7037:7649, ack 799, win 501, options [nop,nop,TS val 2010209827 ecr 2965654999], length 612
15:51:33.181224 IP 10.70.241.148.63763 > 10.70.241.145.sun-sr-https: Flags [.], ack 7649, win 1285, options [nop,nop,TS val 2965656222 ecr 2010209827], length 0
15:51:33.222334 IP 10.70.241.148.32945 > 10.70.241.145.sun-sr-https: Flags [.], ack 5580, win 787, options [nop,nop,TS val 3579291379 ecr 2010209827], length 0
15:51:33.227039 IP 10.70.241.145.7946 > 10.70.241.148.7946: UDP, length 54
15:51:33.227201 IP 10.70.241.148.7946 > 10.70.241.145.7946: UDP, length 49
138 packets captured
138 packets received by filter
0 packets dropped by kernel
 
[root@kworker1 ~]# tcpdump -ni eth0 host 10.70.241.151
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel 
 
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
makrand@mint-gl63:~$ telnet 10.70.241.51 80
Trying 10.70.241.51...
Connected to 10.70.241.51.
Escape character is '^]'.
^CConnection closed by foreign host.
makrand@mint-gl63:~$



--
Makrand



--
You received this message because you are subscribed to the Google Groups "metallb-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metallb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/metallb-users/c5b2098d-d023-40eb-8e80-a1bb43fed972o%40googlegroups.com.

Todor Petkov

unread,
Jul 26, 2020, 2:57:18 PM7/26/20
to Makrand, Sachith Muhandiram, metallb-users
Are you running tcpdump on the node, which "holds" the LoadBalancer IP
address? Also, you ran tcpdump for "10.70.241.151" but did telnet for
"10.70.241.51" (51 vs 151), is this a typo?
Here is the tshark output in my case, I ran it on the node, which
"holds" the LoadBalancer IP address:

tshark -i ens192 host 10.82.9.70
Running as user "root" and group "root". This could be dangerous.
Capturing on 'ens192'
1 0.000000000 10.82.9.100 -> 10.82.9.70 TCP 74 41078 > http [SYN]
Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=819620106 TSecr=0
WS=128
2 0.000251420 10.82.9.70 -> 10.82.9.100 TCP 74 http > 41078 [SYN,
ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=1680195684
TSecr=819620106 WS=128
3 0.000480550 10.82.9.100 -> 10.82.9.70 TCP 66 41078 > http [ACK]
Seq=1 Ack=1 Win=29312 Len=0 TSval=819620106 TSecr=1680195684
4 0.000723907 10.82.9.100 -> 10.82.9.70 HTTP 140 GET / HTTP/1.1
5 0.000793349 10.82.9.70 -> 10.82.9.100 TCP 66 http > 41078 [ACK]
Seq=1 Ack=75 Win=29056 Len=0 TSval=1680195684 TSecr=819620107
6 0.001608480 10.82.9.70 -> 10.82.9.100 TCP 304 [TCP segment of a
reassembled PDU]
7 0.001727168 10.82.9.70 -> 10.82.9.100 HTTP 678 HTTP/1.1 200 OK
(text/html)
> To view this discussion on the web visit https://groups.google.com/d/msgid/metallb-users/CABdC_7nzGXHurGy2oJtOAO%3D%2BXW%3DqkA0VhXK7D3TbaLuqDFP64A%40mail.gmail.com.

Rodrigo Campos

unread,
Jul 26, 2020, 8:38:53 PM7/26/20
to Todor Petkov, Makrand, Sachith Muhandiram, metallb-users
If everything is working fine, I really think this is a
tcpdump/capturing issue rather than a MetalLB issue.

Therefore, are you sure you are using the right network interface? Can
you try to tcpdump -i any (to use all network interfaces) and grep
from the output?

Also, are you actively sending traffic to the service while you do
that? How? You need to send traffic to the application, you might not
get what you expect with ping, for example (kube-proxy doesn't handle
ICMP). I'd encourage you to send real traffic and check that the pod
in the node is receiving the traffic you send, then check tcpdump.
> To view this discussion on the web visit https://groups.google.com/d/msgid/metallb-users/CA%2Bvp9e2Cyza%2B3uRhhV2PKCgvRquCNMZ7Cr6Q_c6Pc09W-0FYyQ%40mail.gmail.com.



--
Rodrigo Campos
---
Kinvolk GmbH | Adalbertstr.6a, 10999 Berlin | tel: +491755589364
Geschäftsführer/Directors: Alban Crequy, Chris Kühl, Iago López Galeiras
Registergericht/Court of registration: Amtsgericht Charlottenburg
Registernummer/Registration number: HRB 171414 B
Ust-ID-Nummer/VAT ID number: DE302207000

Rodrigo Campos

unread,
Jul 26, 2020, 9:08:45 PM7/26/20
to Todor Petkov, Makrand, Sachith Muhandiram, metallb-users
Grr, sorry, I meant the node that is announcing the IP (kubectl
describe service <your-service> should tell you which node is
announcing it). Btw, this is layer 2, right?

Sachith Muhandiram

unread,
Jul 26, 2020, 10:36:38 PM7/26/20
to metallb-users
Hm, seems to be a problem with MetalLB or tcpdump or some config mismatch in your and my clusters.
To unsubscribe from this group and stop receiving emails from it, send an email to metall...@googlegroups.com.

Sachith Muhandiram

unread,
Jul 26, 2020, 10:38:43 PM7/26/20
to metallb-users
I always used -i any option.

To send traffic, I use curl, as its simple GET request.
> >> To unsubscribe from this group and stop receiving emails from it, send an email to metall...@googlegroups.com.
> >> To view this discussion on the web visit https://groups.google.com/d/msgid/metallb-users/c5b2098d-d023-40eb-8e80-a1bb43fed972o%40googlegroups.com.
> >
> > --
> > You received this message because you are subscribed to the Google Groups "metallb-users" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to metall...@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/metallb-users/CABdC_7nzGXHurGy2oJtOAO%3D%2BXW%3DqkA0VhXK7D3TbaLuqDFP64A%40mail.gmail.com.
>
> --
> You received this message because you are subscribed to the Google Groups "metallb-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to metall...@googlegroups.com.

Makrand

unread,
Jul 26, 2020, 11:41:20 PM7/26/20
to Sachith Muhandiram, metallb-users
err.....my apologies. 151 was typo indeed.
here is how tcpdump looks like for simple telnet on LBip at port 80
[root@kworker1 ~]# tcpdump -ni eth0 host 10.70.241.51
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C03:20:18.308597 IP 10.70.241.1.44968 > 10.70.241.51.http: Flags [S], seq 337147234, win 64240, options [mss 1460,sackOK,TS val 2759737199 ecr 0,nop,wscale 7], length 0
03:20:18.308970 IP 10.70.241.51.http > 10.70.241.1.44968: Flags [S.], seq 1509632657, ack 337147235, win 64308, options [mss 1410,sackOK,TS val 3807027570 ecr 2759737199,nop,wscale 7], length 0
03:20:18.309010 IP 10.70.241.1.44968 > 10.70.241.51.http: Flags [.], ack 1, win 502, options [nop,nop,TS val 2759737199 ecr 3807027570], length 0
03:20:23.428324 ARP, Request who-has 10.70.241.51 tell 10.70.241.1, length 28
03:20:23.428696 ARP, Reply 10.70.241.51 is-at 00:16:3e:6a:72:10, length 46

5 packets captured
5 packets received by filter
0 packets dropped by kernel

Although, I have to hit CTRL+C to view the packets...it's not appearing realtime. 

As Rodrigo mentioned - kube-proxy can not handle ICMP
ping is 100% packet drop. (from ping alone one can figure out LBip is attached to which node).
makrand@mint-gl63:~$ ping 10.70.241.51
PING 10.70.241.51 (10.70.241.51) 56(84) bytes of data.
From 10.70.241.148: icmp_seq=2 Redirect Host(New nexthop: 10.70.241.51)
From 10.70.241.148: icmp_seq=3 Redirect Host(New nexthop: 10.70.241.51)
^C
--- 10.70.241.51 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2012ms

Arping isn't working in my case. Should it work?
ARPING 10.70.241.51 from 10.100.100.104 wlo1
^CSent 7 probes (7 broadcast(s))
Received 0 response(s)


--
Makrand



To unsubscribe from this group and stop receiving emails from it, send an email to metallb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/metallb-users/b5396b63-afe2-4d6e-a902-6cfb5c02c29ao%40googlegroups.com.

Rodrigo Campos

unread,
Jul 27, 2020, 4:04:47 PM7/27/20
to Sachith Muhandiram, metallb-users
On Sun, Jul 26, 2020 at 11:38 PM Sachith Muhandiram
<sachit...@gmail.com> wrote:
>
> I always used -i any option.
>
> To send traffic, I use curl, as its simple GET request.

So you get a response in curl, you see the request in your app logs
but can't see the traffic with tcpdump?

Can you please run in the same host you run curl: sudo arp -a and
check if the MAC address you get is the same as the node mac address
you are running tcpdump on?

If everything is working, it should be something simple. Like maybe
the wrong node or something silly that I'm just missing forgetting
about :)

Rodrigo Campos

unread,
Jul 27, 2020, 4:10:57 PM7/27/20
to Makrand, Sachith Muhandiram, metallb-users
On Mon, Jul 27, 2020 at 12:41 AM Makrand <makran...@gmail.com> wrote:
>
> err.....my apologies. 151 was typo indeed.
> here is how tcpdump looks like for simple telnet on LBip at port 80
>
> [root@kworker1 ~]# tcpdump -ni eth0 host 10.70.241.51
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
> ^C03:20:18.308597 IP 10.70.241.1.44968 > 10.70.241.51.http: Flags [S], seq 337147234, win 64240, options [mss 1460,sackOK,TS val 2759737199 ecr 0,nop,wscale 7], length 0
> 03:20:18.308970 IP 10.70.241.51.http > 10.70.241.1.44968: Flags [S.], seq 1509632657, ack 337147235, win 64308, options [mss 1410,sackOK,TS val 3807027570 ecr 2759737199,nop,wscale 7], length 0
> 03:20:18.309010 IP 10.70.241.1.44968 > 10.70.241.51.http: Flags [.], ack 1, win 502, options [nop,nop,TS val 2759737199 ecr 3807027570], length 0
> 03:20:23.428324 ARP, Request who-has 10.70.241.51 tell 10.70.241.1, length 28
> 03:20:23.428696 ARP, Reply 10.70.241.51 is-at 00:16:3e:6a:72:10, length 46
>
> 5 packets captured
> 5 packets received by filter
> 0 packets dropped by kernel

Okay, so here you see the ARP request and reply


> Arping isn't working in my case. Should it work?
> ARPING 10.70.241.51 from 10.100.100.104 wlo1
> ^CSent 7 probes (7 broadcast(s))
> Received 0 response(s)

It should. In the logs you pasted you get an ARP response. So I guess
it is an issue about your network setup. Maybe something like not
using the right interface or the right source IP. Don't have in mind
what might cause arping to behave like this to give you a hint.

Let us know if you find out was missing, or if you test more stuff and
still didn't find how to capture it correctly so we can try more
things :)
Reply all
Reply to author
Forward
0 new messages