Dual homed KVM guest not responding to requests

10 views
Skip to first unread message

Arthur Enright

unread,
May 30, 2025, 3:30:08 PMMay 30
to metallb-users
Hello!

I'm setting up a k8s cluster to host several services for a project.  The whole thing runs on a "one-box-wonder" which is a Rocky 9 server acting as a KVM hypervisor (diagram attached).

Each K8s VM has 2 network interfaces:
  • eth0 - private NAT'ed vnet from the hypervisor configured to use 192.168.123.0/24
  • eth1 - plumbed to the hypervisors bridge interface (no IP configured as I only have a /28 for this project)
The server is in a colo facility so I don't have access to the network equipment.

I have configured 2 IPAddressPools (seems to be working as expected) and am using L2 advertising.

I believe I can see the announcer responding as expected when I arping the LoadBalancer external IP:

[root@prod01 ~]# arping <ipaddr>                        
ARPING <ipaddr> from <ipaddr> bridge0                   
Unicast reply from <ipaddr> [52:54:00:CE:9B:A9]  0.723ms
Unicast reply from <ipaddr> [52:54:00:CE:9B:A9]  0.694ms
Unicast reply from <ipaddr> [52:54:00:CE:9B:A9]  0.685ms
Unicast reply from <ipaddr> [52:54:00:CE:9B:A9]  0.675ms
Unicast reply from <ipaddr> [52:54:00:CE:9B:A9]  0.688ms
Unicast reply from <ipaddr> [52:54:00:CE:9B:A9]  0.664ms
Sent 6 probes (1 broadcast(s))                          
Received 6 response(s)                                  

Where [52:54:00:CE:9B:A9] is the MAC of the worker that is advertising.

Similarly I can see the request come in when I run tcpdump on the announcing node:

13:14:51.001486 ARP, Request who-has 66.85.73.221 (Broadcast) tell 66.85.73.210, length 28
13:14:52.001485 ARP, Request who-has 66.85.73.221 (52:54:00:ce:9b:a9) tell 66.85.73.210, length 28
13:14:53.001486 ARP, Request who-has 66.85.73.221 (52:54:00:ce:9b:a9) tell 66.85.73.210, length 28
13:14:54.001495 ARP, Request who-has 66.85.73.221 (52:54:00:ce:9b:a9) tell 66.85.73.210, length 28
13:14:55.001492 ARP, Request who-has 66.85.73.221 (52:54:00:ce:9b:a9) tell 66.85.73.210, length 28

However when I try to access my test deployment the connection just times out and when looking at a tcpdump - it would appear that no response is coming through:

[root@kube-wrkr-2 ~]# tcpdump -n -i eth1 src host <ipaddr>
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
13:55:23.140415 IP <hypervisor IP>.36456 > <lb IP>.irisa: Flags [S], seq 2141134184, win 32120, options [mss 1460,sackOK,TS val 2395428550 ecr 0,nop,wscale 7], length 0
13:55:24.202968 IP 
<hypervisor IP>.36456 > <lb IP>.irisa: Flags [S], seq 2141134184, win 32120, options [mss 1460,sackOK,TS val 2395429613 ecr 0,nop,wscale 7], length 0
13:55:26.251991 IP 
<hypervisor IP>.36456 > <lb IP>.irisa: Flags [S], seq 2141134184, win 32120, options [mss 1460,sackOK,TS val 2395431662 ecr 0,nop,wscale 7], length 0
13:55:30.282973 IP 
<hypervisor IP>.36456 > <lb IP>.irisa: Flags [S], seq 2141134184, win 32120, options [mss 1460,sackOK,TS val 2395435693 ecr 0,nop,wscale 7], length 0
13:55:38.666971 IP 
<hypervisor IP>.36456 > <lb IP>.irisa: Flags [S], seq 2141134184, win 32120, options [mss 1460,sackOK,TS val 2395444077 ecr 0,nop,wscale 7], length 0
13:55:55.050972 IP 
<hypervisor IP>.36456 > <lb IP>.irisa: Flags [S], seq 2141134184, win 32120, options [mss 1460,sackOK,TS val 2395460461 ecr 0,nop,wscale 7], length 0

I'm unsure if this is a routing issue.  I added the gateway for my /28 to the bridged network interface but that had no impact (and I don't see why that would prevent the announcing node from responding to traffic requests)

I don't see anything out of sorts  with the metallb deployment and when I deploy with the private (192.168.123.0/24) network things work as expected from within the virtual network.

I'm open to any and all suggestions around what I may have done to bork this up. I'm wondering if I need to go the BGP announce route rather than L2.

Thanks in advance from the community for looking at this long post and any suggestions / troubleshooting recommendations.

Best,
Arthur



prod01-k8s-svc-cluster.svg
Reply all
Reply to author
Forward
0 new messages