Listening on wrong interface in L2 mode

266 views
Skip to first unread message

Michael White

unread,
Sep 10, 2020, 11:45:25 PM9/10/20
to metallb-users
So I have metallb set up in layer2 mode to give addresses in 10.0.16.0/24 to my kubernetes services.

The kubernetes node is 10.0.10.1.

There are 2 main networks.
- 10.0.0.0/16 is LAN
- 10.10.10.0/24 is for Ceph


When I try to `curl 10.0.16.1` from anything other than the kubernetes node, I get a timeout error.

Looking at the speaker pod logs, I see the following lines

{"caller":"arp.go:102","interface":"ens19","ip":"10.0.16.1","msg":"got ARP request for service IP, sending response","responseMAC":"be:77:53:b7:6f:1b","senderIP":"10.0.5.6","senderMAC":"de:b3:19:3b:59:0a","ts":"2020-09-11T03:42:07.633878541Z"}
{"caller":"arp.go:102","interface":"ens18","ip":"10.0.16.1","msg":"got ARP request for service IP, sending response","responseMAC":"ca:a9:34:b4:0d:56","senderIP":"10.0.0.20","senderMAC":"14:10:9f:e3:82:f5","ts":"2020-09-11T03:42:15.74826983Z"}

The first request is from a storage server that has 2 interfaces, the 2nd interface being for ceph, and it seems to be calling it on that, even though it should be using ens18 (the LAN interface) instead.

The second is just a laptop connected to a WAP. 

But both get timeout errors when trying to connect.  Why are they getting timeout errors if it seems to be receiving the request?

Johannes Liebermann

unread,
Sep 11, 2020, 2:33:31 PM9/11/20
to Michael White, metallb-users
There are many possible reasons for a timeout. I would try to run tcpdump/Wireshark on the client and/or nodes to figure out exactly what happens. Some things to check which may help you find the issue:

- Does the ARP reply reach the client?
- Does the ARP reply contain the correct MAC address?
- After receiving the ARP reply, does the client send traffic? For example, does the client initiate a TCP connection towards the cluster?
- Assuming the client correctly initiates a request, does the request arrive at the node?
- Assuming the request arrives at the node, does it reach the relevant pod?
- If the request reaches the relevant pod, is a response generated? If so, where does it stop on the way back to the client?

In general, MetalLB's only responsibility is to get traffic into your cluster. From there, the "standard" k8s mechanisms (kube-proxy, your network plugin) take care of things.

HTH.

--
You received this message because you are subscribed to the Google Groups "metallb-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metallb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/metallb-users/f90fb061-a044-4303-abe3-ced7b2bef1b9n%40googlegroups.com.


--
Johannes Liebermann

Kinvolk GmbH | Adalbertstr. 6a, 10999 Berlin | tel: +491755589364
Geschäftsführer/Directors: Alban Crequy, Chris Kühl, Iago López Galeiras
Registergericht/Court of registration: Amtsgericht Charlottenburg
Registernummer/Registration number: HRB 171414 B
Ust-ID-Nummer/VAT ID number: DE302207000
Reply all
Reply to author
Forward
0 new messages