ARP issue

구재완

unread,

Dec 10, 2020, 3:21:30 AM12/10/20

to metallb-users

Hi,all

In l2 mode, I have observed certain problematic thing when leader node down and up.

the scenario is like below

3 nodes A B C, A owns the LoadBalancer IP I,

have A to be down. I can see all alive nodes B and C response lodbanalce ip ower query(ARP) from gateway. So, this makes gateway to have the mac addres of B and a few milli seconds later gateway has C mac for the lb ip in arp table. In addition, both B and C

generate GARP in some seconds.

A is up, then A seems retrieving ownership again and generate GARP.

This is correct behavior??? above behavior makes to delay establishing connections completely between client and server

Could you help this, any opinion??

thanks

Message has been deleted

구재완

unread,

Dec 10, 2020, 3:46:12 AM12/10/20

to metallb-users

Look at this

other node responds first every time arping to external ip

root@dejn-pcf11-or-bastion-00 ~]# k get svc -n ocpcf
NAME                                   TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                         AGE
ocpcf-occnp-app-info                   ClusterIP      10.233.27.114   <none>          5906/TCP                        132d
ocpcf-occnp-audit                      ClusterIP      10.233.5.73     <none>          5807/TCP                        132d
ocpcf-occnp-chf-connector              ClusterIP      10.233.0.201    <none>          5808/TCP,8443/TCP               21d
ocpcf-occnp-config-mgmt                NodePort       10.233.9.96     <none>          5808:31743/TCP                  132d
ocpcf-occnp-config-server              NodePort       10.233.44.157   <none>          5807:31187/TCP                  132d
ocpcf-occnp-egress-gateway             ClusterIP      10.233.32.140   <none>          8080/TCP                        132d
ocpcf-occnp-egress-gateway-cache       ClusterIP      None            <none>          5701/TCP                        21d
ocpcf-occnp-ingress-gateway            LoadBalancer   10.233.57.23    172.20.100.21   80:32744/TCP                    132d

root@dejn-pcf11-or-k8s-node-02 ~]# arping 172.20.100.21 -I eth2
ARPING 172.20.100.21 from 172.20.100.16 eth2
Unicast reply from 172.20.100.21 [FA:16:3E:93:5C:D7] 0.825ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 0.845ms
Unicast reply from 172.20.100.21 [FA:16:3E:62:21:18] 0.854ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 0.971ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 1.067ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 1.393ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 0.855ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 1.029ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 0.857ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 1.046ms
^CSent 4 probes (1 broadcast(s))

Rodrigo Campos

unread,

Dec 10, 2020, 8:10:19 AM12/10/20

to 구재완, metallb-users

Do you have open port 7946 on the host? Layer 2 needs that currently.
It seems like nodes can't talk to each other, so they think they are
the owners (as no other nodes are around from their POV). Can you open
that tcp port?

If that doesn't do the trick for you, can you try to disable the new
fast node detection algorithm? Instructions are here:
https://metallb.universe.tf/release-notes/#version-0-9-2

> --
> You received this message because you are subscribed to the Google Groups "metallb-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to metallb-user...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/metallb-users/e8db5d98-6bf1-42cc-a15c-2cb2b99a8351n%40googlegroups.com.

--
Rodrigo Campos
---
Kinvolk GmbH | Adalbertstr.6a, 10999 Berlin | tel: +491755589364
Geschäftsführer/Directors: Alban Crequy, Chris Kühl, Iago López Galeiras
Registergericht/Court of registration: Amtsgericht Charlottenburg
Registernummer/Registration number: HRB 171414 B
Ust-ID-Nummer/VAT ID number: DE302207000

구재완

unread,

Dec 10, 2020, 9:45:18 AM12/10/20

to metallb-users

Thanks you for replying to me

ok let me check with your suggestion.

for 7946 port, yes I verifed that the communication looks fine among speakers.

here is what I observed. could you help with these??

Q1 : When the leader down, new leader is elected. only this new leader generates GARP. but once previous leader is up, then this retrieve the leader again.

This is right behavior in general??

Q2 : when I test with this arping -b option, all nodes reply to ARP request of service IP. but for GARP, only new leader sends it.

this is also correct behavior??

Heve you ever tested arping with '-b'(keeping broadcasting)???

[root@dejn-pcf11-or-k8s-node-00 ~]# arping 172.20.100.21 -I eth2
ARPING 172.20.100.21 from 172.20.100.14 eth2
Unicast reply from 172.20.100.21 [FA:16:3E:93:5C:D7] 0.858ms
Unicast reply from 172.20.100.21 [FA:16:3E:83:B7:F0] 0.874ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 0.906ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 54.154ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 0.923ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 1.106ms

[root@dejn-pcf11-or-k8s-node-00 ~]# arping 172.20.100.21 -I eth2 -b
ARPING 172.20.100.21 from 172.20.100.14 eth2
Unicast reply from 172.20.100.21 [FA:16:3E:83:B7:F0] 0.833ms
Unicast reply from 172.20.100.21 [FA:16:3E:93:5C:D7] 0.851ms

Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 1.046ms

Unicast reply from 172.20.100.21 [FA:16:3E:83:B7:F0] 0.920ms
Unicast reply from 172.20.100.21 [FA:16:3E:93:5C:D7] 0.977ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 0.997ms
Unicast reply from 172.20.100.21 [FA:16:3E:93:5C:D7] 0.786ms
Unicast reply from 172.20.100.21 [FA:16:3E:83:B7:F0] 0.809ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 0.997ms

Rodrigo Campos

unread,

Dec 10, 2020, 10:05:37 AM12/10/20

to 구재완, metallb-users

On Thu, Dec 10, 2020 at 3:45 PM 구재완 <koo...@gmail.com> wrote:
>
> Thanks you for replying to me
>
> ok let me check with your suggestion.
> for 7946 port, yes I verifed that the communication looks fine among speakers.
>
>
> here is what I observed. could you help with these??
> Q1 : When the leader down, new leader is elected. only this new leader generates GARP. but once previous leader is up, then this retrieve the leader again.
> This is right behavior in general??

Yes, there is like a distributed "leader election" and when the old
node joins, it becomes leader again. This is because we order things
and when N nodes are up for a service, only one will be leader and
will always be the same. So, when you got all of them back, the same
leader will be elected

>
> Q2 : when I test with this arping -b option, all nodes reply to ARP request of service IP. but for GARP, only new leader sends it.
> this is also correct behavior??

No, not at all. Can you reproduce with kind by any chance? Do you have
the logs of the speakers for when this happened?

구재완

unread,

Dec 10, 2020, 10:57:47 AM12/10/20

to metallb-users

> Q1 : When the leader down, new leader is elected. only this new leader generates GARP. but once previous leader is up, then this retrieve the leader again.
> This is right behavior in general??

Yes, there is like a distributed "leader election" and when the old
node joins, it becomes leader again. This is because we order things
and when N nodes are up for a service, only one will be leader and
will always be the same. So, when you got all of them back, the same
leader will be elected

=> Is there any way to keep new leader still having ownership when old leader node join?? because the leader change again causes service impact too

>
> Q2 : when I test with this arping -b option, all nodes reply to ARP request of service IP. but for GARP, only new leader sends it.
> this is also correct behavior??

No, not at all. Can you reproduce with kind by any chance? Do you have
the logs of the speakers for when this happened?

=> this is usually happening without taking any action. I think you can reproduce this in your lab. try arping with option -b

below is logs while arping is running with option -b

[root@dejn-pcf11-or-k8s-node-00 ~]# arping 172.20.100.21 -I eth2 -b
ARPING 172.20.100.21 from 172.20.100.14 eth2

Unicast reply from 172.20.100.21 [FA:16:3E:93:5C:D7] 0.691ms
Unicast reply from 172.20.100.21 [FA:16:3E:83:B7:F0] 0.863ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 1.137ms
Unicast reply from 172.20.100.21 [FA:16:3E:93:5C:D7] 0.895ms
Unicast reply from 172.20.100.21 [FA:16:3E:83:B7:F0] 1.001ms
Unicast reply from 172.20.100.21 [FA:16:3E:C8:4A:2E] 1.227ms
Unicast reply from 172.20.100.21 [FA:16:3E:83:B7:F0] 0.729ms
Unicast reply from 172.20.100.21 [FA:16:3E:93:5C:D7] 0.755ms

================================================================

speaker-jskqw speaker {"caller":"main.go:202","component":"MemberList","msg":"net.go:785: [DEBUG] memberlist: Initiating push/pull sync with: 172.20.100.104:7946","ts":"2020-12-10T15:54:03.84535759Z"}
speaker-k4j27 speaker {"caller":"main.go:202","component":"MemberList","msg":"net.go:210: [DEBUG] memberlist: Stream connection from=172.20.100.107:51178","ts":"2020-12-10T15:54:03.845476646Z"}
speaker-mzbhl speaker {"caller":"arp.go:102","interface":"eth2","ip":"172.20.100.21","msg":"got ARP request for service IP, sending response","responseMAC":"fa:16:3e:c8:4a:2e","senderIP":"172.20.100.14","senderMAC":"fa:16:3e:62:21:18","ts":"2020-12-10T15:54:04.775015819Z"}
speaker-mzbhl speaker {"caller":"arp.go:102","interface":"eth2","ip":"172.20.100.21","msg":"got ARP request for service IP, sending response","responseMAC":"fa:16:3e:c8:4a:2e","senderIP":"172.20.100.14","senderMAC":"fa:16:3e:62:21:18","ts":"2020-12-10T15:54:05.775194559Z"}
speaker-mzbhl speaker {"caller":"arp.go:102","interface":"eth2","ip":"172.20.100.21","msg":"got ARP request for service IP, sending response","responseMAC":"fa:16:3e:c8:4a:2e","senderIP":"172.20.100.14","senderMAC":"fa:16:3e:62:21:18","ts":"2020-12-10T15:54:06.775413533Z"}
speaker-mzbhl speaker {"caller":"arp.go:102","interface":"eth2","ip":"172.20.100.21","msg":"got ARP request for service IP, sending response","responseMAC":"fa:16:3e:c8:4a:2e","senderIP":"172.20.100.14","senderMAC":"fa:16:3e:62:21:18","ts":"2020-12-10T15:54:07.775970495Z"}
speaker-mzbhl speaker {"caller":"arp.go:102","interface":"eth2","ip":"172.20.100.21","msg":"got ARP request for service IP, sending response","responseMAC":"fa:16:3e:c8:4a:2e","senderIP":"172.20.100.14","senderMAC":"fa:16:3e:62:21:18","ts":"2020-12-10T15:54:08.776155306Z"}
speaker-mzbhl speaker {"caller":"arp.go:102","interface":"eth2","ip":"172.20.100.21","msg":"got ARP request for service IP, sending response","responseMAC":"fa:16:3e:c8:4a:2e","senderIP":"172.20.100.14","senderMAC":"fa:16:3e:62:21:18","ts":"2020-12-10T15:54:09.776534104Z"}
speaker-jskqw speaker {"caller":"main.go:202","component":"MemberList","msg":"net.go:210: [DEBUG] memberlist: Stream connection from=172.20.100.105:50328","ts":"2020-12-10T15:54:09.864017497Z"}
speaker-m4z78 speaker {"caller":"main.go:202","component":"MemberList","msg":"net.go:785: [DEBUG] memberlist: Initiating push/pull sync with: 172.20.100.107:7946","ts":"2020-12-10T15:54:09.863860764Z"}
speaker-mzbhl speaker {"caller":"arp.go:102","interface":"eth2","ip":"172.20.100.21","msg":"got ARP request for service IP, sending response","responseMAC":"fa:16:3e:c8:4a:2e","senderIP":"172.20.100.14","senderMAC":"fa:16:3e:62:21:18","ts":"2020-12-10T15:54:10.776840118Z"}
speaker-mzbhl speaker {"caller":"arp.go:102","interface":"eth2","ip":"172.20.100.21","msg":"got ARP request for service IP, sending response","responseMAC":"fa:16:3e:c8:4a:2e","senderIP":"172.20.100.14","senderMAC":"fa:16:3e:62:21:18","ts":"2020-12-10T15:54:11.777189046Z"}
speaker-mzbhl speaker {"caller":"arp.go:102","interface":"eth2","ip":"172.20.100.21","msg":"got ARP request for service IP, sending response","responseMAC":"fa:16:3e:c8:4a:2e","senderIP":"172.20.100.14","senderMAC":"fa:16:3e:62:21:18","ts":"2020-12-10T15:54:12.777582447Z"}

Rodrigo Campos

unread,

Dec 10, 2020, 12:01:18 PM12/10/20

to 구재완, metallb-users

On Thu, Dec 10, 2020 at 4:57 PM 구재완 <koo...@gmail.com> wrote:
>
> > Q1 : When the leader down, new leader is elected. only this new leader generates GARP. but once previous leader is up, then this retrieve the leader again.
> > This is right behavior in general??
>
> Yes, there is like a distributed "leader election" and when the old
> node joins, it becomes leader again. This is because we order things
> and when N nodes are up for a service, only one will be leader and
> will always be the same. So, when you got all of them back, the same
> leader will be elected
>
> => Is there any way to keep new leader still having ownership when old leader node join?? because the leader change again causes service impact too

Nope, no way :(

> > Q2 : when I test with this arping -b option, all nodes reply to ARP request of service IP. but for GARP, only new leader sends it.
> > this is also correct behavior??
>
> No, not at all. Can you reproduce with kind by any chance? Do you have
> the logs of the speakers for when this happened?
>
> => this is usually happening without taking any action. I think you can reproduce this in your lab. try arping with option -b
> below is logs while arping is running with option -b

Why with "-b"? Your previous examples were without it. Do you see
arping without -b being answered by more than one host or not? I'm
confused now :)

Can you share the speaker logs too?

If you can create a repro case (e.g. with kind) it would be super helpful

구재완

unread,

Dec 10, 2020, 12:55:34 PM12/10/20

to metallb-users

when you arping without -b then arp is sent with uniq target mac not broadcast.

in order to make sending arp using broadcase everytime, you should use '-b' option. you can verify this if you got pcap.

I put the speaker logs and arping out in my previous post.

with option -b, I can see all ndoes reply to ARP request of service IP.

one more thing, I am usuing L2 mode and IPVS mode. all nodes have that service ip. this is normal??otherwise only leader must have that service ip?? look below,

root@dejn-pcf11-or-k8s-node-00 ~]# ip -4 addr | grep 172.20.100.21
inet 172.20.100.21/32 brd 172.20.100.21 scope global kube-ipvs0

[root@dejn-pcf11-or-k8s-node-01 ~]# ip -4 addr | grep 172.20.100.21
inet 172.20.100.21/32 brd 172.20.100.21 scope global kube-ipvs0

[root@dejn-pcf11-or-k8s-node-03 ~]# ip -4 addr | grep 172.20.100.21
inet 172.20.100.21/32 brd 172.20.100.21 scope global kube-ipvs0

[root@dejn-pcf11-or-k8s-node-04 ~]# ip -4 addr | grep 172.20.100.21
inet 172.20.100.21/32 brd 172.20.100.21 scope global kube-ipvs0

NAME                                   TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                         AGE
ocpcf-occnp-app-info                   ClusterIP      10.233.27.114   <none>          5906/TCP                        132d
ocpcf-occnp-audit                      ClusterIP      10.233.5.73     <none>          5807/TCP                        132d

ocpcf-occnp-chf-connector ClusterIP 10.233.0.201 <none> 5808/TCP,8443/TCP 22d

ocpcf-occnp-config-mgmt                NodePort       10.233.9.96     <none>          5808:31743/TCP                  132d
ocpcf-occnp-config-server              NodePort       10.233.44.157   <none>          5807:31187/TCP                  132d
ocpcf-occnp-egress-gateway             ClusterIP      10.233.32.140   <none>          8080/TCP                        132d

ocpcf-occnp-egress-gateway-cache ClusterIP None <none> 5701/TCP 22d

ocpcf-occnp-ingress-gateway LoadBalancer 10.233.57.23 172.20.100.21 80:32744/TCP 132d

Rodrigo Campos

unread,

Dec 11, 2020, 6:48:03 AM12/11/20

to 구재완, metallb-users

On Thu, Dec 10, 2020 at 6:55 PM 구재완 <koo...@gmail.com> wrote:
> when you arping without -b then arp is sent with uniq target mac not broadcast.
> in order to make sending arp using broadcase everytime, you should use '-b' option. you can verify this if you got pcap.

I meant: do you see that with or without -b too?

> I put the speaker logs and arping out in my previous post.

Oh, sorry, missed them :D

> root@dejn-pcf11-or-k8s-node-00 ~]# ip -4 addr | grep 172.20.100.21
> inet 172.20.100.21/32 brd 172.20.100.21 scope global kube-ipvs0
>
> [root@dejn-pcf11-or-k8s-node-01 ~]# ip -4 addr | grep 172.20.100.21
> inet 172.20.100.21/32 brd 172.20.100.21 scope global kube-ipvs0
>
> [root@dejn-pcf11-or-k8s-node-03 ~]# ip -4 addr | grep 172.20.100.21
> inet 172.20.100.21/32 brd 172.20.100.21 scope global kube-ipvs0
>
> [root@dejn-pcf11-or-k8s-node-04 ~]# ip -4 addr | grep 172.20.100.21
> inet 172.20.100.21/32 brd 172.20.100.21 scope global kube-ipvs0

MetalLB is not doing this, probably you did this manually or
something? This is why all answer ARP, because all have the IP and
linux is doing that for you (although not sure why /32)

None should have the IP listed in ip a, metallb will answer ARP
without adding it (just answers the ARP)

구재완

unread,

Dec 13, 2020, 6:25:59 AM12/13/20

to metallb-users

This issue has been solved by setting strictp ARP:true and restart kube-proxy service.

I already set this option as true but It was not working because that option is in config map of kube-proxy which means need to restart proxy service then changed value works.

I missed restarting operation.

BTW I am still wondering if it is really not possible to have new leader keeping as leader after old leader join again.

thanks

Rodrigo Campos

unread,

Dec 14, 2020, 5:39:09 AM12/14/20

to 구재완, metallb-users

On Sun, Dec 13, 2020 at 12:26 PM 구재완 <koo...@gmail.com> wrote:
>
> This issue has been solved by setting strictp ARP:true and restart kube-proxy service.

Hmm, if you have the IP added manually on the nodes, they will respond
to ARP and it won't work. Have you removed that too?

> BTW I am still wondering if it is really not possible to have new leader keeping as leader after old leader join again.

It is really really not possible with current code :)

--

Reply all

Reply to author

Forward